Thread overview
"strtok" D equivalent
Jul 28, 2022
pascal111
Jul 28, 2022
rikki cattermole
Jul 28, 2022
pascal111
Jul 28, 2022
H. S. Teoh
Jul 28, 2022
Paul Backus
Jul 28, 2022
pascal111
Jul 28, 2022
Paul Backus
Jul 28, 2022
pascal111
Jul 29, 2022
pascal111
July 28, 2022

What's the "strtok" - C function - D equivalent?

https://en.cppreference.com/w/cpp/string/byte/strtok

July 29, 2022
I don't know of a D version, although it should be pretty easy to write up yourself.

But you can always use strtok itself.

https://github.com/dlang/dmd/blob/09d04945bdbc0cba36f7bb1e19d5bd009d4b0ff2/druntime/src/core/stdc/string.d#L97

Very similar to example given on the docs:

```d
void main()
{
    import std.stdio, std.algorithm;
    string input = "one + two * (three - four)!";
    string delimiters = "!+-(*)";

    foreach(value; input.splitWhen!((a, b) => delimiters.canFind(b))) {
        writeln(value);
    }
}
```
July 28, 2022

On Thursday, 28 July 2022 at 19:17:26 UTC, pascal111 wrote:

>

What's the "strtok" - C function - D equivalent?

https://en.cppreference.com/w/cpp/string/byte/strtok

Closest thing is probably std.algorithm.splitter with a predicate:

import std.algorithm: splitter, canFind;
import std.stdio;

void main()
{
    string input = "one + two * (three - four)!";
    string delimiters = "! +- (*)";
    auto tokens = input.splitter!(c => delimiters.canFind(c));
    foreach (token; tokens) {
        writef("\"%s\" ", token);
    }
}

Output:

"one" "" "" "two" "" "" "" "three" "" "" "four" "" ""

Unlike strtok, this code does not skip over sequences of multiple consecutive delimiters, so you end up with a bunch of empty tokens in the output. To exclude them, you can use std.algorithm.filter:

import std.algorithm: filter;
import std.range: empty;
import std.functional: not;

// ...

    auto tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty);

// ...
July 28, 2022
On Thursday, 28 July 2022 at 19:37:31 UTC, rikki cattermole wrote:
> I don't know of a D version, although it should be pretty easy to write up yourself.
>
> But you can always use strtok itself.
>
> https://github.com/dlang/dmd/blob/09d04945bdbc0cba36f7bb1e19d5bd009d4b0ff2/druntime/src/core/stdc/string.d#L97
>
> Very similar to example given on the docs:
>
> ```d
> void main()
> {
>     import std.stdio, std.algorithm;
>     string input = "one + two * (three - four)!";
>     string delimiters = "!+-(*)";
>
>     foreach(value; input.splitWhen!((a, b) => delimiters.canFind(b))) {
>         writeln(value);
>     }
> }
> ```

From where can I get details about properties like "canFind" and "splitWhen" or other properties. The next link has mentioning for more properties:

https://dlang.org/spec/arrays.html
July 28, 2022
On Thu, Jul 28, 2022 at 09:03:55PM +0000, pascal111 via Digitalmars-d-learn wrote:
> On Thursday, 28 July 2022 at 19:37:31 UTC, rikki cattermole wrote:
[...]
> >     foreach(value; input.splitWhen!((a, b) => delimiters.canFind(b))) {
> >         writeln(value);
> >     }
> > }
> > ```
> 
> From where can I get details about properties like "canFind" and "splitWhen" or other properties. The next link has mentioning for more properties:
> 
> https://dlang.org/spec/arrays.html

These are not properties; these are function calls using UFCS (Uniform Function Call Syntax).  In a nutshell, whenever the compiler sees function call of the form:

	object.funcName(args);

but `object` does not have a member function named `funcName`, then the compiler will rewrite it instead to:

	funcName(object, args);

So, if you have a function that takes a string as a 1st argument, let's say:

	string myStringOp(string s) { ... }

then you can write:

	"abc".myStringOp();

instead of:

	myStringOp("abc");

This in itself may seem like a rather inane syntactic hack, but the swapping of function name and first argument allows you to chain several nested function calls together while keeping the calling order the same as the visual order:

	// What does this do?? You have to scan back and forth to figure
	// out what is nested in what.  Hard to read.
	writeln(walkLength(filter!(l => l > 3)(map!(e => e.length),
		["a", "abc", "def", "ghij"])));

can be rewritten using UFCS in the more readable form:

	// Much easier to read: take an array, map each element to
	// length, filter by some predicate, and count the number of
	// matches.
	writeln(["a", "abc", "def", "ghij"]
		.map!(e => e.length)
		.filter!(l => l > 3)
		.walkLength);

The identifiers splitWhen and canFind in the original code snippet are Phobos library functions. Pretty much all of the functions in std.algorithm, std.range, std.array, and std.string can be used in this manner.


T

-- 
People walk. Computers run.
July 28, 2022

On Thursday, 28 July 2022 at 20:36:31 UTC, Paul Backus wrote:

>

On Thursday, 28 July 2022 at 19:17:26 UTC, pascal111 wrote:

>

What's the "strtok" - C function - D equivalent?

https://en.cppreference.com/w/cpp/string/byte/strtok

Closest thing is probably std.algorithm.splitter with a predicate:

import std.algorithm: splitter, canFind;
import std.stdio;

void main()
{
    string input = "one + two * (three - four)!";
    string delimiters = "! +- (*)";
    auto tokens = input.splitter!(c => delimiters.canFind(c));
    foreach (token; tokens) {
        writef("\"%s\" ", token);
    }
}

Output:

"one" "" "" "two" "" "" "" "three" "" "" "four" "" ""

Unlike strtok, this code does not skip over sequences of multiple consecutive delimiters, so you end up with a bunch of empty tokens in the output. To exclude them, you can use std.algorithm.filter:

import std.algorithm: filter;
import std.range: empty;
import std.functional: not;

// ...

    auto tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty);

// ...

I think "tokens" is a range. I didn't read much about it, but I figured out that there's no particular way to know the number of elements in a range, or how can you know the elements order and the length of the range?

July 28, 2022

On Thursday, 28 July 2022 at 21:52:28 UTC, pascal111 wrote:

>

On Thursday, 28 July 2022 at 20:36:31 UTC, Paul Backus wrote:

>
import std.algorithm: filter;
import std.range: empty;
import std.functional: not;

// ...

    auto tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty);

// ...

I think "tokens" is a range. I didn't read much about it, but I figured out that there's no particular way to know the number of elements in a range, or how can you know the elements order and the length of the range?

In this case, the only way is to convert the range to an array, using std.array.array:

import std.array: array;

// ...

    string[] tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty)
        .array;
July 28, 2022

On Thursday, 28 July 2022 at 23:16:15 UTC, Paul Backus wrote:

>

On Thursday, 28 July 2022 at 21:52:28 UTC, pascal111 wrote:

>

On Thursday, 28 July 2022 at 20:36:31 UTC, Paul Backus wrote:

>
import std.algorithm: filter;
import std.range: empty;
import std.functional: not;

// ...

    auto tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty);

// ...

I think "tokens" is a range. I didn't read much about it, but I figured out that there's no particular way to know the number of elements in a range, or how can you know the elements order and the length of the range?

In this case, the only way is to convert the range to an array, using std.array.array:

import std.array: array;

// ...

    string[] tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty)
        .array;

What about this version:

string[] d_strtok(const string ch, const string delim)
{

string[] tokens = ch.
splitter!(c => delim.canFind(c)).
filter!(not!empty).array;

return tokens;

}

////////////////////////

module main;

import std.stdio;
import std.string;
import std.conv;
import dcollect;
import std.math;
/import std.algorithm;
import std.range: empty;
import std.functional: not;
import std.array;
/

int main(string[] args)
{

string bad_guy="This is,, an, statement.";
string[] coco=d_strtok(bad_guy, " ,.");

for(int i=0; i<coco.length; i++)
    writeln(coco[i]);

return 0;

}

//////////////////

Really I don't understand all of its code well although it works fine.

July 29, 2022

On Thursday, 28 July 2022 at 23:16:15 UTC, Paul Backus wrote:

>

On Thursday, 28 July 2022 at 21:52:28 UTC, pascal111 wrote:

>

On Thursday, 28 July 2022 at 20:36:31 UTC, Paul Backus wrote:

>
import std.algorithm: filter;
import std.range: empty;
import std.functional: not;

// ...

    auto tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty);

// ...

I think "tokens" is a range. I didn't read much about it, but I figured out that there's no particular way to know the number of elements in a range, or how can you know the elements order and the length of the range?

In this case, the only way is to convert the range to an array, using std.array.array:

import std.array: array;

// ...

    string[] tokens = input
        .splitter!(c => delimiters.canFind(c))
        .filter!(not!empty)
        .array;

This is the first program using "d_strtok":
https://github.com/pascal111-fra/D/blob/main/proj03.d

This is the "dcollect" module:
https://github.com/pascal111-fra/D/blob/main/dcollect.d