Thread overview
Iterating chars by Word
Nov 13, 2020
Ali Çehreli
Nov 13, 2020
evilrat
November 13, 2020
Is:
wchar[] chars;  // like a: "import core.sys.windows.windows;\nimport std.conv      : to;\n"

Goal:
foreach ( word; chars.byWord )
{
    // ...
}

Iterating chars by Word...
How to ? ( simple, fast, low memory, beauty, perfect )

November 12, 2020
On 11/12/20 9:14 PM, Виталий Фадеев wrote:
> Is:
> wchar[] chars;  // like a: "import core.sys.windows.windows;\nimport std.conv      : to;\n"
> 
> Goal:
> foreach ( word; chars.byWord )
> {
>      // ...
> }
> 
> Iterating chars by Word...
> How to ? ( simple, fast, low memory, beauty, perfect )

import std.stdio;
import std.algorithm;
import std.uni;

void main() {
  auto s = "Виталий abcçdefgğhı   Фадеев"w;
  auto words = s.splitter;
  words.writefln!"%-(%s\n%)";
}

Note that splitter() is different from splitter!isWhite. The version I used above removes empty parts. (I used multiple spaces in one place but the output contains only three parts.)

Виталий
abcçdefgğhı
Фадеев

Ali

November 13, 2020
On Friday, 13 November 2020 at 05:14:08 UTC, Виталий Фадеев wrote:
> Is:
> wchar[] chars;  // like a: "import core.sys.windows.windows;\nimport std.conv      : to;\n"
>
> Goal:
> foreach ( word; chars.byWord )
> {
>     // ...
> }

You can make your own range, however look at this function first (second example)
https://dlang.org/phobos/std_algorithm_iteration.html#.splitter

    // 1) might need to cast your wchar[] to wstring first
    // 2) also assumes that 'the word' is separated by whitespace
    foreach( word; chars.splitter(' '))
    {

    }

or this one, which is a bit more smarter about what "the word" means
https://dlang.org/phobos/std_array.html#.split

    import std.array : split;

    wchar[] str = cast(wchar[]) "some random stuff blah blah"w;
    foreach(w; str.split())
    {
        writeln(w);
    }

Anyway in both cases using dmd -vgc flag shows no GC allocations done.
November 13, 2020
On Friday, 13 November 2020 at 06:42:24 UTC, evilrat wrote:
> On Friday, 13 November 2020 at 05:14:08 UTC, Виталий Фадеев wrote:
>> [...]
>
> You can make your own range, however look at this function first (second example)
> https://dlang.org/phobos/std_algorithm_iteration.html#.splitter
>
>     // 1) might need to cast your wchar[] to wstring first
>     // 2) also assumes that 'the word' is separated by whitespace
>     foreach( word; chars.splitter(' '))
>     {
>
>     }
>
> or this one, which is a bit more smarter about what "the word" means
> https://dlang.org/phobos/std_array.html#.split
>
>     import std.array : split;
>
>     wchar[] str = cast(wchar[]) "some random stuff blah blah"w;
>     foreach(w; str.split())
>     {
>         writeln(w);
>     }
>
> Anyway in both cases using dmd -vgc flag shows no GC allocations done.

Thanks, Ali. Thanks, Evilrat.
I taste it now: https://run.dlang.io/is/HlSFVY
November 13, 2020
On Friday, 13 November 2020 at 06:52:38 UTC, Виталий Фадеев wrote:
> On Friday, 13 November 2020 at 06:42:24 UTC, evilrat wrote:
>> [...]
>
> Thanks, Ali. Thanks, Evilrat.
> I taste it now: https://run.dlang.io/is/HlSFVY


Latest: https://run.dlang.io/is/dfrcYj
November 13, 2020
On Friday, 13 November 2020 at 07:23:13 UTC, Виталий Фадеев wrote:
> On Friday, 13 November 2020 at 06:52:38 UTC, Виталий Фадеев wrote:
>> On Friday, 13 November 2020 at 06:42:24 UTC, evilrat wrote:
>>> [...]
>>
>> Thanks, Ali. Thanks, Evilrat.
>> I taste it now: https://run.dlang.io/is/HlSFVY
>
>
> Latest: https://run.dlang.io/is/riY5BI

Splitted on each White space...

Thanks. I going to the Lexers/Parsers world.