Thread overview
[Issue 20184] String maxsplit
Sep 01, 2019
Jon Degenhardt
Sep 01, 2019
svnpenn@gmail.com
Sep 01, 2019
Jon Degenhardt
Sep 02, 2019
Alex
Sep 07, 2019
svnpenn@gmail.com
Sep 19, 2019
Berni
Dec 17, 2022
Iain Buclaw
September 01, 2019
https://issues.dlang.org/show_bug.cgi?id=20184

Jon Degenhardt <jrdemail2000-dlang@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jrdemail2000-dlang@yahoo.co
                   |                            |m

--- Comment #1 from Jon Degenhardt <jrdemail2000-dlang@yahoo.com> ---
This can be achieved using 'splitter' and 'take' or another range iteration algorithm that limits the number of candidates selected.

e.g.

assert("a|bc|def".splitter('|').take(4).equal([ "a", "bc", "def" ]));
assert("a|bc|def".splitter('|').take(3).equal([ "a", "bc", "def" ]));
assert("a|bc|def".splitter('|').take(2).equal([ "a", "bc" ]));
assert("a|bc|def".splitter('|').take(1).equal([ "a" ]));

'splitter' (from std.algorithm) is a lazy version of 'split', which is eager. It produces an input range. 'take' (from std.range) takes the first N elements from an input range. 'take' is also lazy. To convert it to a fully realized array similar to the result of 'split' use 'array' (from std.array) or another range "eager" range algorithm. e.g.

auto x = "a|bc|def".splitter('|').take(2).array;
assert(x.length == 2);
assert (x[0] == "a");
assert (x[1] == "bc");

--
September 01, 2019
https://issues.dlang.org/show_bug.cgi?id=20184

--- Comment #2 from svnpenn@gmail.com ---
(In reply to Jon Degenhardt from comment #1)
> This can be achieved using 'splitter' and 'take' or another range iteration algorithm that limits the number of candidates selected.
> 
> e.g.
> 
> assert("a|bc|def".splitter('|').take(4).equal([ "a", "bc", "def" ]));
> assert("a|bc|def".splitter('|').take(3).equal([ "a", "bc", "def" ]));
> assert("a|bc|def".splitter('|').take(2).equal([ "a", "bc" ]));

It seems you have a profound misunderstand of what split limiting is. Here is a result with Python:

    >>> 'one two three'.split(maxsplit = 1)
    ['one', 'two three']

as you can see, it doesnt discard any part of the original input, instead it stops splitting after the specified amount, and puts the rest of the string as the final element.

--
September 01, 2019
https://issues.dlang.org/show_bug.cgi?id=20184

--- Comment #3 from Jon Degenhardt <jrdemail2000-dlang@yahoo.com> ---
(In reply to svnpenn from comment #2)
> (In reply to Jon Degenhardt from comment #1)
> Here is a result with Python:
> 
>     >>> 'one two three'.split(maxsplit = 1)
>     ['one', 'two three']
> 
> as you can see, it doesnt discard any part of the original input, instead it stops splitting after the specified amount, and puts the rest of the string as the final element.

Thanks for clarify what you are looking for. This is a useful refinement of the original description, which is:

> D seems to have no way to limit the number of splits done on a string.

D does have a way to limit the number of splits, but as you point out, this mechanism doesn't preserve the remainder of the string in the fashion available in a number of other libraries.

--
September 02, 2019
https://issues.dlang.org/show_bug.cgi?id=20184

Alex <sascha.orlov@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sascha.orlov@gmail.com

--- Comment #4 from Alex <sascha.orlov@gmail.com> ---
As a workaround, this is possible:

´´´
import std;

void main()
{
    "one two three four".fun1(1).writeln;
    "one two three four".fun2(2).writeln;
}

auto fun1(string s, size_t num)
{
    size_t summe;
    auto r = s.splitter(' ').take(num).tee!(a => summe += a.length + 1).array;
    return r ~ s[summe .. $];
}

auto fun2(string s, size_t num)
{
    auto i = s.splitter(' ').take(num);
    return i.array ~ s[i.map!(el => el.length).sum + num .. $];
}
´´´

If the splitter construct allowed public access to its underlying range, more convenient solutions were possible.

--
September 07, 2019
https://issues.dlang.org/show_bug.cgi?id=20184

--- Comment #5 from svnpenn@gmail.com ---
Here is a better workaround:

    import std.format, std.stdio;
    void main() {
       string s1 = "one two three", s2, s3;
       s1.formattedRead("%s %s", s2, s3);
       writeln(s2);
       writeln(s3);
    }

--
September 19, 2019
https://issues.dlang.org/show_bug.cgi?id=20184

Berni <dlang@croco-puzzle.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dlang@croco-puzzle.com

--- Comment #6 from Berni <dlang@croco-puzzle.com> ---
I've had a look at this. I think it's not feasable to add an other parameter "maxsplit" to split. Internally split uses splitter and splitter works with BidirectionalRange. That means, for implementing back, splitter has to go through all elements from the front to find the correct breakpoint. That breaks lazyness, which in my eyes is not desirable.

Therefore I think it would be better to implement separate functions splitN and splitterN. splitterN would then be restricted to ForwardRange.

--
December 17, 2022
https://issues.dlang.org/show_bug.cgi?id=20184

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P1                          |P4

--