Thread overview
[Issue 7716] New: Add an indexed overload to countUntil
Mar 15, 2012
Andrej Mitrovic
Dec 10, 2012
Andrej Mitrovic
March 15, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7716

           Summary: Add an indexed overload to countUntil
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: andrej.mitrovich@gmail.com


--- Comment #0 from Andrej Mitrovic <andrej.mitrovich@gmail.com> 2012-03-15 09:55:27 PDT ---
Sometimes I want to find the index of a substring, but not necessarily the index of the first occurence. E.g. in the string "foo x foo y" I want to get the index of the second foo. Here's a simple implementation based on the existing countUntil:

import std.algorithm : startsWith;
import std.array;
import std.traits;

sizediff_t countUntil(alias pred = "a == b", R, N)(R haystack, N needle, size_t
index)
if (is(typeof(startsWith!pred(haystack, needle))))
{
    size_t count;
    static if (isNarrowString!R)
    {
        // Narrow strings are handled a bit differently
        auto length = haystack.length;
        for (; !haystack.empty; haystack.popFront())
        {
            if (startsWith!pred(haystack, needle))
            {
                if (count == index)
                    return length - haystack.length;
                else
                    count++;
            }
        }
    }
    else
    {
        typeof(return) result;
        for (; !haystack.empty; ++result, haystack.popFront())
        {
            if (startsWith!pred(haystack, needle))
            {
                if (count == index)
                    return result;
                else
                    count++;
            }
        }
    }
    return -1;
}

void main()
{
    string s = "foo x foo y";

    auto idx1 = s.countUntil("foo", 0);
    auto idx2 = s.countUntil("foo", 1);
    auto idx3 = s.countUntil("foo", 2);

    assert(idx1 == 0);
    assert(idx2 == 6);
    assert(idx3 == -1);
}

Equivalent functionality could be added to std.string.indexOf to allow case-unsensitive searches.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 10, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7716


Andrej Mitrovic <andrej.mitrovich@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX


--- Comment #1 from Andrej Mitrovic <andrej.mitrovich@gmail.com> 2012-12-10 14:51:28 PST ---
This niche feature belongs in user-space libraries, not Phobos.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 10, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7716


bearophile_hugs@eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs@eml.cc


--- Comment #2 from bearophile_hugs@eml.cc 2012-12-10 15:28:19 PST ---
This belongs in Phobos. See Python string functions, that have optional start-end:

>>> help(str.find)
Help on method_descriptor:

find(...)
    S.find(sub [,start [,end]]) -> int

    Return the lowest index in S where substring sub is found,
    such that sub is contained within s[start:end].  Optional
    arguments start and end are interpreted as in slice notation.

    Return -1 on failure.

>>> help(str.index)
Help on method_descriptor:

index(...)
    S.index(sub [,start [,end]]) -> int

    Like S.find() but raise ValueError when the substring is not found.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7716


monarchdodra@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra@gmail.com


--- Comment #3 from monarchdodra@gmail.com 2012-12-11 05:30:54 PST ---
(In reply to comment #0)
> Sometimes I want to find the index of a substring, but not necessarily the index of the first occurence. E.g. in the string "foo x foo y" I want to get the index of the second foo.

This would conflict with this other request: http://d.puremagic.com/issues/show_bug.cgi?id=5507

which would allow counting until multiple needles, eg:
//----
string s = "foo x bar y";
size_t index = s.countUntil("bar", 'x');
assert(index == 4);
//----

So what you are asking for may have to be implemented as "countUntilN" or something:

//-----
string s = "foo x bar y";
size_t index = s.countUntilN(1, "bar", 'x');
assert(index == 6); //First found 'x', then found "bar".
//-----

> Here's a simple implementation based on the
> existing countUntil:

Just want to point out that that implementation is based on an old and buggy
implementation of countUntil. 2 problems:
1. No support for reference type ranges.
2. Incorrect support of unicode.

The issue 1 is not that big of an issue, but is unacceptable of a standard library.

Issue 2 depends on what you want. Do you want the *index* in the string, or the logical *range_position*:

"日本語".countUntil(本); //1 popFront once
"日本語".countUntil(本); //3 slice index: "日本語"[3 .. $]

If you plan on using countUntil(N), then I'd recommend basing your implementation on https://github.com/D-Programming-Language/phobos/pull/951

Finally, keep in mind that the standard function "count" skips over overlaps:
"ababab".count("abab"); //produces 1. Only 1 match.
However, your countUntil implementation would find that second match:
"ababab".countUntilN(1, "abab"); //Produces 2
This would be kind of weird:
"Given a range which holds 1 instance of a needle, countUntilN is capable of
finding the second occurrence of said needle :/"

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------