February 15, 2017
On Wednesday, February 15, 2017 12:54:02 Walter Bright via Digitalmars-d wrote:
> On 2/15/2017 10:51 AM, Andrei Alexandrescu wrote:
> > isStringLike. I wanted to add this for a while already. Please do! -- Andrei
> What I've found messy and confusing with string overloads in Phobos is there are at least 6 kinds of strings:
>
> 1. auto decoding dynamic arrays
> 2. not auto decoding arrays
> 3. static arrays
> 4. aggregates with an 'alias this' to a string
> 5. ranges of characters
> 6. something convertible to a string
>
> These classifications seem to be tested for in a unique ad-hoc manner in every case.
>
> I'd like to take a step back and devise a consistent taxonomy of these things, based on how Phobos uses them, before adding more names.

Yeah. It's a bit of a mess. And types with alias this and enums _really_ don't help things.

- Jonathan M Davis

February 15, 2017
On Wednesday, 15 February 2017 at 20:54:02 UTC, Walter Bright wrote:
> I'd like to take a step back and devise a consistent taxonomy of these things

Ok

> 1. auto decoding dynamic arrays

Narrow strings

> 2. not auto decoding arrays

Wide strings

> 3. static arrays

Do these need to be called anything other than "static arrays"? Also, they're not ranges, so they're usually tested with isStaticArray

> 4. aggregates with an 'alias this' to a string

isConvertibleToString

> 5. ranges of characters

Character range

> 6. something convertible to a string

Same as 4

February 15, 2017
On 2/15/2017 1:03 PM, Jack Stouffer wrote:
> On Wednesday, 15 February 2017 at 20:54:02 UTC, Walter Bright wrote:
>> I'd like to take a step back and devise a consistent taxonomy of these things
>
> Ok
>
>> 1. auto decoding dynamic arrays
>
> Narrow strings
>
>> 2. not auto decoding arrays
>
> Wide strings
>
>> 3. static arrays
>
> Do these need to be called anything other than "static arrays"? Also, they're
> not ranges, so they're usually tested with isStaticArray
>
>> 4. aggregates with an 'alias this' to a string
>
> isConvertibleToString
>
>> 5. ranges of characters
>
> Character range
>
>> 6. something convertible to a string
>
> Same as 4
>

That's a good start. A test of that is to look at Phobos' actual usage of constraints and see if they fit in.
February 15, 2017
On Wednesday, February 15, 2017 21:03:46 Jack Stouffer via Digitalmars-d wrote:
> On Wednesday, 15 February 2017 at 20:54:02 UTC, Walter Bright
>
> wrote:
> > I'd like to take a step back and devise a consistent taxonomy of these things
>
> Ok
>
> > 1. auto decoding dynamic arrays
>
> Narrow strings
>
> > 2. not auto decoding arrays
>
> Wide strings
>
> > 3. static arrays
>
> Do these need to be called anything other than "static arrays"? Also, they're not ranges, so they're usually tested with isStaticArray
>
> > 4. aggregates with an 'alias this' to a string
>
> isConvertibleToString
>
> > 5. ranges of characters
>
> Character range
>
> > 6. something convertible to a string
>
> Same as 4

Except that you're forgetting enums. Also, there's this mess:

enum bool isNarrowString(T) =
    (is(T : const char[]) || is(T : const wchar[])) &&
    !isAggregateType!T &&
    !isStaticArray!T;

enum bool isAutodecodableString(T) =
    (is(T : const char[]) || is(T : const wchar[])) &&
    !isStaticArray!T;

A type with alias this passes isAutodecodableString but not isNarrowString, making for really subtle difference. Also, enums of strings pass both, which is potentially a problem as they really should be treated the same as types with alias this given how they need to be used in a templated function. enums also pass isSomeString, which makes using isSomeString a no-go for any range-based function if it doesn't then test for enums - but aggregate types _don't_ pass isSomeString. And then there's

template isConvertibleToString(T)
{
    enum isConvertibleToString =
        (isAggregateType!T || isStaticArray!T || is(T == enum))
        && is(StringTypeOf!T);
}

So, regardless of the exact terminology, we have a whole set of very similar but subtly different traits. And as it stands, they _will_ get screwed up unless someone is carefully looking at each to make sure that they actually use the right one as well as testing with various types that frequently get missed in unit tests - like types which use alias this or enums with a base type of string.

- Jonathan M Davis

February 15, 2017
On Wed, Feb 15, 2017 at 09:03:46PM +0000, Jack Stouffer via Digitalmars-d wrote:
> On Wednesday, 15 February 2017 at 20:54:02 UTC, Walter Bright wrote:
> > I'd like to take a step back and devise a consistent taxonomy of these things
> 
> Ok
> 
> > 1. auto decoding dynamic arrays
> 
> Narrow strings
> 
> > 2. not auto decoding arrays
> 
> Wide strings
> 
> > 3. static arrays
> 
> Do these need to be called anything other than "static arrays"? Also, they're not ranges, so they're usually tested with isStaticArray
> 
> > 4. aggregates with an 'alias this' to a string
> 
> isConvertibleToString
> 
> > 5. ranges of characters
> 
> Character range
> 
> > 6. something convertible to a string
> 
> Same as 4

This describes the current state of Phobos, but I think what Walter is driving at is, does it *have* to be this way?  I think we can (and should) simplify this taxonomy to avoid needless duplication and also create a more consistent conceptual model of how Phobos deals with strings and string-like things.

First of all, I think we should try out best to avoid treating arrays and character ranges differently.  I know this is inevitable because we're still in a state where autodecoding can't be fully eliminated yet, but as far as possible, I think we should try to treat them the same way.

(1) Functions that need to work with string contents (e.g., find, toUpperCase, etc.) should in general accept any input ranges whose elements are convertible in some way to dchar.  Some of these functions may require forward ranges or bidirectional / random-access ranges. Most of these functions can allow infinite ranges (so we only need the occasional !isInfinite check).

(2) Functions that don't need to work with string contents, i.e., the strings are treated as opaque blobs to be passed to, say, an underlying OS function, require finite ranges, but should be able to work with any type that converts to string in one form or another.  So anything from (finite) ranges of char-like elements to aggregates with alias this to string ought to be accepted.  They can be internally converted to strings if necessary (e.g., to pass to an OS call).

I suspect that many of the functions in category (1) can be coalesced with generic range algorithms, so they should not even be exposing their sig constraints in the public API.  Instead, they should take a generic range and then use static if or module-private helper functions to dispatch to the right implementation(s).

Category (2) functions can internally call helpers (maybe in std.string
or std.array) that convert any incoming type that can be converted to
string in some way.

As for static arrays, I think the consensus is (was?) that they are not ranges, and so the user is required to take a slice before handing it to a Phobos function expecting string or string-like arguments. Well, actually, not just string-like things, but anything to do with ranges. I don't think there's a need to treat static arrays of char separately from static arrays in general.


T

-- 
Bare foot: (n.) A device for locating thumb tacks on the floor.
February 15, 2017
On 02/15/2017 03:31 PM, Jonathan M Davis via Digitalmars-d wrote:
> On Wednesday, February 15, 2017 14:30:02 Andrei Alexandrescu via
> Digitalmars-d wrote:
>> On 02/15/2017 02:22 PM, Jacob Carlborg wrote:
>>> On 2017-02-15 15:01, Andrei Alexandrescu wrote:
>>>> That's nice, could you please submit as an enhancement request on
>>>> bugzilla?
>>>
>>> https://issues.dlang.org/show_bug.cgi?id=17186
>>
>> Thanks. I'll take it up to Walter for preapproval. -- Andrei
>
> It's one of those features that I was surprised when you couldn't do it.

We agree. It's preapproved now. -- Andrei
February 15, 2017
On 2/15/2017 1:24 PM, Jonathan M Davis via Digitalmars-d wrote:
> So, regardless of the exact terminology, we have a whole set of very similar
> but subtly different traits. And as it stands, they _will_ get screwed up
> unless someone is carefully looking at each to make sure that they actually
> use the right one as well as testing with various types that frequently get
> missed in unit tests - like types which use alias this or enums with a base
> type of string.

I suspect the only way forward is to go through Phobos and collect all the plethora of constraints used for strings, and examine them for commonalities. Not doing this will just result in more accumulation of confusing junk.

February 15, 2017
Also, as mentioned in the std.algorithm.mutation.remove case, constraints in Phobos often confuse "requirements" with "specializations".

Requirements should be user-facing constraints, while specializations are implementation details better handled with internal static if.
February 15, 2017
On 2/15/2017 12:31 PM, Jonathan M Davis via Digitalmars-d wrote:
> It's one of those features that I was surprised when you couldn't do it.

It was an oversight. We just never thought of it.

February 16, 2017
On Thursday, 16 February 2017 at 00:08:12 UTC, Walter Bright wrote:
> On 2/15/2017 12:31 PM, Jonathan M Davis via Digitalmars-d wrote:
>> It's one of those features that I was surprised when you couldn't do it.
>
> It was an oversight. We just never thought of it.

What do you think about generalizing this feature to allow introducing template value parameters without default value, i.e.:

// Note: `beg` and `end` have no default value

auto bounded
    (auto beg, auto end, auto onErrPolicy = Enforce("Value out of range"), T)
    (T value)
{
    return Bounded!(beg, end, Policy)(value);
}

struct Enforce
{ this(string) { /* */ } }

int tmp;
readf("%s", &x);

enum policy = Enforce("User age must be between 18 and 150");
auto userAge = bounded!(18, 150, policy)(tmp);

"Declaring non-type template arguments with auto" is also coming to C++17: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0127r1.html

BTW, shouldn't we use `enum`, instead of `auto`, since everywhere else `enum` means guaranteed to be computed at compile-time whereas `auto` means the opposite?