ranges of characters and the overabundance of traits that go with them (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » ranges of characters and the overabundance of traits that go with them (page 2)

March 16, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by H. S. Teoh
in reply to Dukc

H. S. Teoh

Posted in reply to Dukc

On Thu, Mar 16, 2017 at 12:04:39PM +0000, Dukc via Digitalmars-d wrote:
> On Monday, 13 March 2017 at 19:08:12 UTC, H. S. Teoh wrote:
> > What the user sees should be the most generic API that makes sense relative to this function.  For cases where the implementation can't (yet) handle, a helpful error message is given, and the docs should also indicate the present limitations.
> 
> phobos each, and probably other functions too, is sometimes hard to use because template constrains make them disappear when you enter something faulty to them. Thus the error message only says something like: No function overload "each" for arguments (...) found.

Exactly, this is why we need to seriously take up Walter's suggestion to simplify Phobos sig constraints. As I commented in a recent PR[1], Phobos functions need to accept *all* logically-acceptable arguments, and use static ifs internally to generate error messages when said arguments don't work with the current implementation for whatever reason.

[1] https://github.com/dlang/phobos/pull/5148 - coincidentally, this one
is about each().

In this PR about each(), there are 4 overloads, and the PR combines them into two.  But they still have rather complicated constraints. However, if you look at it from a holistic POV, there really are only two constraints: (1) the argument can be iterated over with foreach; and (2) the argument has a range API. Thus, instead of having a whole bunch of complicated constraints that are hard to read and uninteresting to the user, the constraints really should be as simple as:

	void each(alias func, R)(R range)
		if (isInputRange!R)
	{ ... }

	void each(alias func, R)(R range)
		if (!isInputRange!R && isForeachIterable!R)
	{ ... }

There, isn't that much more readable?

Furthermore, whatever the current implementation can't handle, e.g. a range whose elements don't match the given function func, can be handled with static if's with an else clause that has an assert(0) with a nice error message. For example:

	void each(alias func, R)(R range)
		if (isInputRange!R)
	{
		static if (is(typeof(func(R.init.front))))
		{
			foreach (e; range)
				func(e);
		}
		else
			static assert(0, "Range element type does not "
				"match function parameters");
	}

This way, if the user passes a range whose elements don't match the given function, they get a nice error message that explains what went wrong, instead of a generic "no matching function for call" with a wall of text in encrypted Klingon that nobody understands.

T

-- 
Latin's a dead language, as dead as can be;
It killed off all the Romans, and now it's killing me!
-- Schoolboy

March 16, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Jonathan M Davis
in reply to Sebastiaan Koppe

Jonathan M Davis

Posted in reply to Sebastiaan Koppe

On Thursday, March 16, 2017 12:55:32 Sebastiaan Koppe via Digitalmars-d wrote:
> On Thursday, 16 March 2017 at 07:44:59 UTC, Jonathan M Davis
>
> wrote:
> > So, if we deprecated isSomeString, we'd be telling a lot of folks to change their code when it's perfectly fine as-is. I agree that it would be nice to fix isSomeString, but I just don't think that it's reasonable to deprecate it at this point.
>
> If you can't deprecate, what about only deprecating the enum behaviour (e.g. insert a static assert if anybody uses an enum on it)?

No, because isSomeString!T is supposed to work with an enum just like it works with int or a user-defined type Foo. It should be false. A _function_ that inadverently accepted enums could alter its template constraint and add an overload specifically for enums and deprecate it - though in a lot of cases, if the function didn't purposefully accept enum, it was going to get a compilation error anyway when an enum was used with it. But regardless, the only real options we have with isSomeString are

1. do nothing
2. change it to include !is(T == enum) and break anything that used it in a
   template constraint and actually did work with enums and had enums used
   with it.
3. make the downsides the current behavior in the documentation clear but
   not actually change the code.
4. deprecate it (and presumably replace it with something else that isn't
   true for enums)

Unfortunately, the nature of traits is such that altering them in a fashion that includes a deprecation cycle really doesn't work. You can deprecate the whole trait, but you really can't change its behavior without risking code breakage. Changing a trait means that who knows how many template constraints and static if braches will suddenly pass where they failed before or fail where they passed before, changing what does and doesn't compile all over the place.

It wouldn't entirely surprise me if we could get away with just changing it - I expect that comparatively little code would break, and the code that did would often be doing stuff like returning a value of the enum type that's not actually a valid value of that enum type, so the change would catch bugs - but the odds are quite high that _someone_ would have correct code that broke as a result. It's just that most code would either not care, or it would be broken as-is. Unfortunately, not all code would be in that boat, so making the change is likely unacceptable.

However, while deprecating isSomeString would not immediately break any code and would therefore be more acceptable in that regard, it would actually affect a _lot_ more code - most of which is likely perfectly good code - which on some level actualy makes deprecating it worse than just changing it.

Honestly, if I had to choose between just changing isSomeString (possibly warning about it beforehand) and deprecating it, I'd prefer to just change it.

In any case, while it would be nice to be able to deprecate the fact that isSomeString is true for enums, there's really no way to do so. The nature of traits just doesn't play well with that.

- Jonathan M Davis

March 16, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Andrei Alexandrescu
in reply to Jonathan M Davis

Andrei Alexandrescu

Posted in reply to Jonathan M Davis

On 3/16/17 9:44 AM, Jonathan M Davis via Digitalmars-d wrote:
> I completely agree that we should try and make the template constraints on
> public functions in Phobos - and other libraries be simple - and try and
> restrict a lot of the more complicated stuff to internals.

Acta, non verba!

We all are in violent agreement about this, yet there continues to be prolific discussion about it. Directing that energy toward simple, almost mechanical pull requests that merge unnecessary overloads would work wonders. -- Andrei

March 16, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by H. S. Teoh
in reply to Andrei Alexandrescu

H. S. Teoh

Posted in reply to Andrei Alexandrescu

On Thu, Mar 16, 2017 at 07:53:14PM +0200, Andrei Alexandrescu via Digitalmars-d wrote:
> On 3/16/17 9:44 AM, Jonathan M Davis via Digitalmars-d wrote:
> > I completely agree that we should try and make the template constraints on public functions in Phobos - and other libraries be simple - and try and restrict a lot of the more complicated stuff to internals.
> 
> Acta, non verba!
> 
> We all are in violent agreement about this, yet there continues to be prolific discussion about it. Directing that energy toward simple, almost mechanical pull requests that merge unnecessary overloads would work wonders. -- Andrei

I agree that we should take action here. There has been some, e.g.:

	https://github.com/dlang/phobos/pull/5148

But, as I indicated in the discussion there, this may not be as mechanical as it first appears.  Some amount of judgment has to be made in deciding what the merged sig constraints ought to look like, and it's not always obvious.  Where "logical constraints" end and "implementation details" begin isn't always clear-cut, and opinions may differ as to where to draw that line.

T

-- 
Right now I'm having amnesia and deja vu at the same time. I think I've forgotten this before.

March 16, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Jonathan M Davis
in reply to Andrei Alexandrescu

Jonathan M Davis

Posted in reply to Andrei Alexandrescu

On Thursday, March 16, 2017 19:53:14 Andrei Alexandrescu via Digitalmars-d wrote:
> On 3/16/17 9:44 AM, Jonathan M Davis via Digitalmars-d wrote:
> > I completely agree that we should try and make the template constraints on public functions in Phobos - and other libraries be simple - and try and restrict a lot of the more complicated stuff to internals.
>
> Acta, non verba!
>
> We all are in violent agreement about this, yet there continues to be prolific discussion about it. Directing that energy toward simple, almost mechanical pull requests that merge unnecessary overloads would work wonders. -- Andrei

Yes, but the main point of this thread was to discuss what to do about the various string-related traits not about converting overloaded templated functions into static ifs. That part was already agreed upon previously and IMHO is largely orthogonal to the string-related traits. It's just that H.S. Teoh thinks that that then fixes the string-related traits with the idea that they should be changed to private inside of Phobos instead of public for folks to use in their own libraries, because the public-facing template constraints would mostly not be using them, and I don't agree with that at all. If Phobos needs them, then 3rd party libraries will as well, and they're already public. That's primarily why this discussion has dragged on.

- Jonathan M Davis

March 16, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Sebastiaan Koppe
in reply to Jonathan M Davis

Sebastiaan Koppe

Posted in reply to Jonathan M Davis

On Thursday, 16 March 2017 at 17:12:05 UTC, Jonathan M Davis wrote:
> Unfortunately, the nature of traits is such that altering them in a fashion that includes a deprecation cycle really doesn't work. You can deprecate the whole trait, but you really can't change its behavior without risking code breakage.

While not a nice solution, a pragma(msg,"deprecated") inside a static if would do the trick. You could also argue for a static-deprecated language feature.

> It wouldn't entirely surprise me if we could get away with just changing it - I expect that comparatively little code would break, and the code that did would often be doing stuff like returning a value of the enum type that's not actually a valid value of that enum type, so the change would catch bugs - but the odds are quite high that _someone_ would have correct code that broke as a result. It's just that most code would either not care, or it would be broken as-is. Unfortunately, not all code would be in that boat, so making the change is likely unacceptable.

Maybe a little bit off-topic, but would it be possible to scrape github and use dcd or dscanner to count the usages of enums and isSomeString? Just to measure how big the playing field is. Sometimes a little measuring brings up surprising insights and turns choices into no-brainers.

> In any case, while it would be nice to be able to deprecate the fact that isSomeString is true for enums, there's really no way to do so. The nature of traits just doesn't play well with that.
>
> - Jonathan M Davis

I get your point though, there is no easy out.

March 16, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Jonathan M Davis
in reply to Sebastiaan Koppe

Jonathan M Davis

Posted in reply to Sebastiaan Koppe

On Thursday, March 16, 2017 21:50:18 Sebastiaan Koppe via Digitalmars-d wrote:
> On Thursday, 16 March 2017 at 17:12:05 UTC, Jonathan M Davis
>
> wrote:
> > Unfortunately, the nature of traits is such that altering them in a fashion that includes a deprecation cycle really doesn't work. You can deprecate the whole trait, but you really can't change its behavior without risking code breakage.
>
> While not a nice solution, a pragma(msg,"deprecated") inside a static if would do the trick. You could also argue for a static-deprecated language feature.

The problem is that that then catches when an enum is passed to isSomeString, not when the template constraint or static if that isSomeString is being used in should have checked for enums and didn't. And checking isSomeString with an enum is perfectly valid. It's just that we'd like it to be false, whereas right now it's true. Code could quite legitimately be doing something like

auto foo(T)(T str)
    if(isSomeString!T)
{
    static if(is(T == enum))
        return foo!(StringTypeOf!T)(str);
    else
    {
        ...
    }
}

> Maybe a little bit off-topic, but would it be possible to scrape github and use dcd or dscanner to count the usages of enums and isSomeString? Just to measure how big the playing field is. Sometimes a little measuring brings up surprising insights and turns choices into no-brainers.

That could give us some insights into what some of the code out there is doing - certainly it's probably our best resource for examining D code in the wild - but it also is only going to be a portion of the D code out there - possibly even a small portion. And it's generally going to miss corporate users, who are arguably the most important, since messing them up could cost them money rather than simply annoying them. So, if someone did that, it could certainly be informative, but it wouldn't be conclusive. If anything, it would help us see some of what people are definitely doing, but it wouldn't let us see for sure what they aren't doing or necessarily how common any particular practice is in the D community as a whole.

As for how hard it is to actually do it, I don't know, because I don't know how advanced dcd or dscanner are, but if I had to guess, I'd say that checking much more than how much isSomeString gets used wouldn't be possible. Checking for when it is or isn't used in conjuction with a check for enums could get pretty complicated when you take into account that one part of the check could be in a template constraint while the other is in a static if, or how something like isInputRange won't pass for enums, but it doesn't explicitly check for enums - they just don't semantically work with the code that isInputRange requires to compile. So, probably, all we could do is figure out how often isSomeString is used in the D code that's publicly up on places like github or bitbucket and not much about how it's used. I don't know for sure though.

- Jonathan M Davis

March 17, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Sebastiaan Koppe
in reply to Jonathan M Davis

Sebastiaan Koppe

Posted in reply to Jonathan M Davis

On Thursday, 16 March 2017 at 22:42:21 UTC, Jonathan M Davis wrote:
> On Thursday, March 16, 2017 21:50:18 Sebastiaan Koppe via The problem is that that then catches when an enum is passed to isSomeString, not when the template constraint or static if that isSomeString is being used in should have checked for enums and didn't. And checking isSomeString with an enum is perfectly valid. It's just that we'd like it to be false, whereas right now it's true. Code could quite legitimately be doing something like
>
> auto foo(T)(T str)
>     if(isSomeString!T)
> {
>     static if(is(T == enum))
>         return foo!(StringTypeOf!T)(str);
>     else
>     {
>         ...
>     }
> }

Ahh yes, silly me. It's not so much about isSomeString but how people use it in conjunction with extra constraints. Quite messy.

So for my proposed static deprecated to work it would only have to show a message when it would have changed the outcome of the template constraint where it is used in. Which still gives false positives due to overloads.

So the only way to change a template trait is to either just do it - and risk breakage - or to create a new one under another name?

> So, probably, all we could do is figure out how often isSomeString is used in the D code that's publicly up on places like github or bitbucket and not much about how it's used. I don't know for sure though.
>
> - Jonathan M Davis

Yeah, you are right. You'll would need to instantiate the templates to resolve the types and as far as I know dcd/scanner don't.

March 17, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Andrei Alexandrescu
in reply to Jonathan M Davis

Andrei Alexandrescu

Posted in reply to Jonathan M Davis

On 3/17/17 12:42 AM, Jonathan M Davis via Digitalmars-d wrote:
> And
> checking isSomeString with an enum is perfectly valid. It's just that we'd
> like it to be false, whereas right now it's true.

That hasn't been the case for long, and there can't be many designs relying on that. I think this is one of those places where we can break compatibility if there's enough reason. -- Andrei

March 17, 2017

Re: ranges of characters and the overabundance of traits that go with them

Posted by Jonathan M Davis
in reply to Andrei Alexandrescu

Jonathan M Davis

Posted in reply to Andrei Alexandrescu

On Friday, March 17, 2017 10:25:22 Andrei Alexandrescu via Digitalmars-d wrote:
> On 3/17/17 12:42 AM, Jonathan M Davis via Digitalmars-d wrote:
> > And
> > checking isSomeString with an enum is perfectly valid. It's just that
> > we'd like it to be false, whereas right now it's true.
>
> That hasn't been the case for long, and there can't be many designs relying on that. I think this is one of those places where we can break compatibility if there's enough reason. -- Andrei

Actually, it's been that way for years (since at least  2012), and I suspect that it's been that way since the beginning, since it would need to explicitly check for enums to not work with enums thanks to the fact that it checks for implicit conversion and then disallows the implicit conversions that we don't want (like for static arrays and user-defined types). And sadly, when Kenji tried to fix it, I argued against it, because I didn't understand the issue well enough:

https://issues.dlang.org/show_bug.cgi?id=8508

At the time, I thought that we needed isSomeString to be true for enums in order to be able to templatize functions that took string, whereas that's actually a bad idea, because then you didn't actually convert the enum to a string, and the function is in serious risk of doing the wrong thing with enums (either by not compiling or by resulting in a value that's not a valid enum value but still has the enum's type), though it will work in some cases. Also, templatizing the function and checking isSomeString breaks code anyway, because then it doesn't accept static arrays or user-defined types that implicitly convert to string, whereas it did before. And as discussed in this thread, as far as I can tell, in the general case, the only way to safely templatize a function that took a string is to include an overload that templatizes on the character type so that the implicit conversion is done at the call site. Given all of that, it was ignorant and ultimately detrimental of me to argue to that isSomeString needed to be true for enums, and it's been that way for several years at least. If I'd been smarter at the time, this would have been fixed years ago.

However, in most cases, I would expect that fixing isSomeString to be false for enums would simply work. Anything range-based wouldn't care, because enums don't pass isInputRange anyway, and a decent chunk of the time, using an enum in place of a string would result in a compilation error, which means that that code wouldn't be broken but would instead catch the enum at the template constraint instead of not compiling when the template is instantiated. Really, the only code that we'd be at risk of breaking would be code that was written with the full understanding that enums passed isSomeString and did something like

auto foo(S)(S str)
    if(isSomeString!S)
{
    static if(is(T == enum))
        return foo!(StringTypeOf!T)(str);
    else
    {
        ...
    }
}

or that just so happens to use the subset of string transformations or function calls which work with enums.

So, if we don't fix isSomeString, and we want our code to be fully correct, we're left with the need to check is(T == enum) in non-range-based code in order to correctly handle enums (as in the example above or by disallowing enums) or risk code not compiling or being subtly incorrect when enums are used with it. And if we do fix isSomeString, then we'll probably have an occasional function call that happened to work with enums in spite of the problems and would then break - though the breakage would then be easy to fix.

So, I would love to fix isSomeString, and I think that the resulting breakage would be minimal, but I don't expect it to be zero. So, if you think that that presumably small amount of code breakage is acceptable, I would gladly create a PR to fix isSomeString and isNarrowString to be false for enums. I think that it's clear that in the long run, we'll be better off for it. It's just that we have no way to make this change and guarantee that we're not breaking anyone's code, much as in many (most?) cases, I would expect the code to be wrong anyway.

- Jonathan M Davis

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation