March 06, 2018
On Tue, Mar 06, 2018 at 08:39:41AM -0700, Jonathan M Davis via Digitalmars-d-announce wrote: [...]
> Yeah. If you're dealing with generic code rather than a specific range type that you know is implicitly saved when copied, you have to use save so often that it's painful, and almost no one does it. e.g.
> 
> equal(lhs.save, rhs.save)
> 
> or
> 
> immutable result = range.save.startsWith(needle.save);
> 
> How well Phobos has done with this has improved over time as more and better testing has been added (testing for reference type ranges is probably the most critical to finding this particular problem), but I doubt that Phobos has it right everywhere, and I'm sure that the average programmer's code has tons of these problems.

In my own code, I often run into subtle bugs that arise from ranges being unintentionally consumed because I forgot to call .save.  So I tend to be extra careful about this.  But yeah, it's so easy to miss unless your code actually uses ranges where it would make a difference.


[...]
> Ranges are wonderfully powerful, but they become a royal pain to get right with truly generic code. And that's without getting into all of the arguments about whether stuff like whether transitive fronts should be allowed...

I know we have disagreed on this before, but in my mind, it's very simple. Generic code should basically be written in such a way that it does the most making the least assumptions. Meaning, don't assume the return value of .front persists beyond the next .popFront, don't assume iterating the range won't consume it, etc..  It's just basic defensive programming.  If the algorithm won't work without some of these assumptions, then make them explicit, either a part of the API, or clearly documented. IMNSHO, code that isn't written this way is just sloppy and a haven for hidden bugs.


> Ranges are definitely one area where we could really use some redesign to iron out some of the issues that we've found over time, but their success makes them almost impossible to fix, because changing them would break tons of code. But annoyingly, that's often what happens when you implement a new idea. You simply don't have enough knowledge about it ahead of time to avoid mistakes; those are easy enough to make when you really know what you're doing, let alone with something new.
[...]

Andrei has said before, and probably on more than one occasion, that if he were to redesign ranges today, one of the things he would do differently was to change the definition of forward range so that .save is basically implicit on copying the range object, and non-forward input ranges would just be reference / non-copyable types.

But that boat has long sailed, and we just have to make do with what we have today. Changing this now will literally break just about *every* D program that uses ranges, which is breakage of an ecosystem-killing magnitude that I can't even contemplate.  I would much rather go with a less intrusive breakage like killing autodecoding with fire, than with something that will basically require me to rewrite practically every D program I ever wrote.


T

-- 
"You know, maybe we don't *need* enemies." "Yeah, best friends are about all I can take." -- Calvin & Hobbes
March 06, 2018
On Mon, Mar 05, 2018 at 10:21:47PM -0500, Nick Sabalausky (Abscissa) via Digitalmars-d-announce wrote:
> On 03/05/2018 12:38 PM, H. S. Teoh wrote:
> > 
> > This broke the by-value assumption inherent in much of Phobos code,
> 
> Wait, seriously? Phobos frequently passes ranges by value? I sincerely hope that's only true for class-based ranges and forward-ranges (and more specifically, only forward ranges where copying the range and calling .save are designed to do the exact same thing). Otherwise, that's really, *REALLY* bad since non-forward ranges *by definition* cannot be duplicated.

I think you misunderstood. :-D  Passing ranges by value means passing the range itself, usually a struct, which is a value type.  I did *not* say the *content* of ranges are *copied* -- that would be so horribly wrong that I would be thinking twice about using D for my projects. :-D


[...]
> The definition of "what is a forward/non-forward range" for
> struct-based ranges should have been "is this() @disabled (non-forward
> range), or is this() enabled *and* does the same thing as .save
> (forward range)?"
[...]

Yeah, Andrei has admitted before that this is probably what he would do today, if he were given a second chance to design ranges.  But at the time, the landscape of D was rather different, and certain language features didn't exist yet (sorry, can't recall exactly which off the top of my head), so he settled with the compromise that we have today.

As they say, hindsight is always 20/20.  But it wasn't so easy to foresee the consequences at the time when the very concept of ranges was still brand new.


T

-- 
What do you get if you drop a piano down a mineshaft? A flat minor.
March 06, 2018
On Tuesday, March 06, 2018 09:36:43 H. S. Teoh via Digitalmars-d-announce wrote:
> Andrei has said before, and probably on more than one occasion, that if he were to redesign ranges today, one of the things he would do differently was to change the definition of forward range so that .save is basically implicit on copying the range object, and non-forward input ranges would just be reference / non-copyable types.
>
> But that boat has long sailed, and we just have to make do with what we have today. Changing this now will literally break just about *every* D program that uses ranges, which is breakage of an ecosystem-killing magnitude that I can't even contemplate.  I would much rather go with a less intrusive breakage like killing autodecoding with fire, than with something that will basically require me to rewrite practically every D program I ever wrote.

I'm not actually convinced that killing auto-decoding is really much better. As it stands, changing it would break a large percentage of string-based code, and the functions in question sit in std.range.primitives along with all of the other core range stuff such that I don't see how we can change them any more than we can change the basic range API. I would love to be proven wrong, but I don't know how we could change it at this point without code breakage that comes pretty close to the breakage that changing the range API would cause.

- Jonathan M Davis

March 06, 2018
On Tuesday, March 06, 2018 09:41:42 H. S. Teoh via Digitalmars-d-announce wrote:
> As they say, hindsight is always 20/20.  But it wasn't so easy to foresee the consequences at the time when the very concept of ranges was still brand new.

Except that even worse, I'd argue that hindsight really isn't 20/20. We can see a lot of the mistakes that were made, and if we were starting from scratch or otherwise willing to break a lot of code, we could change stuff like the range API based on the lessons learned. But we'd probably still screw it up, because we wouldn't have the experience with the new API to know where it was wrong. Consider all of the stuff that was improved in D over C++ but which still has problems in D (like const). We build on experience to make the new stuff better and frequently lament that we didn't know better in the past, but we still make mistakes when we do new stuff or redesign old stuff. Frequently, the end result is better, but it's rarely perfect.

- Jonathan M Davis

March 06, 2018
On Tue, Mar 06, 2018 at 11:20:56AM -0700, Jonathan M Davis via Digitalmars-d-announce wrote:
> On Tuesday, March 06, 2018 09:41:42 H. S. Teoh via Digitalmars-d-announce wrote:
> > As they say, hindsight is always 20/20.  But it wasn't so easy to foresee the consequences at the time when the very concept of ranges was still brand new.
> 
> Except that even worse, I'd argue that hindsight really isn't 20/20. We can see a lot of the mistakes that were made, and if we were starting from scratch or otherwise willing to break a lot of code, we could change stuff like the range API based on the lessons learned. But we'd probably still screw it up, because we wouldn't have the experience with the new API to know where it was wrong.
[...]

Well, that means *hind*sight is still 20/20: we see where we went wrong, but *fore*sight is still blurry, because what we think is the solution to that wrong may not turn out to be a good solution later. :-D


T

-- 
Question authority. Don't ask why, just do it.
March 06, 2018
On 3/6/18 10:39 AM, Jonathan M Davis wrote:

> Yeah. If you're dealing with generic code rather than a specific range type
> that you know is implicitly saved when copied, you have to use save so often
> that it's painful, and almost no one does it. e.g.
> 
> equal(lhs.save, rhs.save)
> 
> or
> 
> immutable result = range.save.startsWith(needle.save);

Yep. The most frustrating thing about .save to me is that .save is nearly always implemented as:

auto save() { return this; }

This just screams "I really meant just copying".

-Steve
March 06, 2018
On Tue, Mar 06, 2018 at 01:31:39PM -0500, Steven Schveighoffer via Digitalmars-d-announce wrote:
> On 3/6/18 10:39 AM, Jonathan M Davis wrote:
> > Yeah. If you're dealing with generic code rather than a specific range type that you know is implicitly saved when copied, you have to use save so often that it's painful, and almost no one does it. e.g.
> > 
> > equal(lhs.save, rhs.save)
> > 
> > or
> > 
> > immutable result = range.save.startsWith(needle.save);
> 
> Yep. The most frustrating thing about .save to me is that .save is nearly always implemented as:
> 
> auto save() { return this; }
> 
> This just screams "I really meant just copying".

Yeah, and also:

	auto save() {
		auto copy = this;
		copy.blah = blah.dup;
		return this;
	}

Which just screams "I'm really just a postblit in disguise".


T

-- 
This is not a sentence.
March 06, 2018
On Tuesday, March 06, 2018 10:47:36 H. S. Teoh via Digitalmars-d-announce wrote:
> On Tue, Mar 06, 2018 at 01:31:39PM -0500, Steven Schveighoffer via
Digitalmars-d-announce wrote:
> > On 3/6/18 10:39 AM, Jonathan M Davis wrote:
> > > Yeah. If you're dealing with generic code rather than a specific range type that you know is implicitly saved when copied, you have to use save so often that it's painful, and almost no one does it. e.g.
> > >
> > > equal(lhs.save, rhs.save)
> > >
> > > or
> > >
> > > immutable result = range.save.startsWith(needle.save);
> >
> > Yep. The most frustrating thing about .save to me is that .save is nearly always implemented as:
> >
> > auto save() { return this; }
> >
> > This just screams "I really meant just copying".
>
> Yeah, and also:
>
>   auto save() {
>       auto copy = this;
>       copy.blah = blah.dup;
>       return this;
>   }
>
> Which just screams "I'm really just a postblit in disguise".

That's exactly what it is. It's a postblit constructor that you have to call manually and which works for classes and dynamic arrays in addition to structs.

- Jonathan M Davis

March 06, 2018
On Tuesday, 6 March 2018 at 18:17:58 UTC, Jonathan M Davis wrote:
> I'm not actually convinced that killing auto-decoding is really much better.

I don't think the problem is auto-decoding in string range adapters, but repeated validation.
https://issues.dlang.org/show_bug.cgi?id=14519#c32
If you know that sth. works on code units just use .representation.

There is the related annoyance when the user of a function presumably knows to only deal with ASCII strings but algorithms fail, e.g. splitter.popBack or binary search. This one is tricky because broken unicode support is often rooted in ignoring it's existence.
March 06, 2018
On Tuesday, March 06, 2018 19:06:25 Martin Nowak via Digitalmars-d-announce wrote:
> On Tuesday, 6 March 2018 at 18:17:58 UTC, Jonathan M Davis wrote:
> > I'm not actually convinced that killing auto-decoding is really much better.
>
> I don't think the problem is auto-decoding in string range
> adapters, but repeated validation.
> https://issues.dlang.org/show_bug.cgi?id=14519#c32
> If you know that sth. works on code units just use
> .representation.
>
> There is the related annoyance when the user of a function presumably knows to only deal with ASCII strings but algorithms fail, e.g. splitter.popBack or binary search. This one is tricky because broken unicode support is often rooted in ignoring it's existence.

Yes, using stuff like representation or byCodeUnit helps to work around the auto-decoding, but as long as it's there, you have to constantly work around it if you care about efficiency with strings and/or want to be able to retain the original string type where possible. At this point, I think that it's pretty clear that we wouldn't have it if we could do stuff from scratch, but of course, we can't do stuff from scratch, because that would break everything.

- Jonathan M Davis