Jump to page: 1 2 3
Thread overview
Re: Article: Why Const Sucks
Mar 05, 2018
H. S. Teoh
Mar 06, 2018
Jonathan M Davis
Mar 06, 2018
Adam D. Ruppe
Mar 06, 2018
Jonathan M Davis
Mar 06, 2018
jmh530
Mar 06, 2018
H. S. Teoh
Mar 06, 2018
Jonathan M Davis
Mar 06, 2018
SimonN
Mar 06, 2018
H. S. Teoh
Mar 06, 2018
H. S. Teoh
Mar 07, 2018
Simen Kjærås
Mar 06, 2018
Jonathan M Davis
Mar 06, 2018
H. S. Teoh
Mar 06, 2018
Martin Nowak
Mar 05, 2018
Jonathan M Davis
Mar 09, 2018
Nathan S.
Mar 05, 2018
H. S. Teoh
Mar 06, 2018
Jonathan M Davis
Mar 06, 2018
Martin Nowak
Mar 06, 2018
Jonathan M Davis
March 05, 2018
On Mon, Mar 05, 2018 at 03:57:35AM -0700, Jonathan M Davis via Digitalmars-d-announce wrote:
> Here's something I wrote up on const:
> 
> http://jmdavisprog.com/articles/why-const-sucks.html
> 
> I suppose that it's not exactly the most positive article, but I feel that it's accurate.
[...]

Yeah, I found myself in the same boat recently.  As you often say, const and ranges simply don't mix.  And modern idiomatic D is 90% about ranges, so that alone instantly reduces the scope of const's usefulness by a lot.

A case in point: I was implementing a container recently, and wrote an opSlice() method to return a range over its elements. My initial thought was that it could be made const, since the range wouldn't mutate the underlying elements, but only represent a mutable slice over a const container, i.e., the slice can mutate, but the elements cannot, just like iterating over string (== immutable(char)[]). Should be easy, right?

	struct Container {
		auto opSlice() const {
			static struct Result {
				private Container impl;
				private int n; // internal mutable state
				@property bool empty() { ... }
				... // rest of range API
			}
			return Result(this);
		}
	}

Well, that didn't work, because in opSlice, `this` is const, and I can't initialize Result.impl which is a mutable Container. Well, no biggie, just make it const:

	struct Container {
		auto opSlice() const {
			static struct Result {
				private const(Container) impl;
				private int n; // internal mutable state
				@property bool empty() { ... }
				... // rest of range API
			}
			return Result(this);
		}
	}

At first, this worked, and it seems that I could have my const cake and eat it too.  Until I decided at some point that I needed to make it a forward range, which requires a .save method. So I tried:

	...
	static struct Result {
		private const(Container) impl;
		...
		@property Result save() {
			Result copy = this;
			return this;
		}
	}

That seemed to do the trick, everything compiles and works.  But I soon ran into an unfixable problem: the resulting range, while it works in the simplest cases, started causing mysterious compile errors when I tried to use it with Phobos range algorithms. Eventually, I discovered that the underlying problem was that Result, as defined above, was a struct with a const member, and therefore it was illegal to assign it to a variable of the same type outside of initialization (since doing do meant you were overwriting a const field with something else, which violates the constness of the field).  This broke the by-value assumption inherent in much of Phobos code, so the resulting range ended being unusable with most Phobos algorithms.  Which defeated the whole purpose in the first place.

While I'm sure with enough time and patience Phobos could be fixed to support this kind of range, the decision I was faced with was: should I (1) persist in using const, and thereby stall my project while I work on huge chunks of Phobos range algorithms to make them usable with ranges that have const members (not to mention spending how much time waiting in the Phobos PR queue and potentially getting things rejected because it might cause breakage of unknown amounts of existing code), or (2) just remove `const` from my code, and be able to continue with my project *right now*?  The choice was a no-brainer, sad to say.

So yeah, while D's const provides actual guarantees unlike C++'s laughable const-by-documentation, that also limits its scope so much that in practice, it's rarely ever used outside of built-in types like string. Which also limits the usefulness of its guarantees so much that it's questionable whether it's actually worth the effort.


T

-- 
It won't be covered in the book. The source code has to be useful for something, after all. -- Larry Wall
March 05, 2018
On Monday, March 05, 2018 09:38:52 H. S. Teoh via Digitalmars-d-announce wrote:
> Eventually, I discovered
> that the underlying problem was that Result, as defined above, was a
> struct with a const member, and therefore it was illegal to assign it to
> a variable of the same type outside of initialization (since doing do
> meant you were overwriting a const field with something else, which
> violates the constness of the field).  This broke the by-value
> assumption inherent in much of Phobos code, so the resulting range ended
> being unusable with most Phobos algorithms.  Which defeated the whole
> purpose in the first place.

Honestly, I've come to the conclusion that structs should never have const or immutable members. It just causes too many problems. Treating them as read-only from the outside by having them be private and have member functions be const is fine (assuming that const works in that case), and having them work when the entire object gets marked as const is great (assuming that const works in that case), but I think that it's pretty much always a mistake to make individual member variables of a struct const or immutable.

Classes don't have the same problem, because they're on the heap and don't get copied, but with structs being on the stack and very much being designed with copying in mind, members that can't be copied becomes a definite problem.

Tail-const and tail-immutable would work fine with member variables in structs, but that basically means that the data for those members has to be on the heap, which isn't always a reasonable option.

> So yeah, while D's const provides actual guarantees unlike C++'s laughable const-by-documentation, that also limits its scope so much that in practice, it's rarely ever used outside of built-in types like string. Which also limits the usefulness of its guarantees so much that it's questionable whether it's actually worth the effort.

Exactly.

- Jonathan M Davis

March 05, 2018
On Mon, Mar 05, 2018 at 11:04:49AM -0700, Jonathan M Davis via Digitalmars-d-announce wrote:
> On Monday, March 05, 2018 09:38:52 H. S. Teoh via Digitalmars-d-announce wrote:
> > Eventually, I discovered that the underlying problem was that Result, as defined above, was a struct with a const member, and therefore it was illegal to assign it to a variable of the same type outside of initialization (since doing do meant you were overwriting a const field with something else, which violates the constness of the field).  This broke the by-value assumption inherent in much of Phobos code, so the resulting range ended being unusable with most Phobos algorithms.  Which defeated the whole purpose in the first place.
> 
> Honestly, I've come to the conclusion that structs should never have const or immutable members. It just causes too many problems. Treating them as read-only from the outside by having them be private and have member functions be const is fine (assuming that const works in that case), and having them work when the entire object gets marked as const is great (assuming that const works in that case), but I think that it's pretty much always a mistake to make individual member variables of a struct const or immutable.
[...]

Yeah, but in this case, since `this` is const, there's simply no way to get around the fact that there must be `const` somewhere in the Result struct.  The D compiler will not accept a mutable member referencing `this` that has a const access method, since that in theory breaks the const guarantee.  I suppose replacing `const(Container)` with tail-const references to Container's innards would fix the problem, but it would uglify the code too much and would be far too much effort just to be able to say "we support const", that it's simply not worth it.

Also, structs with const/immutable members are a rare case allowed by the language but almost never tested for in Phobos, so you can pretty much expect random things to break left, right, and center if you ever attempt to use such a struct with Phobos functions.  In fact, I vaguely remember that even the compiler may have bugs / strange behaviours if you try to use such structs in non-trivial ways.


T

-- 
Never step over a puddle, always step around it. Chances are that whatever made it is still dripping.
March 05, 2018
On 03/05/2018 12:38 PM, H. S. Teoh wrote:
> 
> This broke the by-value
> assumption inherent in much of Phobos code,

Wait, seriously? Phobos frequently passes ranges by value? I sincerely hope that's only true for class-based ranges and forward-ranges (and more specifically, only forward ranges where copying the range and calling .save are designed to do the exact same thing). Otherwise, that's really, *REALLY* bad since non-forward ranges *by definition* cannot be duplicated.

Honestly, I think this is the one big flaw in the otherwise really nice design of ranges.

The definition of "what is a forward/non-forward range" for struct-based ranges should have been "is this() @disabled (non-forward range), or is this() enabled *and* does the same thing as .save (forward range)?"

Without that, this is a serious hole in non-forward ranges.
March 05, 2018
On Monday, March 05, 2018 22:21:47 Nick Sabalausky  via Digitalmars-d- announce wrote:
> On 03/05/2018 12:38 PM, H. S. Teoh wrote:
> > This broke the by-value
> > assumption inherent in much of Phobos code,
>
> Wait, seriously? Phobos frequently passes ranges by value? I sincerely hope that's only true for class-based ranges and forward-ranges (and more specifically, only forward ranges where copying the range and calling .save are designed to do the exact same thing). Otherwise, that's really, *REALLY* bad since non-forward ranges *by definition* cannot be duplicated.
>
> Honestly, I think this is the one big flaw in the otherwise really nice design of ranges.
>
> The definition of "what is a forward/non-forward range" for struct-based
> ranges should have been "is this() @disabled (non-forward range), or is
> this() enabled *and* does the same thing as .save (forward range)?"
>
> Without that, this is a serious hole in non-forward ranges.

Passing ranges around by value is fine so long as you don't use the original after the copy is made. Where you get screwed is when you then use the original after the copy has been made. Almost nothing in Phobos passed ranges around by ref, and doing so would actually make it a royal pain to iteract with forward ranges, because save obviously isn't an lvalue, and range-based functions that return new ranges aren't returning lvalues. So, if range-based functions took their arguments by ref, then you couldn't chain them. And using auto ref wouldn't fix the problem, since you could still pass by value. It would just introduce all kinds of inconsistent behavior as to whether a range was copied or not depending on how exactly a range-based function were called, causing more bugs.

Honestly, I think that the correct way to implement forward ranges would have been to disallowing ranges that weren't dynamic arrays or structs and then use postblit constructors instead of save (classes could then be used as ranges by wrapping them in structs, though even then, it would be better to avoid classes as ranges, because all of those calls to new get to be inefficent). With that, you wouldn't have all of these problems with accidentally saving or not. Any time a forward range was copied, it would be saved automatically, unless it could be moved, in which case, saving wasn't necessary.

Unfortunately, that still leaves the problem of basic input ranges, since they wouldn't have postblit constructors, and they could still be defined as pseudo-reference types. Maybe we could require them to be defined as classes to force them to be full-on reference types (they obviously can't be value types, or they could be forward ranges), but then that would force allocations for basic input ranges. _Most_ ranges can be at least forward ranges, but some stuff can't reasonably be, and you wouldn't want to have to allocate all of those on the heap. So, I don't have a clean solution for how to deal with basic input ranges and copying, though I haven't sat down recently and tried to work through the problem. In principle though, they're reference types and ideally would be treated as such.

Regardless, I doubt that the design of ranges is going to be changed at this point given the amount of code that would break as a result, and these sort of changes are not backwards compatible.

- Jonathan M Davis

March 06, 2018
On 3/6/18 1:49 AM, Jonathan M Davis wrote:

> Regardless, I doubt that the design of ranges is going to be changed at this
> point given the amount of code that would break as a result, and these sort
> of changes are not backwards compatible.

I sometimes think we would be better off to drop InputRange, and base everything on the assumption that it can be copied (where isInputRange is renamed to isForwardRange, and `save` goes away). Then you could use @disable postblit to mimic what InputRange would have been.

I've never seen the point of having classes be ranges.

-Steve
March 06, 2018
On Tuesday, 6 March 2018 at 03:21:47 UTC, Nick Sabalausky (Abscissa) wrote:
> Wait, seriously? Phobos frequently passes ranges by value?

You *should* pass most ranges by value, just like how you should rarely use `ref T[]` or `T[]*`. Ranges, like slices, are typically already small references to some other container.

Where Phobos effs it up is not follow its own rules on range.save in most cases... Jonathan talked about this at dconf IIRC in 2015.

> The definition of "what is a forward/non-forward range" for struct-based ranges should have been "is this() @disabled (non-forward range), or is this() enabled *and* does the same thing as .save (forward range)?"

yeah.
March 06, 2018
On Tuesday, March 06, 2018 15:23:52 Adam D. Ruppe via Digitalmars-d-announce wrote:
> On Tuesday, 6 March 2018 at 03:21:47 UTC, Nick Sabalausky
>
> (Abscissa) wrote:
> > Wait, seriously? Phobos frequently passes ranges by value?
>
> You *should* pass most ranges by value, just like how you should rarely use `ref T[]` or `T[]*`. Ranges, like slices, are typically already small references to some other container.
>
> Where Phobos effs it up is not follow its own rules on range.save in most cases... Jonathan talked about this at dconf IIRC in 2015.

Yeah. If you're dealing with generic code rather than a specific range type that you know is implicitly saved when copied, you have to use save so often that it's painful, and almost no one does it. e.g.

equal(lhs.save, rhs.save)

or

immutable result = range.save.startsWith(needle.save);

How well Phobos has done with this has improved over time as more and better testing has been added (testing for reference type ranges is probably the most critical to finding this particular problem), but I doubt that Phobos has it right everywhere, and I'm sure that the average programmer's code has tons of these problems. Code mostly just works because most code really isn't used with arbitrary ranges, and a large percentage of ranges implicitly save on copy. It's not that hard to get a piece of code working with a particular range or just a few similar range types. It's when it needs to work with _any_ range type that matches the template constraint that things start getting hairy, and without thorough testing, it simply doesn't happen unless the code is very simple and the programmer in question is very mindful of stuff like save (which most programmers aren't). And of course, since most testing is done with dynamic arrays, issues with other range types simply aren't found unless the programmer is really putting in the effort to do their due diligence.

Ranges are wonderfully powerful, but they become a royal pain to get right with truly generic code. And that's without getting into all of the arguments about whether stuff like whether transitive fronts should be allowed...

Ranges are definitely one area where we could really use some redesign to iron out some of the issues that we've found over time, but their success makes them almost impossible to fix, because changing them would break tons of code. But annoyingly, that's often what happens when you implement a new idea. You simply don't have enough knowledge about it ahead of time to avoid mistakes; those are easy enough to make when you really know what you're doing, let alone with something new.

- Jonathan M Davis

March 06, 2018
On Tuesday, 6 March 2018 at 15:23:52 UTC, Adam D. Ruppe wrote:
> On Tuesday, 6 March 2018 at 03:21:47 UTC, Nick Sabalausky (Abscissa) wrote:
>> The definition of "what is a forward/non-forward range" for struct-based ranges should have been "is this() @disabled (non-forward range), or is this() enabled *and* does the same thing as .save (forward range)?"
>
> yeah.

I've skimmed through my range-using code, and indeed, very often I assume that the ranges are copied even without calling .save() when passing them to Phobos. That works out of the box -- or maybe I haven't hit the corner cases yet.

The sharpest redefinition of "forward range" would be: All ranges are struct-based, and a forward range is an input range that allows copy-construction/assignment. The save() method would not be part of the definition anymore.

That relies on the design guideline that ranges should be lightweight views into underlying containers. Here, it seems natural to design most of your input ranges as forward ranges anyway. With the redefinition (forward if copyable), forward ranges arise for free from input ranges. Without such a redefinition, all range designers must remember to implement .save(), even though this 4th member function is far less prominent than the 3 input range methods.

-- Simon
March 06, 2018
On Tuesday, 6 March 2018 at 15:39:41 UTC, Jonathan M Davis wrote:
> [snip]
>
> How well Phobos has done with this has improved over time as more and better testing has been added (testing for reference type ranges is probably the most critical to finding this particular problem), but I doubt that Phobos has it right everywhere, and I'm sure that the average programmer's code has tons of these problems. Code mostly just works because most code really isn't used with arbitrary ranges, and a large percentage of ranges implicitly save on copy. It's not that hard to get a piece of code working with a particular range or just a few similar range types. It's when it needs to work with _any_ range type that matches the template constraint that things start getting hairy, and without thorough testing, it simply doesn't happen unless the code is very simple and the programmer in question is very mindful of stuff like save (which most programmers aren't). And of course, since most testing is done with dynamic arrays, issues with other range types simply aren't found unless the programmer is really putting in the effort to do their due diligence.

Someone could make a range testing library of sorts. Really just a variant packed with a bunch of different range types covering a variety of use cases. The user could import it in a version(unittest) block and then just loop through it and assert that the function works for all of them (would need some kind of variant full of what to compare it to for each type).

>
> Ranges are wonderfully powerful, but they become a royal pain to get right with truly generic code. And that's without getting into all of the arguments about whether stuff like whether transitive fronts should be allowed...
>
> Ranges are definitely one area where we could really use some redesign to iron out some of the issues that we've found over time, but their success makes them almost impossible to fix, because changing them would break tons of code. But annoyingly, that's often what happens when you implement a new idea. You simply don't have enough knowledge about it ahead of time to avoid mistakes; those are easy enough to make when you really know what you're doing, let alone with something new.
>
> - Jonathan M Davis

There's probably value in writing up a retrospective of sorts on D's ranges. What works about the design, what are the limitations, and what could be improved (depending on the level of breakage considered acceptable). Even if D takes a long time to fix these issues, making the information more easily available to others outside the D community might be valuable.
« First   ‹ Prev
1 2 3