Thread overview
Re: More tricky range semantics
Jan 16, 2015
H. S. Teoh
Jan 16, 2015
H. S. Teoh
January 15, 2015
On 15/01/15 21:53, H. S. Teoh via Digitalmars-d wrote:
> a) At least as things currently stand, passing (wrapper) ranges around
> may exhibit "undefined" behaviour, like the above. Passing a range to a
> function may invalidate it unless you use .save.  Therefore, one should
> *always* use .save. (If we had passed wrapper.save to equal() instead,
> this problem would not have happened.) This applies even if the wrapper
> range is a by-value type. Or should we say, *especially* when it's a
> by-value type?
>
> b) One may argue that WrapperRange ought to .save the underlying range
> in its postblit... but what if the only thing we wanted to do was to
> call equal() on wrapper and nothing else? In that case, we'd be
> incurring needless overhead of .save'ing the range when we didn't care
> whether it got consumed.
>
> c) This issue is already latent in almost *all* Phobos algorithms.  We
> only haven't discovered it yet because most people just use arrays for
> ranges. But all it takes is for somebody to start using
> std.range.inputRangeObject (and there are cases where this is
> necessary), and this problem will surface.  Anytime you mix by-value and
> by-reference ranges, be ready for problems of this sort to arise.
>
> Yep, range semantics just got even trickier. (And we thought transient
> ranges were bad...)

Yes, there are a lot of complications here.  You may recall the issues discussed numerous times about std.random, and in particular the horrendous problems of value-type RNGs?

Switching to reference types still leaves some nasty possibilities.  For example, suppose you have an RNG that _is_ a reference-type range.  Now you pass it to a function that needs to re-use the range it's given as input, multiple times -- so it calls .save on the range it is given and only uses saved copies.

That function produces its output "correctly" with its .save'd range, but now your original RNG is unchanged, because the values were extracted from a saved copy.  So, the next time you use that RNG, the values it produces will be correlated with the values used in this function.

Ouch!
January 16, 2015
On Thu, Jan 15, 2015 at 03:24:29PM -0800, Andrei Alexandrescu via Digitalmars-d wrote:
> On 1/15/15 12:53 PM, H. S. Teoh via Digitalmars-d wrote:
> >Passing a range to a function may invalidate it unless you use .save. Therefore, one should *always*  use .save.
> 
> That's right. To simplify the problem space we might decree that forward (or better) ranges with reference semantics are not allowed. -- Andrei

That's not workable. It instantly makes InputRangeObject useless.

InputRangeObject is required in situations where run-time polymorphism of ranges is needed, for example, if a particular function must return one of multiple possible range types depending on a runtime parameter. Simple example:

	auto wrapMyRange(R)(R range) {
		if (someRuntimeCondition) {
			return range.map!(a => a*2);
		} else {
			return range.map!(a => a*2 + 1)
				    .filter!(a > 10);
		}
	}

This code doesn't compile, of course, because the return types of map()
and filter() are incompatible. The only current way to make it work is
to wrap the return value in an InputRangeObject:

	auto wrapMyRange(R)(R range) {
		if (someRuntimeCondition) {
			return inputRangeObject(range.map!(a => a*2));
		} else {
			return inputRangeObject(range.map!(a => a*2 + 1)
				.filter!(a > 10));
		}
	}

This works because inputRangeObject returns an instance of a subclass of InputRangeObject, which serves as the common return type. The concrete type is specialized for the actual type being wrapped; in this case either the return type of map(), or the return type of filter(). Note that methods like .save, .popBack, etc., are forwarded, so the returned range retains forward or higher range functionality. This is, of course, necessary, since otherwise you couldn't do anything with the returned wrapped range except iterate over it once.

Forcing reference type ranges to be non-forward completely breaks this important use case. The only way to work around it would be to wrap the class object in a struct wrapper that calls .save in its postblit -- which introduces a prohibitive performance overhead.


T

-- 
Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry
January 16, 2015
On Thu, Jan 15, 2015 at 03:25:38PM -0800, Andrei Alexandrescu via Digitalmars-d wrote:
> On 1/15/15 12:53 PM, H. S. Teoh via Digitalmars-d wrote:
> >This issue is already latent in almost*all*  Phobos algorithms.  We only haven't discovered it yet because most people just use arrays for ranges. But all it takes is for somebody to start using std.range.inputRangeObject (and there are cases where this is necessary), and this problem will surface.
> 
> There's a distinction here. Input non-forward ranges can be considered
> "reference" because popFront()ing any copy is tantamount to
> popFront()int any other copy. -- Andrei

I hope you realize that inputRangeObject, in spite of its name, does forward methods of the higher ranges (.save, .popBack, etc.), right?

Besides, conflating reference types with non-forward input ranges will cripple ranges built not only from class objects, but from *any* type (even structs) that exhibit reference semantics. One particular poignant example is your proposed groupBy replacement, which uses RefCounted, which has reference semantics. :-) We wouldn't want to be breaking that now, would we?

(On the flip side, perhaps now you might finally see some justification for my hesitation about implementing groupBy with reference semantics...)


T

-- 
Let's eat some disquits while we format the biskettes.
January 16, 2015
On 1/15/15 4:46 PM, H. S. Teoh via Digitalmars-d wrote:
> On Thu, Jan 15, 2015 at 03:24:29PM -0800, Andrei Alexandrescu via Digitalmars-d wrote:
>> On 1/15/15 12:53 PM, H. S. Teoh via Digitalmars-d wrote:
>>> Passing a range to a function may invalidate it unless you use .save.
>>> Therefore, one should *always*  use .save.
>>
>> That's right. To simplify the problem space we might decree that
>> forward (or better) ranges with reference semantics are not allowed.
>> -- Andrei
>
> That's not workable. It instantly makes InputRangeObject useless.

Indeed. That said we could achieve polymorphism with value objects; it's just more complicated. I agree it's a tricky matter. -- Andrei