Tricky semantics of ranges & potentially numerous Phobos bugs (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Tricky semantics of ranges & potentially numerous Phobos bugs (page 4)

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by H. S. Teoh
in reply to Andrei Alexandrescu

H. S. Teoh

Posted in reply to Andrei Alexandrescu

On Wed, Oct 17, 2012 at 11:17:08AM -0400, Andrei Alexandrescu wrote:
> On 10/16/12 1:59 AM, Jonathan M Davis wrote:
> >On Monday, October 15, 2012 22:48:15 Jonathan M Davis wrote:
> >>So, I don't really know what the right answer is, but I _really_ don't like the idea of having to worry about the result of front changing after a call to popFront in every single function that ever uses front. In the general case, I just don't see how that's tenable. I'd _much_ rather that it be up to the programmer using abnormal ranges such as ByLine to use them correctly.
> >
> >And actually, it seems to me that issues like this make it look like it was a mistake to make ranges like ByLine ranges in the first place. They should have just defined opApply so that they'd work nicely in foreach but not with range- based functions. They're clearly not going to work with a _lot_ of range-based functions.
> 
> I think integration of pure streams within ranges is important and beneficial.
[...]

I agree, and I think that's what makes ranges such a powerful concept. I'd hate to lose it just over implementation details like popFront() pulling the carpet from under a few functions that assume the persistence of what .front returns. I'd much rather live with an isTransient property (which is only needed in a very few cases, and only needs to be checked in a handful of places where the dependence matters, and so is not a big inconvenience at all), than to devolve into Java's wrapper upon adapter upon wrapper design.

T

-- 
Give a man a fish, and he eats once. Teach a man to fish, and he will sit forever.

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by Jonathan M Davis
in reply to Andrei Alexandrescu

Jonathan M Davis

Posted in reply to Andrei Alexandrescu

On Wednesday, October 17, 2012 11:17:08 Andrei Alexandrescu wrote:
> I think integration of pure streams within ranges is important and beneficial.

The problem isn't that it's a stream. The problem is that it reuses the buffer that it returns, which is great for efficiency but horrible for algorithms which actually try and use it as a range. Having to worry about front being invalidated after a call to popFront would be a big problem - especially when that's not at all normal behavior. And in many cases, you _can't_ code algorithms in a way that supports that (e.g. if it needs to compare the result of two subsequent calls to front). Having to worry about such ranges is just going to complicate things even more. We already have enough issues with special cases as it is - like save not being called because reference type ranges are so rare, and least that's actually part of the range API, whereas the idea that front could be invalidated really isn't.

- Jonathan M Davis

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by Andrei Alexandrescu
in reply to H. S. Teoh

Andrei Alexandrescu

Posted in reply to H. S. Teoh

On 10/16/12 11:17 AM, H. S. Teoh wrote:
> OTOH, if we clearly state that .front may not persist past the next
> popFront(), then it would be up to the caller to .dup the return value,
> or otherwise make a safe copy of it (or use the value before calling
> popFront()).
>
> The current ambiguous situation leads to one range doing one thing, and
> another range doing something else, and either way, either this code
> will break or that code will break.

Input ranges don't guarantee preservation of .front upon calling .popFront. They do allow several to .front (without an intervening call to .popFront) that should return the same result (i.e. a one-element buffer).

Andrei

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by H. S. Teoh
in reply to Andrei Alexandrescu

H. S. Teoh

Posted in reply to Andrei Alexandrescu

On Wed, Oct 17, 2012 at 12:22:55PM -0400, Andrei Alexandrescu wrote:
> On 10/16/12 11:17 AM, H. S. Teoh wrote:
> >OTOH, if we clearly state that .front may not persist past the next
> >popFront(), then it would be up to the caller to .dup the return
> >value, or otherwise make a safe copy of it (or use the value before
> >calling popFront()).
> >
> >The current ambiguous situation leads to one range doing one thing, and another range doing something else, and either way, either this code will break or that code will break.
> 
> Input ranges don't guarantee preservation of .front upon calling .popFront. They do allow several to .front (without an intervening call to .popFront) that should return the same result (i.e. a one-element buffer).
[...]

This is contrary to what Jonathan has been saying. So which is it: should .front return a persistent value, or a transient value that may get invalidated by popFront? I have been assuming it's the latter, but others have been saying it's the former.

As Jonathan pointed out, some algorithms won't work with transient .front values. I can think of one: findAdjacent, which requires that the previous value of .front be preserved (otherwise you couldn't compare it with the following element -- the template has no reliable way of saving the previous value since we don't have a reliable way to deep-copy a possibly complex data structure that might be getting overwritten by popFront).

Other algorithms, like joiner, currently don't work with transient .front values, but can be made to work by tweaking the implementation. People will hate me for saying this, but I think Phobos algorithms *should* be written to work with minimal expectations, so I don't consider it unreasonable to expect std.algorithm to work with transient .front values where possible. (User code is a different matter, of course). There *are* cases for which it can't work, which is why I proposed the isTransient property, but people didn't seem to like that idea.

T

-- 
They pretend to pay us, and we pretend to work. -- Russian saying

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by Andrei Alexandrescu
in reply to Jonathan M Davis

Andrei Alexandrescu

Posted in reply to Jonathan M Davis

On 10/16/12 1:28 PM, Jonathan M Davis wrote:
> So, it's fine that ByLine is a range as long as we're willing to put up with it
> not working with a lot of range-based functions because of its abnormal
> behavior. But I don't think that it's at all reasonable for range-based
> functions in general to not be able to rely on front returning the same type
> every time or on its value disappearing or becoming invalid in some way after
> a call to popFront. That's completely untenable IMHO.

Then what is to you the difference between an input range and a forward range?

Andrei

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by H. S. Teoh
in reply to Andrei Alexandrescu

H. S. Teoh

Posted in reply to Andrei Alexandrescu

On Wed, Oct 17, 2012 at 12:39:13PM -0400, Andrei Alexandrescu wrote:
> On 10/16/12 1:28 PM, Jonathan M Davis wrote:
> >So, it's fine that ByLine is a range as long as we're willing to put up with it not working with a lot of range-based functions because of its abnormal behavior. But I don't think that it's at all reasonable for range-based functions in general to not be able to rely on front returning the same type every time or on its value disappearing or becoming invalid in some way after a call to popFront. That's completely untenable IMHO.
> 
> Then what is to you the difference between an input range and a forward range?
[...]

So what you're saying, is that we cannot rely on the value of .front after popFront() is called, and that the only way to ensure the validity of .front is to use a forward range's .save, and access the saved copy's .front?

It makes sense to me, but then we'd need to revise quite a lot of code in Phobos, as I've seen quite a number of places where .front is cached in some local variable or struct field, which would be invalid by your definition.

T

-- 
This is not a sentence.

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by Andrei Alexandrescu
in reply to H. S. Teoh

Andrei Alexandrescu

Posted in reply to H. S. Teoh

On 10/16/12 3:41 PM, H. S. Teoh wrote:
> I think with D's compile-time introspection capabilities, it _should_ be
> possible to implement a generic deepCopy template that works for any
> type.
>
> Of course, then one has to address tricky issues such as complex data
> structures that have interlinking parts; a blind recursive deep-copy may
> not have the desired effect (e.g., if we're deep-copying a graph and
> there are multiple paths (via references) to the same node, then we
> shouldn't end up with multiple copies of that node). Some care will also
> need to be taken to deal with cyclic structures, etc.. And some
> optimizations can probably be done to avoid copying immutable objects,
> since that would be a waste of time&  memory.
>
> Probably some kind of graph traversal algorithm can be used to address
> these issues, I think, perhaps with the help of an AA or two to recreate
> the original linking structure in the copy.

Yes, deepDup should be implementable as a library and use a temporary dictionary of already-duplicated items to avoid infinite recursion.

We should add that to Phobos - could you please add a task to trello.


Andrei

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by Andrei Alexandrescu
in reply to H. S. Teoh

Andrei Alexandrescu

Posted in reply to H. S. Teoh

On 10/16/12 5:38 PM, H. S. Teoh wrote:
> I was thinking probably a bool[void*] should work as a way of keeping
> track of what has already been copied, so a simple depth-first traversal
> should work.

Should be void*[void*] to also get the actual new location.

Andrei

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by Andrei Alexandrescu
in reply to bearophile

Andrei Alexandrescu

Posted in reply to bearophile

On 10/17/12 5:59 AM, bearophile wrote:
> monarch_dodra:
>
>> Can't we just have a "byLineDeep" or "byLineSafe", but that returns
>> actual deep copied immutable strings?
>
> See this old enhancement of mine they have closed:
> http://d.puremagic.com/issues/show_bug.cgi?id=4474

You closed it.

Also that bug has nothing to do with this topic because you can't compile the buggy code and have it silently do the wrong thing.

Andrei

October 17, 2012

Re: Tricky semantics of ranges & potentially numerous Phobos bugs

Posted by Andrei Alexandrescu
in reply to H. S. Teoh

Andrei Alexandrescu

Posted in reply to H. S. Teoh

On 10/17/12 12:39 PM, H. S. Teoh wrote:
> On Wed, Oct 17, 2012 at 12:22:55PM -0400, Andrei Alexandrescu wrote:
>> On 10/16/12 11:17 AM, H. S. Teoh wrote:
>>> OTOH, if we clearly state that .front may not persist past the next
>>> popFront(), then it would be up to the caller to .dup the return
>>> value, or otherwise make a safe copy of it (or use the value before
>>> calling popFront()).
>>>
>>> The current ambiguous situation leads to one range doing one thing,
>>> and another range doing something else, and either way, either this
>>> code will break or that code will break.
>>
>> Input ranges don't guarantee preservation of .front upon calling
>> .popFront. They do allow several to .front (without an intervening
>> call to .popFront) that should return the same result (i.e. a
>> one-element buffer).
> [...]
>
> This is contrary to what Jonathan has been saying. So which is it:
> should .front return a persistent value, or a transient value that may
> get invalidated by popFront? I have been assuming it's the latter, but
> others have been saying it's the former.

The latter for input ranges, the former for anything stronger.

> As Jonathan pointed out, some algorithms won't work with transient
> .front values. I can think of one: findAdjacent, which requires that the
> previous value of .front be preserved (otherwise you couldn't compare it
> with the following element -- the template has no reliable way of saving
> the previous value since we don't have a reliable way to deep-copy a
> possibly complex data structure that might be getting overwritten by
> popFront).

findAdjacent correctly restricts itself to work only with forward ranges.

> Other algorithms, like joiner, currently don't work with transient
> .front values, but can be made to work by tweaking the implementation.
> People will hate me for saying this, but I think Phobos algorithms
> *should* be written to work with minimal expectations, so I don't
> consider it unreasonable to expect std.algorithm to work with transient
> .front values where possible. (User code is a different matter, of
> course). There *are* cases for which it can't work, which is why I
> proposed the isTransient property, but people didn't seem to like that
> idea.

No isTransient is needed.


Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation