Transient ranges (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Transient ranges (page 3)

May 29, 2016

Re: Transient ranges

Posted by Steven Schveighoffer
in reply to Seb

Steven Schveighoffer

Posted in reply to Seb

On 5/27/16 7:42 PM, Seb wrote:

> So what about the convention to explicitely declare a `.transient` enum
> member on a range, if the front element value can change?

enum isTransient(R) = is(typeof(() {
   static assert(isInputRange!R);
   static assert(hasIndirections(ElementType!R));
   static assert(!allIndrectionsImmutable!(ElementType!R)); // need to write this
}));

-Steve

May 29, 2016

Re: Transient ranges

Posted by Steven Schveighoffer
in reply to Steven Schveighoffer

Steven Schveighoffer

Posted in reply to Steven Schveighoffer

On 5/29/16 1:45 PM, Steven Schveighoffer wrote:
> On 5/27/16 7:42 PM, Seb wrote:
>
>> So what about the convention to explicitely declare a `.transient` enum
>> member on a range, if the front element value can change?
>
> enum isTransient(R) = is(typeof(() {
>    static assert(isInputRange!R);
>    static assert(hasIndirections(ElementType!R));
>    static assert(!allIndrectionsImmutable!(ElementType!R)); // need to
> write this
> }));

obviously, this is better as a simple && statement between the three requirements :) When I started writing, I thought I'd have to write some runtime code.

-Steve

May 29, 2016

Re: Transient ranges

Posted by default0
in reply to Steven Schveighoffer

default0

Posted in reply to Steven Schveighoffer

On Sunday, 29 May 2016 at 18:09:29 UTC, Steven Schveighoffer wrote:
> On 5/29/16 1:45 PM, Steven Schveighoffer wrote:
>> On 5/27/16 7:42 PM, Seb wrote:
>>
>>> So what about the convention to explicitely declare a `.transient` enum
>>> member on a range, if the front element value can change?
>>
>> enum isTransient(R) = is(typeof(() {
>>    static assert(isInputRange!R);
>>    static assert(hasIndirections(ElementType!R));
>>    static assert(!allIndrectionsImmutable!(ElementType!R)); // need to
>> write this
>> }));
>
> obviously, this is better as a simple && statement between the three requirements :) When I started writing, I thought I'd have to write some runtime code.
>
> -Steve

Would that make a range of polymorphic objects transient?

May 30, 2016

Re: Transient ranges

Posted by Alex Parrill
in reply to Steven Schveighoffer

Alex Parrill

Posted in reply to Steven Schveighoffer

On Sunday, 29 May 2016 at 17:45:00 UTC, Steven Schveighoffer wrote:
> On 5/27/16 7:42 PM, Seb wrote:
>
>> So what about the convention to explicitely declare a `.transient` enum
>> member on a range, if the front element value can change?
>
> enum isTransient(R) = is(typeof(() {
>    static assert(isInputRange!R);
>    static assert(hasIndirections(ElementType!R));
>    static assert(!allIndrectionsImmutable!(ElementType!R)); // need to write this
> }));
>
> -Steve

allIndrectionsImmutable could probably just be is(T : immutable) (ie implicitly convertible to immutable). Value types without (or with immutable only) indirections should be convertible to immutable, since the value is being copied.

May 29, 2016

Re: Transient ranges

Posted by Jonathan M Davis
in reply to Steven Schveighoffer

Jonathan M Davis

Posted in reply to Steven Schveighoffer

On Sunday, May 29, 2016 13:36:24 Steven Schveighoffer via Digitalmars-d wrote:
> On 5/27/16 9:48 PM, Jonathan M Davis via Digitalmars-d wrote:
> > On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote:
> >> So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change?
> >
> > Honestly, I don't think that supporting transient ranges is worth it.
> > Every
> > single range-based function would have to either test that the "transient"
> > enum wasn't there or take transient ranges into account, and
> > realistically,
> > that isn't going to happen. For better or worse, we do have byLine in
> > std.stdio, which has a transient front, but aside from the performance
> > benefits, it's been a disaster.
>
> Wholly disagree. If we didn't cache the element, D would be a laughingstock of performance-minded tests.

Having byLine not copy its buffer is fine. Having it be a range is not. Algorithms in general just do not play well with that behavior, and I don't think that it's reasonable to expect them to.

> > It's way too error-prone. We now have
> > byLineCopy to combat that, but of course, byLine is the more obvious
> > function and thus more likely to be used (plus it's been around longer),
> > so
> > a _lot_ of code is going to end up using it, and a good chunk of that code
> > really should be using byLineCopy.
>
> There's nothing actually wrong with using byLine, and copying on demand. Why such a negative connotation?

Because it does not play nicely with ranges, and aside from a few rare ranges like byLine that have to deal directly with I/O, transience isn't even useful. Having an efficient solution that plays nicely with I/O is definitely important, but it doesn't need to be a range, especially when it complicates ranges in general. byLine doesn't even work with std.array.array, and if even that doesn't work, I don't see how a range could be considered well-behaved.

> > I'm of the opinion that if you want a transient front, you should just use opApply and skip ranges entirely.
>
> So you want to make this code invalid? Why?
>
> foreach(i; map!(a => a.to!int)(stdin.byLine))
> {
>     // process each integer
>     ...
> }
>
> You want to make me copy each line to a heap-allocated string so I can parse it?!!

If it's a range, then it can be passed around to other algorithms with impunity, and almost nothing is written with the idea that a range's front is transient. There's no way to check for transience, and I don't think that it's even vaguely worth adding yet another range primitive that has to be checked for everywhere just for this case. Transience does _not_ play nicely with algorithms in general.

Using opApply doesn't completely solve the problem (since the buffer could still escape - we'd need some kind of scope attribute or wrapper to fix that problem), but it makes it so that you can't pass such a a range around and run into problems with all of the algorithms that don't play nicely with it. So, instead, you end up with code that looks something like

foreach(line; stdin.byLine())
{
    auto i = line.to!int();
    ...
}

And yes, it's slightly longer, but it prevents a whole class of bugs by not having it be a range with a transient front.

> > Allowing for front to be transient -
> > whether you can check for it or not - simply is not worth the extra
> > complications. I'd love it if we deprecated byLine's range functions, and
> > made it use opApply instead and just declare transient ranges to be
> > completely unsupported. If you want to write your code to have a transient
> > front, you can obviously take that risk, but you're on your own.
>
> There is no way to disallow front from being transient. In fact, it should be assumed that it is the default unless it's wholly a value-type.

Pretty much no range-based code is written with the idea that front is transient. It's pretty much the opposite. Unfortunately, we can't check for all of the proper range semantics at compile time (be it having to do with transience, the fact that front needs to be the same every time until popFront is called, that save has to actually result in a range that will have exactly the same elements, or whatever other runtime behavior that ranges are supposed to adhere to), but just because something can't be checked for doesn't mean that it should be considered reasonable or valid. IMHO, a range with a transient front should be considered as valid as a range that returns a different value every time that front is called without popFront having been called. Neither can be tested for, but both cause problems.

If we're going to support transience, then we _need_ to have some sort of flag/enum in the type to indicate that the range is transient, but that complicates everything, because then all range implementations have to check for it and pass it on when they wrap that type, and many algorithms will have to expclicitly check for it in their template constraints to make it invalid. You end up with a whole lot of extra machinery in range-based code to support a very small number of ranges.

The number of things that range-based code has to check for is already arguably way too high without adding yet more into the mix.

- Jonathan M Davis

May 30, 2016

Re: Transient ranges

Posted by Jack Stouffer
in reply to Steven Schveighoffer

Jack Stouffer

Posted in reply to Steven Schveighoffer

On Sunday, 29 May 2016 at 17:36:24 UTC, Steven Schveighoffer wrote:
> Wholly disagree. If we didn't cache the element, D would be a laughingstock of performance-minded tests.

byLine already is a laughingstock performance wise: https://issues.dlang.org/show_bug.cgi?id=11810

It's way faster to read the entire file into a buffer and iterate by line over that.

I have to agree with Jonathan, I see a lot of proposals in this thread but I have yet to see a cost/benefit analysis that's pro transient support. The amount of changes needed to support them is not commensurate to any possible benefits.

May 29, 2016

Re: Transient ranges

Posted by Jonathan M Davis
in reply to default0

Jonathan M Davis

Posted in reply to default0

On Sunday, May 29, 2016 18:27:53 default0 via Digitalmars-d wrote:
> On Sunday, 29 May 2016 at 18:09:29 UTC, Steven Schveighoffer
>
> wrote:
> > On 5/29/16 1:45 PM, Steven Schveighoffer wrote:
> >> On 5/27/16 7:42 PM, Seb wrote:
> >>> So what about the convention to explicitely declare a
> >>> `.transient` enum
> >>> member on a range, if the front element value can change?
> >>
> >> enum isTransient(R) = is(typeof(() {
> >>
> >>    static assert(isInputRange!R);
> >>    static assert(hasIndirections(ElementType!R));
> >>    static assert(!allIndrectionsImmutable!(ElementType!R)); //
> >>
> >> need to
> >> write this
> >> }));
> >
> > obviously, this is better as a simple && statement between the three requirements :) When I started writing, I thought I'd have to write some runtime code.
> >
> > -Steve
>
> Would that make a range of polymorphic objects transient?

It would make pretty much anything that isn't a value type - including a type that's actually a value but uses postblit to do it - be treated as transient, with the one exception being that if the reference types involved are immutable (be they the element type or members in the elmenet type), then it's not treated as transient. This means a very large number of ranges will be treated as being transient, which is completely unacceptable IMHO. Having a transient front is _not_ the norm, and code is usually written with the assumption that front is not transient. In almost all cases, if a range-based function happens to work with a transient front, it's by luck and not because it was designed that way.

You can't statically check for transience, because it depends on runtime behavior. At best, you can statically eliminate a fairly small portion of the ranges as not being having transient fronts. If we want to actually support transient fronts, it really needs to be explicit IMHO.

Regardless, I don't think that we want to need to be checking for transience in range-based functions in general. It's too much extra complication for too little benefit. A very small number of ranges actually have or benefit from having a transient front, and I don't think that it's worth supporting them as ranges given how much that affects everything else. Otherwise, you end up with the 1% case causing problems for all range-based code.

- Jonathan M Davis

May 30, 2016

Re: Transient ranges

Posted by Dicebot
in reply to Steven Schveighoffer

Dicebot

Posted in reply to Steven Schveighoffer

On Sunday, 29 May 2016 at 17:25:47 UTC, Steven Schveighoffer wrote:
> What problems are solvable only by not caching the front element? I can't think of any.

As far as I know, currently it is done mostly for performance reasons - if result is fitting in the register there is no need to allocate stack space for the cache, or something like that. One of most annoying examples is map which calls lambda on each `front` call : https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L587-L590

> And there is no way to define "transient" ranges in a way other than explicitly declaring they are transient. There isn't anything inherent or introspectable about such ranges.

I don't really care about concept of transient ranges, it is the fact there is no guarantee of front stability for plain input ranges which worries me.

May 30, 2016

Re: Transient ranges

Posted by Nordlöw
in reply to Seb

Nordlöw

Posted in reply to Seb

On Friday, 27 May 2016 at 23:42:24 UTC, Seb wrote:
> So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change?

An alternative solution is to extend data-flow/escape-analysis to forbit references to `scope`d-variables from leaking outside of the scope where it's declared. If the element variable in the `foreach`-statement is then qualifed with `scope` the developer can safely use the front-reference inside the foreach-scope without worrying about it leaking into the enclosing scopes.

However, this solution, of course, requires the developer to remember to use the `scope` keyword every time he iterates over a transient range, which might not be what want in terms of simplicity.

For a very technical plan on how to implement this in D see

http://wiki.dlang.org/User:Schuetzm/scope

Could this big undertaking be split up into smaller more managable parts?

May 30, 2016

Re: Transient ranges

Posted by Steven Schveighoffer
in reply to default0

Steven Schveighoffer

Posted in reply to default0

On 5/29/16 2:27 PM, default0 wrote:
> On Sunday, 29 May 2016 at 18:09:29 UTC, Steven Schveighoffer wrote:
>> On 5/29/16 1:45 PM, Steven Schveighoffer wrote:
>>> On 5/27/16 7:42 PM, Seb wrote:
>>>
>>>> So what about the convention to explicitely declare a `.transient` enum
>>>> member on a range, if the front element value can change?
>>>
>>> enum isTransient(R) = is(typeof(() {
>>>    static assert(isInputRange!R);
>>>    static assert(hasIndirections(ElementType!R));
>>>    static assert(!allIndrectionsImmutable!(ElementType!R)); // need to
>>> write this
>>> }));
>>
>> obviously, this is better as a simple && statement between the three
>> requirements :) When I started writing, I thought I'd have to write
>> some runtime code.
>>
>
> Would that make a range of polymorphic objects transient?

Of course! If it's mutable (or marked const), it can change from call to call.

-Steve

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation