November 15, 2012
On Thursday, November 15, 2012 22:07:22 Daniel Murphy wrote:
> Is that really good enough?  Keeping ranges simple is important, but so is making the obvious solution 'just work'.

std.array.array will never work with ranges with a transient front unless it somehow knew when it was and wasn't appropriate to dup, which it's not going to know purely by looking at the type of front. The creator of the range would have to tell them somehow. And even then, it wouldn't work beyond the built-in types, because there's no generic way to dup stuff.

So, either std.array.array will not work directly with byLine or byChunk, or byLine and byChunk need to not have transient fronts. If them not working with std.array.array is too un-user-friendly, then they need to be changed so that they don't have transient fronts, and transient fronts should just be considered invalid ranges (though there's no way to actually test for them, so anyone who wrote them would still be able to try and use them - they just wouldn't work).

- Jonathan M Davis
November 15, 2012
> std.array.array will never work with ranges with a transient front unless it
> somehow knew when it was and wasn't appropriate to dup, which it's not going
> to know purely by looking at the type of front. The creator of the range would
> have to tell them somehow. And even then, it wouldn't work beyond the built-in
> types, because there's no generic way to dup stuff.

Daniel was actually talking about std.byLine.map!"a.dup", which is not a transient range, but would be considered transient if we did what Andrei suggests.
November 15, 2012
On Thursday, November 15, 2012 13:17:12 jerro wrote:
> > std.array.array will never work with ranges with a transient
> > front unless it
> > somehow knew when it was and wasn't appropriate to dup, which
> > it's not going
> > to know purely by looking at the type of front. The creator of
> > the range would
> > have to tell them somehow. And even then, it wouldn't work
> > beyond the built-in
> > types, because there's no generic way to dup stuff.
> 
> Daniel was actually talking about std.byLine.map!"a.dup", which is not a transient range, but would be considered transient if we did what Andrei suggests.

Well, there's no way around that as far as I can see. Even if all ranges had to be explicitly marked as transient or not, map would be in a bind here, because it knows nothing about what the function it was given is doing, so it has no way of knowing how it affects transience. At minimum, it would be forced to mark itself as transient if the original range was (even if the function used idup), or it would _always_ be forced to mark it as transient (I'm not sure which). The only way out would be if there were a way to tell map explicitly to mark the resultant range as having a non-transient front.

By using type deduction like Andrei is suggesting, then we can at least deduce that map!"a.idup" has a non-transient front, but the only way that we'd know that map!"a.dup" was non-transient was if map were told somehow, and it defined an enum that the hasTransientFront trait could examine (i.e. we're back in the boat we'd be in if all ranges had to declare whether they were transient or not). So, as long as we can have transient fronts, map!"a.dup" is screwed, which may or may not be a problem. It's arguably a lot like how we keep having to explain why functions don't work with narrow strings because of how narrow strings aren't random-access, don't have length, etc. And that's definitely annoying, but we can't really fix it.

It's looking like this comes down to either banning ranges with transient fronts entirely (and changing how ByLine and ByChunk work), or we're going to have to deal with quirks like array(map!"a.dup"(file.byLine())) not working whereas array(map!"a.idup"(file.byLine())) does work.

- Jonathan M Davis
November 15, 2012
On Wed, 2012-11-14 at 18:31 -0800, Andrei Alexandrescu wrote:
> > array(map!"a.dup"(stdin.byLine()))

As it seems there is a good way of handling ranges with transient front for algorithms that need a persistent front?

Why not simply document any transient range to be transient (should be anyway) and add the little hint to map. Also note that some algorithms might not work as expected with transient fronts. In addition, at least the algorithms in phobos should state in their documentation whether they rely on non transient front or not.

To me it seems that ranges and algorithms already offer a solution to the problem.

The other way round it would of course be better (safe behaviour the default, fast the option) but as a matter of fact there is no real unsafe behaviour, it just might be unexpected if you don't know what you are doing.

On the other hand if an algorithm depends unnecessarily on non transient fronts it should be fixed. If there are many algorithms which can be more efficient with the dependency on non transient front, we could simply provide a second module, called std.transalgorithm (or something) offering dedicated algorithms for transient fronts. (So people don't have to role their own)

I think this is a very clean and straight forward solution. If you want something that simply works you just use map!"a.dup" ( or whatever you need to copy your elements) and don't care. If you want performance then you would have to check what algorithms to use and have a look at std.transalgorithm.

My apologies if someone else already suggested something like this, I haven't read all the threads about this topic entirely.

November 15, 2012
On Thursday, 15 November 2012 at 12:57:24 UTC, Jonathan M Davis wrote:
> It's looking like this comes down to either banning ranges with transient
> fronts entirely (and changing how ByLine and ByChunk work), or we're going to
> have to deal with quirks like array(map!"a.dup"(file.byLine())) not working
> whereas array(map!"a.idup"(file.byLine())) does work.
>
> - Jonathan M Davis

I still say this could be "simply" solved the same way we solved the "size_t" indexing problem: Only ranges that have non-transient elements are guaranteed supported by phobos algorithms/functions.

Everything else: Use at your own risk.

November 15, 2012
11/15/2012 5:20 PM, monarch_dodra пишет:
> On Thursday, 15 November 2012 at 12:57:24 UTC, Jonathan M Davis wrote:
>> It's looking like this comes down to either banning ranges with transient
>> fronts entirely (and changing how ByLine and ByChunk work), or we're
>> going to
>> have to deal with quirks like array(map!"a.dup"(file.byLine())) not
>> working
>> whereas array(map!"a.idup"(file.byLine())) does work.
>>
>> - Jonathan M Davis
>
> I still say this could be "simply" solved the same way we solved the
> "size_t" indexing problem: Only ranges that have non-transient elements
> are guaranteed supported by phobos algorithms/functions.
>
> Everything else: Use at your own risk.
>

Yeah! Let's introduce undefined behavior into the standard library!

Wrong type of index at least breaks at compile time.

-- 
Dmitry Olshansky
November 15, 2012
On Thu, Nov 15, 2012 at 04:38:04AM -0800, Jonathan M Davis wrote:
> On Thursday, November 15, 2012 13:17:12 jerro wrote:
> > > std.array.array will never work with ranges with a transient front unless it somehow knew when it was and wasn't appropriate to dup, which it's not going to know purely by looking at the type of front. The creator of the range would have to tell them somehow. And even then, it wouldn't work beyond the built-in types, because there's no generic way to dup stuff.
> > 
> > Daniel was actually talking about std.byLine.map!"a.dup", which is not a transient range, but would be considered transient if we did what Andrei suggests.
> 
> Well, there's no way around that as far as I can see. Even if all ranges had to be explicitly marked as transient or not, map would be in a bind here, because it knows nothing about what the function it was given is doing, so it has no way of knowing how it affects transience. At minimum, it would be forced to mark itself as transient if the original range was (even if the function used idup), or it would _always_ be forced to mark it as transient (I'm not sure which). The only way out would be if there were a way to tell map explicitly to mark the resultant range as having a non-transient front.

OK, you've convinced me. The only way to take care of all these corner cases is to make transient ranges illegal, period. Just the fact that map!a and map!"a.dup" may be transient or not, shows that this isn't going to be solved by any simple means. None of our proposals so far even comes close to handling this one correctly.


> By using type deduction like Andrei is suggesting, then we can at least deduce that map!"a.idup" has a non-transient front,

Well, this at least gives us some semblance of workability for this particular case, though it is very leaky around the edges.


> but the only way that we'd know that map!"a.dup" was non-transient was if map were told somehow, and it defined an enum that the hasTransientFront trait could examine (i.e. we're back in the boat we'd be in if all ranges had to declare whether they were transient or not). So, as long as we can have transient fronts, map!"a.dup" is screwed, which may or may not be a problem.

This is not good, because it relies on the user to declare whether or not something is transient when they aren't even the implementor of the delegate passed to map. It's one thing to require users to declare their ranges transient or not, but it's quite another thing to require them to tell map whether or not a.dup is transient (where a.dup can be substituted with an arbitrarily complex delegate which may not even be implemented by the user).


[...]
> It's looking like this comes down to either banning ranges with transient fronts entirely (and changing how ByLine and ByChunk work),

This is looking like the more attractive option right now.


> or we're going to have to deal with quirks like
> array(map!"a.dup"(file.byLine())) not working whereas
> array(map!"a.idup"(file.byLine())) does work.
[...]

This is ugly. I don't like it. But at least, it does give you a compile time error when array requires a non-transient range, but gets a transient one. Better than subtle runtime breakage, for sure.


T

-- 
It said to install Windows 2000 or better, so I installed Linux instead.
November 15, 2012
On Thu, Nov 15, 2012 at 02:14:15PM +0100, eskimo wrote:
> On Wed, 2012-11-14 at 18:31 -0800, Andrei Alexandrescu wrote:
> > > array(map!"a.dup"(stdin.byLine()))
> 
> As it seems there is a good way of handling ranges with transient front for algorithms that need a persistent front?
> 
> Why not simply document any transient range to be transient (should be anyway) and add the little hint to map. Also note that some algorithms might not work as expected with transient fronts. In addition, at least the algorithms in phobos should state in their documentation whether they rely on non transient front or not.

This is better than nothing, of course, but still, relying purely on documentation is not desirable if we can do better. Though at this point, it looks like we can't, so this may be the only option left.


[...]
> On the other hand if an algorithm depends unnecessarily on non transient fronts it should be fixed.

Definitely! I have a fix for std.algorithm.joiner already, and there are a few others that can be fixed without too much effort (I hope).


> If there are many algorithms which can be more efficient with the dependency on non transient front, we could simply provide a second module, called std.transalgorithm (or something) offering dedicated algorithms for transient fronts. (So people don't have to role their own)

AFAIK, none of the algorithms will be more or less efficient depending on whether non-transience can be assumed. It's just a matter of reordering operations (don't call .popFront until .front is used); a bit trickier to write the code, but doesn't change the asymptotic complexity. The algorithms that *are* affected are those that can't work with transient ranges anyway, so it doesn't really matter.


> I think this is a very clean and straight forward solution. If you want something that simply works you just use map!"a.dup" ( or whatever you need to copy your elements) and don't care. If you want performance then you would have to check what algorithms to use and have a look at std.transalgorithm.
[...]

I don't like duplicating a whole bunch of algorithms in transalgorithm.

However, there *may* be something to the idea of splitting up std.algorithm so that those algorithms that aren't sensitive to transience are in one module, and the fragile algorithms in another module. Then one module can be clearly marked as usable with *all* ranges, and the other as usable only with non-transient ranges.


T

-- 
Be in denial for long enough, and one day you'll deny yourself of things you wish you hadn't.
November 15, 2012
On 11/14/2012 11:18 PM, Andrei Alexandrescu wrote:
> On 11/14/12 11:18 AM, Timon Gehr wrote:
>> On 11/14/2012 06:43 PM, Andrei Alexandrescu wrote:
>>> On 11/14/12 7:29 AM, H. S. Teoh wrote:
>>>> But since this isn't going to be fixed properly, then the only solution
>>>> left is to arbitrarily declare transient ranges as not ranges (even
>>>> though the concept of ranges itself has no such implication, and many
>>>> algorithms don't even need such assumptions), and move on. We will just
>>>> have to put up with an inferior implementation of std.algorithm and
>>>> duplicate code when one*does* need to work with transient ranges. It is
>>>> not a big loss anyway, since one can simply implement one's own library
>>>> to deal with this issue properly.
>>>
>>> What is your answer to my solution?
>>>
>>> transient elements == input range && not forward range && element type
>>> has mutable indirections.
>>>
>>> This is testable by any interested clients, covers a whole lot of
>>> ground, and has a good intuition behind it.
>>>
>>>
>>> Andrei
>>
>> That is a very imprecise approximation. I think it does not cover any
>> ground: The day eg. 'array' will require this kind of non-transient
>> element range is the day where I will write my own.
>
> What would be an example where array would have trouble with using this
> definition?
>
> Andrei

import std.array, std.range, std.algorithm, std.stdio, std.conv;

class C{
    int	x;
    this(int x){ this.x = x; }
    string toString(){ return "C("~to!string(x)~")"; }
}

void main(){
    auto a = iota(0,100).map!(a=>new C(a)).array;
    writeln(a);
}
November 15, 2012
On 11/14/2012 08:32 PM, Jonathan M Davis wrote:
> On Wednesday, November 14, 2012 20:18:26 Timon Gehr wrote:
>> That is a very imprecise approximation. I think it does not cover any
>> ground: The day eg. 'array' will require this kind of non-transient
>> element range is the day where I will write my own.
>
> std.array.array _cannot_ work with a transient front. ...

It can work if 'transient' is over-approximated like suggested in the parent post.