December 23, 2013 std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Sorry for the amount of text, but I tried to explain everything in a clear and simple way. I'm willing to volunteer for adding support for more types in iota. The relevant discussion is: https://d.puremagic.com/issues/show_bug.cgi?id=10762 A few preliminary considerations: - iota() currently returns a Random Access range, and I'm sure nobody would agree to downgrade the result to a ForwardRange or an InputRange, because that would break an unknown amount of code. - I think everyone agrees that even after enhancements, iota() should always return a RA range, even if we use types that are currently not supported. It would make little sense returning different kind of ranges for different input types. Let's define the types: When calling iota(start,end,inc) S=typeof(start), E=typeof(end), I=typeof(inc), C=CommonType!(S,E). My proposal: iota(start, end, inc) accepts any type, as long as {C c; c += inc; c < end;} or {C c; c = c + inc; c < end;} is valid code; Since inc type might not be directly comparable to 0, we also check if (start + inc < start). If this returns true, this means that "inc is negative" in a more general sense. We can use this information to return an empty range when needed, respecting the behaviour defined by the library reference. eg: (start < end && inc < 0) => return empty range. end is not required to be equal to (start + n*inc) for any given n; eg: iota(4, 9, 2) should be valid and return [4, 6, 8]. We can discuss if this makes sense or should be changed, but I think it would be better this way. If the code {int n; I n_inc = n * inc;} is valid, this instruction is used for providing O(1) access for opIndex and opSlice methods. If the code {C c; c -= inc;} or {C c; c = c - inc} is valid, this instruction is used for providing O(1) access for popBack() method. If no optimization is available, the code will provide O(n) performance for those methods. The real issue however is that construction of the range might end up taking O(n) time, because we have to provide the length and the back property. One work around is computing those lazily only if the user requests them. By doing this, anyone only using iota() for forward iteration will still obtain O(1) performance. iota(start, end) accepts any type, as long as {C c; ++c; c < end;} or {iota(start, end, 1);} is valid code. If both forms are valid, the iota(start, end, 1) version is preferred (allowing possible optimizations as discussed before). Everything else ends up being the same as before: popBack() can be optimized if {--c;} is valid code, the opIndex, opSlice, back and length methods will take O(n) time, no way around this sadly. hsteoh suggested: If start+inc is *not* supported but start-- is, and end < start, should we support iota(start,end)? I think this idea should be discarded since some code could quickly become a minefield: type T implements ++t and --t; type P only implements ++p; t1 < t2 => iota(t2, t1) has a way to compute a non-empty range; p1 < p2 => iota(p2, p1) can do nothing but return an empty range; And this behaviour really smells bad, IMHO. The only way to solve this is requiring opUnary!"--" being implemented for the type to be used, and I am not sure about this (most iota uses only really need ++). Any suggestion/critique? If this looks good to you, I can start working on it. If you disagree on something (and I know you have something to say!), discussion is more than welcome! Francesco |
December 23, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | Francesco Cattoglio:
> - iota() currently returns a Random Access range, and I'm sure nobody would agree to downgrade the result to a ForwardRange or an InputRange, because that would break an unknown amount of code.
> ...
> Everything else ends up being the same as before: popBack() can be optimized if {--c;} is valid code, the opIndex, opSlice, back and length methods will take O(n) time, no way around this sadly.
If the new iota accepts new types, then no existing code is using iota for such cases. So you are not breaking code is you offer a more restricted range for such types, avoiding O(n) behavior for them.
Bye,
bearophile
|
December 23, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Monday, 23 December 2013 at 15:23:45 UTC, bearophile wrote:
> If the new iota accepts new types, then no existing code is using iota for such cases. So you are not breaking code is you offer a more restricted range for such types, avoiding O(n) behavior for them.
I do realize this, but I don't really like having iota()
returning different ranges for different types. Do you think this
would make sense?
Something like "iota returns a RA range for arithmetical types
and a ForwardRange for any other type supporting opUnary!++"?
|
December 23, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | Francesco Cattoglio:
> Do you think this would make sense?
Other things in Phobos work like this. Usually in Phobos Big-O requirements are more important than keeping the range type unchanged.
Bye,
bearophile
|
December 24, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | On Mon, Dec 23, 2013 at 03:00:25PM +0000, Francesco Cattoglio wrote: > Sorry for the amount of text, but I tried to explain everything in a > clear and simple way. > I'm willing to volunteer for adding support for more types in iota. > The relevant discussion is: > https://d.puremagic.com/issues/show_bug.cgi?id=10762 > > A few preliminary considerations: > - iota() currently returns a Random Access range, and I'm sure > nobody would agree to downgrade the result to a ForwardRange or an > InputRange, because that would break an unknown amount of code. Certainly, the new iota() implementation should not reduce current functionality, but only add to it. > - I think everyone agrees that even after enhancements, iota() should always return a RA range, even if we use types that are currently not supported. It would make little sense returning different kind of ranges for different input types. This is false. Phobos has a lot of code that returns slightly different types depending on what the input was. For example, .map returns a random access range if the input is random access, a bidirectional / forward / input range if that's that the input is, etc.. This is perfectly normal. It makes little sense to require, e.g., .map to always return, say, a forward range even if you hand it an input range -- it wouldn't be able to do this without introducing very inefficient code (like caching entries so that it can implement .save). Phobos range adaptors should always return the "best of input functionality", i.e., if the incoming range has .save, then where reasonable the resulting range also should have .save; ditto for opIndex, .back, .popBack, etc.. > Let's define the types: When calling iota(start,end,inc) > S=typeof(start), > E=typeof(end), > I=typeof(inc), > C=CommonType!(S,E). > > My proposal: > iota(start, end, inc) accepts any type, as long as > {C c; c += inc; c < end;} or {C c; c = c + inc; c < end;} is > valid code; > > Since inc type might not be directly comparable to 0, we also check if (start + inc < start). If this returns true, this means that "inc is negative" in a more general sense. I'm a bit uneasy about this, but I guess it's the price of completely generic code, since you're right, there's no guarantee the inc type is comparable to 0. > We can use this information to return an empty range when needed, > respecting the behaviour defined by the library reference. eg: (start > < end && inc < 0) => return > empty range. > > end is not required to be equal to (start + n*inc) for any given n; eg: iota(4, 9, 2) should be valid and return [4, 6, 8]. We can discuss if this makes sense or should be changed, but I think it would be better this way. I think this way is better. > If the code {int n; I n_inc = n * inc;} is valid, this instruction is used for providing O(1) access for opIndex and opSlice methods. Yes. > If the code {C c; c -= inc;} or {C c; c = c - inc} is valid, this > instruction is used for providing O(1) access for popBack() method. Yes. > If no optimization is available, the code will provide O(n) > performance for those methods. No. If (start + n*inc) is not compilable, then iota should not be a random-access range. I don't like the idea that iota[n] is O(1) for some types and O(n) for other types. > The real issue however is that construction of the range might end up > taking O(n) time, because we have to provide the length and the back > property. One work around is computing those lazily only if the > user requests them. No, if (start + n*inc) is not available, don't provide .length and don't provide .back. Otherwise you're adding unnecessary baggage in the returned range type for most code that don't care about .length or .back. > By doing this, anyone only using iota() for forward iteration will > still obtain O(1) performance. > > iota(start, end) accepts any type, as long as > {C c; ++c; c < end;} or {iota(start, end, 1);} is valid code. If > both forms are valid, the iota(start, end, 1) version is preferred > (allowing possible optimizations as discussed before). > Everything else ends up being the same as before: popBack() can be > optimized if {--c;} is valid code, the opIndex, opSlice, back and > length methods will take O(n) time, no way around this sadly. No, we should not implement any O(n) time methods in iota. If only ++ is available in the base type, then .length, .back, opSlice, will not be available. Basically, if the user type only supports ++, then in normal user code they will not expect to compute length easily, or to be able jump to over several consecutive values. So chances are that they won't be using them with the range returned by iota either. So this will just be unneeded added complexity in the iota implementation. I vote against it. > hsteoh suggested: > If start+inc is *not* supported but start-- is, and end < start, > should we support iota(start,end)? > I think this idea should be discarded since some code could quickly > become a minefield: > type T implements ++t and --t; > type P only implements ++p; > t1 < t2 => iota(t2, t1) has a way to compute a non-empty range; > p1 < p2 => iota(p2, p1) can do nothing but return an empty range; > And this behaviour really smells bad, IMHO. The only way to solve > this is requiring opUnary!"--" being implemented for the type to be > used, and I am not sure about this (most iota uses only really need > ++). I think this part is optional, I won't miss it if it won't be not there, actually. I only suggested it because I thought it would be nice to support iota(a,b,-1) for types that don't have a way to specify inc. But like you said, it's probably very rare for people to actually want to do this. > Any suggestion/critique? If this looks good to you, I can start working on it. If you disagree on something (and I know you have something to say!), discussion is more than welcome! [...] I think overall it looks good, except for the parts where you try to implement .back, .length, opSlice, etc., when the underlying type doesn't support it. I think that's unnecessary work. Just as we don't expect sqrt() to return a meaningful value for -1 for real numbers, but we *do* expect a meaningful value for complex numbers, so there's no reason iota() must always return a RA range. If the user wants RA access, then he should implement more primitives in his type, it's not iota's job to do that for him. Another thing to be mindful of is the current existing overloads of iota. Your implementation should either be in distinct overloads from them (so that they continue to work as they do now), or it should subsume them and provide (at least) equivalent functionality, if not better. In no case should existing functionality be broken by the new replacement. This sounds obvious but may be harder than it appears, because there's an overload (or multiple overloads?) specifically for floating-point types, and you'll have to take into account the quirks of floating-point types to handle all the cases correctly. Keep in mind that if a float is too large, iota() may not be able to return a meaningful range (e.g., x+inc may not have a distinct floating-point representation from x when abs(x) is too big relative to inc). Also, what should be done if mathematically x + n*inc == end exactly, but due to roundoff error it misses by a few ulps. Then iota() may have different lengths depending on CPU (e.g. 80-bit real vs. 128-bit real), rounding flags, etc., and things could get hairy. You might want to keep the floating-point overload(s) of iota as-is to avoid having to deal with this mess. :) T -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot |
December 24, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | On Mon, Dec 23, 2013 at 03:30:12PM +0000, Francesco Cattoglio wrote: > On Monday, 23 December 2013 at 15:23:45 UTC, bearophile wrote: > >If the new iota accepts new types, then no existing code is using iota for such cases. So you are not breaking code is you offer a more restricted range for such types, avoiding O(n) behavior for them. > > I do realize this, but I don't really like having iota() returning > different ranges for different types. Do you think this would make > sense? > Something like "iota returns a RA range for arithmetical types and a > ForwardRange for any other type supporting opUnary!++"? I think this is OK. It's just like map() returning an input range if you give it an input range, a forward range if you give it a forward range, etc.. Or joiner() returning a forward range only if you give it a forward range of a forward range, otherwise it's just an input range, even if one of the inner / outer ranges is forward (both the outer and inner range must be forward ranges, otherwise the result can't guarantee that .save will work correctly). Basically, return the best range you can without adding hidden costs (like O(n) methods), otherwise fall back to a more primitive range type. This rule is observed by other parts of Phobos too. T -- The richest man is not he who has the most, but he who needs the least. |
December 24, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | On 12/23/13 7:00 AM, Francesco Cattoglio wrote: > A few preliminary considerations: > - iota() currently returns a Random Access range, and I'm sure nobody > would agree to downgrade the result to a ForwardRange or an InputRange, > because that would break an unknown amount of code. Agreed. > - I think everyone agrees that even after enhancements, iota() should > always return a RA range, even if we use types that are currently not > supported. It would make little sense returning different kind of ranges > for different input types. I think it makes a lot of sense to relax that. Different types have different properties leading to different iota capabilities. > Let's define the types: When calling iota(start,end,inc) > S=typeof(start), > E=typeof(end), > I=typeof(inc), > C=CommonType!(S,E). > > My proposal: > iota(start, end, inc) accepts any type, as long as > {C c; c += inc; c < end;} or {C c; c = c + inc; c < end;} is valid > code; Sounds good. That's not sufficient to make it random access though. > Since inc type might not be directly comparable to 0, we also check if > (start + inc < start). If this returns true, this means that "inc is > negative" in a more general sense. We can use this information to return > an empty range when needed, respecting the behaviour defined by the > library reference. eg: (start < end && inc < 0) => return empty range. > end is not required to be equal to (start + n*inc) for any given n; eg: > iota(4, 9, 2) should be valid and return [4, 6, 8]. We can discuss if > this makes sense or should be changed, but I think it would be better > this way. Anything sensible is fine so long as we don't pay every iteration. > If the code {int n; I n_inc = n * inc;} is valid, this instruction is > used for providing O(1) access for opIndex and opSlice methods. Ah, nice. There's an unstated assumption here - adding inc n times is the same as adding n * inc. > If the code {C c; c -= inc;} or {C c; c = c - inc} is valid, this > instruction is used for providing O(1) access for popBack() method. Noice. > If no optimization is available, the code will provide O(n) performance > for those methods. I don't think any primitive should be O(n). > The real issue however is that construction of the range might end up > taking O(n) time, because we have to provide the length and the back > property. One work around is computing those lazily only if the user > requests them. > By doing this, anyone only using iota() for forward iteration will still > obtain O(1) performance. All primitives must be O(1). > iota(start, end) accepts any type, as long as > {C c; ++c; c < end;} or {iota(start, end, 1);} is valid code. If > both forms are valid, the iota(start, end, 1) version is preferred > (allowing possible optimizations as discussed before). No, there are types for which increment is sensible but not adding an arbitrary number. (Impractical example: Peano arithmetic. Practical examples anyone?) So iota(start, end) should be a special range, not derived from the general one. Also x++ is faster than x += a where a is a variable. > Everything else ends up being the same as before: popBack() can be > optimized if {--c;} is valid code, the opIndex, opSlice, back and length > methods will take O(n) time, no way around this sadly. Must be O(1) or don't implement. This is the D way. > hsteoh suggested: > If start+inc is *not* supported but start-- is, and end < start, > should we support iota(start,end)? Absolutely. > I think this idea should be discarded since some code could quickly > become a minefield: > type T implements ++t and --t; > type P only implements ++p; > t1 < t2 => iota(t2, t1) has a way to compute a non-empty range; > p1 < p2 => iota(p2, p1) can do nothing but return an empty range; > And this behaviour really smells bad, IMHO. The only way to solve this > is requiring opUnary!"--" being implemented for the type to be used, and > I am not sure about this (most iota uses only really need ++). Not sure I understand this, but any reasoning that leads to the conclusion that simple ++ ranges should be discarded must be carefully revised. Andrei |
December 24, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | Thank you everyone for the feedback. I made the wrong assumption about phobos design, I didn't knew that the policy here is "when needed, relax the range", and now I see it makes perfect sense. The range will only be a RA range for types that implement (inc * n) and ((end - begin) / step) (used for lenght computation), otherwise it will be a ForwardRange, because it can't be directional if we can't compute the last element in O(1). On Tuesday, 24 December 2013 at 01:42:21 UTC, H. S. Teoh wrote: > there's > an overload (or multiple overloads?) specifically for floating-point > types, and you'll have to take into account the quirks of floating-point > types to handle all the cases correctly. > [...] You might want to keep the floating-point > overload(s) of iota as-is to avoid having to deal with this mess. :) I actually think the current float doesn't behave perfectly either. I will not forget about float quirks, don't worry. >Ah, nice. There's an unstated assumption here - adding inc n times is the same as adding n * inc. Right, completely missed that. I guess checking a few test cases at compile time is the best one can do. Checking for every n could take some infinite amount of time :P >No, there are types for which increment is sensible but not adding an arbitrary number. I agree. After relaxing the range, we can prefer a specialized version over the iota(begin, end, 1) version. The latter should be used as a backup instead for cases where ++ is not implemented. >Not sure I understand this, but any reasoning that leads to the conclusion that simple ++ ranges should be discarded must be carefully revised. Maybe I should have pasted all the suggestions from h.s. teoh, to make it more clear which one I didn't like. He suggested to add extra functionality if the type provides opUnary!--. ++ ranges will be supported. what will not be supported is extra functionality if -- is defined, because similar types used in the same case might produce different results. |
December 24, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | Francesco Cattoglio:
> I agree. After relaxing the range, we can prefer a specialized version over the iota(begin, end, 1) version. The latter should be used as a backup instead for cases where ++ is not implemented.
One possible disadvantage is when you want an array of various iota (all of the same indexed type, like int) (currently this doesn't compile), this was a potential use case for me:
void main() {
import std.range: iota;
auto intervals = [iota(10), iota(2, 7), iota(2, 15, 2)];
}
Bye,
bearophile
|
December 24, 2013 Re: std.range.iota enhancement: supporting more types (AKA issue 10762) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | On Tuesday, 24 December 2013 at 10:38:17 UTC, Francesco Cattoglio wrote:
> The range will only be a RA range for types that implement (inc * n) and ((end - begin) / step) (used for lenght computation), otherwise it will be a ForwardRange, because it can't be directional if we can't compute the last element in O(1).
Bidirectionality and random access are separate capabilities; the former can be supported without the latter through `--end`, and `end == start`, and we should support that. If it causes problems when the same range is traversed from both ends, then that is an issue with the UDT's operator overloads.
|
Copyright © 1999-2021 by the D Language Foundation