March 25, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #50 from Kenji Hara <k.hara.pg@gmail.com> 2013-03-24 17:52:41 PDT ---
(In reply to comment #47)
> @Kenji, shall we go for this?

I always think that is not good to enforce particular semantics to the user programs. This enhancement violates the rule, and we already have an example that we should not do it (AA and containers like that).

(I know we already have special treatement for ranges - foreach can recognize the object with input range primitives is iterable. But this enhancement just only works for random access range. It is too specialized.)

And I found that this is not sufficient - it does not work for infinite forward range!

https://github.com/9rnsr/phobos/commit/dd0d4c139828013c34e76acc74884341f31db298#L0R1379

    struct IFR  // infinite forward range
    {
        enum empty = false;
        @property front() { return 1; }
        auto popFront() {}

        @property save() { return this; }

        auto opSlice(size_t b, size_t e) { return this.take(e - b); }
        auto opSlice(size_t b, Infinity) { return this; }
    }

IFR does not have 'length' primitive, but should be slicable like r[n .. $]; (see std.range.hasSlicing definition) But this enhancement cannot cover this - therefore user defined IFR should always define their own opDollar.

So, I think the combination of std.range.opDollar and UFCS would be much better than compiler's implicit alias just only for 'length' primitive.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 25, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #51 from Kenji Hara <k.hara.pg@gmail.com> 2013-03-24 17:56:57 PDT ---
(In reply to comment #49)
> I'm thinking of putting this decision in the compiler for now, it's the least committal change.

To make my suggestion "committal change", I posted it as a pull request.

https://github.com/D-Programming-Language/dmd/pull/1793

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 25, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #52 from monarchdodra@gmail.com 2013-03-25 03:53:11 PDT ---
(In reply to comment #50)
> And I found that this is not sufficient - it does not work for infinite forward range!
> 
> https://github.com/9rnsr/phobos/commit/dd0d4c139828013c34e76acc74884341f31db298#L0R1379
> 
>     struct IFR  // infinite forward range
>     {
>         enum empty = false;
>         @property front() { return 1; }
>         auto popFront() {}
> 
>         @property save() { return this; }
> 
>         auto opSlice(size_t b, size_t e) { return this.take(e - b); }
>         auto opSlice(size_t b, Infinity) { return this; }
>     }
> 
> IFR does not have 'length' primitive, but should be slicable like r[n .. $]; (see std.range.hasSlicing definition) But this enhancement cannot cover this - therefore user defined IFR should always define their own opDollar.
> 
> So, I think the combination of std.range.opDollar and UFCS would be much better than compiler's implicit alias just only for 'length' primitive.

Well, I think the compiler can't do *everything* for the user. If you want an infinite range to adhere to "hasSlicing", then at the very least, it has to implement the slicing primitive.

The notion of "slice-able infinite range" has always been ambiguous, but IMO, "slice to end" primitive is the important one that *must* be implemented and checked. I'd doubt we'd break much code enforcing this.

Besides, at worst, ranges that can't be sliced to end would seize being considered sliceable, which, IMO, is a good thing anyways.

I think we should make this change (for which there would be no automatic workaround) sooner rather than later.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 27, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #53 from Jonathan M Davis <jmdavisProg@gmx.com> 2013-03-26 19:22:26 PDT ---
Implementing it in the language has the advantage of avoiding making it so that opDollar works with UFCS and avoids the risk of someone overloading opDollar in another module and causing stuff to break when it clashes with the one in std.range. It has the disadvantage of not working for infinite ranges while also making $ mean length in cases when we don't want it to, forcing us to @disable it.

Implementing it in the library has the advantage of making it only work with ranges (unless another overload of opDollar is created for non-ranges) and making it so that it works with infinite ranges. But it has the disadvantage of allowing opDollar to be used with UFCS and risks conflicts if other code also overloads it that way.

However, even if we go with the library route, opSlice will still have to explicitly support the type that opDollar returns, so it can't really be supported automatically for infinite ranges. We just save them the trouble of actually aliasing the type to opDollar and make it so that there's a standard type to use with infinite ranges and opDollar (which we could already do by simply declaring something like std.range.InfiniteDollar with the idea that all infinite ranges with slicing would alias it to opDollar and use it with opSlice).

So, I don't think that it really matters which way we go as far as infinite ranges go. In either case, some work is required to support it, and requiring opDollar on infinite ranges with slicing will break code (though likely not a lot, since infinite ranges are likely to be a lot rarer than finite ones, and sliceable ones even more so).

I think that it mainly comes down to whether we'd rather require that types with length in addition to opIndex and/or opSlice @disable opDollar if they don't want it, or whether we'd rather risk 3rd party code defining opDollar as a free function, causing conflicts with std.range.opDollar (conflicts which wouldn't be  resolvable in the normal fashion, because $ is an operator, and you can use an import path with it without explicitly calling opDollar). Beyond that, I don't think that it matters much which route we take (though I do worry somewhat that making opDollar UFCS-able would open the door to making other operators UFCS-able, which I think would be a big mistake; but we wouldn't actually be required to do that if we made opDollar UFCS-able), as beyond that, the two routes seem pretty much functionally equivalent.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 27, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #54 from monarchdodra@gmail.com 2013-03-27 10:43:35 PDT ---
I think I agree that implementing it in the language is a better choice. For one, it means $ will *always* work, as opposed to having to import std.range just to expect generic code to work. As long as it's properly documented, I don't really view the length => $ conversion problematic. Yes, it allows illegal code to compile, but doesn't break anything existing (except for strange static ifs?), and we would be giving users a way to prevent it.

I don't view that it wouldn't work "out of the box" with infinite ranges as problematic. There is really no way around the fact that for an infinite range to support "slice to end", it must implement it via a "DollarToken" approach, which pretty much means the code *has* to be deployed. So there is no way to "accidently" forget opDollar.

--------

That said, if we do go ahead and implement this, I STRONGLY urge we enforce
"hasSlicing" => "can slice to end", even for infinite ranges. Having used
infinite ranges in phobos, I can say that:
1. A *LOT* of the slicing that occurs is very often of the form r[i .. $].
2. A *LOT* of algorithms in phobos would naturally support sliceable infinite
ranges with no extra code, if they could rely on being able to write r[i .. $].

Being able *reliably* slice *finite* ranges with opDollar is only half of what we need.

I don't think this would break a lot of code as:
1. We are modifying a *trait*, so code that slices would not actually be broken
2. Most of *our* algorithms have fall-backs should a range seize to be
sliceable.
3. There are very little infinite ranges anyways

@jmdavis: Would you be willing to go forward with such a change? You said "I'm very much inclined to put a note in the changelog (probably in red) [...]" would you be inclined to do such a note for only infinite ranges?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 31, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #55 from Andrei Alexandrescu <andrei@erdani.com> 2013-03-31 09:41:17 PDT ---
(In reply to comment #50)
> (In reply to comment #47)
> > @Kenji, shall we go for this?
> 
> I always think that is not good to enforce particular semantics to the user programs. This enhancement violates the rule, and we already have an example that we should not do it (AA and containers like that).
> 
> (I know we already have special treatement for ranges - foreach can recognize the object with input range primitives is iterable. But this enhancement just only works for random access range. It is too specialized.)

@Kenji: I'm a bit unclear on your view on this. You mention you'd prefer a library solution but your pull request seems to go for a in-compiler solution.

> And I found that this is not sufficient - it does not work for infinite forward range!
> 
> https://github.com/9rnsr/phobos/commit/dd0d4c139828013c34e76acc74884341f31db298#L0R1379
> 
>     struct IFR  // infinite forward range
>     {
>         enum empty = false;
>         @property front() { return 1; }
>         auto popFront() {}
> 
>         @property save() { return this; }
> 
>         auto opSlice(size_t b, size_t e) { return this.take(e - b); }
>         auto opSlice(size_t b, Infinity) { return this; }
>     }
> 
> IFR does not have 'length' primitive, but should be slicable like r[n .. $]; (see std.range.hasSlicing definition) But this enhancement cannot cover this - therefore user defined IFR should always define their own opDollar.

There is no intent to cover infinite forward ranges with default behavior. Again, the primary goal here should be, I think, to make code defining casual ranges easy and boilerplate-free. Infinite ranges or ranges that don't define length yet want to define slicing through to the end are comparatively rare. It is reasonable to request people defining those to define opDollar appropriately.

> So, I think the combination of std.range.opDollar and UFCS would be much better than compiler's implicit alias just only for 'length' primitive.

If we go for a library solution, we should define opDollar in object.d so it's available by default. It should look something like this (in the 1-dimensional case):

    auto ref opDollar(R)(auto ref R r) if (is(typeof(r.length))) {
        return r.length;
    }

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 31, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #56 from Kenji Hara <k.hara.pg@gmail.com> 2013-03-31 10:21:20 PDT ---
(In reply to comment #55)
> @Kenji: I'm a bit unclear on your view on this. You mention you'd prefer a library solution but your pull request seems to go for a in-compiler solution.

As Steven already mentioned in comment#14, automatically forwarding from $ to 'length' would break user-defined containers.

The fact clearly represents that we cannot always regard $ as "length". Indeed, it is true in built-in arrays and range concept, but isn't true in associative arrays and some user-defined containers. So I think that this enhancement has a bias toward range concept.

And, as far as possible, compiler should be a neutral.

In other words, this is just reasonable for std.range users. Most of D programmers would use std.range, but not all.

> There is no intent to cover infinite forward ranges with default behavior. Again, the primary goal here should be, I think, to make code defining casual ranges easy and boilerplate-free. Infinite ranges or ranges that don't define length yet want to define slicing through to the end are comparatively rare. It is reasonable to request people defining those to define opDollar appropriately.

I don't mention that infinite range _should_ have r[n..$], rather mention that it is possible.

> If we go for a library solution, we should define opDollar in object.d so it's available by default. It should look something like this (in the 1-dimensional case):
>
>     auto ref opDollar(R)(auto ref R r) if (is(typeof(r.length))) {
>         return r.length;
>     }

This is bad. druntime should not have things for range concept. It is a job of std.range.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 31, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #57 from Andrei Alexandrescu <andrei@erdani.com> 2013-03-31 11:04:45 PDT ---
(In reply to comment #56)
> (In reply to comment #55)
> > @Kenji: I'm a bit unclear on your view on this. You mention you'd prefer a library solution but your pull request seems to go for a in-compiler solution.
> 
> As Steven already mentioned in comment#14, automatically forwarding from $ to 'length' would break user-defined containers.

I don't think there's any breakage. My understanding is that those containers already need to define opDollar. If opDollar is defined there is no change in semantics.

> The fact clearly represents that we cannot always regard $ as "length". Indeed, it is true in built-in arrays and range concept, but isn't true in associative arrays and some user-defined containers. So I think that this enhancement has a bias toward range concept.

I'd have quite a bit of difficulty agreeing with this. The notion that "$" is a synonym for "length" predates ranges and has been there for strings and arrays ever since they were defined. On the contrary, I'd argue that notions such as length-less ranges and infinite ranges contributed to the notion that opDollar may be (rarely) something distinct from length.

> And, as far as possible, compiler should be a neutral.

That shouldn't be done to a fault, either. The D language is not neutral on user-defined expr++ vs. ++expr; it forces both to have similar semantics to their built-in counterparts. In contrast, the C++ language allows defining ++expr and expr++ with different semantics. But wait, it's worse: (a) C++ forces a net loss of efficiency for expr++ barring heroic optimization efforts that are not generally applicable, and (b) C++ requires actual boilerplate to ensure both variants work. I think it is plain that C++ made the wrong decision and D learned from it and made the right decision.

Similarly, we should not require boilerplate for $ just to convince it to do what it's always done for arrays and strings. Associative arrays and user-defined containers are free to disable it or define it, depending on what's most useful for them.

> In other words, this is just reasonable for std.range users. Most of D programmers would use std.range, but not all.

Again, I find it very difficult to agree with this. $ has always been length wherever meaningful - long before ranges came up.

> I don't mention that infinite range _should_ have r[n..$], rather mention that it is possible.

Yes, and defaulting $ to length does not prevent that. Again, if a range does define opDollar, that will always be chosen.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 31, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #58 from Kenji Hara <k.hara.pg@gmail.com> 2013-03-31 12:09:31 PDT ---
(In reply to comment #57)
> I don't think there's any breakage. My understanding is that those containers already need to define opDollar. If opDollar is defined there is no change in semantics.

No. This enhancement will suddenly change a "correctly invalid" code to "accepts-invalid".

SparseArray a;  // SparseArray does not have opDollar,
// because it is unnecessary now.

auto n = a.length;  // actually contains element count, or maximum index number
a[$-1];  // now: Error: undefined identifier __dollar
         // after: incorrectly translated to a[a.length-1];

Finally SparseArray's author should add @disable opDollar().

> > The fact clearly represents that we cannot always regard $ as "length". Indeed, it is true in built-in arrays and range concept, but isn't true in associative arrays and some user-defined containers. So I think that this enhancement has a bias toward range concept.
> 
> I'd have quite a bit of difficulty agreeing with this. The notion that "$" is a synonym for "length" predates ranges and has been there for strings and arrays ever since they were defined.

"$" is a synonym for "length" in D - yes. But, **in general**, "$" is not a synonym for "length". The difference is important.

The advantage of UFCS and std.range.opDollar approach is that is "opt-in" for the "$" meaning. It does not change the meaning of current existing code. (Again, "change" == "currently invalid/meaningless code will be changed to acceptable, potentially and unintendedly")

On the other hand, your compiler approach is "opt-out". It moves boilerplate code from all range definition to SparseArray definition. This is unacceptable to me.

> > And, as far as possible, compiler should be a neutral.
> 
> That shouldn't be done to a fault, either. The D language is not neutral on user-defined expr++ vs. ++expr; it forces both to have similar semantics to their built-in counterparts. In contrast, the C++ language allows defining ++expr and expr++ with different semantics. But wait, it's worse: (a) C++ forces a net loss of efficiency for expr++ barring heroic optimization efforts that are not generally applicable, and (b) C++ requires actual boilerplate to ensure both variants work. I think it is plain that C++ made the wrong decision and D learned from it and made the right decision.

It is entirely different thing.

> Similarly, we should not require boilerplate for $ just to convince it to do what it's always done for arrays and strings. Associative arrays and user-defined containers are free to disable it or define it, depending on what's most useful for them.

Why you enforce disabling opDollar to other container authors? It is equivalent
to enforce writing "alias opDollar = length;` to all range authors.
I cannot see any difference there. If you favor the former, I can say it is
definitely a kind of bias.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
April 01, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=7177



--- Comment #59 from Andrei Alexandrescu <andrei@erdani.com> 2013-03-31 17:30:22 PDT ---
(In reply to comment #58)
> (In reply to comment #57)
> > I don't think there's any breakage. My understanding is that those containers already need to define opDollar. If opDollar is defined there is no change in semantics.
> 
> No. This enhancement will suddenly change a "correctly invalid" code to "accepts-invalid".
> 
> SparseArray a;  // SparseArray does not have opDollar,
> // because it is unnecessary now.
> 
> auto n = a.length;  // actually contains element count, or maximum index number
> a[$-1];  // now: Error: undefined identifier __dollar
>          // after: incorrectly translated to a[a.length-1];
> 
> Finally SparseArray's author should add @disable opDollar().

Understood. I would argue that behavior changes from "invalid" to "valid and correct". I think it would be hard to find one D programmer who'd write s[$ - 1] and expect it to be anything else than a[a.length - 1]. This behavior has been built in forever.

Whether SparseArray should support operator [] and length are separate questions. If it does, and if anyone ever writes a[$ - 1], that should be correct code and there is no other possible semantics than a[a.length - 1].

> > I'd have quite a bit of difficulty agreeing with this. The notion that "$" is a synonym for "length" predates ranges and has been there for strings and arrays ever since they were defined.
> 
> "$" is a synonym for "length" in D - yes. But, **in general**, "$" is not a synonym for "length". The difference is important.

Agreed.

> The advantage of UFCS and std.range.opDollar approach is that is "opt-in" for the "$" meaning. It does not change the meaning of current existing code. (Again, "change" == "currently invalid/meaningless code will be changed to acceptable, potentially and unintendedly")

I'd say the change is to a behavior that has a null surprise factor. If anyone writes a[$ - 1] or whatnot, it is clear what they meant and what they expect.

> On the other hand, your compiler approach is "opt-out". It moves boilerplate code from all range definition to SparseArray definition. This is unacceptable to me.

Let me submit for your consideration that (a) it's undecided whether that's bad for SparseArray, and (b) the vast majority of ranges simply want opDollar be the same as length. This is the case for all of Phobos, and this bug originated in wake of a large diff that added very many "alias length opDollar;" to virtually all ranges that support [] and length. It's just what everybody expects. I would argue we can't afford to make everybody pay for the sake of a rare occurrence.

> > > And, as far as possible, compiler should be a neutral.
> > 
> > That shouldn't be done to a fault, either. The D language is not neutral on user-defined expr++ vs. ++expr; it forces both to have similar semantics to their built-in counterparts. In contrast, the C++ language allows defining ++expr and expr++ with different semantics. But wait, it's worse: (a) C++ forces a net loss of efficiency for expr++ barring heroic optimization efforts that are not generally applicable, and (b) C++ requires actual boilerplate to ensure both variants work. I think it is plain that C++ made the wrong decision and D learned from it and made the right decision.
> 
> It is entirely different thing.

I agree it is a different thing. I was replying to the part with "As far as possible, compiler should be neutral" by simply showing that sometimes being neutral is exactly the wrong thing to do.

> > Similarly, we should not require boilerplate for $ just to convince it to do what it's always done for arrays and strings. Associative arrays and user-defined containers are free to disable it or define it, depending on what's most useful for them.
> 
> Why you enforce disabling opDollar to other container authors?

I think most container authors will benefit of that behavior.

> It is equivalent
> to enforce writing "alias opDollar = length;` to all range authors.
> I cannot see any difference there. If you favor the former, I can say it is
> definitely a kind of bias.

Yes, there is a bias - frequency. My core argument is that the vast majority of ranges simply want $ and length to mean the same thing, and very few need to have them mean different things.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------