Jump to page: 1 2 3
Thread overview
More complexity creep in Phobos
Mar 28, 2019
Seb
Mar 28, 2019
Meta
Mar 28, 2019
Meta
Mar 28, 2019
Walter Bright
Mar 28, 2019
Kagamin
Mar 28, 2019
Meta
Mar 28, 2019
Meta
Apr 01, 2019
Meta
Mar 28, 2019
David Gileadi
May 03, 2019
touchaa
Mar 30, 2019
Jonathan M Davis
Apr 01, 2019
H. S. Teoh
Apr 01, 2019
Olivier FAURE
May 03, 2019
FeepingCreature
March 27, 2019
I started an alphabetical search through std modules, and as luck would have it I got as far as the first module: std/algorithm/comparison.d. In there, there's these overloads:

size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
    (Range1 s, Range2 t)
if (isForwardRange!(Range1) && isForwardRange!(Range2));

size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
    (auto ref Range1 s, auto ref Range2 t)
if (isConvertibleToString!Range1 || isConvertibleToString!Range2)

(similar for levenshteinDistanceAndPath)

What's with the second overload nonsense? The Levenshtein algorithm works on forward ranges. And that's about it, in a profound sense: Platonically what the algorithm needs is two forward ranges to operate. (We ought to be pretty proud of it, too: in all other languages I looked at, it's misimplemented to require random access.)

The second overload comes from a warped sense of DWIM: well if this type converts to a string, we commit to support that too. Really. How about a struct that converts to an array? Should we put in a pull request to that, too? Where do we even stop?

I hope there's not much more of this nonsense, because it all should be deprecated with fire.
March 28, 2019
On Thursday, 28 March 2019 at 02:17:35 UTC, Andrei Alexandrescu wrote:
> I started an alphabetical search through std modules, and as luck would have it I got as far as the first module: std/algorithm/comparison.d. In there, there's these overloads:
>
> size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
>     (Range1 s, Range2 t)
> if (isForwardRange!(Range1) && isForwardRange!(Range2));
>
> size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
>     (auto ref Range1 s, auto ref Range2 t)
> if (isConvertibleToString!Range1 || isConvertibleToString!Range2)
>
> (similar for levenshteinDistanceAndPath)
>
> What's with the second overload nonsense? The Levenshtein algorithm works on forward ranges. And that's about it, in a profound sense: Platonically what the algorithm needs is two forward ranges to operate. (We ought to be pretty proud of it, too: in all other languages I looked at, it's misimplemented to require random access.)
>
> The second overload comes from a warped sense of DWIM: well if this type converts to a string, we commit to support that too. Really. How about a struct that converts to an array? Should we put in a pull request to that, too? Where do we even stop?

See https://github.com/dlang/phobos/pull/3770 for the historical reason.

> I hope there's not much more of this nonsense, because it all should be deprecated with fire.

Then we would have to deprecate more than half of Phobos, because a lot of similar cruft got aggregated over the last ten years. Many of these crufts can't be as easily deprecated as this example...

March 28, 2019
On Thursday, 28 March 2019 at 02:17:35 UTC, Andrei Alexandrescu wrote:
> I started an alphabetical search through std modules, and as luck would have it I got as far as the first module: std/algorithm/comparison.d. In there, there's these overloads:
>
> size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
>     (Range1 s, Range2 t)
> if (isForwardRange!(Range1) && isForwardRange!(Range2));
>
> size_t levenshteinDistance(alias equals = (a,b) => a == b, Range1, Range2)
>     (auto ref Range1 s, auto ref Range2 t)
> if (isConvertibleToString!Range1 || isConvertibleToString!Range2)
>
> (similar for levenshteinDistanceAndPath)
>
> What's with the second overload nonsense? The Levenshtein algorithm works on forward ranges. And that's about it, in a profound sense: Platonically what the algorithm needs is two forward ranges to operate. (We ought to be pretty proud of it, too: in all other languages I looked at, it's misimplemented to require random access.)
>
> The second overload comes from a warped sense of DWIM: well if this type converts to a string, we commit to support that too. Really. How about a struct that converts to an array? Should we put in a pull request to that, too? Where do we even stop?
>
> I hope there's not much more of this nonsense, because it all should be deprecated with fire.

Maybe the implementation of the fix is not ideal, but the reason it was added in the first place is valid, IMO. Looking at the original issue[1], the minimized example is as follows:

void popFront(T)(ref T[] a) { a = a[1..$]; }

enum bool isInputRange(R) = is(typeof(
{
    R r;
    r.popFront();
}));

struct DirEntry
{
    @property string name() { return ""; }
    alias name this;
}
pragma(msg, isInputRange!DirEntry); // prints 'false'
pragma(msg, isInputRange!(typeof(DirEntry.init.name))); // prints 'true'

bool isDir(R)(R r) if (isInputRange!R) { return true; }

void main()
{
    DirEntry de;
    bool c = isDir(de); // Error: isDir cannot deduce function from argument types !()(DirEntry)
}

And trying this code out on dmd-nightly[2], it still fails today. Personally, I think it would be a bad thing if code that uses a DirEntry like above stopped compiling, but I agree that the fix could be implemented differently.

It seems like if you see some weird code in Phobos but you don't understand why it was written that way, there are 3 main reasons that account for 99% of these cases:

- string (autodecoding)
- alias this
- enums

1. https://issues.dlang.org/show_bug.cgi?id=15027
2. https://run.dlang.io/is/zMtHs0
March 27, 2019
On 3/27/19 10:45 PM, Seb wrote:
> See https://github.com/dlang/phobos/pull/3770 for the historical reason.

Thanks. Yep, the proverbial good intentions paving the road to hell.

>> I hope there's not much more of this nonsense, because it all should be deprecated with fire.
> 
> Then we would have to deprecate more than half of Phobos, because a lot of similar cruft got aggregated over the last ten years. Many of these crufts can't be as easily deprecated as this example...

"More than half" would be an exaggeration and as such of limited usefulness. I looked at a few more modules and they're in better shape.
March 28, 2019
On 3/27/19 11:22 PM, Meta wrote:
> It seems like if you see some weird code in Phobos but you don't understand why it was written that way, there are 3 main reasons that account for 99% of these cases:
> 
> - string (autodecoding)
> - alias this
> - enums

"Mistakes made by people" is missing from that list. Which include mine of course.

There's this nice notion of reasoning by first principles vs. reasoning by analogy:

https://fs.blog/2018/04/first-principles/

A very nice essay - recommended outside this discussion's context, too. The levenshteinDistance issue is a direct application of the two kinds of reasoning. This is reasoning by analogy:

"DirEntry converts to string and is used by some people as such. They pass it to functions, and some accept it but some don't. It follows by analogy that the Levenshtein distance algorithm should be worked out to accept it, too."

In contrast, the reasoning from first principles should powerfully override that:

"Levenshtein distance operates on forward ranges. All that stuff that changes the signature of levenshteinDistance to accept other artifacts is nonsense. Whoever wants to use it should carry the conversion to forward range themselves."

It follows that the correct answer to this generalization frenzy should be: "We work with ranges and character types. If you've got enums and alias this and whatnot, more power to you but to use the standard library you must convert those to the stuff we support."
March 28, 2019
On Thursday, 28 March 2019 at 04:20:55 UTC, Andrei Alexandrescu wrote:
> There's this nice notion of reasoning by first principles vs. reasoning by analogy:
>
> https://fs.blog/2018/04/first-principles/
>
> A very nice essay - recommended outside this discussion's context, too. The levenshteinDistance issue is a direct application of the two kinds of reasoning. This is reasoning by analogy:
>
> "DirEntry converts to string and is used by some people as such. They pass it to functions, and some accept it but some don't. It follows by analogy that the Levenshtein distance algorithm should be worked out to accept it, too."
> In contrast, the reasoning from first principles should powerfully override that:
>
> "Levenshtein distance operates on forward ranges. All that stuff that changes the signature of levenshteinDistance to accept other artifacts is nonsense. Whoever wants to use it should carry the conversion to forward range themselves."
>
> It follows that the correct answer to this generalization frenzy should be: "We work with ranges and character types. If you've got enums and alias this and whatnot, more power to you but to use the standard library you must convert those to the stuff we support."

I agree with your first principles arguments, but not your conclusion. I believe you are starting with the faulty base assumption that DirEntry is not a range. DirEntry IS a range, by the rules of the current language.

DirEntry states that it is a subtype of `string` by declaring `alias name this`. It follows that a DirEntry may be transparently substituted wherever a string is accepted. As a string is a range, and DirEntry is a subtype of string, then DirEntry is also a range.

`levenshteinDistance` accepts two ranges as its arguments, either of which may be strings; therefore, either of its arguments may also be a DirEntry substituted in the place of a string. By writ, we must support passing a DirEntry to levenshteinDistance.



The real problem here is with `alias this`.

`alias this` claims to allow one type to become a subtype of another, but that's not true; this language feature violates the Liskov Substitution Principle (a fact that I have mentioned before, and likely others). `alias this` fails the substitutability test - this thread and the defect linked are proof of that.

Because `alias this` claims to allow one type A to subtype another type B, but does not actually make good on that promise and implements this subtyping improperly, it follows that any code working with type B that wants to also support type A has to use these ugly workarounds.

The code in question is working around a defect in the language, and would not be necessary if the language itself were fixed.

March 28, 2019
On 3/28/19 1:36 AM, Meta wrote:
> DirEntry states that it is a subtype of `string`

No, please refer to the "Generality creep" thread.
March 28, 2019
On 3/28/2019 12:32 AM, Andrei Alexandrescu wrote:
> No, please refer to the "Generality creep" thread.

https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html
March 28, 2019
On Thursday, 28 March 2019 at 07:32:33 UTC, Andrei Alexandrescu wrote:
> On 3/28/19 1:36 AM, Meta wrote:
>> DirEntry states that it is a subtype of `string`
>
> No, please refer to the "Generality creep" thread.

DirEntry is implicitly convertible to string because it needs to be usable with std.file, which is a big untyped ball of strings.
Compare with scriptlike https://github.com/abscissa/scriptlike#filepaths that has typed wrappers for paths (though I prefer separate wrappers for files and folders).
March 28, 2019
BTW the intended title was "More generality creep in Phobos". Turns out generality creep begets complexity creep, too...
« First   ‹ Prev
1 2 3