June 06, 2013
On 6/6/2013 9:50 AM, Dylan Knutson wrote:
> Well, it comes down to are we willing to marginally break code for the sake of a
> better API. D and Phobos aren't considered stable by any standard; I don't think
> we should treat them like they're set in stone. Also, deprecation gives
> developers plenty of time to update their code (if they have to at all).

I don't believe that because we broke A, therefore it's ok to break B.

And secondly, it isn't clear that Path is a better API.

I'm not opposed to breakage in all cases. But there needs to be a big win to justify it. I'm not seeing even a small net win for Path types. I'm not talking hypothetical either, like I said, I've tried them several times.

> Projects such as Dub, Vibe, and to an extent Tango disagree.

I agree there's a strong temptation to create a Path object, and I've succumbed myself to it several times. A corollary is that people often wanted to create a String class, too, though that has died out.

You might also consider David Nadlinger's counter example:

"As another data point (which may or may not be relevant for the discussion here), the LLVM system/support library was initially based on Path objects, but recently has been rewritten to use raw strings: http://llvm.org/docs/doxygen/html/namespacellvm_1_1sys_1_1path.html"

I've rewritten my Path code to go back to raw strings, too.
June 06, 2013
On Thursday, 6 June 2013 at 17:28:56 UTC, Steven Schveighoffer wrote:
> On Thu, 06 Jun 2013 13:25:56 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:
>
>> On Thursday, 6 June 2013 at 17:13:10 UTC, Steven Schveighoffer wrote:
>>> On Thu, 06 Jun 2013 12:14:30 -0400, Dylan Knutson <tcdknutson@gmail.com> wrote:
>>>
>>>> It doesn't do any allocations that the user won't have to do anyways. Paths have to be normalized before comparison; not doing so isn't correct behavior. Eg, the strings `foo../bar` != `bar`, yet they're equivalent paths. Path encapsulates the behavior. So it's the difference between
>>>>
>>>> buildNormalizedPath(s1) == buildNormalizedPath(s2);
>>>>
>>>> and
>>>>
>>>> p1 == p2;
>>>
>>> This can be done without allocations.
>>
>> I know.  There are a few additions that I've been planning to make for std.path for the longest time, I just haven't found the time to do so yet.  Specifically, I want to add a couple of functions that deal with ranges of path segments rather than full path strings.
>>
>> The first one is a lazy "path normaliser":
>>
>>   assert (equal(pathNormalizer(["foo", "bar", "..", "baz"]),
>>                 ["foo", "bar", "baz"]));
>>
>> With this, non-allocating path comparison is easy.  The verbose version of p1 == p2, which could be wrapped for convenience, is then:
>>
>>   equal(pathNormalizer(pathSplitter(p1)),
>>         pathNormalizer(pathSplitter(p2)))
>>
>> You can also use filenameCmp() as a predicate to equal() to make the comparison case-insensitive on OSes where this is expected.  Very general and composable, and easily wrappable.
>
> Great!  I'd highly suggest pathEqual which takes two ranges of dchar and does the composition and OS-specific comparison for you.

They don't have to be dchar if all the building blocks are templates (as the existing ones are):

bool pathEqual(CaseSensitive cs = CaseSensitive.osDefault, C1, C2)
              (const(C1)[] p1, const(C2)[] p2)
    if (isSomeChar!C1 && isSomeChar!C2)
{
    return equal!((a, b) => filenameCharCmp!cs(a, b) == 0)
                 (pathNormalizer(pathSplitter(p1)),
                  pathNormalizer(pathSplitter(p2)));
}
June 06, 2013
On 6/6/2013 8:57 AM, Lars T. Kyllingstad wrote:
> On Thursday, 6 June 2013 at 15:41:51 UTC, Dylan Knutson wrote:
>> FWIW, having Path be an object adds consistency with the rest of Phobos, which
>> has many entities which could be expressed as primitives, expressed as
>> objects. To name a few, DateTime is an object, File is an object, and DirEntry
>> is an object. Yes, they could be described as integers, or a pointer, or a
>> string, but it's less cognitive load on the developer to recognize them as
>> separate types.
>
> "Reducing cognitive load" is not the main reason these are objects.  DateTime
> lumps together no less than six integers. File adds automatic resource
> management via reference counting. DirEntry caches file information to avoid
> repeated filesystem lookups.  And so on.

It's hard to see what value there is in a type that is simply a wrapper around an existing type, and which provides implicit conversions too/from that existing type so that they can be intermixed arbitrarily.

At the end, that's nothing more than:

   alias string Path;
June 06, 2013
On Thursday, June 06, 2013 19:25:56 Lars T. Kyllingstad wrote:
> I know. There are a few additions that I've been planning to make for std.path for the longest time, I just haven't found the time to do so yet. Specifically, I want to add a couple of functions that deal with ranges of path segments rather than full path strings.

Another thing to consider is overloads of some of the functions which take an output range as their first argument. There has been an increased push lately to cut down on GC allocations in Phobos, and so we're probably going to start having more functions be overloaded such that they can be used with output ranges in order to give the folks who want to avoid the GC more control - similar to how we have the overload of toString that takes a delegate (though outside of classes, since we can templatize stuff, using an output range is more flexible than a delegate, though a delegate does qualify as an ouput range apparently).

- Jonathan M Davis
June 06, 2013
On 6/6/13 12:50 PM, Dylan Knutson wrote:
> Well, it comes down to are we willing to marginally break code for the
> sake of a better API.

Well the position of "marginally" in the sentence above may be contested by some.

> D and Phobos aren't considered stable by any
> standard; I don't think we should treat them like they're set in stone.
> Also, deprecation gives developers plenty of time to update their code
> (if they have to at all).

I think this opinion is very unlikely to enjoy popularity. We actively /want/ to make Phobos more stable, so using the argument that it's not yet stable to add more instability is sure to fit the pattern of some list of fallacies. Besides, the corresponding benefits (the best solid argument that could be constructed) are at least according to some not that large to justify the cost of breakage.


Andrei
June 06, 2013
On 6/6/13 1:04 PM, Lars T. Kyllingstad wrote:
> On Thursday, 6 June 2013 at 16:03:15 UTC, Andrei Alexandrescu wrote:
>> [...]
>>
>>> 8. There really isn't any such thing as a portable path representation.
>>> It's more than just \ vs /. There are the drive prefixes in Windows that
>>> have no analog in Linux. Sometimes case matters in Linux, where it would
>>> be ignored under Windows. There are 8.3 issues sometimes. The only thing
>>> you can do is come up with a subset of what works across systems, and
>>> then of course you have to go back to using strings when you need to
>>> access D:\foo\abc.c
>>
>> That is actually an argument in favor of good encapsulation, not against.
>
> The proposed API change does not introduce good encapsulation. It
> introduces a super-thin wrapper around a built-in type, and replaces
> free functions with methods, for what gain?

I was talking in principle. I agree that the argument "it was as easy as wrapping the already existing functions" works against the current proposal, not in favor of it.

Andrei
June 06, 2013
On 6/6/13 1:13 PM, Steven Schveighoffer wrote:
>> buildNormalizedPath(s1) == buildNormalizedPath(s2);
>>
>> and
>>
>> p1 == p2;
>
> This can be done without allocations.

Interesting. "Show me the code!"

Andrei
June 06, 2013
On Thu, 06 Jun 2013 13:40:37 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:

> On Thursday, 6 June 2013 at 17:28:56 UTC, Steven Schveighoffer wrote:

>> Great!  I'd highly suggest pathEqual which takes two ranges of dchar and does the composition and OS-specific comparison for you.
>
> They don't have to be dchar if all the building blocks are templates (as the existing ones are):
>
> bool pathEqual(CaseSensitive cs = CaseSensitive.osDefault, C1, C2)
>                (const(C1)[] p1, const(C2)[] p2)
>      if (isSomeChar!C1 && isSomeChar!C2)

Actually, all string variants are dchar ranges :)  And your solution is less general, dchar ranges don't have to be arrays.

However, I don't think in practice there are any real non-array dchar ranges...

One thing your version does do is explicitly say the parameters are const, which you couldn't do with a non-array dchar range.

-Steve
June 06, 2013
On 6/6/2013 9:14 AM, Dylan Knutson wrote:
> It doesn't do any allocations that the user won't have to do anyways. Paths have
> to be normalized before comparison; not doing so isn't correct behavior. Eg, the
> strings `foo../bar` != `bar`, yet they're equivalent paths. Path encapsulates
> the behavior. So it's the difference between
>
> buildNormalizedPath(s1) == buildNormalizedPath(s2);
>
> and
>
> p1 == p2;

I believe it is a mistake to try and automatically hide the difference between ./bar and bar. Paths being == and 'referring to the same file' are different things.

For example, what about symlinks?

For performance reasons, also, I'd want to normalize sometime after building the entire path, I wouldn't want to normalize at each step. Normalization should be an explicit step, not implicit.
June 06, 2013
On Thursday, June 06, 2013 10:37:27 Walter Bright wrote:
> On 6/6/2013 9:50 AM, Dylan Knutson wrote:
> > Well, it comes down to are we willing to marginally break code for the sake of a better API. D and Phobos aren't considered stable by any standard; I don't think we should treat them like they're set in stone. Also, deprecation gives developers plenty of time to update their code (if they have to at all).
> I don't believe that because we broke A, therefore it's ok to break B.
> 
> And secondly, it isn't clear that Path is a better API.
> 
> I'm not opposed to breakage in all cases. But there needs to be a big win to justify it. I'm not seeing even a small net win for Path types. I'm not talking hypothetical either, like I said, I've tried them several times.

Some modules have needed been redone. Some still do. But we already _did_ rework std.path. We agreed that we liked the new API, and it's been working great. It's one thing to revisit an API that's been around since before we had ranges or a review process. It's an entirely different thing to be constantly reworking entire modules. I think that we need _very_ strong justification to redesign a module that we already put through the review process. And I really don't think that we have it here.

- Jonathan M Davis