Jump to page: 1 2
Thread overview
[phobos] RFC: std.path
Jun 16, 2011
Jimmy Cao
Jun 14, 2011
Rainer Schuetze
Jun 14, 2011
Jimmy Cao
Jun 15, 2011
torhu
June 08, 2011
As you may already know, I've had a new version of std.path in the works for a while.  It is ready now, and I am waiting for my turn in the review queue in the main NG.  In the meantime, I just thought I'd run it by this mailing list to iron out the worst wrinkles.

So, if you feel like it, please comment.

Code: https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

Docs: http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html


-Lars


June 12, 2011
On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
> As you may already know, I've had a new version of std.path in the works for a while. ?It is ready now, and I am waiting for my turn in the review queue in the main NG. ?In the meantime, I just thought I'd run it by this mailing list to iron out the worst wrinkles.
>
> So, if you feel like it, please comment.
>
> Code: https://github.com/kyllingstad/phobos/blob/std-path/std/path.d
>
> Docs: http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html
>
>
> -Lars
>

Looks good. Some minor comments:

1) At many places, the code seems to be indexing into char and wchar
arrays which have variable size encoding is this an oversight?
2) There is still some code that is not "templetized" and only accepts
one parameter type. Was this on purpose?
June 12, 2011
On Sun, 2011-06-12 at 11:39 -0300, Jose Armando Garcia wrote:
> On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
> > As you may already know, I've had a new version of std.path in the works for a while.  It is ready now, and I am waiting for my turn in the review queue in the main NG.  In the meantime, I just thought I'd run it by this mailing list to iron out the worst wrinkles.
> >
> > So, if you feel like it, please comment.
> >
> > Code: https://github.com/kyllingstad/phobos/blob/std-path/std/path.d
> >
> > Docs: http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html
> >
> >
> > -Lars
> >
> 
> Looks good. Some minor comments:
> 
> 1) At many places, the code seems to be indexing into char and wchar arrays which have variable size encoding is this an oversight?

As long as you are searching for a specific character, and that character only consists of one code unit, it is fine.  For instance, '/' will never appear as the second, third or fourth code unit of any UTF-8 encoded character.


> 2) There is still some code that is not "templetized" and only accepts one parameter type. Was this on purpose?

It is on purpose, but it can of course be argued whether my reasons are good enough. ;)  There are four functions which only accept UTF-8 strings:

- glob() (formerly fnmatch())
- expandTilde()

These functions are from the old std.path, and I haven't made any changes to them in my version.

- toAbsolute()
- toCanonical()

These functions call std.file.getcwd(), which returns a char array.  If you want any other width, some transcoding must happen, and that may as well be up to the user.  (The same reasoning applies to expandTilde(), by the way, as that uses the HOME environment variable, and environment variables are assumed to be UTF-8 encoded in Phobos.)

-Lars

June 12, 2011
On Sun, Jun 12, 2011 at 12:15 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
> On Sun, 2011-06-12 at 11:39 -0300, Jose Armando Garcia wrote:
>> On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad
> These functions are from the old std.path, and I haven't made any changes to them in my version.
>
> - toAbsolute()
> - toCanonical()
>

In the comments where you say that it doesn't perform any IO you should add these functions. Speaking of which can we add a template called normalize (maybe you can come up with a better name) that does what canonical does but doesn't make it absolute. E.g.:

version(windows) assert(normilize("dir/file") == "dir\\file");
version(windows) assert(normilize("dir/./file") == "dir\\file");
//etc
June 12, 2011
On Sun, 2011-06-12 at 12:41 -0300, Jose Armando Garcia wrote:
> On Sun, Jun 12, 2011 at 12:15 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
> > On Sun, 2011-06-12 at 11:39 -0300, Jose Armando Garcia wrote:
> >> On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad
> > These functions are from the old std.path, and I haven't made any changes to them in my version.
> >
> > - toAbsolute()
> > - toCanonical()
> >
> 
> In the comments where you say that it doesn't perform any IO you should add these functions.

Does getcwd() perform any IO on Windows?  AFAIK, on POSIX it just queries /proc/self/cwd, which is a virtual file.


> Speaking of which can we add a template
> called normalize (maybe you can come up with a better name) that does
> what canonical does but doesn't make it absolute. E.g.:
> 
> version(windows) assert(normilize("dir/file") == "dir\\file");
> version(windows) assert(normilize("dir/./file") == "dir\\file");
> //etc

That sounds like a good idea.  Then I guess normalize("../foo") should just return "..\\foo", i.e. leave the ".." unresolved?

-Lars

June 12, 2011
On Sun, Jun 12, 2011 at 12:59 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
> On Sun, 2011-06-12 at 12:41 -0300, Jose Armando Garcia wrote:
>> On Sun, Jun 12, 2011 at 12:15 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
>> > On Sun, 2011-06-12 at 11:39 -0300, Jose Armando Garcia wrote:
>> >> On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad
>> > These functions are from the old std.path, and I haven't made any changes to them in my version.
>> >
>> > - toAbsolute()
>> > - toCanonical()
>> >
>>
>> In the comments where you say that it doesn't perform any IO you should add these functions.
>
> Does getcwd() perform any IO on Windows? ?AFAIK, on POSIX it just queries /proc/self/cwd, which is a virtual file.
>

The way I look at IO is anything that is external to the process.
Another way to thinking about it is that
toAbsolute()'s and toCanonical()'s result is dependent on state
outside of the process. While the rest of the templates/functions
aren't.

>
>> Speaking of which can we add a template
>> called normalize (maybe you can come up with a better name) that does
>> what canonical does but doesn't make it absolute. E.g.:
>>
>> version(windows) assert(normilize("dir/file") == "dir\\file");
>> version(windows) assert(normilize("dir/./file") == "dir\\file");
>> //etc
>
> That sounds like a good idea. ?Then I guess normalize("../foo") should just return "..\\foo", i.e. leave the ".." unresolved?
>

It is hard to resolve '..' without looking at the file system when considering soft/sym link due to multiple parents. if 'somedir' is a simlink "somedir/../" != ".".
June 12, 2011
On Sun, 2011-06-12 at 13:36 -0300, Jose Armando Garcia wrote:
> On Sun, Jun 12, 2011 at 12:59 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
> > On Sun, 2011-06-12 at 12:41 -0300, Jose Armando Garcia wrote:
> >> On Sun, Jun 12, 2011 at 12:15 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
> >> > On Sun, 2011-06-12 at 11:39 -0300, Jose Armando Garcia wrote:
> >> >> On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad
> >> > These functions are from the old std.path, and I haven't made any changes to them in my version.
> >> >
> >> > - toAbsolute()
> >> > - toCanonical()
> >> >
> >>
> >> In the comments where you say that it doesn't perform any IO you should add these functions.
> >
> > Does getcwd() perform any IO on Windows?  AFAIK, on POSIX it just queries /proc/self/cwd, which is a virtual file.
> >
> 
> The way I look at IO is anything that is external to the process.
> Another way to thinking about it is that
> toAbsolute()'s and toCanonical()'s result is dependent on state
> outside of the process. While the rest of the templates/functions
> aren't.

The way I've interpreted the "no IO" principle of std.path is "no disk/network IO", since those would come with an enormous performance penalty as compared to in-memory operations.  But maybe you are right.


> >> Speaking of which can we add a template
> >> called normalize (maybe you can come up with a better name) that does
> >> what canonical does but doesn't make it absolute. E.g.:
> >>
> >> version(windows) assert(normilize("dir/file") == "dir\\file");
> >> version(windows) assert(normilize("dir/./file") == "dir\\file");
> >> //etc
> >
> > That sounds like a good idea.  Then I guess normalize("../foo") should just return "..\\foo", i.e. leave the ".." unresolved?
> >
> 
> It is hard to resolve '..' without looking at the file system when considering soft/sym link due to multiple parents. if 'somedir' is a simlink "somedir/../" != ".".

That is a matter of choice, I guess.  In both bash and zsh, if I type

   cd some_dir/some_symlink/..

I end up in some_dir, regardless of where some_symlink is pointing.
That is how toCanonical() does things as well, and how I think
normalize() should work if I end up adding that.

-Lars

June 12, 2011
Em 12/06/2011, ?s 14:00, Lars Tandle Kyllingstad <lars at kyllingen.net> escreveu:

> On Sun, 2011-06-12 at 13:36 -0300, Jose Armando Garcia wrote:
>> On Sun, Jun 12, 2011 at 12:59 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
>>> On Sun, 2011-06-12 at 12:41 -0300, Jose Armando Garcia wrote:
>>>> On Sun, Jun 12, 2011 at 12:15 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
>>>>> On Sun, 2011-06-12 at 11:39 -0300, Jose Armando Garcia wrote:
>>>>>> On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad
>>>>> These functions are from the old std.path, and I haven't made any changes to them in my version.
>>>>>
>>>>> - toAbsolute()
>>>>> - toCanonical()
>>>>>
>>>>
>>>> In the comments where you say that it doesn't perform any IO you should add these functions.
>>>
>>> Does getcwd() perform any IO on Windows?  AFAIK, on POSIX it just queries /proc/self/cwd, which is a virtual file.
>>>
>>
>> The way I look at IO is anything that is external to the process.
>> Another way to thinking about it is that
>> toAbsolute()'s and toCanonical()'s result is dependent on state
>> outside of the process. While the rest of the templates/functions
>> aren't.
>
> The way I've interpreted the "no IO" principle of std.path is "no disk/network IO", since those would come with an enormous performance penalty as compared to in-memory operations.  But maybe you are right.
>
>
>>>> Speaking of which can we add a template
>>>> called normalize (maybe you can come up with a better name) that
>>>> does
>>>> what canonical does but doesn't make it absolute. E.g.:
>>>>
>>>> version(windows) assert(normilize("dir/file") == "dir\\file");
>>>> version(windows) assert(normilize("dir/./file") == "dir\\file");
>>>> //etc
>>>
>>> That sounds like a good idea.  Then I guess normalize("../foo")
>>> should
>>> just return "..\\foo", i.e. leave the ".." unresolved?
>>>
>>
>> It is hard to resolve '..' without looking at the file system when considering soft/sym link due to multiple parents. if 'somedir' is a simlink "somedir/../" != ".".
>
> That is a matter of choice, I guess.  In both bash and zsh, if I type
>
>   cd some_dir/some_symlink/..
>
> I end up in some_dir, regardless of where some_symlink is pointing.
> That is how toCanonical() does things as well, and how I think
> normalize() should work if I end up adding that.

But most program dont behave this way. For example ls, less and vim don't do that. I am okay with resolving symlinks but just take note.
June 12, 2011
On Sun, Jun 12, 2011 at 3:26 PM, Jose Armando Garcia <jsancio at gmail.com> wrote:
> Em 12/06/2011, ?s 14:00, Lars Tandle Kyllingstad <lars at kyllingen.net> escreveu:
>
>> On Sun, 2011-06-12 at 13:36 -0300, Jose Armando Garcia wrote:
>>>
>>> On Sun, Jun 12, 2011 at 12:59 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
>>>>
>>>> On Sun, 2011-06-12 at 12:41 -0300, Jose Armando Garcia wrote:
>>>>>
>>>>> On Sun, Jun 12, 2011 at 12:15 PM, Lars Tandle Kyllingstad <lars at kyllingen.net> wrote:
>>>>>>
>>>>>> On Sun, 2011-06-12 at 11:39 -0300, Jose Armando Garcia wrote:
>>>>>>>
>>>>>>> On Wed, Jun 8, 2011 at 4:29 PM, Lars Tandle Kyllingstad
>>>>>>
>>>>>> These functions are from the old std.path, and I haven't made any changes to them in my version.
>>>>>>
>>>>>> - toAbsolute()
>>>>>> - toCanonical()
>>>>>>
>>>>>
>>>>> In the comments where you say that it doesn't perform any IO you should add these functions.
>>>>
>>>> Does getcwd() perform any IO on Windows? ?AFAIK, on POSIX it just queries /proc/self/cwd, which is a virtual file.
>>>>
>>>
>>> The way I look at IO is anything that is external to the process.
>>> Another way to thinking about it is that
>>> toAbsolute()'s and toCanonical()'s result is dependent on state
>>> outside of the process. While the rest of the templates/functions
>>> aren't.
>>
>> The way I've interpreted the "no IO" principle of std.path is "no disk/network IO", since those would come with an enormous performance penalty as compared to in-memory operations. ?But maybe you are right.
>>
>>
>>>>> Speaking of which can we add a template
>>>>> called normalize (maybe you can come up with a better name) that does
>>>>> what canonical does but doesn't make it absolute. E.g.:
>>>>>
>>>>> version(windows) assert(normilize("dir/file") == "dir\\file");
>>>>> version(windows) assert(normilize("dir/./file") == "dir\\file");
>>>>> //etc
>>>>
>>>> That sounds like a good idea. ?Then I guess normalize("../foo") should just return "..\\foo", i.e. leave the ".." unresolved?
>>>>
>>>
>>> It is hard to resolve '..' without looking at the file system when considering soft/sym link due to multiple parents. if 'somedir' is a simlink "somedir/../" != ".".
>>
>> That is a matter of choice, I guess. ?In both bash and zsh, if I type
>>
>> ?cd some_dir/some_symlink/..
>>
>> I end up in some_dir, regardless of where some_symlink is pointing.
>> That is how toCanonical() does things as well, and how I think
>> normalize() should work if I end up adding that.
>
> But most program dont behave this way. For example ls, less and vim don't do that. I am okay with resolving symlinks but just take note.

Err. I am okay with resolving "..".
June 14, 2011
On 08.06.2011 21:29, Lars Tandle Kyllingstad wrote:
> As you may already know, I've had a new version of std.path in the works for a while.  It is ready now, and I am waiting for my turn in the review queue in the main NG.  In the meantime, I just thought I'd run it by this mailing list to iron out the worst wrinkles.
>
> So, if you feel like it, please comment.
>
> Code: https://github.com/kyllingstad/phobos/blob/std-path/std/path.d
>
> Docs: http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html
>
>
> -Lars

Looks good to me. A few notes:

- I agree with Jose that toAbsolute and expandTilde feel a little out-of-place in this module as they are not only string manipulation functions.

- expandTilde is currently not implemented on windows, but I think a similar functionality using environment variables similar to HOME could work, too. I have no idea how to implement it for other users than the current, though.

- \\network drives are currently only respected under windows (at least that's what the documentation says). At work I'm using the same syntax on OSX to mount network drives (with forward slashes), so I guess it is also valid on Posix systems.

- isRelative and isAbsolute both return false on the empty string, while dirName("") returns ".". Wouldn't that imply that isRelative("") is true?

- when transferring data containing file names between different operating systems, it would be convenient to have the path operation of the other system available, too. While you can deal with posix file names on windows without big problems, the reverse is usually not the case (you have to convert backslashes manually, don't know how to strip drive names, etc). Would it be a good idea to have both versions accessible through namespaces, e.g. Posix.isRooted() and Win.isRooted(), and let the global functions dispatch to the implementation that fits the current OS?

Rainer
« First   ‹ Prev
1 2