January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | Would it be enough to have a function that takes a string (or inout(char)[] if possible) and returns a string[] with the path elements? Also, a function to do the reverse. I agree we need something to abstract as much as possible the path elements (separator, drive, extension, etc.), but the huge benefit of using a string for a path is that when a path struct exists, it's often used as the parameter type for paths. This results in one having to wrap their paths in a Path struct just to call some function, even though they don't care about manipulating it. I remember when using Tango, there were types to deal with all sorts of things (like files and paths) and I always had to use the docs when using them. It was not very intuitive. -Steve ----- Original Message ---- > From: Jonathan M Davis <jmdavisProg at gmx.com> > To: phobos at puremagic.com > Sent: Sun, January 2, 2011 7:25:08 PM > Subject: Re: [phobos] next release (meaning of path) > > On Sunday 02 January 2011 16:01:47 Andrei Alexandrescu wrote: > > Let's have a brief vote. Do you think we should have a string-like structure Path in std.path? What primitives should it have? > > > > I'm fine using strings, but I could be convinced to use a Path type if it had some compelling advantages. > > I'm sure that it depends on the use case, but if you're doing a lot of operations on paths which would involve adding, removing, or renaming directories, then having a Path struct of some kind which essentially held a linked list of the pieces of the path could be beneficial. If you're having to > > constantly search for the Xth separator in the string and the like - >especially > > if you're then having to create a new string with changes - it could be a bit > expensive to deal with just strings. However, any time that you then need to actually use the path - like opening a file or whatnot - you'd need to concatenate the whole thing together, and doing that a lot could get expensive > > too. > > For the general use case, I think that strings work just fine and that having >a > > Path struct would be unnecessary overhead. There are use cases where it could >be > > useful, so it might be useful to have a Path struct for such cases (what Boost > > has is rather nice from what I recall), but that isn't the typical case. > > The one really nice thing about using a Path struct that I can think of is >that > > it makes errors related to different separators less likely. The separator >would > > generally be abstracted away in the user code and then dealt with >appropriately > > by the Path struct when turned into string form for OS calls and the like. It > might also help cases where you actually want to use the separator in a file name, though that's generally a bad idea, even if it can be done. > > I really liked Boost's path stuff last time I messed with it, and having something similar in Phobos would be cool, but I would worry that that's just > overkill for the average case. Certainly, if we have a Path struct of some >kind, > > it needs to work with strings well and easily, or it's going to be a problem. > > Personally, I'm not sure how much I care either way. A solid Path struct could > > be very cool, but it also could be overkill. > > - Jonathan M Davis > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos > |
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steve Schveighoffer | On Monday 03 January 2011 04:42:49 Steve Schveighoffer wrote: > Would it be enough to have a function that takes a string (or inout(char)[] > if possible) and returns a string[] with the path elements? Also, a > function to do the reverse. > > I agree we need something to abstract as much as possible the path elements (separator, drive, extension, etc.), but the huge benefit of using a string for a path is that when a path struct exists, it's often used as the parameter type for paths. This results in one having to wrap their paths in a Path struct just to call some function, even though they don't care about manipulating it. I remember when using Tango, there were types to deal with all sorts of things (like files and paths) and I always had to use the docs when using them. It was not very intuitive. One of the major reasons that some folks want a Path struct is so that function _don't_ take strings for paths. That way, paths are their own type, and you can overload on them and the like. Personally, I think that they have a point, but I'm not sure that I care all that much. However, if all it takes to have Path struct from a string is Path("my/path/string"), then it's quite straightforward to get a Path struct from a string. Getting a string out of it could then be as simple as toString(), though if you do anything anywhere near as fancy as Boost's path type ( http://is.gd/k0Q3Q ),then you could have it do things like giving you the absolute version of the path, or the canonicalized version, or other similar versions of it, giving you quite a bit more flexibility. Boost's path stuff is fairly fancy in what it does for you, and encapsulating all of that in a class or struct has its benefits. I would be worried about it being overkill though. If we do go down that path, I don't think that we're going to want something quite as complicated as what Boost has. Having a function for moving between string and string[] could definitely be useful, but pretty quickly you get into the problem over whether functions should be taking string or string[]. With a separate path type, it's clear. And if it's a separate type, then you could treat it as one entity or as its pieces without having to convert it, since it would all be encapsulated. Also, without a path type, abstracting away stuff is definitely going to be harder. As it is, I'm willing to bet that there are plenty of programmers who will just use / or \ rather then sep. If you wanted to be purist about it (which isn't very pragmatic and therefore not very D-like), you'd _never_ write paths like "a/b/c" because the separator is hard-coded. Instead, you always do something like Path("a", "b", "c") and never care about the separator. In such cases, you ideally wouldn't even have to convert it to a string later, because all of the file functions would take the Path type. But, as I said, that's likely too much of a purist approach for a pragmatic language like D. Certainly, anyone who would want to just use strings for paths would get very annoyed with it very quickly. But at least you'd never have separator problems. Another approach which D could take would simply be to treat file names like URIs and always use / for the separator. Then any time an OS function is called, the path gets converted to whatever the OS uses. If you're going that far, you can ever have it convert file names to file names which are legal for the current OS or file system so that the programmer doesn't have to worry about which characters are legal and which aren't. That would work better with a separate path type, but it could work with just strings. I see definite benefit in having an actual Path type which abstracts various file system stuff away, but I also see definite benefit in just using strings. So, I don't know which way I'd like std.path to go. If you're doing fancier stuff with paths or if you're really trying to make sure that paths are cross-platform, then a separate path type would likely be the way to go. But for a lot of stuff, simply using strings is good enough. So, I don't know. If std.path continues to use strings and doesn't have any kind of separate path type, then I expect that someone will create one in a third party library eventually. So, I expect that such a path type will be available in D at some point. The question is whether such a type should be in Phobos and/or whether it should be the normal way of handling paths, or just a nice, alternate way of dealing with paths. - Jonathan M Davis |
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | ----- Original Message ---- > From: Jonathan M Davis <jmdavisProg at gmx.com> > > One of the major reasons that some folks want a Path struct is so that function _don't_ take strings for paths. I understand this, but it's rarely needed. What would the string represent if not a path? My point was that, the most intuitive interface is to take a string. This looks very intuitive to me: auto f = openFile("/my/filename"); So likely one would have to support both overloads with one always forwarding to the other. I could be wrong, maybe it's really important to have a path type, but my experience with Path types are that they just get in the way most of the time. > Having a function for moving between string and string[] could definitely be useful, but pretty quickly you get into the problem over whether functions should be taking string or string[]. With a separate path type, it's clear. >And > > if it's a separate type, then you could treat it as one entity or as its >pieces > > without having to convert it, since it would all be encapsulated. My point is, often you have a path as a string already (an argument to a program, or part of a message over the network). Why should the system do extra processing just so it can pass it to another function? Think of this: int main(string[] args) { auto f = File(args[1]); // expect args[1] is a valid filename } Why would it make sense to require that File receives a valid path type that parses the string before handing it over wholesale to the open() syscall? > Also, without a path type, abstracting away stuff is definitely going to be harder. As it is, I'm willing to bet that there are plenty of programmers who > will just use / or \ rather then sep. If you wanted to be purist about it >(which > > isn't very pragmatic and therefore not very D-like), you'd _never_ write paths > > like "a/b/c" because the separator is hard-coded. Instead, you always do something like Path("a", "b", "c") and never care about the separator. In such > > cases, you ideally wouldn't even have to convert it to a string later, because > > all of the file functions would take the Path type. But, as I said, that's >likely > > too much of a purist approach for a pragmatic language like D. Certainly, >anyone > > who would want to just use strings for paths would get very annoyed with it >very > > quickly. But at least you'd never have separator problems. I typically always use / because it mostly works on all OSes. But in any case, I was thinking of a function like this to convert back to a path string: string toPath(const(char)[][] elements...); auto pathstr = toPath("a", "b", "c"); > I see definite benefit in having an actual Path type which abstracts various >file > > system stuff away, but I also see definite benefit in just using strings. So, >I > > don't know which way I'd like std.path to go. If you're doing fancier stuff >with > > paths or if you're really trying to make sure that paths are cross-platform, then a separate path type would likely be the way to go. But for a lot of >stuff, > > simply using strings is good enough. So, I don't know. If std.path continues >to > > use strings and doesn't have any kind of separate path type, then I expect >that > > someone will create one in a third party library eventually. So, I expect that > > such a path type will be available in D at some point. The question is whether > > such a type should be in Phobos and/or whether it should be the normal way of > handling paths, or just a nice, alternate way of dealing with paths. I don't have a problem with a Path type existing for manipulation (I think you need one for URIs), I just am worried about developers saying "oh, there's a path type? Great, that's what I'll use as my path parameter" instead of a string. Then the library gets unnecessarily complex. Maybe a Path type with a disclaimer "do not use as a parameter". -Steve |
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steve Schveighoffer | Le 2011-01-03 ? 9:02, Steve Schveighoffer a ?crit : > I understand this, but it's rarely needed. What would the string represent if not a path? My point was that, the most intuitive interface is to take a string. This looks very intuitive to me: > > auto f = openFile("/my/filename"); I think one reason is that sometime you have a constructor that takes either a path or a string. For instance, I could have a class TextContent that can be initialized wither with a path to a text file, or a string for the content. To implement this I need Path to be of a different type, or I need to introduce a dummy parameter. And it's not like I can give a different name to one of those constructors, all constructors have the same name in D. There was a discussion about this on d.learn recently, so it's not like it's a made up case. (See "discrimination of constructors with same number of parameters", December 30.) That said, I agree that generally using a path struct everywhere would be too verbose. -- Michel Fortin michel.fortin at michelf.com http://michelf.com/ |
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | The path still isn't at the same level as the content, it's an address to the content. Only passing in path seems limited, what if you wanted a network stream?
Now, a URI might pose a different argument, because it encodes how to open the content as well as the path to the content, and it would seem complex to require someone to open a network stream with a URI before passing it into a function. However, I'd still be hesitant to suggest having the overload be against a URI type. The main purpose of such types is to parse a string into its components, not to provide an overload mechanism. It seems incorrect to use these types of things as parameters to distinguish them from strings.
-Steve
----- Original Message ----
> From: Michel Fortin <michel.fortin at michelf.com>
> To: Discuss the phobos library for D <phobos at puremagic.com>
> Sent: Mon, January 3, 2011 9:29:06 AM
> Subject: Re: [phobos] next release (meaning of path)
>
> Le 2011-01-03 ? 9:02, Steve Schveighoffer a ?crit :
>
> > I understand this, but it's rarely needed. What would the string represent
>if
>
> > not a path? My point was that, the most intuitive interface is to take a
> > string. This looks very intuitive to me:
> >
> > auto f = openFile("/my/filename");
>
> I think one reason is that sometime you have a constructor that takes either a
>path or a string. For instance, I could have a class TextContent that can be initialized wither with a path to a text file, or a string for the content. To implement this I need Path to be of a different type, or I need to introduce a dummy parameter. And it's not like I can give a different name to one of those constructors, all constructors have the same name in D.
>
> There was a discussion about this on d.learn recently, so it's not like it's a
>made up case. (See "discrimination of constructors with same number of parameters", December 30.)
>
> That said, I agree that generally using a path struct everywhere would be too
>verbose.
>
>
> --
> Michel Fortin
> michel.fortin at michelf.com
> http://michelf.com/
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
|
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On Mon, 3 Jan 2011 05:25:33 -0800
Jonathan M Davis <jmdavisProg at gmx.com> wrote:
> The question is whether such a [Path] type should be in Phobos and/or whether it should be the normal way of handling paths, or just a nice, alternate way of dealing with paths.
I'm all for a Path type to be the standard way of dealings with pathes, especially abstracting away OS/FS issues, as long as it(s cheap in code and computation, meaning:
* it takes any form of plain single-string path, valid on any OS
* its construction does nearly nothing silently, except maybe splitting the string on seps
Then we could have all kinds of niceties, but only on demand. In particular, with the later format param on writeTo, we could have path string according to specified OS or in canonical, OS-independant, form (eg using '|' as sep), while no format would mean according actual underlying OS.
Another option would be a true Directory (or Folder) type, with content (recursive) iteraton and such (again, as long as its construction is cheap, and all service computations are on demand).
Denis
-- -- -- -- -- -- --
vit esse estrany ?
spir.wikidot.com
|
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steve Schveighoffer | On 1/3/11 8:02 AM, Steve Schveighoffer wrote: > I understand this, but it's rarely needed. What would the string represent if not a path? My point was that, the most intuitive interface is to take a string. This looks very intuitive to me: > > auto f = openFile("/my/filename"); > > So likely one would have to support both overloads with one always forwarding to the other. I could be wrong, maybe it's really important to have a path type, but my experience with Path types are that they just get in the way most of the time. I agree. Again, I've reopened this question in hope for a decisive argument in favor of Path. > I typically always use / because it mostly works on all OSes. But in any case, I was thinking of a function like this to convert back to a path string: > > string toPath(const(char)[][] elements...); > > auto pathstr = toPath("a", "b", "c"); Like http://www.digitalmars.com/d/2.0/phobos/std_path.html#join? :o) Andrei |
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu |
----- Original Message ----
> From: Andrei Alexandrescu <andrei at erdani.com>
>
> On 1/3/11 8:02 AM, Steve Schveighoffer wrote:
> > I typically always use / because it mostly works on all OSes. But in any
>case,
> > I was thinking of a function like this to convert back to a path string:
> >
> > string toPath(const(char)[][] elements...);
> >
> > auto pathstr = toPath("a", "b", "c");
>
> Like http://www.digitalmars.com/d/2.0/phobos/std_path.html#join? :o)
Yes, exactly like that, but not called join. I think phobos should avoid naming functions the same from two different modules, otherwise, you have to start using full module names.
-Steve
|
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steve Schveighoffer | On 1/3/11 6:42 AM, Steve Schveighoffer wrote:
> Would it be enough to have a function that takes a string (or inout(char)[] if
> possible) and returns a string[] with the path elements? Also, a function to do
> the reverse.
Problem is I've seldom been in a situation in life where I could find use for such a function. Most of the time I want basename, dirname, and if applicable drive. I don't want to dissect the entire dirname.
Andrei
|
January 03, 2011 [phobos] next release (meaning of path) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu |
----- Original Message ----
> From: Andrei Alexandrescu <andrei at erdani.com>
> To: Discuss the phobos library for D <phobos at puremagic.com>
> Sent: Mon, January 3, 2011 10:43:27 AM
> Subject: Re: [phobos] next release (meaning of path)
>
> On 1/3/11 6:42 AM, Steve Schveighoffer wrote:
> > Would it be enough to have a function that takes a string (or inout(char)[]
>if
> > possible) and returns a string[] with the path elements? Also, a function
>to do
> > the reverse.
>
> Problem is I've seldom been in a situation in life where I could find use for
>such a function. Most of the time I want basename, dirname, and if applicable drive. I don't want to dissect the entire dirname.
What about a breadcrumb trail? Or trying to determine if there is a '..' in the path?
In any case, the point of the function is to remove all OS-specific information, leaving the directory intact retains those pesky dir separators. A path in array form is easier to deal with in an OS-agnostic way. Plus, an array is immediately available to perform any algorithmic functions on it (for instance, searching for a '..').
It's worth noting that getting the full directory and an array of the directory names is in direct conflict (not intuitive to have an API that provides both with the same function). This might argue for having a Path type where you could have more flexibility. Or you could just provide multiple functions.
Either way, path manipulation is one of those things IMO that pops up infrequently. Generally you just want to pass around the entire path, and you don't care what's in it.
-Steve
|
Copyright © 1999-2021 by the D Language Foundation