April 26, 2010
Andrei Alexandrescu <andrei at ...> writes:

> 
> 
> I suggest we go with lazy ranges throughout. No memory allocation unless the user asks for it by calling std.array.array(range).  For example, splitter() vs. split() was a huge accelerator in virtually all my text processing programs.
> 
> Andrei
> 

Fine by me as long as it gets in.

I? a bit curious, though. path2list currently returns an array of slices into the original path. I have a hard time imagining memory allocation this way would be much higher than as a range (unless the original path is also a lazy range).

Next week finals will be over. I?l rewrite the bugger then.


April 26, 2010
The array of slices itself requires allocation.

Andrei

On 04/26/2010 05:15 PM, Ellery Newcomer wrote:
> Andrei Alexandrescu<andrei at ...>  writes:
>
>>
>>
>> I suggest we go with lazy ranges throughout. No memory allocation unless
>> the user asks for it by calling std.array.array(range).  For example,
>> splitter() vs. split() was a huge accelerator in virtually all my text
>> processing programs.
>>
>> Andrei
>>
>
> Fine by me as long as it gets in.
>
> I? a bit curious, though. path2list currently returns an array of slices into the original path. I have a hard time imagining memory allocation this way would be much higher than as a range (unless the original path is also a lazy range).
>
> Next week finals will be over. I?l rewrite the bugger then.
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
April 26, 2010
On 04/26/2010 04:08 PM, Lars Tandle Kyllingstad wrote:
> Are you envisioning a system that auto-detects whether something is a Windows or a POSIX path and converts it to some OS-agnostic internal representation? E.g. something like
>
> // Auto-detect Windows path
> auto path = Path("c:\\foo\\bar.baz");

I wasn't thinking of that. More like some common primitives that could deal with Windows paths and Unix paths. The separator is already abstracted, and the remaining difference is the existence of a drive letter in Windows (that I notice some Unix shells are starting to replace with a protocol such as smb: or sftp:).

> The only other option I can see is to have std.path automatically work with Windows paths on Windows and POSIX paths on POSIX -- which is exactly what I'm suggesting.
>
>
> Anyway, I'm not married to this idea, I just think it's a good one. ;)
>
> I still think something needs to be done to std.path, though (and I'm still volunteering to do it). Did any of my other suggestions seem worthwhile, or are people happy with the module the way it is? Are there other suggestions?

I love the way you set up things here:

http://kyllingen.net/code/ltk/doc/path.html

It's just that I so much believe it would sit better elsewhere. It's a great design hitting on an unfit problem. For example, let's assume you convince me there really is a need for Unix path manipulation on Windows (which I'm not, but let's say I am). Then I see I can use Windows paths on Unix. Yay! That's what I call a cool design. But wait. When would you need that? Well, most never. That's why I feel there's some lack of fitness in there.

How about this: we focus on an API that allows you to use the alternate separator on Windows ("/") for virtually all Posix primitives. At least in theory, a Windows path without a drive is a Posix path.


Andrei
April 27, 2010
On 04/27/2010 12:43 AM, Andrei Alexandrescu wrote:
> How about this: we focus on an API that allows you to use the alternate separator on Windows ("/") for virtually all Posix primitives. At least in theory, a Windows path without a drive is a Posix path.

Alright, I won't push it any further. ;)  What you're suggesting will cover many use cases, and like I said, it's not *that* important to me.

Below is a listing the API I have in mind.  I think it's worth breaking backwards compatibility for a more unified and coherent naming scheme, and I also think it's better to do it now than later.  Walter seemed to oppose the idea, what do others think?


// System-dependent constants
string dirSeparator
string altDirSeparator
string pathSeparator
string currentDir
string parentDir

// File extensions
string extension(path)
string setExtension(path, ext)
string setDefaultExtension(path, ext)
string appendExtension(path, ext)
string removeExtension(path, ext=null)

// Extracting drive/directory/filename
string drive(path)
string directory(path)
string filename(path)
string basename(path, suffix)

// Relative/absolute/canonical paths
bool isAbsolute(path)
bool isRelative(path)
bool isCanonical(path)
string toAbsolute(path)
string toRelative(path)
string toCanonical(path)

// Joining/splitting paths
string join(pathComponents...)
SomeRange splitter(path)   // cf. Ellery Newcomer's suggestion

// Filename matching
bool wildcardMatch(path, pattern)
bool filenameMatch(path1, path2)
bool filenameCharMatch(char1, char2)


-Lars
April 27, 2010
On 04/27/2010 03:40 AM, Lars Tandle Kyllingstad wrote:
> On 04/27/2010 12:43 AM, Andrei Alexandrescu wrote:
> Below is a listing the API I have in mind. I think it's worth breaking
> backwards compatibility for a more unified and coherent naming scheme,
> and I also think it's better to do it now than later. Walter seemed to
> oppose the idea, what do others think?
>
>
> // System-dependent constants
> string dirSeparator
> string altDirSeparator
> string pathSeparator
> string currentDir
> string parentDir

I'd change "currentDir" with e.g. "currentDirSymbol" or something. Otherwise people may actually thing currentDir is pwd. Same about parentDir.

> // File extensions
> string extension(path)
> string setExtension(path, ext)
> string setDefaultExtension(path, ext)
> string appendExtension(path, ext)
> string removeExtension(path, ext=null)

I'm not fond of adding support for extensions. On Unix there's no explicit extension. Extension comes from CP/M with 8 characters for name and 3 characters for extension, which is now long defunct.

> // Extracting drive/directory/filename
> string drive(path)
> string directory(path)
> string filename(path)
> string basename(path, suffix)

"dirname" will be instantly recognized by any Unix user.

> // Relative/absolute/canonical paths
> bool isAbsolute(path)
> bool isRelative(path)
> bool isCanonical(path)
> string toAbsolute(path)
> string toRelative(path)
> string toCanonical(path)
>
> // Joining/splitting paths
> string join(pathComponents...)
> SomeRange splitter(path) // cf. Ellery Newcomer's suggestion
>
> // Filename matching
> bool wildcardMatch(path, pattern)
> bool filenameMatch(path1, path2)
> bool filenameCharMatch(char1, char2)

What does the last do?


Andrei
April 28, 2010
Sorry, sent this to Andrei's private address again...

On 04/27/2010 09:51 PM, Andrei Alexandrescu wrote:
> On 04/27/2010 03:40 AM, Lars Tandle Kyllingstad wrote:
>> On 04/27/2010 12:43 AM, Andrei Alexandrescu wrote:
>> Below is a listing the API I have in mind. I think it's worth breaking
>> backwards compatibility for a more unified and coherent naming scheme,
>> and I also think it's better to do it now than later. Walter seemed to
>> oppose the idea, what do others think?
>>
>>
>> // System-dependent constants
>> string dirSeparator
>> string altDirSeparator
>> string pathSeparator
>> string currentDir
>> string parentDir
>
> I'd change "currentDir" with e.g. "currentDirSymbol" or something. Otherwise people may actually thing currentDir is pwd. Same about parentDir.

Good point.


>> // File extensions
>> string extension(path)
>> string setExtension(path, ext)
>> string setDefaultExtension(path, ext)
>> string appendExtension(path, ext)
>> string removeExtension(path, ext=null)
>
> I'm not fond of adding support for extensions. On Unix there's no explicit extension. Extension comes from CP/M with 8 characters for name and 3 characters for extension, which is now long defunct.

I completely agree with you.

It's not a case of *adding* support, however.  getExt() and defaultExt() are already in the current std.path, so what you're suggesting is *removing* support.

I don't mind, but I'm sure it won't sit well with everyone.  Like it or not -- and I sure don't -- extensions are still the primary way of conveying file type information.


>> // Extracting drive/directory/filename
>> string drive(path)
>> string directory(path)
>> string filename(path)
>> string basename(path, suffix)
>
> "dirname" will be instantly recognized by any Unix user.

Good idea.  I just realised it's kinda pointless to have both filename() and basename() as well.


>> // Relative/absolute/canonical paths
>> bool isAbsolute(path)
>> bool isRelative(path)
>> bool isCanonical(path)
>> string toAbsolute(path)
>> string toRelative(path)
>> string toCanonical(path)
>>
>> // Joining/splitting paths
>> string join(pathComponents...)
>> SomeRange splitter(path) // cf. Ellery Newcomer's suggestion
>>
>> // Filename matching
>> bool wildcardMatch(path, pattern)
>> bool filenameMatch(path1, path2)
>> bool filenameCharMatch(char1, char2)
>
> What does the last do?

The same as the current std.path.fncharmatch():  On POSIX fncharmatch('a','A') is false but on Windows it's true.

I'm not convinced any of these last three are generally useful, but again, I am wary of *removing* functionality.

-Lars
April 28, 2010
Lars Tandle Kyllingstad <lars at ...> writes:

> 
> I'm not convinced any of these last three are generally useful, but again, I am wary of *removing* functionality.
> 
> -Lars
> 


past bikeshedding, filenameMatch would be especially useful if it could accept ranges from splitter.

1 2
Next ›   Last »