Thread overview
[Issue 8967] dirEntries throws when encountering a "long path" on windows
Mar 18, 2014
Walter Bright
Mar 18, 2014
Vladimir Panteleev
Mar 18, 2014
Walter Bright
Mar 18, 2014
Vladimir Panteleev
Mar 18, 2014
Jay Norwood
Mar 18, 2014
Jay Norwood
Mar 18, 2014
Walter Bright
Mar 18, 2014
Vladimir Panteleev
Mar 19, 2014
Jay Norwood
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967


Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla@digitalmars.com


--- Comment #3 from Walter Bright <bugzilla@digitalmars.com> 2014-03-17 19:49:46 PDT ---
http://msdn.microsoft.com/en-us/library/windows/desktop/aa363915(v=vs.85).aspx

This turns out to be not so simple. Look at all the notes and exceptions and caveats here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967



--- Comment #4 from Vladimir Panteleev <thecybershadow@gmail.com> 2014-03-18 05:29:07 EET ---
It's quite simple, the path simply must be an absolute one with all forward slashes replaced with backslashes (so pretty standard normalization). Which "notes and exceptions and caveats" are you referring to, in particular?

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967



--- Comment #5 from Walter Bright <bugzilla@digitalmars.com> 2014-03-17 21:09:36 PDT ---
(In reply to comment #4)
> It's quite simple, the path simply must be an absolute one with all forward slashes replaced with backslashes (so pretty standard normalization).

That's a big one.

> Which "notes and exceptions and caveats" are you referring to, in particular?

All of them; I quote:

1. The "\\?\" prefix can also be used with paths constructed according to the universal naming convention (UNC). To specify such a path using UNC, use the "\\?\UNC\" prefix. For example, "\\?\UNC\server\share", where "server" is the name of the computer and "share" is the name of the shared folder. These prefixes are not used as part of the path itself. They indicate that the path should be passed to the system with minimal modification, which means that you cannot use forward slashes to represent path separators, or a period to represent the current directory, or double dots to represent the parent directory.

2. Because you cannot use the "\\?\" prefix with a relative path, relative paths are always limited to a total of MAX_PATH characters.

3. For file I/O, the "\\?\" prefix to a path string tells the Windows APIs to disable all string parsing and to send the string that follows it straight to the file system.

4. Because it turns off automatic expansion of the path string, the "\\?\" prefix also allows the use of ".." and "." in the path names, which can be useful if you are attempting to perform operations on a file with these otherwise reserved relative path specifiers as part of the fully qualified path.

5. Many but not all file I/O APIs support "\\?\"; you should look at the reference topic for each API to be sure.

6. The "\\.\" prefix will access the Win32 device namespace instead of the Win32 file namespace.

7. If you're working with Windows API functions, you should use the "\\.\" prefix to access devices only and not files.

8. This was accomplished by adding the symlink named "GLOBALROOT" to the Win32 namespace, which you can see in the "Global??" subdirectory of the WinObj browser tool previously discussed, and can access via the path "\\?\GLOBALROOT". This prefix ensures that the path following it looks in the true root path of the system object manager and not a session-dependent path.


So, no, I don't think this is so simple.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967



--- Comment #6 from Vladimir Panteleev <thecybershadow@gmail.com> 2014-03-18 06:18:30 EET ---
(In reply to comment #5)
> 1. The "\\?\" prefix can also be used with paths constructed according to the universal naming convention (UNC). To specify such a path using UNC, use the "\\?\UNC\" prefix. For example, "\\?\UNC\server\share", where "server" is the name of the computer and "share" is the name of the shared folder. These prefixes are not used as part of the path itself. They indicate that the path should be passed to the system with minimal modification, which means that you cannot use forward slashes to represent path separators, or a period to represent the current directory, or double dots to represent the parent directory.

OK, so special case if the path starts with \\ but not \\?\

> 2. Because you cannot use the "\\?\" prefix with a relative path, relative paths are always limited to a total of MAX_PATH characters.

Absolute path as mentioned earlier

> 3. For file I/O, the "\\?\" prefix to a path string tells the Windows APIs to disable all string parsing and to send the string that follows it straight to the file system.

Does not apply on its own

> 4. Because it turns off automatic expansion of the path string, the "\\?\" prefix also allows the use of ".." and "." in the path names, which can be useful if you are attempting to perform operations on a file with these otherwise reserved relative path specifiers as part of the fully qualified path.

Path normalization as mentioned earlier

> 5. Many but not all file I/O APIs support "\\?\"; you should look at the reference topic for each API to be sure.

I don't see this as a concern. D unit tests will reveal any Windows APIs that don't support this syntax

> 6. The "\\.\" prefix will access the Win32 device namespace instead of the Win32 file namespace.

Does not apply. Win32 devices are akin to Posix /dev/ and are rarely accessed directly

> 7. If you're working with Windows API functions, you should use the "\\.\" prefix to access devices only and not files.

Same as above, does not apply

> 8. This was accomplished by adding the symlink named "GLOBALROOT" to the Win32 namespace, which you can see in the "Global??" subdirectory of the WinObj browser tool previously discussed, and can access via the path "\\?\GLOBALROOT". This prefix ensures that the path following it looks in the true root path of the system object manager and not a session-dependent path.

Same as above, does not apply

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967


Jay Norwood <jayn@prismnet.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jayn@prismnet.com


--- Comment #7 from Jay Norwood <jayn@prismnet.com> 2014-03-17 21:41:57 PDT ---
As has been stated in prior comments, it appears the intended solution is to prepend "\\?\" to the path string.

It appears from the documentation that prepending should be done if path is > MAX_PATH-12 for directory names or MAX_PATH-1 for file paths.  Maybe to simplify this, just use MAX_PATH-12.

In addition, it appears from the documentation that two cases you would not want to prepend are if the path already begins with "\\?\" or "\\.".

The code appears pretty consistently to use toUTF16z(name) in places where path strings are being converted to Windows paths for calling the unicode versions of the functions.  I looked through the 26 references to toUTF16z, and the only one I couldn't confirm was a call to _wfopen.  So, I think perhaps these toUTF16z calls could be changed to call a modified version that checks the path length and optionally prepends.

we are using toUTF16z in the libraries when calling CreateFileW, SetCurrentDirectoryW, CopyFileW, FindFirstFileW, CreateDirectoryW, GetFileAttributesW, DeleteFileW, MoveFileExW, RemoveDirectoryW,SetFileAttributesW, GetEnvironmentVariableW, SetEnvironmentVariableW,

Searching for LPCWSTR brings up a number of windows api defs that are unused in the libraries, but that would be candidates for use of toUTF16z.

tempDir() uses MAX_PATH, where the larger 32K limit should be used for the
buffer.

same for thisExePath(), and probably its call to GetModuleFileNameW should use
the toUTF16z

WIN32_FIND_DATAW should probably have the 32K buffer to be consistent, but perhaps the operations that use it don't support the longer paths.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967



--- Comment #8 from Jay Norwood <jayn@prismnet.com> 2014-03-18 09:50:23 PDT ---
2. Because you cannot use the "\\?\" prefix with a relative path, relative paths are always limited to a total of MAX_PATH characters.


So, yes, relative paths don't work when you use that prefix (I tried).  But this below did work for me, where e is a DirEntry.

nm = r"\\?\" ~ absolutePath(e.name);

There is also the issue of read-only status needing to be cleared if files or directories are to be removed.  Our remove() and rmdir() don't take care of this for you, so if you are removing items with long paths the above expansion needs to be done before these getAttributes and setAttributes calls can succeed.

uint att = getAttributes(fn);
att ^= FILE_ATTRIBUTE_READONLY;
setAttributes(fn, att);

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967



--- Comment #9 from Walter Bright <bugzilla@digitalmars.com> 2014-03-18 12:46:04 PDT ---
I don't agree with the "does not apply" comments. The \\?\ has different semantics, and having those semantics suddenly shift when the path gets long will be a surprising change to a user.

My take is that if the user wants \\?\, they should prepend it themselves as a deliberate action, rather than hiding it in a conventional API. After all, the Windows functions themselves don't automatically add it, either. If it was straightforward they would have done it.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 18, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967



--- Comment #10 from Vladimir Panteleev <thecybershadow@gmail.com> 2014-03-19 01:14:47 EET ---
Just to clarify, there are two prefixes: "\\?\" and "\\.\". The latter is used to access the Win32 device namespace, which is a fairly under-the-hood thing. The only way I see how this applies to our problem is to not prepend "\\?\" if the path already has a "\\.\" prefix.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 19, 2014
https://d.puremagic.com/issues/show_bug.cgi?id=8967



--- Comment #11 from Jay Norwood <jayn@prismnet.com> 2014-03-18 18:01:59 PDT ---
More surprising is  attempting to remove a long directory path and having an exception occur.

The libraries are already copying the user's string and adding the 0 termination prior to calling the windows api, so it seems to me to be a reasonable place to make other modifications if they are needed to accomplish the intended operation.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------