Thread overview | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
February 24, 2010 [Issue 3848] New: functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=3848 Summary: functions in std.file don't take symbolic links into account Product: D Version: 2.040 Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: nobody@puremagic.com ReportedBy: jmdavisProg@gmail.com --- Comment #0 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-02-24 00:24:31 PST --- Created an attachment (id=572) Changes for std.file.d to make it work with symlinks. At present, if you call isfile() or isdir() on a symbolic link, it reports on what the link points to rather than the link itself. Presumably this is fine on Windows since as far as I know, Windows doesn't have symbolic links. But it is a big problem for Linux. The other problem is rmdirRecursive(). At present, I believe that it currently would follow symbolic links to directories rather than just deleting the symlink itself. Ideally, it would not follow symlinks. I have included a patch which I believe fixes the problems. Whether it's the best solution to the problem, I don't know. I added getLinkAttributes() which uses lstat instead of stat and make isdir() and isfile() use getLinkAttributes() rather than getAttributes(). I also added issymlink() to go with isdir() and isfile(). DirEntry was changed to use lstat as well. That should fix, isfile(), isdir(), and rmdirRecursive(). Most of the other functions like lastModified() continue to use fstat, so they continue to give information on the file which is pointed to rather than the symlink, which I think is what we want. The one wart is that having DirEntry use lstat means that all of the attributes that it gives are for the symlink itself which is not always what we want. So, I added a followLink() function which returns a DirEntry with the values for stat if it's a symbolic link or just returns a copy of the DirEntry if it's not. followLink() can almost certainly been done in a cleaner manner. I'd have used init, but I don't know how to get dirent for a single file rather than a whole directory. So, it uses stat and then checks the mode to figure out which value to set d_type for. DT_WHT is currently not used (it's commented out) because I have no clue what it represents, so I don't know what mode value from stat it corresponds to. In any case, if my patch isn't quite what you'd want to use, hopefully it's at least a step in the right direction. Hopefully the patch is in an appropriate format. It has changes for std.file.d -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 24, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 Andrei Alexandrescu <andrei@metalanguage.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |andrei@metalanguage.com --- Comment #1 from Andrei Alexandrescu <andrei@metalanguage.com> 2010-02-24 05:11:32 PST --- Thanks, I'll look into it. Just like you, I'm unclear whether isdir() vs. isfile() should automatically dereference the link. I think we should keep the current behavior and add an islink() test that tells whether the thing is actually a link. I think removal should remove the actual link. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 24, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 --- Comment #2 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-02-24 08:46:30 PST --- The main problem with having isdir() and isfile() continuing to use stat, and having the programmer just call issymlink() to check whether it's a link is that that's a call to stat _and_ a call to lstat, and from what I understand stat and lstat are a bit expensive (presumably because they have to talk to disk). Of course, if there's a high chance that a programmer will have to check whether a symlink points to a file or a directory, then they'll end up calling both isdir()/isfile() and issymlink() and so will have a call to both stat and lstat anyway, so it may be better to keep isdir() and isfile() using stat. Really, it depends on the most likely use case. rmdirRecursive() is definitely more efficient using just lstat though. Personally, I wish that there were a function which returned information for the file which is pointed to while telling you whether it's a symlink or not, but unfortunately, from what I've found, there is no such beast. From an efficiency perspective, it would be nice if there were a function which returned a DirEntry for a single file so that you can query that as to whether a file is a file, dir, or symlink without calling stat or lstat every time you call isfile(), isdir(), or issymlink(), but I couldn't figure out how to do that, so I don't know if it can be reasonably done (since presumably it would be silly and expensive to get the contents of the directory holding the file just to get the DirEntry for the file itself). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 25, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 --- Comment #3 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-02-24 23:31:35 PST --- Actually, after playing around with this a bit more, I think that I'm leaning towards it being more useful for isdir() and isfile() to use stat. It can result in less efficient code, but it's a lot easier to deal with the symlinks that way. If isdir() and isfile() use lstat, you have to call getAttributes() on the symlink to find out what it is, and that means anding the return value with the appropriate flags, and that's not terribly user-friendly. In fact, it would actually be nice if getAttributes() returned an enum or struct rather a uint. I believe that it's quite OS-dependent as it is, and you have to look at file.d to have any clue what the return value represents. Ideally, you wouldn't have to look a function's source code to figure out how to deal with its return value. But the whole issue of getAttributes() is more or less a separate (albeit connected) issue. In any case, I think that from a user-friendliness perspective, you're right in your assessment that it would be better to have isdir() and isfile() use stat rather than lstat. It does, however, mean that you have to be that much more careful with isdir() or we risk recursive issues or removing stuff in the directory that it points to rather than just the link in functions like rmdirRecursive(). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 26, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 Jonathan M Davis <jmdavisProg@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #572 is|0 |1 obsolete| | --- Comment #4 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-02-26 00:32:57 PST --- Created an attachment (id=573) Next attempt at fixing problem. Okay. I tried to fix this more cleanly this time. The result is that DirEntry has changed a fair bit, but now isdir() and isfile() use stat rather than lstat, and you can now get a DirEntry for a specific file without getting all the DirEntries for a directory. I don't know whether it will be to your liking, and I have not tested the Windows code at all, but from the testing I did of the linux code, it appears to work (though it could always use more unittests). By making DirEntry as lazy/efficient as possible and making it possible to get one for any file, it should fix the efficiency issue of calling both stat and lstat as much as is possible (though it would still be nice if there were a posix function which did stat but also told you whether it was a symlink). There are also versions of isdir(), isfile(), and issymlink() which take an attributes (uint) value to make them the attributes values easier to use and facilitate rmdirRecursive() only calling lstat. I didn't add any for block devices and such, but perhaps they would be good to add as well. DirEntry is using properties now with its member variables being private, allowing for guarantees as to the state of its members. I don't know whether you'd consider that a problem or not, but I think that it's cleaner. I have attached the updated std.file.d. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 26, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 --- Comment #5 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-02-26 00:35:59 PST --- Created an attachment (id=574) Patch file for next attempt at fixing problem (instead of the full file). Here's a patch if you'd prefer that to the fully changed file. You'd need to apply dos2unix to the original file.d first for it work. Hopefully this at least comes close to solving the problem for you and saves you some work. Thanks. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 28, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 --- Comment #6 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-02-28 02:43:05 PST --- I would point out that my changes appear to fix bug 1635, so fixing this bug could result in that one getting fixed at the same time. I also noticed that I should have named the direntry() function dirEntry(). I was not paying enough attention to the capitalization for dirEntries() and thought that it was direntries(), so my attempt to make their capitalization match failed. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 01, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 --- Comment #7 from Sobirari Muhomori <dfj1esp02@sneakemail.com> 2010-03-01 07:04:11 PST --- (In reply to comment #1) > Thanks, I'll look into it. Just like you, I'm unclear whether isdir() vs. > isfile() should automatically dereference the link. I think we should keep the > current behavior and add an islink() test that tells whether the thing is > actually a link. Actually windows gone the same way: if you want to examine symlink, you should state it explicitly. (In reply to comment #2) > The main problem with having isdir() and isfile() continuing to use stat, and > having the programmer just call issymlink() to check whether it's a link is > that that's a call to stat _and_ a call to lstat, and from what I understand > stat and lstat are a bit expensive (presumably because they have to talk to > disk). They don't talk to disk, they talk to disk cache. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 03, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 Jonathan M Davis <jmdavisProg@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #573 is|0 |1 obsolete| | --- Comment #8 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-03-02 23:39:06 PST --- Created an attachment (id=577) Now fixes DirIterator and listdir() as well. I don't know anything about the whole reading from disk reading from disk cache thing, but it's still I/O of some variety and, as I understand it, is not a cheap operation, so it's desirable to avoid extraneous stat and lstat calls regardless. Also, I figured out that DirIterator and listdir() did not deal with symlinks correctly. In particular, they always followed them, which could be very bad in some situations. So, I'm attaching another version which fixes that problem. It also makes DirEntry on Linux a bit better. It occurred to me that we might as well just save the result from stat rather than copying some of its values to member variables to be used for the properties. That way, we can make the resulting struct available. It makes the DirEntry struct a bit larger, but it seems useful enough to me that it should be worth it. However, I didn't add any more of stat's properties (like inode or block size) to DirEntry since those are likely used rarely enough that the programmer can just access the stat struct in such cases. That and it allows DirEntry's API to be mostly the same between platforms. As far as I've determined, this fix works on Linux (certainly, all of the unit tests pass, and all of the other testing that I've done looks good), but I haven't tested the Windows code at all. Most of the changes are to the Posix side though, so odds are that the Windows side is okay. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 03, 2010 [Issue 3848] functions in std.file don't take symbolic links into account | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | http://d.puremagic.com/issues/show_bug.cgi?id=3848 Jonathan M Davis <jmdavisProg@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #574 is|0 |1 obsolete| | --- Comment #9 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-03-02 23:40:53 PST --- Created an attachment (id=578) Patch version of "Now fixes DirIterator and listdir() as well." Here's the patch version of the most recent fix. Again, you'll need to run dos2unix on the original std.file.d before applying it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation