Thread overview
temporary files - what is the resolution?
Jul 26, 2012
Marco Leise
Jul 26, 2012
Jonathan M Davis
Jul 27, 2012
Marco Leise
Jul 27, 2012
Marco Leise
Jul 27, 2012
Jonathan M Davis
Jul 27, 2012
Brad Roberts
Jul 27, 2012
Jonathan M Davis
July 26, 2012
From the DMD versioning talk I came to the changelog for 2.060 and found #7537 fixed (http://d.puremagic.com/issues/show_bug.cgi?id=7537), which I had to work around before.
Seeing as this is a basic C library function, I was wondering if D could have a modern version. Looking at the MSDN it seems, like Microsoft added quite a few functions to the Visual Studio C Runtime over the years:

           directory         generates  security enhanced¹

tmpnam_s   current working²  filename   yes
tmpnam     current working²  filename   no
_tempnam   TMP/user defined  filename   no
tmpfile_s  system root       FILE ptr³  yes
tmpfile    system root       FILE ptr³  no

¹ = http://msdn.microsoft.com/en-us/library/8ef0s5kh(v=vs.80).aspx
² = functions that use the current working directory require that you set it to the desired location first, which makes them difficult to use in a library, especially since the cwd is attached to the process and other user threads may not be prepared for sudden changes
³ = these files are automagically deleted when the process exists

It looks like "GetTempFileName" (http://msdn.microsoft.com/en-us/library/windows/desktop/aa364991(v=vs.85).aspx) is the actual operating system function that is used by the C Runtime. I'd like to see a cross-platform solution that uses GetTempFileName on Windows. The result should be usable as a D File. It is currently impossible to create a std.stream wrapped around a File.tmpfile, because it returns a C FILE pointer (and until 2.060, because of the administrator rights issue on Windows).
On Unix mkstemp (http://pubs.opengroup.org/onlinepubs/009695399/functions/mkstemp.html) looks like a pretty similar function as well. Unlike the Windows equivalent it returns an open file descriptor/handle in addition to the file name. In both cases the user has to delete the files after use. To me it looks they are flexible enough to write up set of cross-platform (Windows/MacOS X/Linux/FreeBSD) functions for temporary files.

The question is: Is the problem solved with the fixed bug, or is there interest in D functions for temporary files?

-- 
Marco

July 26, 2012
On Thursday, July 26, 2012 03:49:27 Marco Leise wrote:
> The question is: Is the problem solved with the fixed bug, or is there interest in D functions for temporary files?

Of course there is. tmpfile is a horrible function. And actually, pretty much every C function for creating a temp file or even just a temp file name sucks. Most of them are marked as "don't ever use" and the like, and the ones that aren't are still quite poor (e.g. on some systems tmpnam, can only generate 26 unique file names with a given prefix). As bad as tmpfile is, at least it _sort of_ works. But it doesn't give you a file name, and it deletes the file when it's closed, both which make it unusable for a lot of use cases.

I was working on a solution but ran into problems on Windows due to missing C function declarations which were required in order to be able to create a temporary file without introducing a race condition, and I temporarily tabled it. I need to get back to it.

I did create a nice, cross-platform function for generating a random file name which used D's random number generators and created a pull request for it ( https://github.com/D-Programming-Language/phobos/pull/691 ). By default, It puts the file in the directory returned by std.file.tempDir, but you can give it a different directory if you want to.

But there's technically a race condition if you check for the file's existence and then create it if it didn't exist (and generate a new name if it did, check that one for existence, etc.). So, doing

auto file = File(std.file.tempFile());

would have technically had the potential to cause problems (though the file name had enough random characters in it to be as good or better than a UUID for uniqueness, so practically speaking, it was probably okay even if it was theoretically a problem). So, it wasn't merged. If we're going to add something, it will be a function which creates a file with a random name but which doesn't expose the function for generating the random name.

I intend to create std.stdio.File.tempFile to replace std.stdio.File.tmpfile. It will create a temporary file using a name generated with the implementation that I put together for generating a random file name and put it in the directory returned by std.file.tempDir (or wherever you tell it), and the File that it returns will be like any other File save for the fact that it generated the name for you. So, it won't get deleted on you or anything annoying like that.

I have a fully working implementation on Linux. I just need to sort out the Windows C function declaration problem before I can create a pull request for it. I expect that it'll be in 2.061.

- Jonathan M Davis
July 27, 2012
Am Wed, 25 Jul 2012 19:24:21 -0700
schrieb Jonathan M Davis <jmdavisProg@gmx.com>:

> […] As bad as tmpfile is, at least it _sort of_ works. But it doesn't give you a file name, and it deletes the file when it's closed, both which make it unusable for a lot of use cases.
> 
> I was working on a solution but ran into problems on Windows due to missing C function declarations which were required in order to be able to create a temporary file without introducing a race condition, and I temporarily tabled it. I need to get back to it.
> 
> I did create a nice, cross-platform function for generating a random file name which used D's random number generators and created a pull request for it ( https://github.com/D-Programming-Language/phobos/pull/691 ). By default, It puts the file in the directory returned by std.file.tempDir, but you can give it a different directory if you want to.
> 
> But there's technically a race condition if you check for the file's existence and then create it if it didn't exist (and generate a new name if it did, check that one for existence, etc.).
> 
> […]
>
> I have a fully working implementation on Linux. I just need to sort out the Windows C function declaration problem before I can create a pull request for it. I expect that it'll be in 2.061.
> 
> - Jonathan M Davis

I've had to write something up myself, too. But with the closed bug in the Digital Mars C Runtime, I was wondering if there were actual operating system calls that cater for all our needs instead of writing our own functions. And that's where I stumbled upon the similarities between "GetTempFileName" on Windows and "mkstemp" on Posix. What makes me uneasy are the limits of the Windows API. A common wrapper around both would give us:

* free choice of base directory
* optional prefix string (only first 3 letters on Windows)
* can generate unique names (only up to 65,535 on Windows)
* avoids race conditions
* doesn't delete file after program termination

Actually I'd think mkstemp alone with it's random naming scheme and huge limits is what we want. It returns both the name and an open file descriptor with access for the current user only. In other words, my version(Posix) would be a one-liner ;). It's unfortunate that the Windows API doesn't offer something similarly secure and flexible.

Good luck with the race condition check on Windows!

-- 
Marco

July 27, 2012
P.S.: There is a _mktemp_s in MS CRT, but it allows for only 26 unique names per calling thread, directory and prefix :p

-- 
Marco

July 27, 2012
On Friday, July 27, 2012 17:37:20 Marco Leise wrote:
> P.S.: There is a _mktemp_s in MS CRT, but it allows for only 26 unique names per calling thread, directory and prefix :p

Yes. And some POSIX systems have exactly the same problem with mkstemp. And given how easy it is to write a function which just generates a random file name, I see no reason to deal with nonsense like that. As far as I can tell, _every_ C function for generating either a random file or a random file name has a caveat of some sort.

> Good luck with the race condition check on Windows!

It's easily done with the right function calls. It's just that their declarations are missing. I'll get back to sorting that out soon.

- Jonathan M Davis
July 27, 2012
On Fri, 27 Jul 2012, Jonathan M Davis wrote:

> On Friday, July 27, 2012 17:37:20 Marco Leise wrote:
> > P.S.: There is a _mktemp_s in MS CRT, but it allows for only 26 unique names per calling thread, directory and prefix :p
> 
> Yes. And some POSIX systems have exactly the same problem with mkstemp. And given how easy it is to write a function which just generates a random file name, I see no reason to deal with nonsense like that. As far as I can tell, _every_ C function for generating either a random file or a random file name has a caveat of some sort.

Which has that limit?  I haven't found one yet.
July 27, 2012
On Friday, July 27, 2012 12:44:40 Brad Roberts wrote:
> On Fri, 27 Jul 2012, Jonathan M Davis wrote:
> > On Friday, July 27, 2012 17:37:20 Marco Leise wrote:
> > > P.S.: There is a _mktemp_s in MS CRT, but it allows for only 26 unique names per calling thread, directory and prefix :p
> > 
> > Yes. And some POSIX systems have exactly the same problem with mkstemp.
> > And
> > given how easy it is to write a function which just generates a random
> > file
> > name, I see no reason to deal with nonsense like that. As far as I can
> > tell, _every_ C function for generating either a random file or a random
> > file name has a caveat of some sort.
> 
> Which has that limit?  I haven't found one yet.

Here's one page that talks about it, but I've seen other pages mentioning it when searching previously:

http://www.gnu.org/software/gnulib/manual/html_node/mkstemp.html

I do not believe that any of the platforms that we currently support have that particular problem, but I'm not about to trust an OS function that varies that much from platform to platform when I can easily create a solution which is guaranteed to work across all of the platforms that we support and will do so consistently.

- Jonathan M Davis