using ntfs write_through option to create an efficient unzipped layout

April 21, 2012
Posted by Jay Norwood
Permalink
Jay Norwood
Permalink
Below are measured times on operations on an unzipped 2GB layout. My observation is that use of a slightly modified version of std.file.write for the creation of the unzipped files results in a folder that is much more efficient for sequential file system operations.  In particular, the ntfs rmdir takes 6 sec vs 161 sec when removing the layout.

I don't have a clear explanation why this is faster, but there is some mention in the article bleow about lazy writes by ntfs if you don't use write-through, and I suspect that is involved.

http://msdn.microsoft.com/en-us/library/windows/desktop/aa364218(v=vs.85).aspx

All times on a seagate 7200rpm hard drive on win7-64

               unzip       rmd2          cpd       rmdir (ntfs) xcopy (ntfs)
uzp SEQ Par	   86 secs     171 secs      21 secs   169 secs     91 secs
uzp NS WT      157 secs    12 secs       13 secs   6 sec        43 secs
uzp NS WT Par  87 secs     16 secs       17 secs   17 sec       48 secs
7zip unzip     127 secs    151 secs      135 secs  161 sec      68 secs
myDefrag       +15 min     3.5 secs      54.3 secs 4.3 sec      90 secs

uzp SEQ Par  is using the current std.file.write.  Parallel ops during decompress.
uzp NS WT    is using a modified version of std.file.write, no SEQ, added WRITE_THROUGH
uzp NS WT Par is same as above, but write operations parallel, 100 files per thread
myDefrag is using sortByName to defrag the unzipped folder
rmd2 is a parallel unzip, with 100 files per thread
cpd is parallel copy with 100 files per thread copy from the hard drive to an ssd
rmdir is the regular file system rdmir /q /s
xcopy is ntfs  xcopy /q /e /I  from the hard drive to an ssd

void writeThrough(in char[] name, const void[] buffer)
{
    version(Windows)
    {
        alias TypeTuple!(GENERIC_WRITE, 0, null, CREATE_ALWAYS,
                         FILE_ATTRIBUTE_NORMAL|FILE_FLAG_WRITE_THROUGH,
                         HANDLE.init)
            defaults;
        auto h = useWfuncs
            ? CreateFileW(std.utf.toUTF16z(name), defaults)
            : CreateFileA(toMBSz(name), defaults);

        cenforce(h != INVALID_HANDLE_VALUE, name);
        scope(exit) cenforce(CloseHandle(h), name);
        DWORD numwritten;
        cenforce(WriteFile(h, buffer.ptr, to!DWORD(buffer.length), &numwritten, null) == 1
                 && buffer.length == numwritten,
                 name);
    }
    else version(Posix)
        return writeImpl(name, buffer, O_CREAT | O_WRONLY | O_TRUNC);
}
Forums