Thread overview
[Issue 18017] [External] [DMC] File.size() uses a 32-bit signed integer for size internally (gives wrong results for files over ≈2.1 GB)
Nov 28, 2017
krzaq
Nov 28, 2017
krzaq
Nov 28, 2017
kinke@gmx.net
Dec 17, 2022
Iain Buclaw
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

Steven Schveighoffer <schveiguy@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy@yahoo.com
           Hardware|x86_64                      |x86
            Summary|File.size() uses a 32-bit   |[External] [DMC]
                   |signed integer for size     |File.size() uses a 32-bit
                   |internally (gives wrong     |signed integer for size
                   |results for files over ≈2.1 |internally (gives wrong
                   |GB)                         |results for files over ≈2.1
                   |                            |GB)

--- Comment #1 from Steven Schveighoffer <schveiguy@yahoo.com> ---
The issue here is that DMC's 32-bit ftell returns a 32-bit signed value, and this translates to int.min here.

Then phobos translates that to an unsigned long (64-bit).

The workaround is to simply use 64-bit C runtime (dmd -m64), which should work
properly.

But until DMC's clib can support 64-bit ftell, D can't do anything about this. Sure we can treat values from int.min to -2 as unsigned, but that doesn't help with 5GB files for instance.

One thing we *could* do is throw an error. But I'm not sure that's a "solution". Nor am I sure that this workaround would work for files that are larger than uint.max.

--
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

Steven Schveighoffer <schveiguy@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|enhancement                 |normal

--
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

--- Comment #2 from krzaq <issues.dlang.org.kq.ajsx@krzaq.cc> ---
Getting -m64 to work requires non-zero effort and even then isn't hassle-free. I used this as a workaround instead:

ulong getFileSize(const string name)
{
    import std.utf;
    import core.sys.windows.windows;
    //import core.stdc.
        HANDLE hFile = CreateFile(name.toUTF16z, GENERIC_READ,
        FILE_SHARE_READ, NULL, OPEN_EXISTING,
        FILE_ATTRIBUTE_NORMAL, NULL);
    if (hFile==INVALID_HANDLE_VALUE){
        return -1; // error condition, could call GetLastError to find out more
    }

    LARGE_INTEGER size;
    if (!GetFileSizeEx(hFile, &size))
    {
        CloseHandle(hFile);
        return -1; // error condition, could call GetLastError to find out more
    }

    CloseHandle(hFile);
    return size.QuadPart;
}

--
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

--- Comment #3 from Steven Schveighoffer <schveiguy@yahoo.com> ---
Nevertheless, it's still a bug, as File.size using ulong as its return seems to suggest it can handle it.

Note, there's also std.file.getSize: https://dlang.org/phobos/std_file.html#getSize if you aren't actually reading anything in the file.

--
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

--- Comment #4 from krzaq <issues.dlang.org.kq.ajsx@krzaq.cc> ---
std.file.getSize works correctly in my case.

The thing is, I read this big file (using struct File and then byChunk) and I am getting all the data correctly - only the size returned by File.size isn't correct - that's why I suggested using raw winapi for this (at least on win32), but std.file.getSize might work as well.

--
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

--- Comment #5 from Steven Schveighoffer <schveiguy@yahoo.com> ---
std.file.getSize works because it *does* use WinAPI directly.

std.stdio.File is based completely on libc's FILE * structure. It can only support whatever that supports, and that isn't very much. In the case of 32-bit windows, the library it uses is Digital Mars' C runtime, which has some difficult limitations, this being one of them.

A potential fix here is to get the handle directly from the FILE * and query it using WinAPI. But this doesn't fix File.tell(), which is going to use the libc version.

--
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

kinke@gmx.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kinke@gmx.net

--- Comment #6 from kinke@gmx.net ---
(In reply to Steven Schveighoffer from comment #5)
> std.stdio.File is based completely on libc's FILE * structure. It can only support whatever that supports, and that isn't very much. In the case of 32-bit windows, the library it uses is Digital Mars' C runtime, which has some difficult limitations, this being one of them.
> 
> A potential fix here is to get the handle directly from the FILE * and query it using WinAPI. But this doesn't fix File.tell(), which is going to use the libc version.

Then `-m32mscoff` is another option for Win32.

--
November 28, 2017
https://issues.dlang.org/show_bug.cgi?id=18017

--- Comment #7 from Steven Schveighoffer <schveiguy@yahoo.com> ---
(In reply to kinke from comment #6)
> Then `-m32mscoff` is another option for Win32.

Yeah, if that defines CRuntime_Microsoft, then it should work. I admit I'm not too familiar with Windows dmd development, and haven't really tried all of these options.

Appropriate version switch is here: https://github.com/dlang/phobos/blob/master/std/stdio.d#L1142

--
December 17, 2022
https://issues.dlang.org/show_bug.cgi?id=18017

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P1                          |P3

--
December 01
https://issues.dlang.org/show_bug.cgi?id=18017

--- Comment #8 from dlangBugzillaToGithub <robert.schadek@posteo.de> ---
THIS ISSUE HAS BEEN MOVED TO GITHUB

https://github.com/dlang/phobos/issues/10271

DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB

--