Thread overview
std.zip size limit of 2 GB?
Feb 15, 2018
Andre Pany
Feb 15, 2018
Tony
Feb 15, 2018
ag0aep6g
Feb 16, 2018
Andre Pany
Feb 16, 2018
Ali Çehreli
Feb 15, 2018
Stefan Koch
February 15, 2018
Hi,

I just noticed that std.zip will throw an exception if the source files exceeds 2 GB.
I am not sure whether this is a limitation of zip version 20 or a bug. On wikipedia a
size limit of 4 GB is mentioned. Should I open an issue?

Windows 10 with x86_64 architecture.

core.exception.RangeError@std\zip.d(808): Range violation
----------------
0x00007FF7C9B1705C in d_arrayboundsp
0x00007FF7C9B301FF in @safe void std.zip.ZipArchive.putUshort(int, ushort)
0x00007FF7C9B2E634 in void[] std.zip.ZipArchive.build()

    void zipFolder(string archiveFilePath, string folderPath)
    {
        import std.zip, std.file;

        ZipArchive zip = new ZipArchive();
        string folderName = folderPath.baseName;

        foreach(entry; dirEntries(folderPath, SpanMode.depth))
        {
            if (!entry.isFile)
                continue;

            ArchiveMember am = new ArchiveMember();
            am.name = entry.name[folderPath.length + 1..$];
            am.expandedData(cast(ubyte[]) read(entry.name));
            zip.addMember(am);
        }

        void[] compressed_data = zip.build(); // zip.build() will throw
        write(archiveFilePath, compressed_data);
    }

Kind regards
André
February 15, 2018
On 2/15/18 6:56 AM, Andre Pany wrote:
> Hi,
> 
> I just noticed that std.zip will throw an exception if the source files exceeds 2 GB.
> I am not sure whether this is a limitation of zip version 20 or a bug. On wikipedia a
> size limit of 4 GB is mentioned. Should I open an issue?
> 
> Windows 10 with x86_64 architecture.
> 
> core.exception.RangeError@std\zip.d(808): Range violation
> ----------------
> 0x00007FF7C9B1705C in d_arrayboundsp
> 0x00007FF7C9B301FF in @safe void std.zip.ZipArchive.putUshort(int, ushort)
> 0x00007FF7C9B2E634 in void[] std.zip.ZipArchive.build()
> 
>      void zipFolder(string archiveFilePath, string folderPath)
>      {
>          import std.zip, std.file;
> 
>          ZipArchive zip = new ZipArchive();
>          string folderName = folderPath.baseName;
> 
>          foreach(entry; dirEntries(folderPath, SpanMode.depth))
>          {
>              if (!entry.isFile)
>                  continue;
> 
>              ArchiveMember am = new ArchiveMember();
>              am.name = entry.name[folderPath.length + 1..$];
>              am.expandedData(cast(ubyte[]) read(entry.name));
>              zip.addMember(am);
>          }
> 
>          void[] compressed_data = zip.build(); // zip.build() will throw
>          write(archiveFilePath, compressed_data);
>      }
> 
> Kind regards
> André

I think it's inherent in the zlib API. I haven't used all of the library, but the portion I did use (using zstream) uses uint for buffer sizes.

-Steve
February 15, 2018
On Thursday, 15 February 2018 at 18:49:55 UTC, Steven Schveighoffer wrote:

>
> I think it's inherent in the zlib API. I haven't used all of the library, but the portion I did use (using zstream) uses uint for buffer sizes.
>

Wouldn't using a uint for buffer size give a size limit of greater than 4GB? Seems like an int is in the mix somewhere.

February 15, 2018
On 02/15/2018 10:20 PM, Tony wrote:
> Wouldn't using a uint for buffer size give a size limit of greater than 4GB? Seems like an int is in the mix somewhere.

uint gives 4, int gives 2.
February 15, 2018
On 2/15/18 4:20 PM, Tony wrote:
> On Thursday, 15 February 2018 at 18:49:55 UTC, Steven Schveighoffer wrote:
> 
>>
>> I think it's inherent in the zlib API. I haven't used all of the library, but the portion I did use (using zstream) uses uint for buffer sizes.
>>
> 
> Wouldn't using a uint for buffer size give a size limit of greater than 4GB? Seems like an int is in the mix somewhere.
> 

You meant 2GB, I think.

And you are right. I looked into it a bit, this has nothing to do (superficially) with zlib, it has to do with std.zip:

https://github.com/dlang/phobos/blob/0107a6ee09072bda9e486a12caa148dc7af7bb08/std/zip.d#L806

Really, i should be size_t in all places, I can't see why it should ever be int.

Please file an issue.

-Steve
February 15, 2018
On Thursday, 15 February 2018 at 11:56:04 UTC, Andre Pany wrote:
> Hi,
>
> I just noticed that std.zip will throw an exception if the source files exceeds 2 GB.
> I am not sure whether this is a limitation of zip version 20 or a bug. On wikipedia a
> size limit of 4 GB is mentioned. Should I open an issue?
>
> [...]

It was partially changed in this PR: https://github.com/dlang/phobos/pull/2914/files
The the put methods where left at int must have been an oversight.
February 16, 2018
On Thursday, 15 February 2018 at 21:57:23 UTC, Steven Schveighoffer wrote:
> Really, i should be size_t in all places, I can't see why it should ever be int.
>
> Please file an issue.
>
> -Steve

Issue created: https://issues.dlang.org/show_bug.cgi?id=18452
Thanks for the analysis.

Kind regards
André
February 16, 2018
On 02/15/2018 01:57 PM, Steven Schveighoffer wrote:

> Really, i should be size_t in all places

size_t or ulong? size_t would constrain 32-bit systems unless they can't handle files over 2G.

Ali

February 16, 2018
On 2/16/18 5:39 PM, Ali Çehreli wrote:
> On 02/15/2018 01:57 PM, Steven Schveighoffer wrote:
> 
>  > Really, i should be size_t in all places
> 
> size_t or ulong? size_t would constrain 32-bit systems unless they can't handle files over 2G.

The code I linked to writes to an array. So it's constrained to size_t.

I think the zlib library itself doesn't support very well anything more than 4GB files.

-Steve