Jump to page: 1 2
Thread overview
[Issue 1832] reading/writing an archive causes data loss; std.zip horribly broken
Nov 16, 2013
Andrea Fontana
Nov 16, 2013
Andrea Fontana
Nov 17, 2013
Andrea Fontana
Nov 24, 2013
Walter Bright
Dec 06, 2013
Martin Nowak
Dec 06, 2013
Andrea Fontana
Dec 06, 2013
Martin Nowak
Dec 06, 2013
Andrea Fontana
Dec 06, 2013
Andrea Fontana
Jan 12, 2014
Martin Nowak
Jan 12, 2014
Martin Nowak
Jan 12, 2014
Martin Krejcirik
Feb 16, 2014
Andrej Mitrovic
Feb 16, 2014
Andrej Mitrovic
November 16, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832


Andrea Fontana <advmail@katamail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |advmail@katamail.com


--- Comment #1 from Andrea Fontana <advmail@katamail.com> 2013-11-16 15:13:36 PST ---
It seems it's a bug on build() function.

build() function assumes that all ArchiveEntry as expandedData to be
compressed.
So it cycle over elements and compress expandedData to compressedData.

If one of ArchiveEntry is just compressed and never expanded, expandedData will be empty, and build() tries to compress empty data.

I've fixed it in this way:

For each element I check if compressedSize > 0. If true it means that we have compressed data available, so we can copy it from buffer to field .compressedData and we can skip compression for that element (usually it read data from .expandedData, compress it and then copy to .compressedData). Moreover we have not to check for expandedSize or CRC: we just have them from header and we can't rely on expandedData field that is empty.

IMHO std.zip design it's not perfect. Every element copy compressData[] from "global" data[] (so we have memory filled x 2). Moreover it's not useful to store expandedData[] inside every element. It could be memory expensive (hey, we're trying go compress it!), and probably it's just a copy of data we store elsewhere.

I think we should compress data on the fly, and don't store internally expandedData. And that we just have to store compressedData one time.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
November 16, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832



--- Comment #2 from Andrea Fontana <advmail@katamail.com> 2013-11-16 15:54:04 PST ---
Created an attachment (id=1290)
My patch as explained in bug's comment

See previous comment

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
November 17, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832


Andrei Alexandrescu <andrei@erdani.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrei@erdani.com


--- Comment #3 from Andrei Alexandrescu <andrei@erdani.com> 2013-11-16 16:58:10 PST ---
Please redo your patch as a pull request at https://github.com/D-Programming-Language/. Thanks!

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
November 17, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832



--- Comment #4 from Andrea Fontana <advmail@katamail.com> 2013-11-17 00:53:12 PST ---
(In reply to comment #3)
> Please redo your patch as a pull request at https://github.com/D-Programming-Language/. Thanks!

https://github.com/D-Programming-Language/phobos/pull/1697

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
November 22, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832



--- Comment #5 from github-bugzilla@puremagic.com 2013-11-21 20:04:42 PST ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/95af4e3c8c78020109af75c98ed5478c1b813ba7 Fixed phobos bug 1832

https://github.com/D-Programming-Language/phobos/commit/289c10521ae8f4c36cf00cbf71dac8ae9265809e Merge pull request #1697 from trikko/master

Issue 1832 - reading/writing an archive causes data loss; std.zip horribly broken

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
November 24, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832


Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


--- Comment #6 from Walter Bright <bugzilla@digitalmars.com> 2013-11-24 15:52:09 PST ---
Resolved for D2, not for D1.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832


Martin Nowak <code@dawg.eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
                 CC|                            |code@dawg.eu
         Resolution|FIXED                       |
           Severity|critical                    |regression


--- Comment #7 from Martin Nowak <code@dawg.eu> 2013-12-05 20:46:34 PST ---
I now get a segfault for the following code.

import std.zip;

void main()
{
    auto a1 = new ZipArchive();
    auto m1 = new ArchiveMember();
    m1.expandedData = new ubyte[](10);
    a1.addMember(m1);
    a1.build();
    auto a2 = new ZipArchive();
    a2.addMember(m1);
    a2.build(); // segfaults
}

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832



--- Comment #8 from Andrea Fontana <advmail@katamail.com> 2013-12-06 00:47:18 PST ---
This is another bug. m1 is owned by a1. m1._compressedData refers to data inside a1. So you're mixing data from two different objects/archives. I don't think it's a good idea. When you build a2, it tries to read compressed data from a2 data, but it should read from a1.

There's a simply workaround, but again, there's a problem on library design. If m1 is just compressed, we can read content it has inside _compressedData, if any. It filled it before with right reference to a1 data. This solve your bug, but probalby don't solve the mixing problem and of course it's not an elegant solution!


(In reply to comment #7)
> I now get a segfault for the following code.
> 
> import std.zip;
> 
> void main()
> {
>     auto a1 = new ZipArchive();
>     auto m1 = new ArchiveMember();
>     m1.expandedData = new ubyte[](10);
>     a1.addMember(m1);
>     a1.build();
>     auto a2 = new ZipArchive();
>     a2.addMember(m1);
>     a2.build(); // segfaults
> }

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832



--- Comment #9 from Martin Nowak <code@dawg.eu> 2013-12-06 01:07:58 PST ---
(In reply to comment #8)
> This is another bug. m1 is owned by a1. m1._compressedData refers to data inside a1. So you're mixing data from two different objects/archives. I don't think it's a good idea. When you build a2, it tries to read compressed data from a2 data, but it should read from a1.
>
I see.

> There's a simply workaround, but again, there's a problem on library design. If m1 is just compressed, we can read content it has inside _compressedData, if any. It filled it before with right reference to a1 data. This solve your bug, but probalby don't solve the mixing problem and of course it's not an elegant solution!
>
I think m1._compressedData is simply a slice to (constant) GC memory which was allocated when reading a1. So here shouldn't be any problem to copy this data into a new zip archive.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2013
https://d.puremagic.com/issues/show_bug.cgi?id=1832



--- Comment #10 from Andrea Fontana <advmail@katamail.com> 2013-12-06 01:22:46 PST ---
(In reply to comment #9)
> (In reply to comment #8)
> > This is another bug. m1 is owned by a1. m1._compressedData refers to data inside a1. So you're mixing data from two different objects/archives. I don't think it's a good idea. When you build a2, it tries to read compressed data from a2 data, but it should read from a1.
> >
> I see.
> 
> > There's a simply workaround, but again, there's a problem on library design. If m1 is just compressed, we can read content it has inside _compressedData, if any. It filled it before with right reference to a1 data. This solve your bug, but probalby don't solve the mixing problem and of course it's not an elegant solution!
> >
> I think m1._compressedData is simply a slice to (constant) GC memory which was allocated when reading a1. So here shouldn't be any problem to copy this data into a new zip archive.

Yes it is. But we're going to use data compressed/stored in/owned by another archive. That seems a mistake to me. I know we can do it, of course. Tonight I'll fix it then.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
« First   ‹ Prev
1 2