May 22, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | The file format: http://cyan4973.github.io/lz4/lz4_Block_format.html It doesn't look too difficult. If we implement our own LZ4 compressor based on that, from scratch, we can boost license it. |
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Am Sun, 22 May 2016 23:42:33 -0700 schrieb Walter Bright <newshound2@digitalmars.com>: > The file format: http://cyan4973.github.io/lz4/lz4_Block_format.html > > It doesn't look too difficult. If we implement our own LZ4 compressor based on that, from scratch, we can boost license it. Ok, any volunteers? I'm not personally looking forward to reimplementing lz4 from sratch right now. As Stefan Koch said, we should be able to use the existing optimized code, like gcc uses gmp. -- Marco |
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Era Scarecrow | On Monday, 23 May 2016 at 01:46:40 UTC, Era Scarecrow wrote:
> On Sunday, 22 May 2016 at 19:44:08 UTC, Era Scarecrow wrote:
>> ...
>
> Well here's the rundown of some numbers. min_compress uses a tiny window, big_compress was my original algorithmn but modified to use 16k total for a window. reduced_id_compress is the original except reduced to a 220 window and 2 byte constant output. Along with the compressed outputs of each.
>
> min_compress: [TickDuration(46410084)] 0.779836
> big_compress: [TickDuration(47998202)] 0.806545
> orig_id_compress: [TickDuration(59519257)] baseline
> reduced_id_compress: [TickDuration(44033192)] 0.739894
> 1001 (original size)
>
> 72 testexpansion.s!(æεó▌æ╗int).så°Resulà≡Ñ╪¢╨╘ÿ¼É ↑─↑╜►╘fñv├ÿ╜ ↑│↑Ä .foo()
> 73 testexpansion.s!(ÅæεÅó▌Åæ╗int).sÅå°ResulÅà≡ÅÑ╪Å¢╨Å╘ÿżÉ₧├ÿÄ╜É╝▓ÿëåâ.foo()
> 67 tes╤xpansion.s!(ÇææÇóóÇææint)∙⌡ResulÇàÅÇѺǢ»Ç╘τǼ∩ë├τü╜∩¢▓τ².foo()
> 78 testexpansion.s!(æ2óCæ2int).så(Resulà0ÑH¢P╘ê¼É╘íñÖ├ê┤ÿσ║ñ¬├ê¼É╘íñÖ├êÉ1å).foo()
>
> min_compress: [TickDuration(29210832)] 0.82391
> big_compress: [TickDuration(31058664)] 0.87601
> orig_id_compress: [TickDuration(35466130)] baseline
> reduced_id_compress: [TickDuration(25032532)] 0.705977
> 629 (original size)
>
> 52 E.s!(à·è⌡àδint).så°Resulà≡ÖΣÅ▄╝╝ö┤ líd _Ög læ◄.foo()
> 61 E.s!(Åà·Åè⌡Åàδint).sÅå°ResulÅà≡ÅÖΣÅÅ▄Å╝╝Åö┤₧ç∞ÄÖΣ¡ó╠ïå╗.foo()
> 52 E.s!(ΣÇèèΣint)∙⌡ResulÇàÅÇÖ¢ÇÅúÇ╝├Çö╦ëçôüÖ¢Æó│².foo()
> 52 E.s!(à&è+à&int).så(Resulà0Ö<ÅD╝döl ┤í╝ ┴Ö╣ ┤æ9.foo()
If you want, you can give the test case for issue 16039 a try. It produces 300-400MB binary, so it should be a nice test case :P
|
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | Am Sun, 22 May 2016 23:42:33 -0700 schrieb Walter Bright <newshound2@digitalmars.com>: > > The file format: http://cyan4973.github.io/lz4/lz4_Block_format.html > > It doesn't look too difficult. If we implement our own LZ4 compressor based on that, from scratch, we can boost license it. That it right. It's pretty simple. On Monday, 23 May 2016 at 07:30:00 UTC, Marco Leise wrote: > > Ok, any volunteers? Well I am not a compression expert but since I am already working on optimizing the decompressor. The method for archiving perfect compression it outlined here: https://github.com/Cyan4973/lz4/issues/183 |
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Era Scarecrow | On Saturday, 21 May 2016 at 21:27:37 UTC, Era Scarecrow wrote: > > I assume this is related to compressing symbols thread? I mentioned possibly considering the LZO library. Maybe consider lz4 instead? Tends to be a bit faster, and it's BSD instead of GPL. https://cyan4973.github.io/lz4/ -Wyatt |
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Wyatt | On Monday, 23 May 2016 at 14:47:31 UTC, Wyatt wrote:
> Maybe consider lz4 instead?
Disregard that: I see it's come up already.
|
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On 5/23/2016 5:04 AM, Stefan Koch wrote:
> Am Sun, 22 May 2016 23:42:33 -0700
> schrieb Walter Bright <newshound2@digitalmars.com>:
>>
>> The file format: http://cyan4973.github.io/lz4/lz4_Block_format.html
>>
>> It doesn't look too difficult. If we implement our own LZ4 compressor based on
>> that, from scratch, we can boost license it.
>
> That it right. It's pretty simple.
Also, the LZ4 compressor posted here has a 64K string limit, which won't work for D because there are reported 8Mb identifier strings.
|
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Monday, 23 May 2016 at 15:33:45 UTC, Walter Bright wrote:
>
> Also, the LZ4 compressor posted here has a 64K string limit, which won't work for D because there are reported 8Mb identifier strings.
This is only partially true.
The 64k limit does not apply to the input string. It does only apply to the dictionary.It would only hit if we find 64k of identifier without repetition.
|
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Monday, 23 May 2016 at 16:00:20 UTC, Stefan Koch wrote:
> On Monday, 23 May 2016 at 15:33:45 UTC, Walter Bright wrote:
>>
>> Also, the LZ4 compressor posted here has a 64K string limit, which won't work for D because there are reported 8Mb identifier strings.
>
> This is only partially true.
> The 64k limit does not apply to the input string. It does only apply to the dictionary.It would only hit if we find 64k of identifier without repetition.
If you want speed: 16k (aka half of the L1D).
|
May 23, 2016 Re: Need a Faster Compressor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | Am Mon, 23 May 2016 12:04:48 +0000 schrieb Stefan Koch <uplink.coder@googlemail.com>: > The method for archiving perfect compression it outlined here: https://github.com/Cyan4973/lz4/issues/183 Nice, if it can keep the compression speed up. -- Marco |
Copyright © 1999-2021 by the D Language Foundation