May 22, 2016
Am Sun, 22 May 2016 10:55:40 +0000
schrieb Era Scarecrow <rtcvb32@yahoo.com>:

> On Sunday, 22 May 2016 at 10:40:46 UTC, qznc wrote:
> > On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote: It links to SMAZ:
> >
> > https://github.com/antirez/smaz
> >
> > It uses a codebook primed for english text. Since this is for name mangling, priming for Phobos might be a good idea. E.g. put "std.", "core.", "etc." into the codebook.
> 
>   Curiously premaking a few primers as such for languages can give
> similar performance with zlib and gzip.

LZ4 can seemingly use a dictionary, too.

-- 
Marco

May 22, 2016
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
> The current one is effective, but slow:
>
>   https://github.com/dlang/dmd/blob/master/src/backend/compress.c

Yes, LZO, LZ4, or snappy would be good choices. They are real-time compression algorithms, i.e. it's faster to compress and write to main memory than w/o compression.
Please don't write anything on your own. There is a meta library for those algorithms http://blosc.org/, for which I added Deimos headers http://code.dlang.org/packages/d-blosc.
May 22, 2016
On 5/22/16 6:40 AM, qznc wrote:
> On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
>> The current one is effective, but slow:
>>
>> https://github.com/dlang/dmd/blob/master/src/backend/compress.c
>>
>> Anyone want to make a stab at making it faster? Changing the format is
>> fair game, as well as down and dirty assembler if that's what it takes.
>>
>> So really, how good are you at fast code?
>
> A quick search led to this SO thread:
>
> https://stackoverflow.com/questions/1138345/an-efficient-compression-algorithm-for-short-text-strings
>
>
> It links to SMAZ:
>
> https://github.com/antirez/smaz
>
> It uses a codebook primed for english text. Since this is for name
> mangling, priming for Phobos might be a good idea. E.g. put "std.",
> "core.", "etc." into the codebook.

A primed dictionary is a great idea. -- Andrei
May 22, 2016
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
> The current one is effective, but slow:
>
>   https://github.com/dlang/dmd/blob/master/src/backend/compress.c
>
> Anyone want to make a stab at making it faster? Changing the format is fair game, as well as down and dirty assembler if that's what it takes.
>
> So really, how good are you at fast code?

Mandatory goldmine on compression:

http://fastcompression.blogspot.fr/
May 22, 2016
On Sunday, 22 May 2016 at 01:36:39 UTC, Walter Bright wrote:
> Just a note: large lookup tables can have cold cache performance problems.

16k lookup table are the gold standard.

May 22, 2016
On Sunday, 22 May 2016 at 12:12:09 UTC, Martin Nowak wrote:

>
> Yes, LZO, LZ4, or snappy would be good choices. They are real-time compression algorithms, i.e. it's faster to compress and write to main memory than w/o compression.

Yes the LZ compressors are very good for this usecase

> Please don't write anything on your own. There is a meta library for those algorithms http://blosc.org/, for which I added Deimos headers http://code.dlang.org/packages/d-blosc.

Agreed. The existing implementations have been fine-tuned for performance, and we should use them as leverage.

May 22, 2016
⬅ Download proof of concept

This is a proof of concept micro-benchmark compressing the 37_640 bytes symbol from the bug report linked above with both id_compress from the dmd backend and lz4 (without dictionary). The lz4 result is additionally transformed to 7-bits similar to base-64.

Original size: 37640 bytes

id_compress : 5243 ms for 10_000 iterations, 800 bytes
lz4         :   71 ms for 10_000 iterations, 389 bytes

That roughly corresponds to a 2x higher compression ratio at 70x faster compression speed.

-- 
Marco


May 22, 2016
On Sunday, 22 May 2016 at 14:05:23 UTC, Marco Leise wrote:
> ⬅ Download proof of concept
>
> This is a proof of concept micro-benchmark compressing the 37_640 bytes symbol from the bug report linked above with both id_compress from the dmd backend and lz4 (without dictionary). The lz4 result is additionally transformed to 7-bits similar to base-64.
>
> Original size: 37640 bytes
>
> id_compress : 5243 ms for 10_000 iterations, 800 bytes
> lz4         :   71 ms for 10_000 iterations, 389 bytes
>
> That roughly corresponds to a 2x higher compression ratio at 70x faster compression speed.

Please submit a PR.
May 22, 2016
Am Sun, 22 May 2016 14:06:52 +0000
schrieb Stefan Koch <uplink.coder@googlemail.com>:

> On Sunday, 22 May 2016 at 14:05:23 UTC, Marco Leise wrote:
> > ⬅ Download proof of concept
> >
> > This is a proof of concept micro-benchmark compressing the 37_640 bytes symbol from the bug report linked above with both id_compress from the dmd backend and lz4 (without dictionary). The lz4 result is additionally transformed to 7-bits similar to base-64.
> >
> > Original size: 37640 bytes
> >
> > id_compress : 5243 ms for 10_000 iterations, 800 bytes
> > lz4         :   71 ms for 10_000 iterations, 389 bytes
> >
> > That roughly corresponds to a 2x higher compression ratio at 70x faster compression speed.
> 
> Please submit a PR.

There are a few open questions, for me at least. Are there other proposals with good properties? Are there any objections against the benchmarking method or the license? Can the implementation be in D? If not, should we link against the system liblz4.so/a or copy the C code into the backend? Is a pre-filled dictionary worthwhile? If so, which words should go in it? The latter is important because changes to the Dlang mangling need to be reflected in other projects outside our direct control, too.

-- 
Marco

May 22, 2016
On Sunday, 22 May 2016 at 16:06:07 UTC, Marco Leise wrote:

>
> There are a few open questions, for me at least. Are there other proposals with good properties? Are there any objections against the benchmarking method or the license? Can the implementation be in D? If not, should we link against the system liblz4.so/a or copy the C code into the backend? Is a pre-filled dictionary worthwhile? If so, which words should go in it? The latter is important because changes to the Dlang mangling need to be reflected in other projects outside our direct control, too.

Here are my answers please take them with a grain of ignorance.

I do not think prefilling the dictionary will do much good since LZ is all about finding prefixs anyway.

I am for copying the sourcecode into our source-tree we don't want to be dependent on LZ4 library being available on the users system.
I think re-implementing it in D is also fine as long as the function is extern(C).
However I agree with Martins point: It is a waste of resources re-implementing something that already has a perfectly fine implementation.

The only reason why I re-implemented the LZ4 decompression was to use it at CTFE.