Jump to page: 1 210  
Page
Thread overview
Need a Faster Compressor
May 21, 2016
Walter Bright
May 21, 2016
Era Scarecrow
May 21, 2016
Walter Bright
May 22, 2016
Era Scarecrow
May 22, 2016
Walter Bright
May 22, 2016
Era Scarecrow
May 22, 2016
Era Scarecrow
May 22, 2016
Era Scarecrow
May 22, 2016
Walter Bright
May 22, 2016
Walter Bright
May 22, 2016
Era Scarecrow
May 22, 2016
Era Scarecrow
May 22, 2016
Era Scarecrow
May 22, 2016
Stefan Koch
May 22, 2016
Era Scarecrow
May 22, 2016
Marco Leise
May 22, 2016
Era Scarecrow
May 22, 2016
Walter Bright
May 22, 2016
Era Scarecrow
May 23, 2016
Era Scarecrow
May 23, 2016
ZombineDev
May 22, 2016
Walter Bright
May 22, 2016
Walter Bright
May 22, 2016
Walter Bright
May 22, 2016
deadalnix
May 22, 2016
Walter Bright
May 23, 2016
Wyatt
May 23, 2016
Wyatt
May 21, 2016
Timon Gehr
May 21, 2016
Timon Gehr
May 21, 2016
Walter Bright
May 21, 2016
Guillaume Boucher
May 21, 2016
Stefan Koch
May 21, 2016
Walter Bright
May 22, 2016
Rainer Schuetze
May 22, 2016
Era Scarecrow
May 22, 2016
Rainer Schuetze
May 22, 2016
Era Scarecrow
May 22, 2016
Rainer Schuetze
May 23, 2016
Marco Leise
May 22, 2016
Walter Bright
May 23, 2016
Timon Gehr
May 23, 2016
Walter Bright
May 23, 2016
Timon Gehr
May 23, 2016
Walter Bright
May 23, 2016
Stefan Koch
May 24, 2016
Walter Bright
May 24, 2016
Timon Gehr
May 24, 2016
Stefan Koch
May 24, 2016
deadalnix
May 24, 2016
Anon
May 24, 2016
Walter Bright
May 25, 2016
Marco Leise
May 24, 2016
Walter Bright
May 24, 2016
Guillaume Boucher
May 24, 2016
Walter Bright
May 24, 2016
Stefan Koch
May 24, 2016
Walter Bright
May 24, 2016
Stefan Koch
May 24, 2016
Walter Bright
May 24, 2016
Guillaume Boucher
May 21, 2016
Stefan Koch
May 22, 2016
Lionello Lunesu
May 21, 2016
Stefan Koch
May 21, 2016
Walter Bright
May 22, 2016
qznc
May 22, 2016
Era Scarecrow
May 22, 2016
Marco Leise
May 22, 2016
Marco Leise
May 22, 2016
Marco Leise
May 22, 2016
Stefan Koch
May 22, 2016
Marco Leise
May 22, 2016
Stefan Koch
May 22, 2016
Marco Leise
May 22, 2016
Stefan Koch
May 22, 2016
Era Scarecrow
May 22, 2016
Walter Bright
May 23, 2016
Marco Leise
May 23, 2016
Walter Bright
May 23, 2016
Walter Bright
May 23, 2016
Marco Leise
May 23, 2016
Stefan Koch
May 23, 2016
Walter Bright
May 23, 2016
Stefan Koch
May 23, 2016
deadalnix
May 23, 2016
Walter Bright
May 24, 2016
Era Scarecrow
May 24, 2016
Walter Bright
May 24, 2016
Era Scarecrow
Oct 17, 2016
Era Scarecrow
May 23, 2016
Marco Leise
May 22, 2016
Martin Nowak
May 22, 2016
Stefan Koch
May 22, 2016
deadalnix
May 21, 2016
The current one is effective, but slow:

  https://github.com/dlang/dmd/blob/master/src/backend/compress.c

Anyone want to make a stab at making it faster? Changing the format is fair game, as well as down and dirty assembler if that's what it takes.

So really, how good are you at fast code?
May 21, 2016
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
> So really, how good are you at fast code?

I recall being able to get a marginal boost in speed by assigning the first 1-2 bytes in preassigned buckets (256 & 65536 respectively), so you have an immediate 1-2 byte matches (skipping the chafe so to speak). Unfortunately memory-wise it's highly inefficient and requires 2 passes over the data, and probably doesn't play well with cache.

I assume this is related to compressing symbols thread? I mentioned possibly considering the LZO library.
May 21, 2016
On 21.05.2016 23:12, Walter Bright wrote:
> The current one is effective, but slow:
>
>    https://github.com/dlang/dmd/blob/master/src/backend/compress.c
>
> Anyone want to make a stab at making it faster? Changing the format is
> fair game, as well as down and dirty assembler if that's what it takes.
>
> So really, how good are you at fast code?

Why is longest_match Ω(nm) instead of O(n+m) (e.g. KMP)?
May 21, 2016
On 21.05.2016 23:37, Timon Gehr wrote:
> On 21.05.2016 23:12, Walter Bright wrote:
>> The current one is effective, but slow:
>>
>>    https://github.com/dlang/dmd/blob/master/src/backend/compress.c
>>
>> Anyone want to make a stab at making it faster? Changing the format is
>> fair game, as well as down and dirty assembler if that's what it takes.
>>
>> So really, how good are you at fast code?
>
> Why is longest_match Ω(nm) instead of O(n+m) (e.g. KMP)?

(Those are worst case bounds, not sure what's better on average on D symbol strings.)
May 21, 2016
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
> The current one is effective, but slow:
>
>   https://github.com/dlang/dmd/blob/master/src/backend/compress.c
>
> Anyone want to make a stab at making it faster? Changing the format is fair game, as well as down and dirty assembler if that's what it takes.
>
> So really, how good are you at fast code?

Is LZ4 acceptable ?
If so the current C implementation is hard to beat.
And maybe it's best to just copy it in.
May 21, 2016
On 5/21/2016 2:27 PM, Era Scarecrow wrote:
> I assume this is related to compressing symbols thread?

Yes.


> I mentioned possibly considering the LZO library.

Give it a try, see if it pans out.

Other possibilities are the lookback dictionary is fixed at 1023 characters. Experimenting and benchmarking can indicate if it should be more or less. Perhaps 512 will yield just as good compression on real world identifier strings.

Or try another scheme entirely, like huffman coding.

May 21, 2016
On 5/21/2016 2:37 PM, Timon Gehr wrote:
> Why is longest_match Ω(nm) instead of O(n+m) (e.g. KMP)?

I don't understand the terms you use, but as to the "why" it is based on what I knew about LZ77 compression. I don't pretend to be an expert on compression, and this is what I churned out. As mentioned elsewhere, the C++ mangling scheme has a primitive and ineffective LZ77 scheme built in, so I wouldn't waste time on that.

A quick google search on finding the longest match in a string did not turn up anything obvious.

If you want to throw your hat (i.e. expertise) into the ring and post a faster compressor, please do so!
May 21, 2016
On Saturday, 21 May 2016 at 21:12:15 UTC, Walter Bright wrote:
> The current one is effective, but slow:
>
>   https://github.com/dlang/dmd/blob/master/src/backend/compress.c
>
> Anyone want to make a stab at making it faster? Changing the format is fair game, as well as down and dirty assembler if that's what it takes.
>
> So really, how good are you at fast code?

Could you post a link to code you use for benchmarking ?
May 21, 2016
On 5/21/2016 3:13 PM, Stefan Koch wrote:
> Could you post a link to code you use for benchmarking ?

In the comments:

https://github.com/dlang/dmd/pull/5793
May 21, 2016
On Saturday, 21 May 2016 at 22:07:27 UTC, Walter Bright wrote:
> As mentioned elsewhere, the C++ mangling scheme has a primitive and ineffective LZ77 scheme built in, so I wouldn't waste time on that.

Sorry if I didn't memorize everything in this forum from the last 20 years, can you give a link to some reasoning?

The only thing I found are some statements in https://github.com/dlang/dmd/pull/5793:

> Note that the VC++ and g++ name mangling schemes for C++ each use a similar, but primitive (and fairly ineffective) form of compression. It's like people who invent their own crypto algorithms.

The name mangling in the Itanium C++ ABI is similar to LZ77 but heavily optimized for the typical use case in symbol names.  I don't really see how the general LZ77 would be better.  It's almost as if using information about the actual data leads to a better compression scheme.

> This is what C++ does, and results are pretty poor compression. Scanning to determine the backward references would consume the same order of time, anyway.

It uses a constant amount of time if implemented correctly, which is much faster.

« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10