Need a Faster Compressor (page 6)

On 5/22/2016 2:57 AM, Era Scarecrow wrote: > Question Walter (or Andrei): Would it be terrible to have lower characters as > part of the identifier string? I'm referring to characters under 32 (control > characters, especially the \r, \n and \t). Currently there's a bug in id_compress() where it doesn't take into account UTF-8 sequences, as a subset of them are allowed as identifiers. I rue the day I allowed that in D, but we're kinda stuck with it. I kinda hate disallowing control characters, because then the compressor gets a bit specific.

On 5/22/2016 3:32 AM, Era Scarecrow wrote: > Curiously the extra 95 symbols actually is just enough to keep compression up > and bring speed to about 37% speed gain, with it looks like no disadvantages. > Then again I still need a larger set to test things on, but it's very promising! You're doing the gods' work, Era!

On 5/22/2016 3:32 AM, Era Scarecrow wrote: > [...] My idea for speeding things up is every time a new character c is matched in pattern[], a bit is set in mask: ulong mask; ... mask |= 1 << (c & 63); Then, when scanning for longer matches, test: if (!((1 << (id[i + matchlen - 1] & 63)) & mask)) i += matchlen; because we know that id[i + matchlen - 1] cannot match any of the characters in the current match. The two expressions look complex, but they should compile to single instructions on x64.

On Sunday, 22 May 2016 at 19:21:13 UTC, Walter Bright wrote: > On 5/22/2016 3:32 AM, Era Scarecrow wrote: >> Curiously the extra 95 symbols actually is just enough to keep compression up and bring speed to about 37% speed gain, with it looks like no disadvantages. > > You're doing the gods' work, Era! Maybe... I have to wonder if LZ4 from the looks of things will take over instead. Have to see how it performs (and if we can integrate it); Meanwhile the main advantage of id_compress (as I'm tinkering with it) is it makes no changes to the signature/behavior so taking advantage of it's boost in speed can be immediate (although it will no doubt have a bit weaker compression, and demangling needs to be re-written for it). Honestly the majority of speed gains with id_compress are simply from having a much smaller window(1023 to 220), also resulting in a smaller footprint for the compressed output (3 bytes to 2 bytes). Compare to the speed gains from my new algorithm (at 20% faster) is pretty constant, and retains a HUGE window (pre-scans 8k, has a 2k window) that it can find matches from, so it will win in compression. If LZ4 is half as good as the initial results are then I'd go with it instead.

May 22, 2016

Re: Need a Faster Compressor

Posted by Walter Bright
in reply to Rainer Schuetze

Permalink

Walter Bright

Posted in reply to Rainer Schuetze

Permalink

On 5/22/2016 10:44 AM, Rainer Schuetze wrote:
> You are right about the symbols using the VC mangling. The test case
> "1.s.s.s.s.s" in the bugreport translated to C++ yields
>
> ?foo@Result@?1???$s@UResult@?1???$s@UResult@?1???$s@UResult@?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@testexpansion@@YA@U1?1???$s@H@2@YA@H@Z@@Z@@testexpansion@@YA@U1?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@2@YA@U1?1???$s@H@2@YA@H@Z@@Z@@Z@@testexpansion@@YA@U1?1???$s@UResult@?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@testexpansion@@YA@U1?1???$s@H@2@YA@H@Z@@Z@@2@YA@U1?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@2@YA@U1?1???$s@H@2@YA@H@Z@@Z@@Z@@Z@@testexpansion@@YA@U1?1???$s@UResult@?1???$s@UResult@?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@testexpansion@@YA@U1?1???$s@H@2@YA@H@Z@@Z@@testexpansion@@YA@U1?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@2@YA@U1?1???$s@H@2@YA@H@Z@@Z@@Z@@2@YA@U1?1???$s@UResult@?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@testexpansion@@YA@U1?1???$s@H@2@YA@H@Z@@Z@@2@YA@U1?1???$s@UResult@?1???$s@H@testexpansion@@YA@H@Z@@2@YA@U1?1???$s@H@2@YA@H@Z@@Z@@Z@@Z@@Z@QAEXXZ

Haha, thus showing why people should never invent their own ad-hoc crypto nor compression algorithms :-)


> _ZZN13testexpansion1sIZNS0_IZNS0_IZNS0_IZNS0_IiEEDaT_E6ResultEES1_S2_E6ResultEES1_S2_E6ResultEES1_S2_E6ResultEES1_S2_EN6Result3fooEv

Note "Result" appearing 5 times. It's still crappy compression. Ironically, this design predates the much worse VC++ one.


> which is only 39 characters longer than the compressed version of the D symbol.
> It uses a much smaller character set, though.

The much smaller character set can be done, if desired, by using base64 encoding. The compressor is careful to not emit 0 bytes, as those definitely would screw up bintools, but the rest seem unperturbed by binary strings.

On 5/22/2016 9:06 AM, Marco Leise wrote: > Are there any objections > against the benchmarking method or the license? BSD may cause problems. > Can the implementation be in D? Not yet. > If not, should we link against the > system liblz4.so/a No. > or copy the C code into the backend? Yes. > Is a pre-filled dictionary worthwhile? The trouble with that is the demangler would need the same dictionary, right? If so, that would be problematic.

On Sunday, 22 May 2016 at 19:44:08 UTC, Era Scarecrow wrote: > ... Well here's the rundown of some numbers. min_compress uses a tiny window, big_compress was my original algorithmn but modified to use 16k total for a window. reduced_id_compress is the original except reduced to a 220 window and 2 byte constant output. Along with the compressed outputs of each. min_compress: [TickDuration(46410084)] 0.779836 big_compress: [TickDuration(47998202)] 0.806545 orig_id_compress: [TickDuration(59519257)] baseline reduced_id_compress: [TickDuration(44033192)] 0.739894 1001 (original size) 72 testexpansion.s!(æεó▌æ╗int).så°Resulà≡Ñ╪¢╨╘ÿ¼É ↑─↑╜►╘fñv├ÿ╜ ↑│↑Ä .foo() 73 testexpansion.s!(ÅæεÅó▌Åæ╗int).sÅå°ResulÅà≡ÅÑ╪Å¢╨Å╘ÿÅ¼É₧├ÿÄ╜É╝▓ÿëåâ.foo() 67 tes╤xpansion.s!(ÇææÇóóÇææint)∙⌡ResulÇàÅÇÑºÇ¢»Ç╘τÇ¼∩ë├τü╜∩¢▓τ².foo() 78 testexpansion.s!(æ2óCæ2int).så(Resulà0ÑH¢P╘ê¼É╘íñÖ├ê┤ÿσ║ñ¬├ê¼É╘íñÖ├êÉ1å).foo() min_compress: [TickDuration(29210832)] 0.82391 big_compress: [TickDuration(31058664)] 0.87601 orig_id_compress: [TickDuration(35466130)] baseline reduced_id_compress: [TickDuration(25032532)] 0.705977 629 (original size) 52 E.s!(à·è⌡àδint).så°Resulà≡ÖΣÅ▄╝╝ö┤ líd _Ög læ◄.foo() 61 E.s!(Åà·Åè⌡Åàδint).sÅå°ResulÅà≡ÅÖΣÅÅ▄Å╝╝Åö┤₧ç∞ÄÖΣ¡ó╠ïå╗.foo() 52 E.s!(ΣÇèèΣint)∙⌡ResulÇàÅÇÖ¢ÇÅúÇ╝├Çö╦ëçôüÖ¢Æó│².foo() 52 E.s!(à&è+à&int).så(Resulà0Ö<ÅD╝döl ┤í╝ ┴Ö╣ ┤æ9.foo()

Am Sun, 22 May 2016 13:02:37 -0700 schrieb Walter Bright <newshound2@digitalmars.com>: > On 5/22/2016 9:06 AM, Marco Leise wrote: > > Are there any objections > > against the benchmarking method or the license? > > BSD may cause problems. > > > Can the implementation be in D? > > Not yet. > > > If not, should we link against the > > system liblz4.so/a > > No. > > > or copy the C code into the backend? > > Yes. Can you elaborate on how copying the source overrules your above license concerns? I would create a directory "lz4" for the sources, compile them like we compile zlib now and install dmd with a backend license stating that it uses lz4 licensed under BSD? > > Is a pre-filled dictionary worthwhile? > > The trouble with that is the demangler would need the same dictionary, right? If so, that would be problematic. Right, it's probably more trouble than worth it. -- Marco

Am Sun, 22 May 2016 20:22:06 +0200 schrieb Rainer Schuetze <r.sagitario@gmx.de>: > The 3 consecutive ?s are also in the object file, so they are actually part of the mangled symbol. VC++ uses only ASCII characters for mangling (which D should do, too, IMO). In general I agree. The extended characters may be swallowed and the representation not copyable. In the worst case we may uncover bugs in debugging software. On the other hand, base-64 adds ~17% to the symbol length compared to base-128 and the linker in particular should only be interested in a terminating \0. -- Marco

On 5/22/2016 10:52 PM, Marco Leise wrote: > Can you elaborate on how copying the source overrules your > above license concerns? It does not. It just avoids having to link in a library that may or may not exist on every machine we want to create demanglers on. > I would create a directory "lz4" for the sources, compile > them like we compile zlib now and install dmd with a backend > license stating that it uses lz4 licensed under BSD? How many lines of code are we talking about here? The post you made makes it look like just a handful - why go to all the trouble of creating more directories and build complexity?

Forums