May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Observer | On Saturday, 21 May 2016 at 01:29:18 UTC, Observer wrote:
> My recollection is that successively compiled binaries are rarely directly comparable, because of timestamps embedded by compilers.
> So I have to ask: are there standard tools to understand enough
> of the ELF binary format (or whatever, for the given platform) to compare everything except the timestamps?
I just looked at an ELF format document, and didn't see any
reference to timestamps. Maybe I'm confusing that with compression
formats like gzip produces.
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bill Hicks | On Saturday, 21 May 2016 at 01:35:05 UTC, Bill Hicks wrote:
> On Friday, 20 May 2016 at 19:41:16 UTC, Walter Bright wrote:
>> Hashing produces reproducible unique values. Random will not be reproducible and may not even be unique.
>
> Just a troll here, but Quasirandom numbers are reproducible and unique. e.g., Sobol in base 2 can be very efficient and fast, for any output length.
Unless there's a way to seed it consistently so order doesn't matter, it won't work. Although a fast CRC32 on the source could then give you a seed to work with, I'd prefer a hash over PRNG.
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Era Scarecrow | On Saturday, 21 May 2016 at 02:16:30 UTC, Era Scarecrow wrote:
>
> Unless there's a way to seed it consistently so order doesn't matter, it won't work. Although a fast CRC32 on the source could then give you a seed to work with, I'd prefer a hash over PRNG.
There is no seed; it's a low-discrepancy sequence, and each element of the sequence can be computed independently.
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Thursday, 19 May 2016 at 13:45:18 UTC, Andrei Alexandrescu wrote:
> On 05/19/2016 08:38 AM, Steven Schveighoffer wrote:
>> Yep. chain uses voldemort type, map does not.
>
> We definitely need to fix Voldemort types. Walter and I bounced a few ideas during DConf.
>
> 1. Do the cleartext/compress/hash troika everywhere (currently we do it on Windows). That should help in several places.
>
> 2. For Voldemort types, simply generate a GUID-style random string of e.g. 64 bytes. Currently the name of the type is composed out of its full scope, with some exponentially-increasing repetition of substrings. That would compress well, but it's just a lot of work to produce and then compress the name. A random string is cheap to produce and adequate for Voldemort types (you don't care for their name anyway... Voldemort... get it?).
>
> I very much advocate slapping a 64-long random string for all Voldermort returns and calling it a day. I bet Liran's code will get a lot quicker to build and smaller to boot.
>
>
> Andrei
I have an Idea of reproducible, less-than-exponential time mangling, although I don't know how actionable.
Leave mangling as is, but pretend that you are mangling something different, for example when the input is
foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int))
pretend that you are mangling
foo!(boo!(bar!(baz!(int))), #1, #2)
Where #1 and #2 are special symbols that refer to stuff that was **already in the name**, particularly:
#1: bar!(baz!(int))
#2: baz!(int)
Now, this is an reduction of the exponential computation time only if you can detect repetitions before you go inside them, for example, in case of #1, detect the existence of bar!(baz!(int)) without looking inside it and seeing baz!(int). Do you think it could be done?
You would also need a deterministic way to assign those symbols #1, #2, #3... to the stuff in the name.
Compression can also be done at the end if necessary to reduce the polynomial growth.
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to poliklosio | On Saturday, 21 May 2016 at 06:18:05 UTC, poliklosio wrote: > > I have an Idea of reproducible, less-than-exponential time mangling, although I don't know how actionable. > > Leave mangling as is, but pretend that you are mangling something different, for example when the input is > foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int)) > pretend that you are mangling > foo!(boo!(bar!(baz!(int))), #1, #2) > Where #1 and #2 are special symbols that refer to stuff that was **already in the name**, particularly: > #1: bar!(baz!(int)) > #2: baz!(int) See http://forum.dlang.org/post/szodxrizfmufqdkpdryc@forum.dlang.org |
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Friday, 20 May 2016 at 19:37:23 UTC, Andrei Alexandrescu wrote:
> Was talking to Walter on the phone and he just had one of those great ideas: encode in the function mangle that it returns "auto". I thought that was fantastic. Would that work? -- Andrei
Equivalent to not mangling return type at all. But this leaves you with 2^^n growth, still exponential. Also is there any evidence that compression is faster than hashing?
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kagamin | On Saturday, 21 May 2016 at 17:34:19 UTC, Kagamin wrote:
> On Friday, 20 May 2016 at 19:37:23 UTC, Andrei Alexandrescu wrote:
>> Was talking to Walter on the phone and he just had one of those great ideas: encode in the function mangle that it returns "auto". I thought that was fantastic. Would that work? -- Andrei
>
> Equivalent to not mangling return type at all. But this leaves you with 2^^n growth, still exponential. Also is there any evidence that compression is faster than hashing?
Would it be practical to use type-erasure in the middle of a call chain? It would cut 2^^n right at the knees at the possibly negligible runtime cost.
It would impose a burden on the client (as I don't think a compiler should do it on its own - or even if it could), but if it's really Pareto distributed then it shouldn't be that difficult to find the hot spots.
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kagamin | On 05/21/2016 01:34 PM, Kagamin wrote:
> But this leaves you with 2^^n growth, still exponential
He said that that won't happen any longer, the growth was because of the return type. Is that correct? -- Andrei
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to poliklosio | On 5/20/2016 11:18 PM, poliklosio wrote:
> I have an Idea of reproducible, less-than-exponential time mangling, although I
> don't know how actionable.
>
> Leave mangling as is, but pretend that you are mangling something different, for
> example when the input is
> foo!(boo!(bar!(baz!(int))), bar!(baz!(int)), baz!(int))
> pretend that you are mangling
> foo!(boo!(bar!(baz!(int))), #1, #2)
> Where #1 and #2 are special symbols that refer to stuff that was **already in
> the name**, particularly:
> #1: bar!(baz!(int))
> #2: baz!(int)
This is what LZW compression does, except that LZW does it in general rather than just for identifiers.
|
May 21, 2016 Re: DMD producing huge binaries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Saturday, 21 May 2016 at 18:18:21 UTC, Andrei Alexandrescu wrote: > He said that that won't happen any longer, the growth was because of the return type. Is that correct? https://issues.dlang.org/show_bug.cgi?id=15831#c4 Looks like growth is due to the fact that the voldemort type is in the scope of a function and function is fun!(T).fun(T) - hence each level multiplies by 2. And return type is not part of the mangled name already - that would cause circular dependency, you would need the voldemort type mangling to generate it. |
Copyright © 1999-2021 by the D Language Foundation