View mode: basic / threaded / horizontal-split · Log in · Help
August 28, 2007
Why doesn't DMD create any redundant symbols?
This is a problem that comes up for me again and again in making DSSS 
work everywhere. When DMD is being used to compile several modules with 
-c, it never creates any redundant data, and it also doesn't mark any 
data which could be redundant as common as far as I can tell. This means 
that DSSS has to build one file at a time with DMD. This makes certain 
obnoxious people complain about DSSS being slow, because it takes an 
incredible ten seconds to compile a fairly large library. When I 
switched it to compiling multiple files simultaneously, it takes <1 
second, but was wrong for reasons that will be described below.

When DMD comes over typeinfo (for example), it only puts the typeinfo 
symbol into one .o file it is generating, even if it's used within 
several. On the surface, this seems like a good idea, but in reality it 
causes a whole slew of problems with bogus intermodule dependencies. 
With this, foo.io.output could arbitrarily depend on foo.net.ipvsix.udp, 
because some piece of typeinfo was put there.

First, libraries. I don't know precisely how .lib files work on Windows, 
but linking .a files will pick-and-choose only those .o files that are 
used. With these bogus inter-module dependencies, it will often be 
forced to drag in the whole library, even though only a small chunk of 
it is actually necessary. This just causes big binaries, except when 
libraries have conditional dependencies - if foo.a depends on another 
library, but foo.b does not, it is now unpredictable what libraries are 
necessary. Oof.

Second, incremental compilation. This is one I didn't realize was a 
problem until recently. DSSS will perform incremental compilation when 
only one file has changed by only compiling that file. However, that 
causes more issues with these common data problems. Now, typeinfo could 
be doubly defined but not marked common, or (by means I don't quite 
understand) not defined at all. So, I now have to compile one file at a 
time, even when building binaries.

The solution to all of this is simple: Create redundant symbols in the 
object files, marked as common. I know this can be done because it's 
done properly with one file at a time. This increases the size of the 
object files, but since it reduces bogus intermodule dependencies and 
sections marked as common will be merged anyway, it actually reduces the 
size of produced binaries, as well as making linking a significantly 
less complex problem.

I have to assume there's a reason for this, so, to summarize: Why 
doesn't DMD create any redundant symbols in .o files?

 - Gregor Richards
August 28, 2007
Re: Why doesn't DMD create any redundant symbols?
Gregor Richards wrote:
> I have to assume there's a reason for this, so, to summarize: Why 
> doesn't DMD create any redundant symbols in .o files?

It can improve build speed a lot. With C++, which doesn't do this, huge 
.obj files can be generated.

The compiler assumes that if there are multiple modules on the command 
line, they'll all be linked together, so why generate redundant output?
August 28, 2007
Re: Why doesn't DMD create any redundant symbols?
Walter Bright wrote:
> Gregor Richards wrote:
>> I have to assume there's a reason for this, so, to summarize: Why 
>> doesn't DMD create any redundant symbols in .o files?
> 
> It can improve build speed a lot. With C++, which doesn't do this, huge 
> .obj files can be generated.
> 
> The compiler assumes that if there are multiple modules on the command 
> line, they'll all be linked together, so why generate redundant output?

OK, so how about for those willing to (or required to) take the 
performance penalty, adding an option to create redundant data? I 
imagine the speed difference between compiling one file at a time and 
compiling all at once but with redundant data is greater than the speed 
difference between compiling all at once with and without redundant 
data, so your improvement to build speed significantly hinders my build 
speed.

 - Gregor Richards
August 28, 2007
Re: Why doesn't DMD create any redundant symbols?
Gregor Richards wrote:
> Walter Bright wrote:
>> Gregor Richards wrote:
>>> I have to assume there's a reason for this, so, to summarize: Why 
>>> doesn't DMD create any redundant symbols in .o files?
>>
>> It can improve build speed a lot. With C++, which doesn't do this, 
>> huge .obj files can be generated.
>>
>> The compiler assumes that if there are multiple modules on the command 
>> line, they'll all be linked together, so why generate redundant output?
> 
> OK, so how about for those willing to (or required to) take the 
> performance penalty, adding an option to create redundant data? I 
> imagine the speed difference between compiling one file at a time and 
> compiling all at once but with redundant data is greater than the speed 
> difference between compiling all at once with and without redundant 
> data, so your improvement to build speed significantly hinders my build 
> speed.

It's a good idea, but it would be a fair bit of work the way dmd is 
designed.
February 14, 2008
Re: Why doesn't DMD create any redundant symbols?
Walter Bright schrieb:
> Gregor Richards wrote:
>> Walter Bright wrote:
>>> Gregor Richards wrote:
>>>> I have to assume there's a reason for this, so, to summarize: Why 
>>>> doesn't DMD create any redundant symbols in .o files?
>>>
>>> It can improve build speed a lot. With C++, which doesn't do this, 
>>> huge .obj files can be generated.
>>>
>>> The compiler assumes that if there are multiple modules on the 
>>> command line, they'll all be linked together, so why generate 
>>> redundant output?
>>
>> OK, so how about for those willing to (or required to) take the 
>> performance penalty, adding an option to create redundant data? I 
>> imagine the speed difference between compiling one file at a time and 
>> compiling all at once but with redundant data is greater than the 
>> speed difference between compiling all at once with and without 
>> redundant data, so your improvement to build speed significantly 
>> hinders my build speed.
> 
> It's a good idea, but it would be a fair bit of work the way dmd is 
> designed.

DSSS has the option oneatatime=on as the default now, to avoid problems. 
But the compile time is no more acceptable.

Several ppl complained that after 15 min they canceled compilation of 
DWT. With doing it with oneatatime=off the same took <15 sec.

See also http://d.puremagic.com/issues/show_bug.cgi?id=1838
February 15, 2008
Re: Why doesn't DMD create any redundant symbols?
Frank Benoit wrote:
> Walter Bright schrieb:
>> Gregor Richards wrote:
>>> Walter Bright wrote:
>>>> Gregor Richards wrote:
>>>>> I have to assume there's a reason for this, so, to summarize: Why 
>>>>> doesn't DMD create any redundant symbols in .o files?
>>>>
>>>> It can improve build speed a lot. With C++, which doesn't do this, 
>>>> huge .obj files can be generated.
>>>>
>>>> The compiler assumes that if there are multiple modules on the 
>>>> command line, they'll all be linked together, so why generate 
>>>> redundant output?
>>>
>>> OK, so how about for those willing to (or required to) take the 
>>> performance penalty, adding an option to create redundant data? I 
>>> imagine the speed difference between compiling one file at a time and 
>>> compiling all at once but with redundant data is greater than the 
>>> speed difference between compiling all at once with and without 
>>> redundant data, so your improvement to build speed significantly 
>>> hinders my build speed.
>>
>> It's a good idea, but it would be a fair bit of work the way dmd is 
>> designed.
> 
> DSSS has the option oneatatime=on as the default now, to avoid problems. 
> But the compile time is no more acceptable.
> 
> Several ppl complained that after 15 min they canceled compilation of 
> DWT. With doing it with oneatatime=off the same took <15 sec.
> 
> See also http://d.puremagic.com/issues/show_bug.cgi?id=1838
> 

The problem being that, without those possibly redundant symbols, you 
get stuff dying at link time because DMD never bothered to include the 
symbol anywhere?

Performance is secondary to correctness.
February 16, 2008
Re: Why doesn't DMD create any redundant symbols?
On Tue, 28 Aug 2007 10:36:03 -0700, Walter Bright wrote:

> Gregor Richards wrote:
>> I have to assume there's a reason for this, so, to summarize: Why 
>> doesn't DMD create any redundant symbols in .o files?
> 
> It can improve build speed a lot. With C++, which doesn't do this, huge 
> .obj files can be generated.
> 
> The compiler assumes that if there are multiple modules on the command 
> line, they'll all be linked together, so why generate redundant output?

However, this assumption is not a valid one. They are valid reasons to
compile a set of files (all named on the one command line) that are not
necessarily going to be linked together.

Also, tools such as make, rebuild and bud can determine which subset of a
set of files has been changed and thus only recompiling the subset. I have
found that doing this sometimes causes conflicting object file definitions
between the subset object files and previously compiled object files from
others in the full set.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
February 16, 2008
Re: Why doesn't DMD create any redundant symbols?
Walter Bright wrote:
> Gregor Richards wrote:
>> I have to assume there's a reason for this, so, to summarize: Why 
>> doesn't DMD create any redundant symbols in .o files?
> 
> It can improve build speed a lot. With C++, which doesn't do this, huge 
> .obj files can be generated.

Pardon my ignorance, but, who cares? DMD is fast enough that such a time 
penalty for doing something correctly is "excusable" (in other words: 
needed).

Requiring [forcing] the developers of build tools to work around that 
problem seems kinda weird, to me.
Top | Discussion index | About this forum | D home