August 04, 2004
In article <ceq9g5$25mh$1@digitaldaemon.com>, Stewart Gordon says...
>
>Arcane Jill wrote:
><snip>
>> On the other hand, a Char implementation would have to call /every/ Unicode function, and hence the linker would drag in /every/ function, including the exotic ones that hardly anyone's ever going to use,
><snip>
>> That's two megabytes. Although that may be justified in a DLL, no-one is going to want to add that to their executable.
><snip>
>
>Dead code elimination is one of the most fundamental optimisations ever invented.

From application source code, maybe, but not from a library. When you build a library which contains an indivisible module which calls hundreds of external functions, the linker has no idea whether an application which (later) links against that library is going to call a given function, so it has to call all of them.

In etc.unicode, the various functions are each in their own module, so if an application is built directly against etc.unicode then the linker can link in only those modules which are necessary to complete the link.

However, suppose that a single, indivisible module within dool.lib were to call every function in etc.unicode, and suppose that an application linked against dool.lib and used a Char object. The linker would drag in the dool.lib module which defines Char. This would create hundreds of unresolved references, which would in turn cause the whole of etc.unicode to be dragged in to resolve them. This is because, by the time it gets to the linker stage, it's too late.

I'd be more than happy if I turned out to be wrong on this one, but I don't think I am. Walter can hopefully confirm or deny this.

Arcane Jill


August 04, 2004
Arcane Jill wrote:

<snip>
> From application source code, maybe, but not from a library. When you build a library which contains an indivisible module which calls hundreds of external functions, the linker has no idea whether an application which (later) links against that library is going to call a given function, so it has to call all of them.
<snip>

DCE applies to both compiling and linking.  I don't know if this is platform dependent, but AIUI a lib is basically a collection of object files.  Each object file has a symbol table that lists the functions it defines and the external functions on which it depends.  I'd guess that this can be determined for each function too, but I'm not sure.

Linking is then a matter of traversing the dependency graph through the object files of both the project itself and the libs it uses.

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
August 04, 2004
In article <ceqfqu$284l$1@digitaldaemon.com>, Stewart Gordon says...
>
>Arcane Jill wrote:

>DCE [Dead Code Elimination] applies to both compiling and linking.
>I don't know if this is
>platform dependent, but AIUI a lib is basically a collection of object
>files.

..with, in the D paradigm, 1 object file = 1 module...

>Each object file has a symbol table that lists the functions it defines and the external functions on which it depends.

Actually, I think it lists the assembler /symbols/ it defines, and the symbols upon which it depends. Each symbol is an assembler label or an assembler EQU directive. Each symbol will correspond to either a function, a global variable, or a const variable.


>I'd guess that this can be determined for each function too, but I'm not sure.

Nor I


>Linking is then a matter of traversing the dependency graph through the object files of both the project itself and the libs it uses.

True, but this process only determines which object files are needed and which are not. It does not (so far as I know) permit the linker to slice up an object file into smaller parts and discard some of those fragments. Either the whole obj file (read D module) is included, or none of it is. There is no in between.

At least, that's what I've always believed. (I wrote my own linker back in the
'80s, but that was for the Commodore Amiga, and I guess linker formats have
changed since then). Anyway, Hauke did the experiment, and concluded (to his
surprise) that D (or its linker) does in fact work the way I described.

If you prove me wrong, I will be very, very happy.

Arcane Jill



August 04, 2004
Arcane Jill wrote:

<snip>
>> Linking is then a matter of traversing the dependency graph through the object files of both the project itself and the libs it uses.
> 
> True, but this process only determines which object files are needed and which are not. It does not (so far as I know) permit the linker to slice up an object file into smaller parts and discard some of those fragments. Either the whole obj file (read D module) is included, or none of it is. There is no in between.

Unless the fragments have their individual symbol tables.  Even if the obj file format doesn't support this, then maybe someone could invent a new obj file format that does....

Obviously, any lib component that is overly bloated and not going to be used by everyone and everything would have a module to itself.  Of course, letting modules be as independent of each other as possible does help, at least for the time being....

> At least, that's what I've always believed. (I wrote my own linker back in the '80s, but that was for the Commodore Amiga, and I guess linker formats have changed since then). Anyway, Hauke did the experiment, and concluded (to his surprise) that D (or its linker) does in fact work the way I described.

Yes, the DMD linker isn't state of the art just yet.  For example, whatever I add to SDWF, it seems that my skeleton program gets bigger. When a better optimising linker finally comes along, it'll get smaller again.

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
August 04, 2004
In article <cer8l1$aqq$1@digitaldaemon.com>, Stewart Gordon says...

>Obviously, any lib component that is overly bloated and not going to be used by everyone and everything would have a module to itself.

Right - which brings me back to my original point, because you see, in D (unlike C++), every member function of a class /must/ be defined within a single module.


I achieve the one-function-per-module separation in etc.unicode precisely because the functions were good old fashioned, plain functions (not class member functions), each taking a dchar as its only parameter. If you were to make them all member functions of a single class, the resulting (indivisible, so far as we know) obj file would be two megabytes large, and if an application called a single function of that class, the whole two megs would get added to the application. This is not good.

But I don't think that having them as ordinary functions is necessarily a
problem. To me, isLetter(c) is just as readable as c.isLetter(). And, although I
/am/ a great supporter of the OO paradigm in general, I still don't see the need
to wrap the primitive types in structs/classes. If dool lets us write sin(x)
instead of x.sin(), then I don't see why it would have a problem letting us
write isLetter(c) instead of c.isLetter(). There simply is no need to turn
primitive types into objects - and hence, (if you accept this,) no problem to
solve.


>Yes, the DMD linker isn't state of the art just yet.  For example, whatever I add to SDWF, it seems that my skeleton program gets bigger. When a better optimising linker finally comes along, it'll get smaller again.

That's acceptable for most projects, but not for a two megabyte library. Telling people "it will get smaller once someone writes a better linker" is not an argument I could credibly hold. So it's organized the way it is, to keep executables as small as possible, with the linker we have right now.

Arcane Jill


August 04, 2004
It seems that someone believes that the D linker is an immature product. The linker used by DMD is not D specific, but is also used by DMC and it is called 'optlink' (http://www.digitalmars.com/ctg/optlink.html). It is a highly optimized linker (hence the name), at least when it comes to execution speed of the linker itself. Sizeoptimization as requested in this thread might be done through /EXEPACK, /PACKFUNCTIONS and /WINPACK (see http://www.digitalmars.com/ctg/ctgLinkSwitches.html). These switches can be passed to the linker through the DMD -L switch. Read more on how optlink works at http://www.digitalmars.com/ctg/ctgLinkOps.html

Whether DMD do the COMDAT stuff necessary for 'smart linking' or not, I don't know.

Also note that DMD use gcc to link on Linux.

Lars Ivar Igesund
August 04, 2004
In article <cerfpb$f73$1@digitaldaemon.com>, Lars Ivar Igesund says...
>
>It seems that someone believes that the D linker is an immature product. The linker used by DMD is not D specific, but is also used by DMC and it is called 'optlink' (http://www.digitalmars.com/ctg/optlink.html). It is a highly optimized linker (hence the name), at least when it comes to execution speed of the linker itself. Sizeoptimization as requested in this thread might be done through /EXEPACK, /PACKFUNCTIONS and /WINPACK (see http://www.digitalmars.com/ctg/ctgLinkSwitches.html). These switches can be passed to the linker through the DMD -L switch. Read more on how optlink works at http://www.digitalmars.com/ctg/ctgLinkOps.html

Wow! Okay - I'm impressed.

Just as an aside, I would never have discovered those pages by accident, had you not given us the URLs. It might be useful to add links to linker-related pages to the Table-of-Contents frame of the D site.



>Whether DMD do the COMDAT stuff necessary for 'smart linking' or not, I don't know.

I'd certainly be interested to find out.

Thanks for all that brilliant information.
Jill


August 05, 2004
Arcane Jill wrote:
> In article <cerfpb$f73$1@digitaldaemon.com>, Lars Ivar Igesund says...
> 
>>It seems that someone believes that the D linker is an immature product. The linker used by DMD is not D specific, but is also used by DMC and it is called 'optlink' (http://www.digitalmars.com/ctg/optlink.html). It is a highly optimized linker (hence the name), at least when it comes to execution speed of the linker itself. Sizeoptimization as requested in this thread might be done through /EXEPACK, /PACKFUNCTIONS and /WINPACK (see http://www.digitalmars.com/ctg/ctgLinkSwitches.html). These switches can be passed to the linker through the DMD -L switch. Read more on how optlink works at http://www.digitalmars.com/ctg/ctgLinkOps.html
> 
> 
> Wow! Okay - I'm impressed.
> 
> Just as an aside, I would never have discovered those pages by accident, had you
> not given us the URLs. It might be useful to add links to linker-related pages
> to the Table-of-Contents frame of the D site.

That's a good idea. I wish I had known about the linker documentation earlier than I did.

On a related note, there's already a wiki page that links to several of these tools:

http://www.prowiki.org/wiki4d/wiki.cgi?ReferenceForTools

> 
> 
>>Whether DMD do the COMDAT stuff necessary for 'smart linking' or not, I don't know.
> 
> 
> I'd certainly be interested to find out.
> 
> Thanks for all that brilliant information.
> Jill

-- 
Justin (a/k/a jcc7)
http://jcc_7.tripod.com/d/
August 05, 2004
"Arcane Jill" <Arcane_member@pathlink.com> escribió en el mensaje
news:cer3bp$6nk$1@digitaldaemon.com
|
| True, but this process only determines which object files are needed and which
| are not. It does not (so far as I know) permit the linker to slice up an
object
| file into smaller parts and discard some of those fragments. Either the whole
| obj file (read D module) is included, or none of it is. There is no in
between.
|

Say I have this:

module a;
void foo() { }

module b;
void bar() { }

module c;
import a, b;

And put the three of them in mylib.lib. Then I do this:

module app;
import c;
void main () { foo(); }

Compile it, and link it with mylib.lib. Will the final exe also have module b?

-----------------------
Carlos Santander Bernal


August 05, 2004
In article <ces64k$rfd$1@digitaldaemon.com>, Carlos Santander B. says...
>
>Say I have this:
>
>module a;
>void foo() { }
>
>module b;
>void bar() { }
>
>module c;
>import a, b;
>
>And put the three of them in mylib.lib. Then I do this:
>
>module app;
>import c;
>void main () { foo(); }
>
>Compile it, and link it with mylib.lib. Will the final exe also have module b?

No. You'd just get main and a. And that's the priniciple on which I separated my functions.

Since I posted my post, however, Lars has indicated that the linker is actually cleverer than I had thought, and might be able to do that sort of thing at the function level. It doesn't appear to do that automatically, but - if I've understood the documentation correctly - it can be made to do that with some command line parameters. More experimentation is needed. When someone gets function-body-elimination working, I hope they post an example.

Jill