September 18, 2009
Walter Bright wrote:
> Tom S wrote:
>> Personally I'm of the opinion that functions should be explicitly marked for CTFE, and this is just another reason for such. I'm using a patched DMD with added pragma(ctfe) which instructs the compiler not to run any codegen or generate debug info functions/aggregates marked as such. This trick alone can slim an executable down by a good megabyte, which sometimes is a life-saver with OPTLINK.
> 
> If you are compiling files with -lib, and nobody calls those CTFE functions at runtime, then they should never be linked in. (Virtual functions are always linked in, as they have a reference to them even if they are never called.)
> 
> Executables built this way shouldn't have dead functions in them.

It could be debug info, because with -g something definitely is linked in whether it's -lib or not (except with -lib there's way more of it). With ctfe-mixin-based metaprogramming, you also end up with string literals that don't seem to get optimized away by the linker.


-- 
Tomasz Stachowiak
http://h3.team0xf.com/
h3/h3r3tic on #D freenode
September 18, 2009
Walter Bright wrote:
> Tom S wrote:
>> When building my second largest project, DMD eats up about 1.2GB of memory and dies (even without -g). Luckily, xfBuild allows me to set the limit of modules to be compiled at a time, so when I cap it to 200, it compiled... but didn't link :( Somewhere in the process a library is created that confuses OPTLINK as well as "lib -l". There's one symbol in it that neither of these are unable to see and it results in an undefined reference when linking. The symbol is clearly there when using a lib dumping tool from DDL or "libunres -d -c". I've dropped the lib at http://h3.team0xf.com/strangeLib.7z . The symbol in question is compressed and this newsgroup probably won't chew the non-ansi chars well, but it can be found via a regex "D2xf3omg4core.*ctFromRealVee0P0Z".
> 
> Please post to bugzilla.

http://d.puremagic.com/issues/show_bug.cgi?id=3327


>> One thing slowing this tool down is the need to call the librarian multiple times. DMD -lib will sometimes generate multiple objects with the same name
> 
> Please post to bugzilla.

http://d.puremagic.com/issues/show_bug.cgi?id=3328


-- 
Tomasz Stachowiak
http://h3.team0xf.com/
h3/h3r3tic on #D freenode
September 18, 2009
Tom S wrote:
> Walter Bright wrote:
>> Tom S wrote:
>>> Personally I'm of the opinion that functions should be explicitly marked for CTFE, and this is just another reason for such. I'm using a patched DMD with added pragma(ctfe) which instructs the compiler not to run any codegen or generate debug info functions/aggregates marked as such. This trick alone can slim an executable down by a good megabyte, which sometimes is a life-saver with OPTLINK.
>>
>> If you are compiling files with -lib, and nobody calls those CTFE functions at runtime, then they should never be linked in. (Virtual functions are always linked in, as they have a reference to them even if they are never called.)
>>
>> Executables built this way shouldn't have dead functions in them.
> 
> It could be debug info, because with -g something definitely is linked in whether it's -lib or not (except with -lib there's way more of it). 

The linker doesn't pull in obj modules based on symbolic debug info. You can find out what is pulling in a particular module by deleting it from the library, linking, and seeing what undefined symbol message the linker produces.


> With ctfe-mixin-based metaprogramming, you also end up with string literals that don't seem to get optimized away by the linker.

The linker has no idea what a string literal is, or what any other literals are, either. It doesn't know what a type is. It doesn't know what language the source code was. It only knows about symbols, sections, and bytes of binary data. The object module format offers no way to mark a piece of data as a string literal.

I do think it is possible, though, for the compiler to do a better job of not putting unneeded literals into the obj file.
September 18, 2009
Walter Bright wrote:
> Tom S wrote:
>> Walter Bright wrote:
>>> Tom S wrote:
>>>> Personally I'm of the opinion that functions should be explicitly marked for CTFE, and this is just another reason for such. I'm using a patched DMD with added pragma(ctfe) which instructs the compiler not to run any codegen or generate debug info functions/aggregates marked as such. This trick alone can slim an executable down by a good megabyte, which sometimes is a life-saver with OPTLINK.
>>>
>>> If you are compiling files with -lib, and nobody calls those CTFE functions at runtime, then they should never be linked in. (Virtual functions are always linked in, as they have a reference to them even if they are never called.)
>>>
>>> Executables built this way shouldn't have dead functions in them.
>>
>> It could be debug info, because with -g something definitely is linked in whether it's -lib or not (except with -lib there's way more of it). 
> 
> The linker doesn't pull in obj modules based on symbolic debug info.

I wasn't implying that.


> You can find out what is pulling in a particular module by deleting it from the library, linking, and seeing what undefined symbol message the linker produces.

I tested it on a single-module program before posting. Basically void main() {} and a single unused function void fooBar {}. With -g, something with the function's mangled name ended up in the executable. Without -g, the linker was able to remove the function (I ran a diff on a compiled file with the function removed altogether from source).


>> With ctfe-mixin-based metaprogramming, you also end up with string literals that don't seem to get optimized away by the linker.
> 
> The linker has no idea what a string literal is, or what any other literals are, either. It doesn't know what a type is. It doesn't know what language the source code was. It only knows about symbols, sections, and bytes of binary data. The object module format offers no way to mark a piece of data as a string literal.

I wasn't implying that either and I'm well aware of it :S I thought it would be easier for everyone to understand than any blurbing about LEDATA/LED386 and static data segments.


> I do think it is possible, though, for the compiler to do a better job of not putting unneeded literals into the obj file.

That would be nice and perhaps might make OPTLINK crash less.


-- 
Tomasz Stachowiak
http://h3.team0xf.com/
h3/h3r3tic on #D freenode
September 18, 2009
Tom S wrote:
> I tested it on a single-module program before posting. Basically void main() {} and a single unused function void fooBar {}. With -g, something with the function's mangled name ended up in the executable. Without -g, the linker was able to remove the function (I ran a diff on a compiled file with the function removed altogether from source).

The best way to determine what is linked in to an executable is to generate a map file with -L/map, and examine it. It will list all the symbols in it.

Also, if you specify a .obj file directly to the linker, it will put all of the symbols and data in that .obj file into the executable. The linker does NOT remove functions.

What it DOES do is pull obj files out of a library to resolve unresolved symbols from other obj files already linked in.

In other words, it's an additive process, not a subtractive one.
September 18, 2009
Walter Bright wrote:
> Also, if you specify a .obj file directly to the linker, it will put all of the symbols and data in that .obj file into the executable. The linker does NOT remove functions.
> 
> What it DOES do is pull obj files out of a library to resolve unresolved symbols from other obj files already linked in.
> 
> In other words, it's an additive process, not a subtractive one.

Tests seem to indicate otherwise. By the way, the linker in gcc can also remove unused sections (--gc-sections, which works best with -ffunction-sections).

----

>cat foo.d
void main() {
}

version (WithFoo) {
        void foo() {
        }
}
>dmd foo.d -c -of1.obj

>dmd foo.d -version=WithFoo -c -of2.obj

>diff 1.obj 2.obj
Files 1.obj and 2.obj differ

>lib -l 1.obj   1>NUL  && cat 1.lst

Publics by name         module
__Dmain                          1
_D3foo12__ModuleInfoZ            1


Publics by module
1
        __Dmain                           _D3foo12__ModuleInfoZ

>lib -l 2.obj   1>NUL  && cat 2.lst

Publics by name         module
__Dmain                          2
_D3foo12__ModuleInfoZ            2
_D3foo3fooFZv                    2


Publics by module
2
        __Dmain                           _D3foo12__ModuleInfoZ
        _D3foo3fooFZv

>dmd -L/M 1.obj -of1.exe

>dmd -L/M 2.obj -of2.exe

>diff 1.exe 2.exe

>diff 1.map 2.map

>

----

-- 
Tomasz Stachowiak
http://h3.team0xf.com/
h3/h3r3tic on #D freenode
1 2 3
Next ›   Last »