Jump to page: 1 24  
Page
Thread overview
Removing RTTI from binaries
Jan 11, 2015
Mike
Jan 11, 2015
bearophile
Jan 11, 2015
Mike
Jan 11, 2015
Johannes Pfau
Jan 11, 2015
bearophile
Jan 13, 2015
Mike
Jan 13, 2015
Dicebot
Jan 14, 2015
Mike
Jan 14, 2015
Iain Buclaw
Jan 14, 2015
Mike
Jan 14, 2015
Mike
Jan 14, 2015
Iain Buclaw
Jan 14, 2015
Mike
Jan 15, 2015
Mike
Jan 15, 2015
Dicebot
Jan 15, 2015
Mike
Jan 15, 2015
Johannes Pfau
Jan 15, 2015
Johannes Pfau
Jan 15, 2015
Dicebot
Jan 15, 2015
Mike
Jan 15, 2015
ketmar
Jan 16, 2015
Mike
Jan 16, 2015
Orvid King
Jan 16, 2015
Johannes Pfau
Apr 30, 2015
Jens Bauer
Apr 30, 2015
Jens Bauer
Apr 30, 2015
Johannes Pfau
May 02, 2015
Jens Bauer
Jan 16, 2015
Dicebot
May 10, 2015
Mike
May 10, 2015
Mike
May 10, 2015
Iain Buclaw
May 10, 2015
Mike
May 11, 2015
Mike
Jan 14, 2015
Mike
Jan 14, 2015
Mike
Jan 16, 2015
Mike
January 11, 2015
I'm building some code that is heavily templated.  Therefore, I have many very small classes.  I was surprised to see my binaries growing very large, disproportionately to the amount of code I was adding.  I inspected the binaries with objdump and found contents of the .rodata section like the following:

 801fa00 6572616c 2e526567 69737465 72212830  eral.Register!(0
 801fa10 2c206361 73742841 63636573 73293729  , cast(Access)7)
 801fa20 2e526567 69737465 722e4269 74212831  .Register.Bit!(1
 801fa30 312c2063 61737428 4d757461 62696c69  1, cast(Mutabili
 801fa40 74792932 292e4269 74000000 6d6d696f  ty)2).Bit...mmio
 801fa50 2e506572 69706865 72616c21 28414842  .Peripheral!(AHB
 801fa60 312c2031 35333630 292e5065 72697068  1, 15360).Periph
 801fa70 6572616c 2e526567 69737465 72212830  eral.Register!(0
 801fa80 2c206361 73742841 63636573 73293729  , cast(Access)7)
 801fa90 2e526567 69737465 722e4269 74212831  .Register.Bit!(1
 801faa0 322c2063 61737428 4d757461 62696c69  2, cast(Mutabili
 801fab0 74792930 292e4269 74000000 6275732e  ty)0).Bit...bus.
 801fac0 41504232 00000000 6275732e 41504231  APB2....bus.APB1
 801fad0 00000000 6275732e 41484233 00000000  ....bus.AHB3....
 801fae0 6275732e 41484232 00000000 6275732e  bus.AHB2....bus.
 801faf0 41484231 00000000 6275732e 436f7265  AHB1....bus.Core
 801fb00 50657269 70686572 616c7300 54797065  Peripherals.Type
 801fb10 496e666f 5f690000 54797065 496e666f  Info_i..TypeInfo
 801fb20 5f456e75 6d000000 54797065 496e666f  _Enum...TypeInfo
 801fb30 5f417272 61790000 54797065 496e666f  _Array..TypeInfo

Most of my code just uses classes as namespaces calling static methods and properties.  The amount of code in my .text segment is only a few hundred bytes, but the .rodata section is several thousand bytes.

I'm guessing this is RTTI.  Is there any way, either through linker scripting, or the compiler to keep this stuff out of my binary?

Thanks,
Mike

using GDC 4.9 arm-none-eabi cross-compiler.

compiler flags:
arm-none-eabi-gdc -O3 -nophoboslib -nostdinc -nodefaultlibs -nostdlib -fno-emit-moduleinfo -ffunction-sections -fdata-sections -Wl,-Tsource/linker/linker.ld -Wl,--gc-sections
January 11, 2015
Mike:

> I'm building some code that is heavily templated.  Therefore, I have many very small classes.

This is a non sequitur.


> Most of my code just uses classes as namespaces calling static methods and properties.

Aren't structs better for that?

Bye,
bearophile
January 11, 2015
On Sunday, 11 January 2015 at 15:02:07 UTC, bearophile wrote:
> Mike:
>
>> I'm building some code that is heavily templated.  Therefore, I have many very small classes.
>
> This is a non sequitur.

I believe it is because nearly every one of the instantiated template names is appears in the .rodata section, thus causing the binary's size to inflate.

>
>> Most of my code just uses classes as namespaces calling static methods and properties.
>
> Aren't structs better for that?

Not in my current design, as I also make use of inheritance.  If you're curious, you can see the code here: https://github.com/JinShil/stm32f42_discovery_demo, specifically the stm32f42 folder.

Mike

January 11, 2015
Am Sun, 11 Jan 2015 15:15:38 +0000
schrieb "Mike" <none@none.com>:

> On Sunday, 11 January 2015 at 15:02:07 UTC, bearophile wrote:
> > Mike:
> >
> >> I'm building some code that is heavily templated.  Therefore, I have many very small classes.
> >
> > This is a non sequitur.
> 
> I believe it is because nearly every one of the instantiated template names is appears in the .rodata section, thus causing the binary's size to inflate.
> 

That's likely used/caused by the TypeInfo.name property.

> >
> >> Most of my code just uses classes as namespaces calling static methods and properties.
> >
> > Aren't structs better for that?
> 
> Not in my current design, as I also make use of inheritance.  If you're curious, you can see the code here: https://github.com/JinShil/stm32f42_discovery_demo, specifically the stm32f42 folder.
> 
> Mike
> 

I guess you'd see the same problem with structs.

There's no standard way to avoid TypeInfo right now. -fno-rtti would disable TypeInfo completely but it's not implemented in upstream GDC.

If you can disable TypeInfo for all classes open gcc/d/d-objfile.cc
search for
"// Put out the TypeInfo"
in ClassDeclaration::toObjFile

and comment out this line:
"type->getTypeInfo (NULL);"


If you only want to disable TypeInfo for some classes that's more
difficult:
https://github.com/D-Programming-microD/GDC/commit/f0614bc9480dacd1ec6bb75277d280afa96e08bb
January 11, 2015
Johannes Pfau:

> If you only want to disable TypeInfo for some classes that's more difficult:

This seems a feature that can be useful in standard D (all compilers), with an annotation of some kind like @nortti.

Bye,
bearophile
January 13, 2015
On Sunday, 11 January 2015 at 16:57:41 UTC, Johannes Pfau wrote:
>
> That's likely used/caused by the TypeInfo.name property.
>

Judging by what I'm seeing, I think you're right.

But I'm compiling with -fdata-sections and -Wl,--gc-sections, so shouldn't that put each TypeInfo.name in its own section and strip it out?

Here's what I'm seeing:

--------------------
arm-none-eabi-objdump -t binary/firmware

binary/firmware:     file format elf32-littlearm

SYMBOL TABLE:
08000000 l    d  .text  00000000 .text
08000a44 l    d  .rodata        00000000 .rodata
00000000 l    df *ABS*  00000000 start.d
0800001c l       .text  00000000 handler_address
00000000 l       *UND*  00000000 __aeabi_unwind_cpp_pr0
00000000 l       *UND*  00000000 __aeabi_unwind_cpp_pr1
00000000 l    df *ABS*  00000000
10010000 l       *ABS*  00000000 _stackStart
08000034 g     F .text  0000007e memcpy
08000010 g     F .text  00000014 _D5start7OnResetFZv
080202d4 g       .rodata        00000000 __text_end__
08000004 g     O .text  00000004 ResetHandler
20000000 g       .rodata        00000000 __data_end__
20000000 g       .rodata        00000000 __bss_start__
20000000 g       .rodata        00000000 __bss_end__
08000024 g     F .text  00000010 memset
20000000 g       .rodata        00000000 __data_start__
0800000c g     O .text  00000004 HardFaultHandler
080000b4 g     F .text  0000093c main
08000a28 g     F .text  0000001c _D5start11OnHardFaultFZv


I don't see anything in the symbol table, but...

---------------------
rm-none-eabi-readelf -S binary/firmware
There are 6 section headers, starting at offset 0x28300:

Section Headers:
[Nr] Name       Type      Addr     Off    Size   ES Flg
[0]             NULL      00000000 000000 000000 00
[1] .text       PROGBITS  08000000 008000 000a44 00  AX
[2] .rodata     PROGBITS  08000a44 008a44 01f890 00   A
[3] .shstrtab   STRTAB    00000000 0282d4 000029 00
[4] .symtab     SYMTAB    00000000 0283f0 000270 10
[5] .strtab     STRTAB    00000000 028660 000128 00
Key to Flags:
 W (write), A (alloc), X (execute), M (merge), S (strings)


You can see the .rodata section is orders of magnitude larger than any other section.

Mike

January 13, 2015
On Tuesday, 13 January 2015 at 14:20:43 UTC, Mike wrote:
> On Sunday, 11 January 2015 at 16:57:41 UTC, Johannes Pfau wrote:
>>
>> That's likely used/caused by the TypeInfo.name property.
>>
>
> Judging by what I'm seeing, I think you're right.
>
> But I'm compiling with -fdata-sections and -Wl,--gc-sections, so shouldn't that put each TypeInfo.name in its own section and strip it out?

I remember speaking about it with Martin and Daniel during DConf 2014 and I think it was Daniel who mentioned that by default TypeInfo/ModuleInfo is emitted in some weird packed way. When LDC announced using --gc-sections by default it was mentioned they had to change ModuleInfo emitting to make it actually work.

Can it be the same issue?
January 14, 2015
On Tuesday, 13 January 2015 at 14:20:43 UTC, Mike wrote:
>
> Here's what I'm seeing:
>
> --------------------
> arm-none-eabi-objdump -t binary/firmware
>
> binary/firmware:     file format elf32-littlearm
>
> SYMBOL TABLE:
> 08000000 l    d  .text  00000000 .text
> 08000a44 l    d  .rodata        00000000 .rodata
> 00000000 l    df *ABS*  00000000 start.d
> 0800001c l       .text  00000000 handler_address
> 00000000 l       *UND*  00000000 __aeabi_unwind_cpp_pr0
> 00000000 l       *UND*  00000000 __aeabi_unwind_cpp_pr1
> 00000000 l    df *ABS*  00000000
> 10010000 l       *ABS*  00000000 _stackStart
> 08000034 g     F .text  0000007e memcpy
> 08000010 g     F .text  00000014 _D5start7OnResetFZv
> 080202d4 g       .rodata        00000000 __text_end__
> 08000004 g     O .text  00000004 ResetHandler
> 20000000 g       .rodata        00000000 __data_end__
> 20000000 g       .rodata        00000000 __bss_start__
> 20000000 g       .rodata        00000000 __bss_end__
> 08000024 g     F .text  00000010 memset
> 20000000 g       .rodata        00000000 __data_start__
> 0800000c g     O .text  00000004 HardFaultHandler
> 080000b4 g     F .text  0000093c main
> 08000a28 g     F .text  0000001c _D5start11OnHardFaultFZv
>

I just wanted to show off our shiny new demangle support in
binutils for comparison with my previous post.

-----------------------------
arm-none-eabi-objdump --demangle=dlang -t binary/firmware

binary/firmware:     file format elf32-littlearm

SYMBOL TABLE:
08000000 l    d  .text  00000000 .text
08000380 l    d  .rodata        00000000 .rodata
00000000 l    df *ABS*  00000000 start.d
0800001c l       .text  00000000 handler_address
00000000 l       *UND*  00000000 __aeabi_unwind_cpp_pr0
00000000 l    df *ABS*  00000000
10010000 l       *ABS*  00000000 _stackStart
08000034 g     F .text  0000007e memcpy
08000010 g     F .text  00000014 start.OnReset()
0801fb78 g       .rodata        00000000 __text_end__
08000004 g     O .text  00000004 ResetHandler
20000000 g       .rodata        00000000 __data_end__
20000000 g       .rodata        00000000 __bss_start__
20000000 g       .rodata        00000000 __bss_end__
08000024 g     F .text  00000010 memset
0800032c  w    F .text  00000038
trace.writeLine!(immutable(char)[]).writeLine(const(immutable(char)[]))
20000000 g       .rodata        00000000 __data_start__
0800000c g     O .text  00000004 HardFaultHandler
080000b4 g     F .text  00000278 main
08000364 g     F .text  0000001c start.OnHardFault()

Nice! Thanks Iain.
January 14, 2015
On Tuesday, 13 January 2015 at 14:36:15 UTC, Dicebot wrote:
>
> I remember speaking about it with Martin and Daniel during DConf 2014 and I think it was Daniel who mentioned that by default TypeInfo/ModuleInfo is emitted in some weird packed way. When LDC announced using --gc-sections by default it was mentioned they had to change ModuleInfo emitting to make it actually work.
>
> Can it be the same issue?

Thanks, Dicebot, for bringing this to my attention.  That would
explain what I'm seeing.

Is this something unique to GDC, or is it an artifact inherited
from DMD?

Mike
January 14, 2015
On 14 January 2015 at 04:00, Mike via D.gnu <d.gnu@puremagic.com> wrote:
> On Tuesday, 13 January 2015 at 14:36:15 UTC, Dicebot wrote:
>>
>>
>> I remember speaking about it with Martin and Daniel during DConf 2014 and I think it was Daniel who mentioned that by default TypeInfo/ModuleInfo is emitted in some weird packed way. When LDC announced using --gc-sections by default it was mentioned they had to change ModuleInfo emitting to make it actually work.
>>
>> Can it be the same issue?
>
>
> Thanks, Dicebot, for bringing this to my attention.  That would explain what I'm seeing.
>
> Is this something unique to GDC, or is it an artifact inherited from DMD?
>
> Mike

It's an artifact inherited from DMD.

ModuleInfo is of a dynamic size, depending on what is implemented in the module.

See: https://github.com/D-Programming-Language/druntime/blob/081591237ee7d666ffd81463dac1b7f38e7d9798/src/object_.d#L1589

However it's size is correctly recorded before being sent to be written.

The ModuleInfo symbols themselves aren't put into any particular section, they  also can't go in rodata because of how the D runtime start-up works, so they end up in the same section as __gshared data.

The same is also true with TypeInfo_Class (alias ClassInfo) where interface vtables are written packed immediately after the data structure ends.  Again, it's size is treated as dynamic and is correctly recorded before being written, and again it cannot be in rodata because the __monitor field is directly written to.

Iain.
« First   ‹ Prev
1 2 3 4