Thread overview
--gc-sections and GDC
Jul 16, 2014
Mike
Jul 16, 2014
Mike
Jul 16, 2014
Iain Buclaw
Jul 16, 2014
Johannes Pfau
Jul 16, 2014
Iain Buclaw
Jul 17, 2014
David Nadlinger
Jul 17, 2014
Johannes Pfau
Jul 16, 2014
Jacob Carlborg
Jul 17, 2014
Daniel Murphy
July 16, 2014
I received a question from Dicebot in at the end of my presentation.  He asked about the --gc-sections linker flag breaking code from GDC.

I recently discovered how one can see why this is occurring, and I hope this will help identify the problem and lead to a solution.

Compile any simple hello world program with the following gcc command:
gcc --verbose -Wl,--verbose test.c.

Part of the output is GCC's internal linker script as shown below.  I believe this is the source of the problem.  Here's my theory.

D is not C, and is likely generating code that the GCC internal linker script doesn't know about.  This code may be incorrectly identified as dead code because there is no 'link' to it and, therefore, appears unused.  If D or GDC is generating any code like this, it needs to be marked as KEEP in the linker script.  You can see examples of this in GCC's internal linker script below.

If my theory is correct, GDC may have to make an internal linker script specifically for D's code generation that marks such code as KEEP.

I hope I'm not just blowing smoke.

Mike

==================================================
/* Script for -z combreloc: combine and sort reloc sections */
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64",
              "elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SEARCH_DIR("/usr/x86_64-unknown-linux-gnu/lib64"); SEARCH_DIR("/usr/x86_64-unknown-linux-gnu/lib"); SEARCH_DIR("/usr/lib"); SEARCH_DIR("/usr/local/lib");
SECTIONS
{
  /* Read-only sections, merged into text segment: */
  PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
  .interp         : { *(.interp) }
  .note.gnu.build-id : { *(.note.gnu.build-id) }
  .hash           : { *(.hash) }
  .gnu.hash       : { *(.gnu.hash) }
  .dynsym         : { *(.dynsym) }
  .dynstr         : { *(.dynstr) }
  .gnu.version    : { *(.gnu.version) }
  .gnu.version_d  : { *(.gnu.version_d) }
  .gnu.version_r  : { *(.gnu.version_r) }
  .rela.dyn       :
    {
      *(.rela.init)
      *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*)
      *(.rela.fini)
      *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*)
      *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*)
      *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*)
      *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*)
      *(.rela.ctors)
      *(.rela.dtors)
      *(.rela.got)
      *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*)
      *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*)
      *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*)
      *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*)
      *(.rela.ifunc)
    }
  .rela.plt       :
    {
      *(.rela.plt)
      PROVIDE_HIDDEN (__rela_iplt_start = .);
      *(.rela.iplt)
      PROVIDE_HIDDEN (__rela_iplt_end = .);
    }
  .init           :
  {
    KEEP (*(SORT_NONE(.init)))
  }
  .plt            : { *(.plt) *(.iplt) }
  .text           :
  {
    *(.text.unlikely .text.*_unlikely .text.unlikely.*)
    *(.text.exit .text.exit.*)
    *(.text.startup .text.startup.*)
    *(.text.hot .text.hot.*)
    *(.text .stub .text.* .gnu.linkonce.t.*)
    /* .gnu.warning sections are handled specially by elf32.em.  */
    *(.gnu.warning)
  }
  .fini           :
  {
    KEEP (*(SORT_NONE(.fini)))
  }
  PROVIDE (__etext = .);
  PROVIDE (_etext = .);
  PROVIDE (etext = .);
  .rodata         : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
  .rodata1        : { *(.rodata1) }
  .eh_frame_hdr : { *(.eh_frame_hdr) }
  .eh_frame       : ONLY_IF_RO { KEEP (*(.eh_frame)) }
  .gcc_except_table   : ONLY_IF_RO { *(.gcc_except_table
  .gcc_except_table.*) }
  /* These sections are generated by the Sun/Oracle C++ compiler.  */
  .exception_ranges   : ONLY_IF_RO { *(.exception_ranges
  .exception_ranges*) }
  /* Adjust the address for the data segment.  We want to adjust up to
     the same address within the page on the next page up.  */
  . = ALIGN (CONSTANT (MAXPAGESIZE)) - ((CONSTANT (MAXPAGESIZE) - .) & (CONSTANT (MAXPAGESIZE) - 1)); . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE));
  /* Exception handling  */
  .eh_frame       : ONLY_IF_RW { KEEP (*(.eh_frame)) }
  .gcc_except_table   : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) }
  .exception_ranges   : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) }
  /* Thread Local Storage sections  */
  .tdata          : { *(.tdata .tdata.* .gnu.linkonce.td.*) }
  .tbss           : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  }
  .init_array     :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*)))
    KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors))
    PROVIDE_HIDDEN (__init_array_end = .);
  }
  .fini_array     :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*)))
    KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors))
    PROVIDE_HIDDEN (__fini_array_end = .);
  }
  .ctors          :
  {
    /* gcc uses crtbegin.o to find the start of
       the constructors, so we make sure it is
       first.  Because this is a wildcard, it
       doesn't matter if the user does not
       actually link against crtbegin.o; the
       linker won't look for a file to match a
       wildcard.  The wildcard also means that it
       doesn't matter which directory crtbegin.o
       is in.  */
    KEEP (*crtbegin.o(.ctors))
    KEEP (*crtbegin?.o(.ctors))
    /* We don't want to include the .ctor section from
       the crtend.o file until after the sorted ctors.
       The .ctor section from the crtend file contains the
       end of ctors marker and it must be last */
    KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .ctors))
    KEEP (*(SORT(.ctors.*)))
    KEEP (*(.ctors))
  }
  .dtors          :
  {
    KEEP (*crtbegin.o(.dtors))
    KEEP (*crtbegin?.o(.dtors))
    KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .dtors))
    KEEP (*(SORT(.dtors.*)))
    KEEP (*(.dtors))
  }
  .jcr            : { KEEP (*(.jcr)) }
  .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) }
  .dynamic        : { *(.dynamic) }
  .got            : { *(.got) *(.igot) }
  . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .);
  .got.plt        : { *(.got.plt)  *(.igot.plt) }
  .data           :
  {
    *(.data .data.* .gnu.linkonce.d.*)
    SORT(CONSTRUCTORS)
  }
  .data1          : { *(.data1) }
  _edata = .; PROVIDE (edata = .);
  . = .;
  __bss_start = .;
  .bss            :
  {
   *(.dynbss)
   *(.bss .bss.* .gnu.linkonce.b.*)
   *(COMMON)
   /* Align here to ensure that the .bss section occupies space up to
      _end.  Align after .bss to ensure correct alignment even if the
      .bss section disappears because there are no input sections.
      FIXME: Why do we need it? When there is no .bss section, we don't
      pad the .data section.  */
   . = ALIGN(. != 0 ? 64 / 8 : 1);
  }
  .lbss   :
  {
    *(.dynlbss)
    *(.lbss .lbss.* .gnu.linkonce.lb.*)
    *(LARGE_COMMON)
  }
  . = ALIGN(64 / 8);
  . = SEGMENT_START("ldata-segment", .);
  .lrodata   ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) :
  {
    *(.lrodata .lrodata.* .gnu.linkonce.lr.*)
  }
  .ldata   ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) :
  {
    *(.ldata .ldata.* .gnu.linkonce.l.*)
    . = ALIGN(. != 0 ? 64 / 8 : 1);
  }
  . = ALIGN(64 / 8);
  _end = .; PROVIDE (end = .);
  . = DATA_SEGMENT_END (.);
  /* Stabs debugging sections.  */
  .stab          0 : { *(.stab) }
  .stabstr       0 : { *(.stabstr) }
  .stab.excl     0 : { *(.stab.excl) }
  .stab.exclstr  0 : { *(.stab.exclstr) }
  .stab.index    0 : { *(.stab.index) }
  .stab.indexstr 0 : { *(.stab.indexstr) }
  .comment       0 : { *(.comment) }
  /* DWARF debug sections.
     Symbols in the DWARF debugging sections are relative to the beginning
     of the section so we begin them at 0.  */
  /* DWARF 1 */
  .debug          0 : { *(.debug) }
  .line           0 : { *(.line) }
  /* GNU DWARF 1 extensions */
  .debug_srcinfo  0 : { *(.debug_srcinfo) }
  .debug_sfnames  0 : { *(.debug_sfnames) }
  /* DWARF 1.1 and DWARF 2 */
  .debug_aranges  0 : { *(.debug_aranges) }
  .debug_pubnames 0 : { *(.debug_pubnames) }
  /* DWARF 2 */
  .debug_info     0 : { *(.debug_info .gnu.linkonce.wi.*) }
  .debug_abbrev   0 : { *(.debug_abbrev) }
  .debug_line     0 : { *(.debug_line .debug_line.* .debug_line_end ) }
  .debug_frame    0 : { *(.debug_frame) }
  .debug_str      0 : { *(.debug_str) }
  .debug_loc      0 : { *(.debug_loc) }
  .debug_macinfo  0 : { *(.debug_macinfo) }
  /* SGI/MIPS DWARF 2 extensions */
  .debug_weaknames 0 : { *(.debug_weaknames) }
  .debug_funcnames 0 : { *(.debug_funcnames) }
  .debug_typenames 0 : { *(.debug_typenames) }
  .debug_varnames  0 : { *(.debug_varnames) }
  /* DWARF 3 */
  .debug_pubtypes 0 : { *(.debug_pubtypes) }
  .debug_ranges   0 : { *(.debug_ranges) }
  /* DWARF Extension.  */
  .debug_macro    0 : { *(.debug_macro) }
  .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) }
  /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) }
}


==================================================

July 16, 2014
On Wednesday, 16 July 2014 at 13:52:57 UTC, Mike wrote:
> I received a question from Dicebot in at the end of my presentation.  He asked about the --gc-sections linker flag breaking code from GDC.
>
> I recently discovered how one can see why this is occurring, and I hope this will help identify the problem and lead to a solution.
>
> Compile any simple hello world program with the following gcc command:
> gcc --verbose -Wl,--verbose test.c.
>
> Part of the output is GCC's internal linker script as shown below.  I believe this is the source of the problem.  Here's my theory.
>
> D is not C, and is likely generating code that the GCC internal linker script doesn't know about.  This code may be incorrectly identified as dead code because there is no 'link' to it and, therefore, appears unused.  If D or GDC is generating any code like this, it needs to be marked as KEEP in the linker script.  You can see examples of this in GCC's internal linker script below.
>
> If my theory is correct, GDC may have to make an internal linker script specifically for D's code generation that marks such code as KEEP.
>
> I hope I'm not just blowing smoke.
>
> Mike

And I just checked with GDC...
gdc --verbose -Wl,--verbose test.d

... and the internal linker script is exactly the same as the C version.  That doesn't seem right to me.  I would expect them to be at least a little different.

Mike
July 16, 2014
On 16 July 2014 15:12, Mike via D.gnu <d.gnu@puremagic.com> wrote:
> On Wednesday, 16 July 2014 at 13:52:57 UTC, Mike wrote:
>>
>> I received a question from Dicebot in at the end of my presentation.  He asked about the --gc-sections linker flag breaking code from GDC.
>>
>> I recently discovered how one can see why this is occurring, and I hope this will help identify the problem and lead to a solution.
>>
>> Compile any simple hello world program with the following gcc command: gcc --verbose -Wl,--verbose test.c.
>>
>> Part of the output is GCC's internal linker script as shown below.  I believe this is the source of the problem.  Here's my theory.
>>
>> D is not C, and is likely generating code that the GCC internal linker script doesn't know about.  This code may be incorrectly identified as dead code because there is no 'link' to it and, therefore, appears unused.  If D or GDC is generating any code like this, it needs to be marked as KEEP in the linker script.  You can see examples of this in GCC's internal linker script below.
>>
>> If my theory is correct, GDC may have to make an internal linker script specifically for D's code generation that marks such code as KEEP.
>>
>> I hope I'm not just blowing smoke.
>>
>> Mike
>
>
> And I just checked with GDC...
> gdc --verbose -Wl,--verbose test.d
>
> ... and the internal linker script is exactly the same as the C version. That doesn't seem right to me.  I would expect them to be at least a little different.
>
> Mike

Using a D-specific linker script in is outside the scope of GDC itself.  I'd have to make a patch to Binutils.

And yes, a bespoke linker script would solve many problems that are currently managed by the compiler.

Regards
Iain
July 16, 2014
On 16/07/14 15:52, Mike wrote:
> I received a question from Dicebot in at the end of my presentation.  He
> asked about the --gc-sections linker flag breaking code from GDC.

Have you seen this post: http://forum.dlang.org/thread/lbrfycmutwrrghtzazin@forum.dlang.org ?

-- 
/Jacob Carlborg
July 16, 2014
Am Wed, 16 Jul 2014 15:35:09 +0100
schrieb "Iain Buclaw via D.gnu" <d.gnu@puremagic.com>:

> On 16 July 2014 15:12, Mike via D.gnu <d.gnu@puremagic.com> wrote:
> > On Wednesday, 16 July 2014 at 13:52:57 UTC, Mike wrote:
> >>
> >> I received a question from Dicebot in at the end of my presentation.  He asked about the --gc-sections linker flag breaking code from GDC.
> >>
> >> I recently discovered how one can see why this is occurring, and I hope this will help identify the problem and lead to a solution.
> >>
> >> Compile any simple hello world program with the following gcc command: gcc --verbose -Wl,--verbose test.c.
> >>
> >> Part of the output is GCC's internal linker script as shown below.  I believe this is the source of the problem.  Here's my theory.
> >>
> >> D is not C, and is likely generating code that the GCC internal
> >> linker script doesn't know about.  This code may be incorrectly
> >> identified as dead code because there is no 'link' to it and,
> >> therefore, appears unused.  If D or GDC is generating any code
> >> like this, it needs to be marked as KEEP in the linker script.
> >> You can see examples of this in GCC's internal linker script below.
> >>
> >> If my theory is correct, GDC may have to make an internal linker script specifically for D's code generation that marks such code as KEEP.
> >>
> >> I hope I'm not just blowing smoke.
> >>
> >> Mike
> >
> >
> > And I just checked with GDC...
> > gdc --verbose -Wl,--verbose test.d
> >
> > ... and the internal linker script is exactly the same as the C version. That doesn't seem right to me.  I would expect them to be at least a little different.
> >
> > Mike
> 
> Using a D-specific linker script in is outside the scope of GDC itself.  I'd have to make a patch to Binutils.
> 
> And yes, a bespoke linker script would solve many problems that are currently managed by the compiler.
> 
> Regards
> Iain

Please don't start working on a D specific linker script, cause I'm already working on that ;-) I've only done moduleinfo so far, but TLS is next, then shared library support.
July 16, 2014
On 16 July 2014 21:03, Johannes Pfau via D.gnu <d.gnu@puremagic.com> wrote:
> Am Wed, 16 Jul 2014 15:35:09 +0100
> schrieb "Iain Buclaw via D.gnu" <d.gnu@puremagic.com>:
>
>> On 16 July 2014 15:12, Mike via D.gnu <d.gnu@puremagic.com> wrote:
>> > On Wednesday, 16 July 2014 at 13:52:57 UTC, Mike wrote:
>> >>
>> >> I received a question from Dicebot in at the end of my presentation.  He asked about the --gc-sections linker flag breaking code from GDC.
>> >>
>> >> I recently discovered how one can see why this is occurring, and I hope this will help identify the problem and lead to a solution.
>> >>
>> >> Compile any simple hello world program with the following gcc command: gcc --verbose -Wl,--verbose test.c.
>> >>
>> >> Part of the output is GCC's internal linker script as shown below.  I believe this is the source of the problem.  Here's my theory.
>> >>
>> >> D is not C, and is likely generating code that the GCC internal
>> >> linker script doesn't know about.  This code may be incorrectly
>> >> identified as dead code because there is no 'link' to it and,
>> >> therefore, appears unused.  If D or GDC is generating any code
>> >> like this, it needs to be marked as KEEP in the linker script.
>> >> You can see examples of this in GCC's internal linker script below.
>> >>
>> >> If my theory is correct, GDC may have to make an internal linker script specifically for D's code generation that marks such code as KEEP.
>> >>
>> >> I hope I'm not just blowing smoke.
>> >>
>> >> Mike
>> >
>> >
>> > And I just checked with GDC...
>> > gdc --verbose -Wl,--verbose test.d
>> >
>> > ... and the internal linker script is exactly the same as the C version. That doesn't seem right to me.  I would expect them to be at least a little different.
>> >
>> > Mike
>>
>> Using a D-specific linker script in is outside the scope of GDC itself.  I'd have to make a patch to Binutils.
>>
>> And yes, a bespoke linker script would solve many problems that are currently managed by the compiler.
>>
>> Regards
>> Iain
>
> Please don't start working on a D specific linker script, cause I'm already working on that ;-) I've only done moduleinfo so far, but TLS is next, then shared library support.

I wish I could clone you.
July 17, 2014
"Mike"  wrote in message news:cqzazaqxpwezignixuds@forum.dlang.org...

> If my theory is correct, GDC may have to make an internal linker script specifically for D's code generation that marks such code as KEEP.

There was some discussion of this at dconf, and a custom linker script is both straightforward and correct.

However, it is non-portable (AIUI) and non-scalable (ie you can't link with c/c++ code that has its own custom linker script).

An alternative solution was to emit dummy references to the D symbols (that would otherwise be marked with KEEP) from one of the existing KEEP sections. (eg init_array)

This is what LDC is already using and what was attempted with DMD.

It may be possible to do in a nice way with linker scripts, but I would not personally go down that road.  I know others on here have a better understandings of this area and might be able to find a way. 

July 17, 2014
On Wednesday, 16 July 2014 at 20:05:37 UTC, Johannes Pfau wrote:
> Please don't start working on a D specific linker script, cause I'm already working on that ;-) I've only done moduleinfo so far, but TLS is next, then shared library support.

Instead of a fully custom linker script, I'd go for extending the existing one using the INSERT AFTER/BEFORE commands. This way, there should be less potential for breaking any weird system-specific stuff. Within limits, such a script would also work fine with any custom scripts some weird C libraries might be using.

But still, the problem of making this transparent to the user remains. There is a bit of trickery you can do with implicit linker script, but ultimately I couldn't get it to behave nicely, i.e. be consistently linkable using "gcc".

In the end, this seemed far more troublesome than just working around the problem, especially since you have to make it work on all platforms. Even if you restrict yourself to common x86_64 Linux distros, you have to support various versions of both ld.bfd and ld.gold, and the latter doesn't natively use linker scripts (there is an emulation layer, which mostly works, but seemed to behave slightly differently in my tests).

David
July 17, 2014
Am Thu, 17 Jul 2014 11:29:56 +0000
schrieb "David Nadlinger" <code@klickverbot.at>:

> On Wednesday, 16 July 2014 at 20:05:37 UTC, Johannes Pfau wrote:
> > Please don't start working on a D specific linker script, cause I'm already working on that ;-) I've only done moduleinfo so far, but TLS is next, then shared library support.
> 
> Instead of a fully custom linker script, I'd go for extending the existing one using the INSERT AFTER/BEFORE commands. This way, there should be less potential for breaking any weird system-specific stuff. Within limits, such a script would also work fine with any custom scripts some weird C libraries might be using.

My idea is actually different: Instead of working against the standard linker scripts, I'll modify these to support D. They already have C++ specific support, the D support can be added in some way which leaves non-D programs completely unaffected and the binutils maintainers have been cooperative before when Iain added D support to binutils.

So I'll first finish some proof of concept, then ask on the binutils mailing list if they'd accept D specific changes. Then ask you guys to make sure we have a solution for all compilers and the finally provide a patch for binutils.
> 
> But still, the problem of making this transparent to the user remains. There is a bit of trickery you can do with implicit linker script, but ultimately I couldn't get it to behave nicely, i.e. be consistently linkable using "gcc".

Linking with gcc is impossible, even with modified standard linker scripts. The reason is you can't bracket the TLS sections with the linkerscript, you can only tell it to move symbols from certain object files to the start/end of the section. So you'll always need to pass a dend.o dstart.o to the linker for shared libraries and the main application.

I don't think this is a problem, g++ does exactly the same and therefore suffers from the same problem.
> 
> In the end, this seemed far more troublesome than just working around the problem, especially since you have to make it work on all platforms. Even if you restrict yourself to common x86_64 Linux distros, you have to support various versions of both ld.bfd and ld.gold, and the latter doesn't natively use linker scripts (there is an emulation layer, which mostly works, but seemed to behave slightly differently in my tests).
> 
The standard linker scripts are generated from a template. Modify once,
benefit for (almost) all architectures. Only drawback: This will then
of course require a very new binutils version.
Gold should support standard linker scripts, AFAIK. I only use
features also used by the C++ runtime so it has to work ;-)