Jump to page: 1 2
Thread overview
Compiler-generated implicit symbols and --gc-sections
Jan 03, 2014
Mike
Jan 04, 2014
Timo Sintonen
Jan 06, 2014
Dicebot
Jan 06, 2014
Dicebot
Jan 06, 2014
Iain Buclaw
Jan 07, 2014
Mike
Jan 07, 2014
Joakim
Jan 09, 2014
Mike
Jan 09, 2014
Mike
Jan 09, 2014
Joakim
Jan 07, 2014
Dicebot
January 03, 2014
I ran into a problem recently that resulted in a segmentation fault in my program whenever I called a member function of one of my classes.  Sometimes it occurred and sometimes it didn't depending on the order of certain things in my code.


I eventually tracked it down to the fact that I was compiling with -ffunction-sections and -fdata-sections and linking with --gc-sections and symbols like...

.data._D38TypeInfo_E14TypeInfo_Class10ClassFlags6__initZ
.data._D40TypeInfo_E15TypeInfo_Struct11StructFlags6__initZ

... were being discarded.  I'm assuming this is the mangled .init values of these types, yes?



My linker script contained...

.data : AT (__data_rom_begin)
    {
	. = ALIGN(4);
	__data_ram_begin = .;
	
	. = ALIGN(4);
	*(.data)
	*(.data*)

	. = ALIGN(4);
	__data_ram_end = .;
    } >SRAM

... so I was forced to conclude that the reason they were being discarded was because it couldn't find any code that was reaching these symbols.



After adding...

KEEP(*(.data.*init*))

... to my linker script, the problem was resolved.

I'm guessing these are generated implicitly by the GDC compiler, but it does appear that my code never reaches these symbols, so discarding them should be OK.  However, it seems discarding them causes dislocation in memory.

I'm still a novice with GCC-based toolchains, so forgive the ignorance of this question, but is this to be expected, or is this an indication of a problem with the compiler?

Mike

Compiler:
Latest GDC 4.8 compiled for arm-none-eabi (ARM Cortex-M)
January 04, 2014
On Friday, 3 January 2014 at 18:14:58 UTC, Mike wrote:
> I ran into a problem recently that resulted in a segmentation fault in my program whenever I called a member function of one of my classes.  Sometimes it occurred and sometimes it didn't depending on the order of certain things in my code.
>
>
> I eventually tracked it down to the fact that I was compiling with -ffunction-sections and -fdata-sections and linking with --gc-sections and symbols like...
>
> .data._D38TypeInfo_E14TypeInfo_Class10ClassFlags6__initZ
> .data._D40TypeInfo_E15TypeInfo_Struct11StructFlags6__initZ
>
> ... were being discarded.  I'm assuming this is the mangled .init values of these types, yes?
>
>
>
> My linker script contained...
>
> .data : AT (__data_rom_begin)
>     {
> 	. = ALIGN(4);
> 	__data_ram_begin = .;
> 	
> 	. = ALIGN(4);
> 	*(.data)
> 	*(.data*)
>
> 	. = ALIGN(4);
> 	__data_ram_end = .;
>     } >SRAM
>
> ... so I was forced to conclude that the reason they were being discarded was because it couldn't find any code that was reaching these symbols.
>
>
>
> After adding...
>
> KEEP(*(.data.*init*))
>
> ... to my linker script, the problem was resolved.
>
> I'm guessing these are generated implicitly by the GDC compiler, but it does appear that my code never reaches these symbols, so discarding them should be OK.  However, it seems discarding them causes dislocation in memory.
>
> I'm still a novice with GCC-based toolchains, so forgive the ignorance of this question, but is this to be expected, or is this an indication of a problem with the compiler?
>
> Mike
>
> Compiler:
> Latest GDC 4.8 compiled for arm-none-eabi (ARM Cortex-M)

Again, I am guessing a little, but...

In dmd and ides it is common to compile and link everything at once. the compiler has all information available and may remove unused code and data.

The gcc system is made for separate compilation. When compiling a file, the compiler has no idea how other files call functions and objects in this file. So there has to be at least the default set of resources. If they are used is known only at linking phase. I do not know if the linker is able to remove unused code or data and what flags are needed.

Because tha data is referenced from other files, there has to be a common naming system. Maybe it would be possible to use named variables but for some reason they have decided to name a separate section for every piece of info. Every class, struct etc will have its own sections and there will be lots of them. I have just included all of them without thinking.

It may also be possible that the code or data is in use. In asm file there is a table of data after each funtion. The code may get a word from the table. This may be a pointer to another table in another function in another module. There may be an offset to another place in the table and there may be a pointer to this strange section. Without looking the whole program in debugger it is impossible to say whether the code and data are actually used or not.
January 06, 2014
On Saturday, 4 January 2014 at 07:59:55 UTC, Timo Sintonen wrote:
> In dmd and ides it is common to compile and link everything at once. the compiler has all information available and may remove unused code and data.

Actually no D compiler does it out of the box as far as I am aware. It is a big long-standing problem.
January 06, 2014
On Friday, 3 January 2014 at 18:14:58 UTC, Mike wrote:
> I eventually tracked it down to the fact that I was compiling with -ffunction-sections and -fdata-sections and linking with --gc-sections and symbols like...

I never got --gc-sections to work reliably with D without going dirty, crashes were somewhat common for any non-trivial program. Don't think this particular use case is tested by anyone at all, you are on your own once you get here.
January 06, 2014
On 6 Jan 2014 13:45, "Dicebot" <public@dicebot.lv> wrote:
>
> On Friday, 3 January 2014 at 18:14:58 UTC, Mike wrote:
>>
>> I eventually tracked it down to the fact that I was compiling with
-ffunction-sections and -fdata-sections and linking with --gc-sections and symbols like...
>
>
> I never got --gc-sections to work reliably with D without going dirty,
crashes were somewhat common for any non-trivial program. Don't think this particular use case is tested by anyone at all, you are on your own once you get here.

Of course ! --gc-sections is just a dirty hack.  If you want smaller binaries, then you are better off aiding the shared library support. :)

I don't ever recall any of the core maintainers ever endorsing that switch anyway....


January 07, 2014
On Monday, 6 January 2014 at 18:59:00 UTC, Iain Buclaw wrote:
> On 6 Jan 2014 13:45, "Dicebot" <public@dicebot.lv> wrote:
>>
>> On Friday, 3 January 2014 at 18:14:58 UTC, Mike wrote:
>>>
>>> I eventually tracked it down to the fact that I was compiling with
> -ffunction-sections and -fdata-sections and linking with --gc-sections and
> symbols like...
>>
>>
>> I never got --gc-sections to work reliably with D without going dirty,
> crashes were somewhat common for any non-trivial program. Don't think this
> particular use case is tested by anyone at all, you are on your own once
> you get here.
>
> Of course ! --gc-sections is just a dirty hack.  If you want smaller
> binaries, then you are better off aiding the shared library support. :)
>
> I don't ever recall any of the core maintainers ever endorsing that switch
> anyway....

I agree that the --gc-sections method is hackish, but I wouldn't say it's dirty.  And, in absence of a better method, it is *essential* in the embedded world, and was likely added specifically to make the GNU toolchain a feasible alternative for the embedded market.  I doubt the Arduino, with its 32KB of flash memory, would have even been created without it.

The STM32 processors that I use have 16 ~ 1024KB of flash on them, and --gc-sections is essential to get some programs to fit.  Furthermore, it saves my employer 10s of thousands of dollars in hardware costs for mass produced devices.  With --gc-sections, these devices can be built with C/C++, libsup++, newlib, and libc++ quite effectively.  Without it, this would be impossible.

Shared library support just doesn't apply in this world.  Most of the devices I build are single-threaded, and much of code in the libraries is just never called, and hacking the library's source code with #defines to strip out stuff is a non-solution.

I'm interested in knowing why --gc-sections works well for C/C++ programs but not D, and I hope the compilers will eventually emit code that can support it.
It would be sad if D fragmented into D and embedded-D.  I don't think that would serve the D language well.

I'm liking D so far, and I'm very interested in seeing D become an alternative for the embedded world.  I'm willing to help in any way I can.

January 07, 2014
On Monday, 6 January 2014 at 18:59:00 UTC, Iain Buclaw wrote:
> Of course ! --gc-sections is just a dirty hack.  If you want smaller
> binaries, then you are better off aiding the shared library support. :)
>
> I don't ever recall any of the core maintainers ever endorsing that switch
> anyway....

Hack or not it is pretty much the only existing solution for binary bloat which is very strong in D. Shared library support is completely irrelevant here - it does not fix the problem of compilers generating lot of code that is never actually used.
January 07, 2014
On Tuesday, 7 January 2014 at 02:17:46 UTC, Mike wrote:
> On Monday, 6 January 2014 at 18:59:00 UTC, Iain Buclaw wrote:
>> On 6 Jan 2014 13:45, "Dicebot" <public@dicebot.lv> wrote:
>>>
>>> On Friday, 3 January 2014 at 18:14:58 UTC, Mike wrote:
>>>>
>>>> I eventually tracked it down to the fact that I was compiling with
>> -ffunction-sections and -fdata-sections and linking with --gc-sections and
>> symbols like...
>>>
>>>
>>> I never got --gc-sections to work reliably with D without going dirty,
>> crashes were somewhat common for any non-trivial program. Don't think this
>> particular use case is tested by anyone at all, you are on your own once
>> you get here.
>>
>> Of course ! --gc-sections is just a dirty hack.  If you want smaller
>> binaries, then you are better off aiding the shared library support. :)
>>
>> I don't ever recall any of the core maintainers ever endorsing that switch
>> anyway....
>
> I agree that the --gc-sections method is hackish, but I wouldn't say it's dirty.  And, in absence of a better method, it is *essential* in the embedded world, and was likely added specifically to make the GNU toolchain a feasible alternative for the embedded market.  I doubt the Arduino, with its 32KB of flash memory, would have even been created without it.
>
> The STM32 processors that I use have 16 ~ 1024KB of flash on them, and --gc-sections is essential to get some programs to fit.
>  Furthermore, it saves my employer 10s of thousands of dollars in hardware costs for mass produced devices.  With --gc-sections, these devices can be built with C/C++, libsup++, newlib, and libc++ quite effectively.  Without it, this would be impossible.
>
> Shared library support just doesn't apply in this world.  Most of the devices I build are single-threaded, and much of code in the libraries is just never called, and hacking the library's source code with #defines to strip out stuff is a non-solution.
>
> I'm interested in knowing why --gc-sections works well for C/C++ programs but not D, and I hope the compilers will eventually emit code that can support it.
> It would be sad if D fragmented into D and embedded-D.  I don't think that would serve the D language well.
>
> I'm liking D so far, and I'm very interested in seeing D become an alternative for the embedded world.  I'm willing to help in any way I can.

I ran into this recently when compiling for Android/x86, as the Android NDK linker calls --gc-sections by default.  I was able to reproduce the segfault with dmd compiling a linux/x86 executable with the --gc-sections flag added to the linker command, when compiling sieve.d from the samples.  I think sieve.d was working fine when I removed the recent patches for shared library support on linux, in sections_linux.d, so this incompatibility might be related to the shared library work.  I'm not sure if you're even using that work though, so maybe that's just one of the ways that gc-sections trips up.
January 09, 2014
On Tuesday, 7 January 2014 at 11:04:45 UTC, Joakim wrote:
> I ran into this recently when compiling for Android/x86, as the Android NDK linker calls --gc-sections by default.  I was able to reproduce the segfault with dmd compiling a linux/x86 executable with the --gc-sections flag added to the linker command, when compiling sieve.d from the samples.  I think sieve.d was working fine when I removed the recent patches for shared library support on linux, in sections_linux.d, so this incompatibility might be related to the shared library work.  I'm not sure if you're even using that work though, so maybe that's just one of the ways that gc-sections trips up.

Interesting!  I'd like to take the current 4.8 backport and compile it without the shared library stuff to test this out.  But I don't know how.  Would you mind giving me a quick explanation on how to remove these patches using git?  I'm really quite new to some of these tools.
January 09, 2014
On Thursday, 9 January 2014 at 07:51:48 UTC, Mike wrote:
> On Tuesday, 7 January 2014 at 11:04:45 UTC, Joakim wrote:
>> I ran into this recently when compiling for Android/x86, as the Android NDK linker calls --gc-sections by default.  I was able to reproduce the segfault with dmd compiling a linux/x86 executable with the --gc-sections flag added to the linker command, when compiling sieve.d from the samples.  I think sieve.d was working fine when I removed the recent patches for shared library support on linux, in sections_linux.d, so this incompatibility might be related to the shared library work.  I'm not sure if you're even using that work though, so maybe that's just one of the ways that gc-sections trips up.
>
> Interesting!  I'd like to take the current 4.8 backport and compile it without the shared library stuff to test this out.  But I don't know how.  Would you mind giving me a quick explanation on how to remove these patches using git?  I'm really quite new to some of these tools.

Nevermind that last post. I thought you were talking about code in GDC, not the runtime.  My runtime is only about 400 lines total, and I'm not anywhere near sections.d.
« First   ‹ Prev
1 2