Thread overview
[dmd-internals] How to avoid 4-word padding for symbols?
Nov 08, 2010
Michel Fortin
Nov 08, 2010
Walter Bright
Nov 08, 2010
Michel Fortin
Nov 08, 2010
Michel Fortin
Nov 08, 2010
Walter Bright
Nov 08, 2010
Michel Fortin
Nov 08, 2010
Walter Bright
Nov 09, 2010
Michel Fortin
November 07, 2010
It's probably a small thing that I'm missing, but I have a small problem where the code below causes undesired padding to be added between two symbols in the object file. This code writes 5 words to the object file:

    symbol = symbol_name(sname, SCstatic, type_fake(TYnptr));
    symbol->Sdt = dt;
    symbol->Sseg = objc_getsegment(SEGprotocol);
    outdata(symbol);

If you call it 2 times, it generates 5*2 + 3 padding bytes between the two (added dashes to show padding):

[25] 002d2 0034 000da6    0 0f94   2 10000000 00 00 __protocol __OBJC
 0000:   0  0  0  0  6  3  0  0  0  0  0  0  0  0  0  0   ................
 0010:   0  0  0  0 -0--0--0--0--0--0--0--0--0--0--0--0   ................
 0020:   0  0  0  0 36  3  0  0  0  0  0  0  0  0  0  0   ....6...........
 0030:   0  0  0  0                     ....

This padding is causing a crash when starting the Objective-C runtime because it expects the two to be contiguous.

What needs to be changed to remove this padding?

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



November 07, 2010
Symbols are aligned when written out to the data segment.

To defeat the alignment, symbols that should be adjacent should be merged into one symbol.

Michel Fortin wrote:
> It's probably a small thing that I'm missing, but I have a small problem where the code below causes undesired padding to be added between two symbols in the object file. This code writes 5 words to the object file:
>
>     symbol = symbol_name(sname, SCstatic, type_fake(TYnptr));
>     symbol->Sdt = dt;
>     symbol->Sseg = objc_getsegment(SEGprotocol);
>     outdata(symbol);
>
> If you call it 2 times, it generates 5*2 + 3 padding bytes between the two (added dashes to show padding):
>
> [25] 002d2 0034 000da6    0 0f94   2 10000000 00 00 __protocol __OBJC
>  0000:   0  0  0  0  6  3  0  0  0  0  0  0  0  0  0  0   ................
>  0010:   0  0  0  0 -0--0--0--0--0--0--0--0--0--0--0--0   ................
>  0020:   0  0  0  0 36  3  0  0  0  0  0  0  0  0  0  0   ....6...........
>  0030:   0  0  0  0                     ....
>
> This padding is causing a crash when starting the Objective-C runtime because it expects the two to be contiguous.
>
> What needs to be changed to remove this padding?
>
> 
November 07, 2010
I hoped it wouldn't have to come to that. I'm a little surprised that symbols have to be 16-byte aligned.

But thanks for confirming what I'll have to do.


Le 2010-11-07 ? 21:05, Walter Bright a ?crit :

> Symbols are aligned when written out to the data segment.
> 
> To defeat the alignment, symbols that should be adjacent should be merged into one symbol.
> 
> Michel Fortin wrote:
>> It's probably a small thing that I'm missing, but I have a small problem where the code below causes undesired padding to be added between two symbols in the object file. This code writes 5 words to the object file:
>> 
>>    symbol = symbol_name(sname, SCstatic, type_fake(TYnptr));
>>    symbol->Sdt = dt;
>>    symbol->Sseg = objc_getsegment(SEGprotocol);
>>    outdata(symbol);
>> 
>> If you call it 2 times, it generates 5*2 + 3 padding bytes between the two (added dashes to show padding):
>> 
>> [25] 002d2 0034 000da6    0 0f94   2 10000000 00 00 __protocol __OBJC
>> 0000:   0  0  0  0  6  3  0  0  0  0  0  0  0  0  0  0   ................
>> 0010:   0  0  0  0 -0--0--0--0--0--0--0--0--0--0--0--0   ................
>> 0020:   0  0  0  0 36  3  0  0  0  0  0  0  0  0  0  0   ....6...........
>> 0030:   0  0  0  0                     ....
>> This padding is causing a crash when starting the Objective-C runtime because it expects the two to be contiguous.
>> 
>> What needs to be changed to remove this padding?
>> 
>> 
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



November 08, 2010
Turns out it doesn't work because those symbols refer to each other, so by coalescing them in one symbol I'm creating a self-referenced symbol and the backend doesn't like that.

I resorted to adding a 'noalign' flag to outdata() in the backend. It's a small change and it works, but it feels like a hack. Ideally there should be a way to specify the alignment per segment (there might already be one, but I can't figure it out). Since the Objective-C stuff belongs to separate segments it'd make things really easy.

I know there is the 'align' parameter in mach_getsegment(), but it doesn't seem to have any effect other than changing alignment value for the section in the object file.


Le 2010-11-07 ? 21:05, Walter Bright a ?crit :

> Symbols are aligned when written out to the data segment.
> 
> To defeat the alignment, symbols that should be adjacent should be merged into one symbol.
> 
> Michel Fortin wrote:
>> It's probably a small thing that I'm missing, but I have a small problem where the code below causes undesired padding to be added between two symbols in the object file. This code writes 5 words to the object file:
>> 
>>    symbol = symbol_name(sname, SCstatic, type_fake(TYnptr));
>>    symbol->Sdt = dt;
>>    symbol->Sseg = objc_getsegment(SEGprotocol);
>>    outdata(symbol);
>> 
>> If you call it 2 times, it generates 5*2 + 3 padding bytes between the two (added dashes to show padding):
>> 
>> [25] 002d2 0034 000da6    0 0f94   2 10000000 00 00 __protocol __OBJC
>> 0000:   0  0  0  0  6  3  0  0  0  0  0  0  0  0  0  0   ................
>> 0010:   0  0  0  0 -0--0--0--0--0--0--0--0--0--0--0--0   ................
>> 0020:   0  0  0  0 36  3  0  0  0  0  0  0  0  0  0  0   ....6...........
>> 0030:   0  0  0  0                     ....
>> This padding is causing a crash when starting the Objective-C runtime because it expects the two to be contiguous.
>> 
>> What needs to be changed to remove this padding?
>> 
>> 
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



November 08, 2010

Michel Fortin wrote:
> Turns out it doesn't work because those symbols refer to each other, so by coalescing them in one symbol I'm creating a self-referenced symbol and the backend doesn't like that.
>
> I resorted to adding a 'noalign' flag to outdata() in the backend. It's a small change and it works, but it feels like a hack. Ideally there should be a way to specify the alignment per segment (there might already be one, but I can't figure it out). Since the Objective-C stuff belongs to separate segments it'd make things really easy.
>
> I know there is the 'align' parameter in mach_getsegment(), but it doesn't seem to have any effect other than changing alignment value for the section in the object file.
> 

Do you mean alignment per segment, or per symbol?

>
> Le 2010-11-07 ? 21:05, Walter Bright a ?crit :
>
> 
>> Symbols are aligned when written out to the data segment.
>>
>> To defeat the alignment, symbols that should be adjacent should be merged into one symbol.
>>
>> Michel Fortin wrote:
>> 
>>> It's probably a small thing that I'm missing, but I have a small problem where the code below causes undesired padding to be added between two symbols in the object file. This code writes 5 words to the object file:
>>>
>>>    symbol = symbol_name(sname, SCstatic, type_fake(TYnptr));
>>>    symbol->Sdt = dt;
>>>    symbol->Sseg = objc_getsegment(SEGprotocol);
>>>    outdata(symbol);
>>>
>>> If you call it 2 times, it generates 5*2 + 3 padding bytes between the two (added dashes to show padding):
>>>
>>> [25] 002d2 0034 000da6    0 0f94   2 10000000 00 00 __protocol __OBJC
>>> 0000:   0  0  0  0  6  3  0  0  0  0  0  0  0  0  0  0   ................
>>> 0010:   0  0  0  0 -0--0--0--0--0--0--0--0--0--0--0--0   ................
>>> 0020:   0  0  0  0 36  3  0  0  0  0  0  0  0  0  0  0   ....6...........
>>> 0030:   0  0  0  0                     ....
>>> This padding is causing a crash when starting the Objective-C runtime because it expects the two to be contiguous.
>>>
>>> What needs to be changed to remove this padding?
>>>
>>> 
>>> 
>> _______________________________________________
>> dmd-internals mailing list
>> dmd-internals at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/dmd-internals
>> 
>
> 
November 08, 2010
Le 2010-11-08 ? 16:01, Walter Bright a ?crit :

> Michel Fortin wrote:
>> Turns out it doesn't work because those symbols refer to each other, so by coalescing them in one symbol I'm creating a self-referenced symbol and the backend doesn't like that.
>> 
>> I resorted to adding a 'noalign' flag to outdata() in the backend. It's a small change and it works, but it feels like a hack. Ideally there should be a way to specify the alignment per segment (there might already be one, but I can't figure it out). Since the Objective-C stuff belongs to separate segments it'd make things really easy.
>> 
>> I know there is the 'align' parameter in mach_getsegment(), but it doesn't seem to have any effect other than changing alignment value for the section in the object file.
> 
> Do you mean alignment per segment, or per symbol?

I mean the alignment of the symbols in a specific segment. In this case I need symbols inside the __protocol segment of the __OBJC section to have a 4-byte alignment. I'd have expected the align parameter of mach_getsegment() to do that, but setting it to 2 doesn't do anything.

Since all the symbols are 20 bytes I can just emit them with no alignment at all and I get the same result, that's what my small hack does.

If you clone the Git repository you'll see it's the last commit (01c7708c82).

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



November 08, 2010

Michel Fortin wrote:
> L
> I mean the alignment of the symbols in a specific segment. In this case I need symbols inside the __protocol segment of the __OBJC section to have a 4-byte alignment. I'd have expected the align parameter of mach_getsegment() to do that, but setting it to 2 doesn't do anything.
> 

The segment alignment is for the whole segment, not for individual data items in it.

> Since all the symbols are 20 bytes I can just emit them with no alignment at all and I get the same result, that's what my small hack does.
>
> If you clone the Git repository you'll see it's the last commit (01c7708c82).
>
> 
November 08, 2010
Le 2010-11-08 ? 18:14, Walter Bright a ?crit :

> Michel Fortin wrote:
>> L
>> I mean the alignment of the symbols in a specific segment. In this case I need symbols inside the __protocol segment of the __OBJC section to have a 4-byte alignment. I'd have expected the align parameter of mach_getsegment() to do that, but setting it to 2 doesn't do anything.
> 
> The segment alignment is for the whole segment, not for individual data items in it.

Indeed, that makes sense.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/