Thread overview | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 08, 2016 Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
This might be a bit odd to ask this question in the LDC newsgroup, but since LDC already supports native TLS on OS X I was hoping to get some help here. I've implemented native TLS on OS X in DMD to the best of my knowledge. The data in the sections look correct, the assembly look correct, I've updated druntime to use the same code, in this regard, as LDC does. Everything seems to work correctly in the simple cases I've tried. But, I have an issue when the garbage collector is run. In particular when running the DMD test suite. The failing test is this one [1]. I get a segmentation fault (in the debugger, range error) here [2], after executing the outer loop once. I highly suspect that it's the garbage collector that collects "_chars" [3] (or its content) too early, since the destructor of SomeClass [4] is executed. If I make "_chars" __gshared it doesn't crash. If I remove the call to the GC [5], it doesn't crash. I've been trying to debug this but I don't have much knowledge in this area. What I have found out is that "_chars" is included in the range returned by _d_dyld_getTLSRange [6]. I've been trying to debug the GC, and it looks like "_chars" is marked twice, before crashing. Or at least a range where "_chars" is included. One thing that worries me though is the range returned by _d_dyld_getTLSRange for LDC is a quite a lot larger (around 3500) than for DMD (around 650). But I noticed that LDC has a couple of additional TLS symbols that DMD doesn't have. If I recall correctly, they looked like they were related to exception handling. Any ideas what can be wrong or suggestions how to further debug this? [1] https://github.com/D-Programming-Language/dmd/blob/7a7687e6e5b46ab9629bcdddb3061478c504ae49/test/runnable/testaa.d#L401 [2] https://github.com/D-Programming-Language/dmd/blob/7a7687e6e5b46ab9629bcdddb3061478c504ae49/test/runnable/testaa.d#L410 [3] https://github.com/D-Programming-Language/dmd/blob/7a7687e6e5b46ab9629bcdddb3061478c504ae49/test/runnable/testaa.d#L388 [4] https://github.com/D-Programming-Language/dmd/blob/7a7687e6e5b46ab9629bcdddb3061478c504ae49/test/runnable/testaa.d#L372 [5] https://github.com/D-Programming-Language/dmd/blob/7a7687e6e5b46ab9629bcdddb3061478c504ae49/test/runnable/testaa.d#L413 [6] https://github.com/ldc-developers/druntime/blob/ldc/src/rt/sections_ldc.d#L432 -- /Jacob Carlborg |
January 08, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On 8 Jan 2016, at 8:37, Jacob Carlborg via digitalmars-d-ldc wrote: > I've been trying to debug this but I don't have much knowledge in this area. What I have found out is that "_chars" is included in the range returned by _d_dyld_getTLSRange [6]. I've been trying to debug the GC, and it looks like "_chars" is marked twice, before crashing. Or at least a range where "_chars" is included. It's been a while since I initially looked into getting the TLS to work, but did you check that _chars is properly aligned (i.e. to 8 bytes on x86_64)? This would be one way how the GC could miss the pointer even though the global is contained in a root range. If that's not it, I'd just continue trying to figure out which objects exactly are collected (not marked) and why. > If I recall correctly, they looked like they were related to exception handling. There is currently a per-thread cache for exception handling metadata, yes. It contains a subtle bug, though (related to moving fibers between threads), and will probably go away. — David |
January 08, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On 2016-01-08 16:32, David Nadlinger via digitalmars-d-ldc wrote: > It's been a while since I initially looked into getting the TLS to work, > but did you check that _chars is properly aligned (i.e. to 8 bytes on > x86_64)? This would be one way how the GC could miss the pointer even > though the global is contained in a root range. That seemed to be the issue, it works now. Awesome :) thanks. A followup question: * I'm looking at the assembly output of LDC, it looks liked LDC aligns to the size of the type, i.e. "int" to 4 and "long" to 8 and so on, is that the case? * It looks like the only uses the above form of alignment if the symbol is placed in the __thread_bss section, i.e. doesn't have an initializer. Does that make sense? If it's has a initializer and is placed in the __thread_data section it will have the alignment of 3 or 4, depending of the size of the variable. -- /Jacob Carlborg |
January 08, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On 2016-01-08 17:40, Jacob Carlborg wrote: Adding the assembly for convenience > * I'm looking at the assembly output of LDC, it looks liked LDC aligns > to the size of the type, i.e. "int" to 4 and "long" to 8 and so on, is > that the case? Without initializer: .tbss __D4main1ai$tlv$init, 4, 3 BTW, do you know that the above 3 is? > * It looks like the only uses the above form of alignment if the symbol > is placed in the __thread_bss section, i.e. doesn't have an initializer. > Does that make sense? If it's has a initializer and is placed in the > __thread_data section it will have the alignment of 3 or 4, depending of > the size of the variable. With initializer: .section __DATA,__thread_data,thread_local_regular .align 3 __D4main1ai$tlv$init: .long 4 -- /Jacob Carlborg |
January 09, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | Jacob Carlborg <doob@me.com> writes: > On 2016-01-08 17:40, Jacob Carlborg wrote: > > Adding the assembly for convenience > >> * I'm looking at the assembly output of LDC, it looks liked LDC aligns to the size of the type, i.e. "int" to 4 and "long" to 8 and so on, is that the case? > > Without initializer: > > .tbss __D4main1ai$tlv$init, 4, 3 > > BTW, do you know that the above 3 is? 3 is alignment like .p2align (power of 2 alignment). 2^3 in this case (8-byte) >> * It looks like the only uses the above form of alignment if the symbol is placed in the __thread_bss section, i.e. doesn't have an initializer. Does that make sense? If it's has a initializer and is placed in the __thread_data section it will have the alignment of 3 or 4, depending of the size of the variable. > > With initializer: > > .section __DATA,__thread_data,thread_local_regular > .align 3 > __D4main1ai$tlv$init: > .long 4 Same 8-byte alignment (OSX .align is synonym for .p2align). The tbss and tdata declarations match. |
January 09, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dan Olson | Dan Olson <gorox@comcast.net> writes: > Jacob Carlborg <doob@me.com> writes: > >> On 2016-01-08 17:40, Jacob Carlborg wrote: >> >> Adding the assembly for convenience >> >>> * I'm looking at the assembly output of LDC, it looks liked LDC aligns to the size of the type, i.e. "int" to 4 and "long" to 8 and so on, is that the case? >> >> Without initializer: >> >> .tbss __D4main1ai$tlv$init, 4, 3 >> >> BTW, do you know that the above 3 is? > > 3 is alignment like .p2align (power of 2 alignment). > 2^3 in this case (8-byte) > >>> * It looks like the only uses the above form of alignment if the symbol is placed in the __thread_bss section, i.e. doesn't have an initializer. Does that make sense? If it's has a initializer and is placed in the __thread_data section it will have the alignment of 3 or 4, depending of the size of the variable. >> >> With initializer: >> >> .section __DATA,__thread_data,thread_local_regular >> .align 3 >> __D4main1ai$tlv$init: >> .long 4 > > Same 8-byte alignment (OSX .align is synonym for .p2align). > > The tbss and tdata declarations match. Just re-reading and it looks like alignments in your example are too big for a 4-byte type, assuming var is an int. .align only needs to be 2 here. $ cat tls.c __thread int x; __thread int y = 42; $ clang -S tls.c $ cat tls.s .section __TEXT,__text,regular,pure_instructions .macosx_version_min 10, 10 .section __DATA,__thread_data,thread_local_regular .align 2 ## @y _y$tlv$init: .long 42 ## 0x2a .section __DATA,__thread_vars,thread_local_variables .globl _y _y: .quad __tlv_bootstrap .quad 0 .quad _y$tlv$init .tbss _x$tlv$init, 4, 2 ## @x .globl _x _x: .quad __tlv_bootstrap .quad 0 .quad _x$tlv$init .subsections_via_symbols |
January 09, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dan Olson | On Saturday, 9 January 2016 at 20:07:34 UTC, Dan Olson wrote: > Just re-reading and it looks like alignments in your example are too big for a 4-byte type, assuming var is an int. .align only needs to be 2 here. This is probably due to https://github.com/kinke/ldc/commit/a39997d326f0d3da353d8b9f27ffd559e6fcc5d7. |
January 10, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dan Olson | On 2016-01-09 20:48, Dan Olson wrote: >> .tbss __D4main1ai$tlv$init, 4, 3 >> >> BTW, do you know that the above 3 is? > > 3 is alignment like .p2align (power of 2 alignment). > 2^3 in this case (8-byte) I thought the four was the alignment. If the three is the alignment, then what is the four? The size of the variable? > Same 8-byte alignment (OSX .align is synonym for .p2align). > > The tbss and tdata declarations match. Ah, ok. If the second number (3) above is the alignment then it makes sense. -- /Jacob Carlborg |
January 10, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dan Olson | On 2016-01-09 21:07, Dan Olson wrote: > Just re-reading and it looks like alignments in your example are too big > for a 4-byte type, assuming var is an int. .align only needs to be 2 here. The output was from LDC. I noticed that Clang and LDC behaves differently. -- /Jacob Carlborg |
January 11, 2016 Re: Implementing native TLS on OS X in DMD | ||||
---|---|---|---|---|
| ||||
Posted in reply to kinke | kinke <noone@nowhere.com> writes: > On Saturday, 9 January 2016 at 20:07:34 UTC, Dan Olson wrote: >> Just re-reading and it looks like alignments in your example are too big for a 4-byte type, assuming var is an int. .align only needs to be 2 here. > > This is probably due to https://github.com/kinke/ldc/commit/a39997d326f0d3da353d8b9f27ffd559e6fcc5d7. I haven't carefully read the commit yet. Is the extra alignment intended for all vars declarations? It probably is not a big issue, but the following: ubyte a,b,c,d,e,f,g,h; uses 64-bytes versus the 8-bytes from before. -- Dan |
Copyright © 1999-2021 by the D Language Foundation