Jump to page: 1 2 3
Thread overview
[dmd-internals] Building Phobos and druntime as a dylib on Mac OS X
Nov 17, 2011
Jacob Carlborg
Nov 17, 2011
Walter Bright
Nov 20, 2011
Jacob Carlborg
Nov 20, 2011
Walter Bright
Nov 21, 2011
Jacob Carlborg
Nov 21, 2011
Walter Bright
Nov 30, 2011
Jacob Carlborg
Nov 30, 2011
Walter Bright
Nov 30, 2011
Michel Fortin
Dec 01, 2011
Jacob Carlborg
Dec 02, 2011
Sean Kelly
Dec 02, 2011
Alex
Dec 02, 2011
Walter Bright
Dec 02, 2011
Brad Roberts
Dec 02, 2011
Sean Kelly
Dec 03, 2011
Jacob Carlborg
Dec 02, 2011
David Nadlinger
Dec 02, 2011
Walter Bright
Dec 01, 2011
Jacob Carlborg
Dec 02, 2011
Walter Bright
Dec 03, 2011
Jacob Carlborg
Dec 04, 2011
Walter Bright
Dec 04, 2011
Jacob Carlborg
Dec 07, 2011
Sean Kelly
Dec 08, 2011
Jacob Carlborg
Dec 07, 2011
Sean Kelly
Dec 07, 2011
Walter Bright
Dec 08, 2011
Jacob Carlborg
Dec 02, 2011
Sean Kelly
Dec 03, 2011
Jacob Carlborg
November 17, 2011
>From time to time I've been trying to build Phobos and druntime as a dynamic library on Mac OS X. Every time I try I hit some problem or there's a question I can't answer. Now I'm planning to give it another try and I have collected a couple of problems/questions below I'm hoping to get help with.

1. One of my goals was to remove all global variables referring to symbols in the object files. I've removed all occurrences of "_minfo_beg", "_minfo_end" and similar to instead use the proper API to get data from a section of the object file. This has been working fine except in one case and that is with "_tls_beg" and "_tls_end".

The thing is that accessing TLS variables seem to work fine when I do a simple test but when I run the unit tests for druntime they fail. The reason for the failure is this line:

https://github.com/jacob-carlborg/druntime/blob/master/src/core/thread.d#L3972

What I don't get is it this: I thought  ___tls_get_addr was called every time a TLS variable was accessed. Which would mean that my simple test should fail as well, but it doesn't.

2. What should happen to the _tls_beg and _tls_end sections when a library is dynamically loaded, i.e. using dlopen? Since _moduleinfo_array, which is wrapped by _minfo_beg and _minfo_end, is an array I'm just appending to the array when a library is dynamically loaded.

3. I think that _moduleinfo_array needs to be an associative array or similar. Using a mach_header, corresponding to a dynamic library, as the key and the current content of _moduleinfo_array as the value. This will allow to easily add and remove module infos when dynamic libraries are loaded and unloaded.

But if an associative array should be used we probably can't use the built-in one because the runtime may not be fully initialized to use it yet. In that case we need some kind of lightweight associative array that can work when the runtime isn't fully initialized. I think that creating an associative array for that purpose is beyond my current capabilities, but I know Sean has mentioned he has been thinking/wanted to created an associative array for this purpose.

It would be nice to have a low level array type that can be used in the runtime as well. It's easier to create but it would still be nice to have properly working array type in the runtime.

-- 
/Jacob Carlborg

November 17, 2011

On 11/17/2011 9:28 AM, Jacob Carlborg wrote:
> > From time to time I've been trying to build Phobos and druntime as a dynamic library on Mac OS X. Every time I try I hit some problem or there's a question I can't answer. Now I'm planning to give it another try and I have collected a couple of problems/questions below I'm hoping to get help with.
>
> 1. One of my goals was to remove all global variables referring to symbols in the object files. I've removed all occurrences of "_minfo_beg", "_minfo_end" and similar to instead use the proper API to get data from a section of the object file. This has been working fine except in one case and that is with "_tls_beg" and "_tls_end".
>
> The thing is that accessing TLS variables seem to work fine when I do a simple test but when I run the unit tests for druntime they fail. The reason for the failure is this line:
>
> https://github.com/jacob-carlborg/druntime/blob/master/src/core/thread.d#L3972
>
> What I don't get is it this: I thought  ___tls_get_addr was called every time a TLS variable was accessed.

It is. You can verify by using obj2asm to look at the asm output of dmd.

>   Which would mean that my simple test should fail as well, but it doesn't.
>
> 2. What should happen to the _tls_beg and _tls_end sections when a library is dynamically loaded, i.e. using dlopen?

I don't really know.

>   Since _moduleinfo_array, which is wrapped by _minfo_beg and _minfo_end, is an array I'm just appending to the array when a library is dynamically loaded.
>
> 3. I think that _moduleinfo_array needs to be an associative array or similar. Using a mach_header, corresponding to a dynamic library, as the key and the current content of _moduleinfo_array as the value. This will allow to easily add and remove module infos when dynamic libraries are loaded and unloaded.
>
> But if an associative array should be used we probably can't use the built-in one because the runtime may not be fully initialized to use it yet. In that case we need some kind of lightweight associative array that can work when the runtime isn't fully initialized. I think that creating an associative array for that purpose is beyond my current capabilities, but I know Sean has mentioned he has been thinking/wanted to created an associative array for this purpose.
>
> It would be nice to have a low level array type that can be used in the runtime as well. It's easier to create but it would still be nice to have properly working array type in the runtime.
>
November 20, 2011
On 17 nov 2011, at 20:58, Walter Bright wrote:
> 
> It is. You can verify by using obj2asm to look at the asm output of dmd.

That's a good idea.

>>  Which would mean that my simple test should fail as well, but it doesn't.
>> 
>> 2. What should happen to the _tls_beg and _tls_end sections when a library is dynamically loaded, i.e. using dlopen?
> 
> I don't really know.

No offense, but I thought if anyone would know this it would be you. I don't know if anyone else, like Sean, would know this but I really think we need to figure this out if we want to support dynamic libraries.

-- 
/Jacob Carlborg

November 20, 2011

On 11/20/2011 5:58 AM, Jacob Carlborg wrote:
> Which would mean that my simple test should fail as well, but it doesn't.
>>> 2. What should happen to the _tls_beg and _tls_end sections when a library is dynamically loaded, i.e. using dlopen?
>> I don't really know.
> No offense, but I thought if anyone would know this it would be you. I don't know if anyone else, like Sean, would know this but I really think we need to figure this out if we want to support dynamic libraries.
>

One reason dmd doesn't support dynlib yet is because I haven't done much research into how dynlib actually works.
November 21, 2011
On 20 nov 2011, at 20:30, Walter Bright wrote:

> 
> 
> On 11/20/2011 5:58 AM, Jacob Carlborg wrote:
>> Which would mean that my simple test should fail as well, but it doesn't.
>>>> 2. What should happen to the _tls_beg and _tls_end sections when a library is dynamically loaded, i.e. using dlopen?
>>> I don't really know.
>> No offense, but I thought if anyone would know this it would be you. I don't know if anyone else, like Sean, would know this but I really think we need to figure this out if we want to support dynamic libraries.
>> 
> 
> One reason dmd doesn't support dynlib yet is because I haven't done much research into how dynlib actually works.


Ok, I see. I though that you might know since you have developed a C++ compiler as well. I assume dynamic libraries can be used with DMC. Note that when I say "dylib" I mean the general term "dynamic library" and not the Mac OS X specific implementation.

-- 
/Jacob Carlborg

November 21, 2011

On 11/21/2011 9:17 AM, Jacob Carlborg wrote:
>
>> One reason dmd doesn't support dynlib yet is because I haven't done much research into how dynlib actually works.
>
> Ok, I see. I though that you might know since you have developed a C++ compiler as well. I assume dynamic libraries can be used with DMC. Note that when I say "dylib" I mean the general term "dynamic library" and not the Mac OS X specific implementation.
>

DLLs on Windows work very differently from dynlibs on other systems. You have to approach each as its own animal.
November 30, 2011
On 21 nov 2011, at 20:12, Walter Bright wrote:
> 
> On 11/21/2011 9:17 AM, Jacob Carlborg wrote:
>> 
>>> One reason dmd doesn't support dynlib yet is because I haven't done much research into how dynlib actually works.
>> 
>> Ok, I see. I though that you might know since you have developed a C++ compiler as well. I assume dynamic libraries can be used with DMC. Note that when I say "dylib" I mean the general term "dynamic library" and not the Mac OS X specific implementation.
>> 
> 
> DLLs on Windows work very differently from dynlibs on other systems. You have to approach each as its own animal.


I've done some research about how TLS is implemented in the ELF format. I don't understand everything but I think I've got a, at least, somewhat better understanding of TLS.

I've started to think about if it's possible to implement TLS on Mac OS X in the same way as it's implement on Linux, but just with the help of the compiler and druntime. From what I've read, and understood, what basically happens and what's different compared to regular variables is:

* Some form of relocation happens
* The TLS sections are initialization
* The regular sections are not used
* The regular symbol table is not used

I'm not sure if ti's possible to do the relocation but the initialization should be any problem (I think). I'm also not sure about the second symbol table, if that can be made to work. This is how I'm thinking:

There are a couple of things that needs to be done at program start to have TLS working. It shouldn't matter if that's done by the dynamic linker or the application itself (druntime). That's assuming an application can do everything that needs to be done, i.e. relocation.

Of course this is just how I'm thinking and I can be completely wrong. I also have no idea how close your implementation of TLS on Mac OS X is to the implementation on Linux.

Now about getting TLS to work with dynamic libraries.

What's happening now in the ___tls_get_addr function is that there is only one TLS section/segment, bracketed by the __tls_beg and __tls_end segments. The problem is that there is only one pari of these begin and end segments. According to the TLS reference I read, a thread-local variable is identified by a reference to the object and the offset of the variable in the thread-local storage section. So the problem is now how to get the object in which this variable is defined.

I don't full understand how this object is accessed. I think it's either passed to the __tls_get_addr function or accessed inside the function using assembly instructions. What's passed to the __tls_get_addr is an argument of the type "tls_index". The type is defined as follows for the IA-32 ABI:

typedef struct
{
    unsigned long int ti_module;
    unsigned long int ti_offset;
} tls_index;

Using the GNU variant of the ABI, the parameter is passed to the function in the %eax register. The reference says that to load the thread pointer in the %eax register the following code would be used:

movl %gs:0, %eax

I don't know if the object is the thread pointer or if it's the ti_module field in the tls_index struct. The name would suggest it's the field in the struct.

To call the __tls_get_addr function the following assembly instructions are used for the general dynamic model for the IA-32 ABI, the GNU version:

0x00 leal x at tlsgd(,%ebx,1),%eax
0x07 call ___tls_get_addr at plt

In the above code, "x" in the "leal" instruction, is the variable to be accessed. Since you have already implemented TLS for Linux, I assume according to this reference, you already know how to call this function (depending on what TLS model is used).

The general dynamic TLS model can be used everywhere and can access variables defined anywhere else. There are other models available but these are limited in different ways compared to the general model.

I've also read your article at Dr. Dobb's about implementing TLS on Mac OS X. You write:

"my benchmarks show it to be 10 times slower than a simple access to a shared global."

I don't understand why it has to be like this. If TLS is implemented in the same way as on Linux, but in the druntime instead of the dynamic linker (as suggested in the beginning), I don't see why it would be any slower than on Linux. I mean, the same tasks need to be performed regardless if it's done by the dynamic linker or druntime.

Also note that TLS is really fast on Mac OS X, pthread_getspecific is implemented using three assembly instructions and pthread_self using two instructions. There are inline versions available of these functions to remove the function call overhead.

BTW, according to this:

http://stackoverflow.com/questions/2436772/thread-local-storage-macosx

GCC 4.5+ on Mac OS X supports the __thread keyword but it's emulated.

TLS reference: http://www.akkadia.org/drepper/tls.pdf

-- 
/Jacob Carlborg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/dmd-internals/attachments/20111130/39c0fdc1/attachment.html>
November 30, 2011
I'm waiting for gcc on the Mac to support TLS, which it must do soon. When it does, I'll change dmd to match it.

On 11/30/2011 12:57 PM, Jacob Carlborg wrote:
>
> On 21 nov 2011, at 20:12, Walter Bright wrote:
>>
>> On 11/21/2011 9:17 AM, Jacob Carlborg wrote:
>>>
>>>> One reason dmd doesn't support dynlib yet is because I haven't done much research into how dynlib actually works.
>>>
>>> Ok, I see. I though that you might know since you have developed a C++ compiler as well. I assume dynamic libraries can be used with DMC. Note that when I say "dylib" I mean the general term "dynamic library" and not the Mac OS X specific implementation.
>>>
>>
>> DLLs on Windows work very differently from dynlibs on other systems. You have to approach each as its own animal.
>
> I've done some research about how TLS is implemented in the ELF format. I don't understand everything but I think I've got a, at least, somewhat better understanding of TLS.
>
> I've started to think about if it's possible to implement TLS on Mac OS X in the same way as it's implement on Linux, but just with the help of the compiler and druntime. From what I've read, and understood, what basically happens and what's different compared to regular variables is:
>
> * Some form of relocation happens
> * The TLS sections are initialization
> * The regular sections are not used
> * The regular symbol table is not used
>
> I'm not sure if ti's possible to do the relocation but the initialization should be any problem (I think). I'm also not sure about the second symbol table, if that can be made to work. This is how I'm thinking:
>
> There are a couple of things that needs to be done at program start to have TLS working. It shouldn't matter if that's done by the dynamic linker or the application itself (druntime). That's assuming an application can do everything that needs to be done, i.e. relocation.
>
> Of course this is just how I'm thinking and I can be completely wrong. I also have no idea how close your implementation of TLS on Mac OS X is to the implementation on Linux.
>
> Now about getting TLS to work with dynamic libraries.
>
> What's happening now in the ___tls_get_addr function is that there is only one TLS section/segment, bracketed by the __tls_beg and __tls_end segments. The problem is that there is only one pari of these begin and end segments. According to the TLS reference I read, a thread-local variable is identified by a reference to the object and the offset of the variable in the thread-local storage section. So the problem is now how to get the object in which this variable is defined.
>
> I don't full understand how this object is accessed. I think it's either passed to the __tls_get_addr function or accessed inside the function using assembly instructions. What's passed to the __tls_get_addr is an argument of the type "tls_index". The type is defined as follows for the IA-32 ABI:
>
> typedef struct
> {
>     unsigned long int ti_module;
>     unsigned long int ti_offset;
> } tls_index;
>
> Using the GNU variant of the ABI, the parameter is passed to the function in the %eax register. The reference says that to load the thread pointer in the %eax register the following code would be used:
>
> movl %gs:0, %eax
>
> I don't know if the object is the thread pointer or if it's the ti_module field in the tls_index struct. The name would suggest it's the field in the struct.
>
> To call the __tls_get_addr function the following assembly instructions are used for the general dynamic model for the IA-32 ABI, the GNU version:
>
> 0x00 leal x at tlsgd(,%ebx,1),%eax
> 0x07 call ___tls_get_addr at plt
>
> In the above code, "x" in the "leal" instruction, is the variable to be accessed. Since you have already implemented TLS for Linux, I assume according to this reference, you already know how to call this function (depending on what TLS model is used).
>
> The general dynamic TLS model can be used everywhere and can access variables defined anywhere else. There are other models available but these are limited in different ways compared to the general model.
>
> I've also read your article at Dr. Dobb's about implementing TLS on Mac OS X. You write:
>
> "my benchmarks show it to be 10 times slower than a simple access to a shared global."
>
> I don't understand why it has to be like this. If TLS is implemented in the same way as on Linux, but in the druntime instead of the dynamic linker (as suggested in the beginning), I don't see why it would be any slower than on Linux. I mean, the same tasks need to be performed regardless if it's done by the dynamic linker or druntime.
>
> Also note that TLS is really fast on Mac OS X, pthread_getspecific is implemented using three assembly instructions and pthread_self using two instructions. There are inline versions available of these functions to remove the function call overhead.
>
> BTW, according to this:
>
> http://stackoverflow.com/questions/2436772/thread-local-storage-macosx
>
> GCC 4.5+ on Mac OS X supports the __thread keyword but it's emulated.
>
> TLS reference: http://www.akkadia.org/drepper/tls.pdf
>
> -- 
> /Jacob Carlborg
>
>
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/dmd-internals/attachments/20111130/f4b2f9c0/attachment-0001.html>
November 30, 2011
Don't expect the Apple supplied version of GCC to ever get this. Apple stopped merging the GNU trunk of GCC in their own after it was relicensed under GPLv3. They're still maintaining their own 4.2.x branch, but most of their efforts are going to LLVM and Clang.

Le 2011-11-30 ? 16:46, Walter Bright a ?crit :

> I'm waiting for gcc on the Mac to support TLS, which it must do soon. When it does, I'll change dmd to match it.
> 
> On 11/30/2011 12:57 PM, Jacob Carlborg wrote:
>> 
>> On 21 nov 2011, at 20:12, Walter Bright wrote:
>>> 
>>> On 11/21/2011 9:17 AM, Jacob Carlborg wrote:
>>>> 
>>>>> One reason dmd doesn't support dynlib yet is because I haven't done much research into how dynlib actually works.
>>>> 
>>>> Ok, I see. I though that you might know since you have developed a C++ compiler as well. I assume dynamic libraries can be used with DMC. Note that when I say "dylib" I mean the general term "dynamic library" and not the Mac OS X specific implementation.
>>>> 
>>> 
>>> DLLs on Windows work very differently from dynlibs on other systems. You have to approach each as its own animal.
>> 
>> I've done some research about how TLS is implemented in the ELF format. I don't understand everything but I think I've got a, at least, somewhat better understanding of TLS.
>> 
>> I've started to think about if it's possible to implement TLS on Mac OS X in the same way as it's implement on Linux, but just with the help of the compiler and druntime. From what I've read, and understood, what basically happens and what's different compared to regular variables is:
>> 
>> * Some form of relocation happens
>> * The TLS sections are initialization
>> * The regular sections are not used
>> * The regular symbol table is not used
>> 
>> I'm not sure if ti's possible to do the relocation but the initialization should be any problem (I think). I'm also not sure about the second symbol table, if that can be made to work. This is how I'm thinking:
>> 
>> There are a couple of things that needs to be done at program start to have TLS working. It shouldn't matter if that's done by the dynamic linker or the application itself (druntime). That's assuming an application can do everything that needs to be done, i.e. relocation.
>> 
>> Of course this is just how I'm thinking and I can be completely wrong. I also have no idea how close your implementation of TLS on Mac OS X is to the implementation on Linux.
>> 
>> Now about getting TLS to work with dynamic libraries.
>> 
>> What's happening now in the ___tls_get_addr function is that there is only one TLS section/segment, bracketed by the __tls_beg and __tls_end segments. The problem is that there is only one pari of these begin and end segments. According to the TLS reference I read, a thread-local variable is identified by a reference to the object and the offset of the variable in the thread-local storage section. So the problem is now how to get the object in which this variable is defined.
>> 
>> I don't full understand how this object is accessed. I think it's either passed to the __tls_get_addr function or accessed inside the function using assembly instructions. What's passed to the __tls_get_addr is an argument of the type "tls_index". The type is defined as follows for the IA-32 ABI:
>> 
>> typedef struct
>> {
>>    unsigned long int ti_module;
>>    unsigned long int ti_offset;
>> } tls_index;
>> 
>> Using the GNU variant of the ABI, the parameter is passed to the function in the %eax register. The reference says that to load the thread pointer in the %eax register the following code would be used:
>> 
>> movl %gs:0, %eax
>> 
>> I don't know if the object is the thread pointer or if it's the ti_module field in the tls_index struct. The name would suggest it's the field in the struct.
>> 
>> To call the __tls_get_addr function the following assembly instructions are used for the general dynamic model for the IA-32 ABI, the GNU version:
>> 
>> 0x00 leal x at tlsgd(,%ebx,1),%eax
>> 0x07 call ___tls_get_addr at plt
>> 
>> In the above code, "x" in the "leal" instruction, is the variable to be accessed. Since you have already implemented TLS for Linux, I assume according to this reference, you already know how to call this function (depending on what TLS model is used).
>> 
>> The general dynamic TLS model can be used everywhere and can access variables defined anywhere else. There are other models available but these are limited in different ways compared to the general model.
>> 
>> I've also read your article at Dr. Dobb's about implementing TLS on Mac OS X. You write:
>> 
>> "my benchmarks show it to be 10 times slower than a simple access to a shared global."
>> 
>> I don't understand why it has to be like this. If TLS is implemented in the same way as on Linux, but in the druntime instead of the dynamic linker (as suggested in the beginning), I don't see why it would be any slower than on Linux. I mean, the same tasks need to be performed regardless if it's done by the dynamic linker or druntime.
>> 
>> Also note that TLS is really fast on Mac OS X, pthread_getspecific is implemented using three assembly instructions and pthread_self using two instructions. There are inline versions available of these functions to remove the function call overhead.
>> 
>> BTW, according to this:
>> 
>> http://stackoverflow.com/questions/2436772/thread-local-storage-macosx
>> 
>> GCC 4.5+ on Mac OS X supports the __thread keyword but it's emulated.
>> 
>> TLS reference: http://www.akkadia.org/drepper/tls.pdf
>> 
>> -- 
>> /Jacob Carlborg
>> 
>> 
>> _______________________________________________
>> dmd-internals mailing list
>> dmd-internals at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/dmd-internals
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



December 01, 2011
On 30 nov 2011, at 22:46, Walter Bright wrote:

> I'm waiting for gcc on the Mac to support TLS, which it must do soon. When it does, I'll change dmd to match it.


That was kind of a disappointing answer.

-- 
/Jacob Carlborg

« First   ‹ Prev
1 2 3