May 11, 2015
Am Sun, 10 May 2015 19:51:26 +0000
schrieb "Dicebot" <public@dicebot.lv>:

> On Friday, 8 May 2015 at 05:26:01 UTC, Benjamin Thaut wrote:
> > Pro:
> > - Its the plain windows shared library mechanism in all its
> > uglyness.
> 
> I wonder if anyone can provide more "Pro" input :)

Yep, this is an area where I have no expertise and what you provided made me wonder if it is a technical analysis or a sales pitch for unique symbols.

Why did Microsoft go with that approach, why did it work for them and why does it not map well to D ?

-- 
Marco

May 11, 2015
> Why did Microsoft go with that approach,

Maybe they didn't know better back then. Historically DLLs initially didn't support data symbols at all, only functions where supported. For functions its not a problem if they are duplicated because usually you don't compare pointers to functions a lot. Later they added support for data symbols building on what they had. I assume the system that is in place now is a result of that.

> why did it work for them
Because C/C++ are not as template heavy as D and you basically try to avoid cross dll templates in c++ at all cost when developing for windows. Because if you do use templates across dll boundaries and you are not super careful you get a lot of issues due to duplicate symbols (e.g. static variables existing twice etc). MSVC gets around the casting issue by essentially doing string comparisons for dynamic casts which comes with a significant performance impact. On the other hand you don't use dynamic casts in c++ a lot (if you care about performance).

> and why does it not map well to D ?
D uses tons of templates everywhere. Even type information for non templated types is generated on demand and stored in comdats which can lead to duplicate symbols the same way it does for templates. In D the dynamic cast is basically the default and you have to force the compiler to not use a dynamic cast if you care for performance.


Its not like the linux approach doesn't have issues as well. I heard of cases where people put large parts of boost into a shared library and the linux loader would take multiple minutes to load the shared library into the program. This however is mostly due to the fact that on linux all symbols are visible from a shared library by default. In later versions of gcc (4+) they added a option to make all symbols hidden by default (-fvisibility=hidden) and you can make only those visible that you need. This then significantly speeds up loading of shared libraries because the number of symbols that need to be resolved is greatly decreased.

On the other hand the linux approach has a additional advantage I didn't mention yet. You can use the LD_PRELOAD feature to "inject" shared libraries into processes. E.g. for injecting a better malloc library to speed up your favorite program. This is not easily possible with the windows approach to shared libraries.
May 11, 2015
On Friday, 8 May 2015 at 05:26:01 UTC, Benjamin Thaut wrote:
> And Step 2) at program start up time. This means that symbols don't have identity. If different shared libraries provide the same symbol it may exist multiple times and multiple instances might be in use.

Can you elaborate a bit on that?
How would you run into such an ODR violation, by linking against multiple import libraries that contain the same symbol?

> Any opinions on this? As both options would be quite some work I don't wan't to start blindly with one and risking it being rejected later in the PR.

Last time we thought about this we came to the conclusion that global uniqueness for symbols isn't possible, even on Unix when you have 2 comdat/weak typeinfos for template classes in 2 different shared libraries but not in the executable. I suggested that we could wrap typeinfos for template types in something like TypeInfo_Comdat that would do a equality comparison based on name and type size.
May 11, 2015
Thanks for the insight into how this affects MSVC++, too.

How much work do you think would have to be done at startup of an application like Firefox or QtCreator if they were not in C++, but D?

Most of us have no idea what the algorithm would look like and what data sets to expect.

I guess you'd have to collect all the imported symbols from all exe/dll modules and put the list of addresses for each unique symbol into some multi-set that maps symbol names to a list of adresses:

"abc" -> [a.dll @ 0x359428F0, b.dll @ 0x5E30A410]
"def" -> [b.dll @ 0x38C3D200]

Then the symbol name is no longer relevant so it can be thought of as an array of address arrays

[
  [0x359428F0, 0x5E30A410],
  [0x38C3D200]
]

where you pick one item from each of the arrays (e.g. the
first one and map all others to that):

0x359428F0 -> 0x359428F0
0x5E30A410 -> 0x359428F0
0x38C3D200 -> 0x38C3D200

Then you go through all import address tables and perform the above remapping to make symbols unique.

Is that what would happen?

-- 
Marco

May 11, 2015
On Monday, 11 May 2015 at 14:57:46 UTC, Marco Leise wrote:
>
> Is that what would happen?

Yes, that's exactly what would happen. You could go one step further and not do it for all symbols, instead you make the compiler emit a additional section with references to all relevant data symbols. Then you only do the patching operation on the data symbols and leave all other symbols as is. This would greatly reduce the number of symbols that require patching.

The exepcted data set size should be significantly smaller then on linux. Because currently on linux D simply exports all symbols. Which means that the linux loader does this patching for all symbols. On windows only symbols with the "export" protection level get exported. That means the set of symbols this patching has to be done for is a lot smaller to begin with. The additional optimization would reduce the number of symbols to patch once again. So even if the custom implementation is vastly inferior to what the linux loader does (which I don't think it will be) it still should be fast enough to not influence program startup time a lot.
May 11, 2015
Am 11.05.2015 um 16:21 schrieb Martin Nowak:
>
> Can you elaborate a bit on that?
> How would you run into such an ODR violation, by linking against
> multiple import libraries that contain the same symbol?

I will post some code examples later. Code usually shows the issue best.

>
> Last time we thought about this we came to the conclusion that global
> uniqueness for symbols isn't possible, even on Unix when you have 2
> comdat/weak typeinfos for template classes in 2 different shared
> libraries but not in the executable. I suggested that we could wrap
> typeinfos for template types in something like TypeInfo_Comdat that
> would do a equality comparison based on name and type size.

Do you have a code example for this issue? I wasn't able to produce a duplicate symbol with linux shared libraries yet.

May 11, 2015
On Monday, 11 May 2015 at 12:54:09 UTC, Benjamin Thaut wrote:
>> and why does it not map well to D ?
> D uses tons of templates everywhere. Even type information for non templated types is generated on demand and stored in comdats which can lead to duplicate symbols the same way it does for templates. In D the dynamic cast is basically the default and you have to force the compiler to not use a dynamic cast if you care for performance.

Sorry for the rookie question, but my background is C rather than C++.  How do I force a static cast, and roughly order magnitude how big is the cost of a dynamic cast ?

Would you mean for example rather than casting a char[] to a string taking the address and casting the pointer?
May 11, 2015
On Monday, 11 May 2015 at 15:32:47 UTC, Benjamin Thaut wrote:
> On Monday, 11 May 2015 at 14:57:46 UTC, Marco Leise wrote:
>>
>> Is that what would happen?
>
> Yes, that's exactly what would happen. You could go one step further and not do it for all symbols, instead you make the compiler emit a additional section with references to all relevant data symbols. Then you only do the patching operation on the data symbols and leave all other symbols as is. This would greatly reduce the number of symbols that require patching.
>
> The exepcted data set size should be significantly smaller then on linux. Because currently on linux D simply exports all symbols. Which means that the linux loader does this patching for all symbols. On windows only symbols with the "export" protection level get exported. That means the set of symbols this patching has to be done for is a lot smaller to begin with. The additional optimization would reduce the number of symbols to patch once again. So even if the custom implementation is vastly inferior to what the linux loader does (which I don't think it will be) it still should be fast enough to not influence program startup time a lot.


Just as info, Windows is not alone.

There are a few other systems that follow the same process.

For example, Aix used to be Windows like and nowadays it has a mix of ELF and Windows modes.

http://www.ibm.com/developerworks/aix/library/au-aix-symbol-visibility/

Symbian although dead, also used the Windows approach if I remember correctly.

I expect other non-POSIX OSes not to follow the ELF way.

--
Paulo
May 11, 2015
On Sunday, 10 May 2015 at 19:27:03 UTC, Benjamin Thaut wrote:
> Does nobody have a opinion on this?

Sorry for being an extreme noob in the matter.

Probably, only Manu fought with Windows dlls for real.
As a user I would say I want short startup times as I change/execute the active application *very* often. However I'm not sure I hit HDD seek time penalty or the system loader activity.

TBH I think Linux is more sleepy which I don't like (but again, this may be prefetch problem, I don't know).

And by maintenance overhead for 1st option you mean explicit handling in library source code? Isn't it the job for compiler/linker?

Piotrek
May 11, 2015
Am 11.05.2015 um 21:39 schrieb Laeeth Isharc:
> On Monday, 11 May 2015 at 12:54:09 UTC, Benjamin Thaut wrote:
>>> and why does it not map well to D ?
>> D uses tons of templates everywhere. Even type information for non
>> templated types is generated on demand and stored in comdats which can
>> lead to duplicate symbols the same way it does for templates. In D the
>> dynamic cast is basically the default and you have to force the
>> compiler to not use a dynamic cast if you care for performance.
>
> Sorry for the rookie question, but my background is C rather than C++.
> How do I force a static cast, and roughly order magnitude how big is the
> cost of a dynamic cast ?
>
> Would you mean for example rather than casting a char[] to a string
> taking the address and casting the pointer?

Dynamic casts only apply to classes. They don't apply to basic types.

Example

object o = instance;
SomeClass c = cast(SomeClass)instance; // dynamic cast, checks type info
SomeClass c2 = cast(SomeClass)cast(void*)instance; // unsafe cast, simply assumes instance is SomeClass

If you do the cast in a tight loop it can have quite some performance impact because it walks the type info chain. Walking the type info hirarchy may cause multiple cache misses and thus a significant performance impact. The unsafe cast literally does not anything besides copying the pointer.