August 29, 2013
Am 29.08.2013 21:12, schrieb Martin Nowak:
> On 08/29/2013 12:03 PM, Benjamin Thaut wrote:
>> But what if you import a module that is linked statically? That would
>> mean export would be treated as dllimport and it will fail to link
>> because the _imp_ symbols are missing when linking statically?
>
> Could we create alias symbols?

Yes, but that would add a indirection to all data accesses even if you link the library in statically. That would basically remove the benefit from linking statically. If we want control over what gets exported and what not we will need export anyway, and in my eyes a additional compiler switch for windows is not bad enough to justify adding a level of indirction to data accesses into static libraries.

Kind Regards
Benjamin Thaut
August 29, 2013
On 08/29/2013 09:12 PM, Martin Nowak wrote:
> On 08/29/2013 12:03 PM, Benjamin Thaut wrote:
>> But what if you import a module that is linked statically? That would
>> mean export would be treated as dllimport and it will fail to link
>> because the _imp_ symbols are missing when linking statically?
>
> Could we create alias symbols?

Indeed this seems to work.
OMF has an ALIAS record (http://www.azillionmonkeys.com/qed/Omfg.pdf)
and COFF has weak externals (http://blog.omega-prime.co.uk/?p=121).

So we could add _imp_* aliases for every exported symbol.
When someone links against the import library they are never used, when someone links against a static library they redirect to the actual definitions. Would that work?
August 29, 2013
Am 29.08.2013 21:20, schrieb Martin Nowak:
> On 08/29/2013 09:12 PM, Martin Nowak wrote:
>> On 08/29/2013 12:03 PM, Benjamin Thaut wrote:
>>> But what if you import a module that is linked statically? That would
>>> mean export would be treated as dllimport and it will fail to link
>>> because the _imp_ symbols are missing when linking statically?
>>
>> Could we create alias symbols?
>
> Indeed this seems to work.
> OMF has an ALIAS record (http://www.azillionmonkeys.com/qed/Omfg.pdf)
> and COFF has weak externals (http://blog.omega-prime.co.uk/?p=121).
>
> So we could add _imp_* aliases for every exported symbol.
> When someone links against the import library they are never used, when
> someone links against a static library they redirect to the actual
> definitions. Would that work?

Well no its not that simple. You can't just alias it. You still have to create a import table. That would mean ALL data accesses to data symbols would have another level of indirection. Always. The _imp_ symbols don't refer to the data directly, they refer to the location the data is stored at. As explained in the article I linked (http://blog.omega-prime.co.uk/?p=115) this requires different assembly to be generated. To make this work we would always have to generate assembly with one level of indirection added for every data symbol access, because we can not know if the symbol might be imported from a DLL or not. As D tries to achive near C++ performance I would arther want to avoid that.
August 29, 2013
On 08/29/2013 09:28 PM, Benjamin Thaut wrote:
> Well no its not that simple. You can't just alias it. You still have to
> create a import table. That would mean ALL data accesses to data symbols
> would have another level of indirection. Always. The _imp_ symbols don't
> refer to the data directly, they refer to the location the data is
> stored at. As explained in the article I linked
> (http://blog.omega-prime.co.uk/?p=115) this requires different assembly
> to be generated. To make this work we would always have to generate
> assembly with one level of indirection added for every data symbol
> access, because we can not know if the symbol might be imported from a
> DLL or not. As D tries to achive near C++ performance I would arther
> want to avoid that.

So the alias trick would only work for functions.
For data the _imp_* symbol would need to be a pointer to the data?
How about LTO, when statically linking it should be possible to optimize away the indirection. Also are there special relocations?

module libA;
export int var;
int* _imp_var = &var; // created by compiler

module foo;
import libA;

void bar()
{
    auto val = var; // creates val = *_imp_var;
}

August 29, 2013
Am 29.08.2013 22:04, schrieb Martin Nowak:
>
> So the alias trick would only work for functions.

yes

> For data the _imp_* symbol would need to be a pointer to the data?

yes

> How about LTO, when statically linking it should be possible to optimize
> away the indirection.

Rainer Schuetze stated that some linkers are capable of doing this optimizations. But I don't know aynthing further about this topic.

>
> module libA;
> export int var;
> int* _imp_var = &var; // created by compiler
>
> module foo;
> import libA;
>
> void bar()
> {
>      auto val = var; // creates val = *_imp_var;
> }
>

Yes that would work. Still there must be a reason why microsoft doesn't do stuff like that in their C++ toolchain. Its certanly going to cost performance.
August 29, 2013
On 08/29/2013 10:17 PM, Benjamin Thaut wrote:
>> How about LTO, when statically linking it should be possible to optimize
>> away the indirection.
>
> Rainer Schuetze stated that some linkers are capable of doing this
> optimizations. But I don't know aynthing further about this topic.
>
>>
>> module libA;
>> export int var;
>> int* _imp_var = &var; // created by compiler
>>
>> module foo;
>> import libA;
>>
>> void bar()
>> {
>>      auto val = var; // creates val = *_imp_var;
>> }
>>
>
> Yes that would work. Still there must be a reason why microsoft doesn't
> do stuff like that in their C++ toolchain. Its certanly going to cost
> performance.
I just tested it and it works.

lib.c

int var = 0xdeadbeaf;
int* _imp_var = &var;

main.c

#include <stdio.h>

extern int* _imp_var;
void main()
{
    printf("%d\n", *_imp_var);
}

cl /c /O2 /GL lib.c
cl /O2 /GL main.c lib.obj

get objconv from http://www.agner.org/optimize/

objconv -fasm main.exe

Search for deadbeaf in main.asm to get the symbol number (?_1176).
It directly loads the variable.

mov edx, dword ptr [?_1176]
August 30, 2013
On Thursday, 29 August 2013 at 21:40:08 UTC, Martin Nowak wrote:
> On 08/29/2013 10:17 PM, Benjamin Thaut wrote:
>>> How about LTO, when statically linking it should be possible to optimize
>>> away the indirection.
>>
>> Rainer Schuetze stated that some linkers are capable of doing this
>> optimizations. But I don't know aynthing further about this topic.
>>
>>>
>>> module libA;
>>> export int var;
>>> int* _imp_var = &var; // created by compiler
>>>
>>> module foo;
>>> import libA;
>>>
>>> void bar()
>>> {
>>>     auto val = var; // creates val = *_imp_var;
>>> }
>>>
>>
>> Yes that would work. Still there must be a reason why microsoft doesn't
>> do stuff like that in their C++ toolchain. Its certanly going to cost
>> performance.
> I just tested it and it works.
>
> lib.c
>
> int var = 0xdeadbeaf;
> int* _imp_var = &var;
>
> main.c
>
> #include <stdio.h>
>
> extern int* _imp_var;
> void main()
> {
>     printf("%d\n", *_imp_var);
> }
>
> cl /c /O2 /GL lib.c
> cl /O2 /GL main.c lib.obj
>
> get objconv from http://www.agner.org/optimize/
>
> objconv -fasm main.exe
>
> Search for deadbeaf in main.asm to get the symbol number (?_1176).
> It directly loads the variable.
>
> mov edx, dword ptr [?_1176]

I was doubting my idea, but you conviced me :P
August 30, 2013

On 29.08.2013 23:40, Martin Nowak wrote:
> On 08/29/2013 10:17 PM, Benjamin Thaut wrote:
>>> How about LTO, when statically linking it should be possible to optimize
>>> away the indirection.
>>
>> Rainer Schuetze stated that some linkers are capable of doing this
>> optimizations. But I don't know aynthing further about this topic.
>>
>>>
>>> module libA;
>>> export int var;
>>> int* _imp_var = &var; // created by compiler
>>>
>>> module foo;
>>> import libA;
>>>
>>> void bar()
>>> {
>>>      auto val = var; // creates val = *_imp_var;
>>> }
>>>
>>
>> Yes that would work. Still there must be a reason why microsoft doesn't
>> do stuff like that in their C++ toolchain. Its certanly going to cost
>> performance.
> I just tested it and it works.
>
> lib.c
>
> int var = 0xdeadbeaf;
> int* _imp_var = &var;
>
> main.c
>
> #include <stdio.h>
>
> extern int* _imp_var;
> void main()
> {
>      printf("%d\n", *_imp_var);
> }
>
> cl /c /O2 /GL lib.c
> cl /O2 /GL main.c lib.obj
>
> get objconv from http://www.agner.org/optimize/
>
> objconv -fasm main.exe
>
> Search for deadbeaf in main.asm to get the symbol number (?_1176).
> It directly loads the variable.
>
> mov edx, dword ptr [?_1176]

It's a bit easier to see the code by adding debug symbols and using dumpbin /disasm main.exe.

Unfortnately, this won't help us a lot, because the intermediate object files have some unknown format and in fact are just transferring some code representation to the linker that then invokes the compiler and optimizer on the full source code. This is very specific to the C/C++ toolchain and I don't think we can take advantage of it.
August 30, 2013

On 29.08.2013 21:09, Martin Nowak wrote:
> On 08/29/2013 08:30 PM, Rainer Schuetze wrote:
>> I meant the import part. How are accesses or data references to data
>> inside another shared library implemented?
>
> References from the data segment use absolute relocations.
> References from PIC code use the GOT.

So an indirection through the GOT is added to every access to a non-static global if you compile with -fPIC?
August 30, 2013
So let me summarize the currently discussed solutions:

1) Use alias symbols
Pros:
 - Better usability (no additional command line parameters needed when compiling / linking against DLLs)

Cons:
 - Less efficient code for cross DLL function calls (one additional call instruction)
 - Less efficient code for all accesses to global data, no matter if it will end up in a DLL or not. That means __gshared variables, global shared variables, hidden global data like module info, type info, vtables, etc (one additional level of indirection). It might be possible to avoid this in most cases using link time optimization. But it is unlikely that we can get LTO to work easily.
 - Additional level of indirection might confuse current debuggers

2) Use additional command line
Pros:
 - More efficient code

Cons:
 - Additional command line parameters needed when compiling / linking against a DLL. (can be hidden away inside sc.ini for phobos / druntime)