On Monday, 26 February 2024 at 13:36:42 UTC, thumbgun wrote:
> I'm currently trying to call some C++ functions that were compiled by g++ (mingw). However g++ uses the Itanium ABI name mangling rules. dmd on Windows tries to link functions based on the MSVC name mangling rules.
[...]
Is there any way I can make dmd link to symbols mangled according to the Itanium ABI's rules on Windows?
Here's a simple way to do this with no change to source code in either the dynamic library or the DMD project that is supposed to dynamically link to it. Of course this doesn't not resolve potential C++ calling convention issues (that don't exist for C), but now anyone is in a position to investigate when they exist.
I made a tiny proof of concept and it works. For concreteness, suppose the dynamic library is libx.dll
, built with the (mingw64) gcc
installed with the latest MSYS2 as that led to all the utilities I needed and a bash command line.
Suppose also the DMD project executable when build will be main.exe
compiled from main.d
and other.d
and a D interface file header.di
containing the necessary declarations for using libx.dll
. I'll state the obvious below to make the explanation complete and for snag free experimentation.
Suppose for a moment there's no mangling problem because libx.dll
is compiled from C source, not C++. I'll describe the exact context and then show how to fix it up for C++ with the mangling problem solved.
To dynamically link to libx.dll
from a DMD executable main.exe
DMD needs to link an implib (import library) during the build of main.exe
and such can be made from a def file (module definition file) — which is a text file — using a library manager that knows about the MSVC world that DMD inhabits. Let libx.def
be a def file for libx.dll
, and let libx.lib
be an implib for libx.dll
.
The def file would usually be created by gcc
given -Wl,--output-def=libx.def
when libx.dll
is linked. And an implib can be created from it using dlltool
which is distributed with that mingw64 gcc
.¹
$ dlltool -D libx.dll -d libx.def -l libx.lib -m i386:x86-64
Alternatively, the MS librarian lib
can be used.² ³
$ lib -nologo -machine:x64 -def:libx.def -out:libx.lib
Now when main.exe is built, it just needs to link to that import library and we're in business.
$ dmd main.d other.d header.di libx.lib
$ ./main #works
Now suppose we move to C++. If we make an import library as above, then a build of main.exe
will not link, because the gcc
-mangled names in the implib libx.lib
do not match the MSVC-mangled names supplied by DMD.
We can fix this by modifying the def file and producing an implib containing the MSVC-mangled names in place of the corresponding gcc
-mangled names!
An implib contains each name to link to paired with the corresponding location of the function in the dynamic library that name refers to. Concretely libx.lib
contains each gcc
-mangled name paired with the location in libx.dll
of the corresponding function. So the problem is solved if the gcc
-mangled names are replaced by the corresponding MSVC-mangled names in the implib libx.lib
.
There are many ways to do this! However, there's a mechanism in a def file to do just that.
Here's the def file libx.def
for my toy libx.dll
generated by g++ -shared libx.o -o libx.dll -Wl,--output-def=libx.def
.
EXPORTS
_Z11complicatedi @1
Here _Z11complicatedi @1
is the gcc
-mangle of int complicated(int)
. Unfortunately, other.d
expects this function to be mangled as ?complicated@@YAHH@Z
, as this is the MSVC-mangle of int __cdecl complicated(int)
⁴ and comes from extern(C++) int complicated(int);
in header.di
.
Editing libx.def
into
EXPORTS
?complicated@@YAHH@Z=_Z11complicatedi @1
substitutes the MSVC-mangled name on the left for the gcc
-mangled name on the right when generating the implib libx.lib
. Using the MS librarian as before and building main.exe
removes the linking error and the result just works.
However while using dlltool
or llvm-dlltool
as before produces implibs that satify the linker, the resulting main.exe
when run did nothing in my toy example, simply returning to the prompt with no output as of 2024-02-29.
A libx.def
and hence libx.lib
for any main.exe
and libx.dll
with many substitution lines placed in the def file could be mechanically generated for once and for all. Or libx.def
and libx.lib
could be rebuilt on the fly as new symbols are used while the DMD project is being written.
Using the MS dumpbin
tool produces text from which MSVC-mangled symbols can be extracted, along with their demanglings. So if the DMD project is compiled to a lib using the -lib option so that it builds when linkage would be broken then a table of (unmangled,MSVC-mangled) name pairs for linkage can automatically constructed from running dumpbin
on the resulting main.lib
and tearing up the resulting text. Similarly, the utility nm
can be used to produce a table of (unmangled,gcc
-mangled) pairs from libx.dll
and that combined with the text of libx.def
to produce the modified libx.def
with the necessary additional qualifiers as in the example above.
A script could do this and then lib
run to build the import library on the fly during a build. Or, if the library's bindings are all in a D header file already, say header.di
then that could be used to produce the pairs containing unmangled and the MSVC-mangled names once and for all, and the corresponding libx.def
file then used to produce the implib libx.lib
that could be endlessly used with libx.dll
.
Lots of possibilities here!
There is a library distributed with mingw64 gcc
to demangle MSVC-mangled names, though I did not use it. So in principle the substitutive def file could be made using just nm to dump the MSVC-mangled binaries, so no MS tools are needed to make it.
Of course what we really need to know is the extent to which cross calling actually works for various C++ constructs. I'd be grateful if anyone finds this out that they'd post it here. I'm not a C++ fan, so I'm not the person to do this.
[1] This worked with my toy example, but there are claims online that dlltool is unreliable, in which case llvm-dlltool might be better. They both have the same command line, and I could distinguish no difference between them in my toy examples.
[2] A bash script to put the directory lib.exe
is resident in at the front of your MSYS2 path before executing it is handy, so as to avoid polluting that deliberately isolated path with MS related executables. This technique can be used for other MS tools mentioned above. So e.g. in ~/bin/lib
made executable could be the following with VCBIN
appropriately set in ~/.bash_profile
as an MSYS2 path obtained from the windows path to lib.exe
's directory using the cygpath
utility that comes with MSYS2. Note that it says lib.exe
in the script, not lib
to avoid accidental recursion.
#!/bin/bash
PATH="$VCBIN:$PATH"
lib.exe "$@"
[3] Avoid using DOS style options like e.g. /nologo
in favor of unix style options like -nologo
because MSYS2 tries to helpfully modify command lines and regards /nologo
as a an MSYS2 path which will be converted to a Windows path before executing the command.
[4] It seems this is because __cdecl
is the default calling convention for (mingw64) gcc
and DMD's extern(C++)
assumes this, and MSVC-mangling always includes the calling convention in the signature being mangled, even though gcc
-mangling does not if it is the default of __cdecl
.