Thread overview
Dynamically binding to D code using extern(D)
Sep 30, 2021
Hipreme
Sep 30, 2021
jfondren
Sep 30, 2021
Hipreme
Oct 01, 2021
Mike Parker
September 30, 2021

I write this post as both a learning tool, a question and an inquiry.

There are just a lot of drawbacks in trying to do function exporting while using D.

That interface is absurdly confuse and that is probably why I've never seen a project here which made an use of extern(D) while using a DLL.

While I'm making my DLL's generation, there are a lot of pitfalls that I can feel into.

Simple Function


module something;
extern(D) export int sum(int a, int b){return a + b;}

The correct way to bind to that function would be:


module app;
import core.demangle

int function(int a, int b) sum;

void main()
{
    sum = cast(typeof(sum))GetProcAddress(someDll, mangleFunc!(typeof(sum)("something.sum");
}

And that should be it for loading a simple function.

Now, lets make our case a bit more complicated:

Overloaded function


module something;

extern(D) export int add(int a, int b)
{
    return a + b;
}

extern(D) export float add(float a, float b)
{
   return a+b;
}

For loading those functions, the correct way would be


module app;
import core.demangle;


int function(int a, int b) sumInt;
float function(float a, float b) sumFloat;

int sum(int a, int b){return sumInt(a, b);}
float sum(float a, float b){return sumFloat(a,b);}

void main()
{
    sumInt = cast(typeof(sumInt))GetProcAddress(dll, mangleFunc!(typeof(sumInt))("something.sum"));
    sumFloat = cast(typeof(sumFloat))GetProcAddress(dll, mangleFunc!(typeof(sumFloat))("something.sum"));
}

Notice how much the overall complexity starts to increase as there seems to be no way to put get the overloads and there doesn't seem to be any advantage in using extern(D).

Static Methods

The only difference from the default functions is that we need to pass the class name as a module name.

Static Methods returning user data

That is mainly the reason I'm writing that post. It made me really wonder if I should really use extern(D).

This section will use 3 files because after all, there is really a (consistency?) problem

module supertest;
import ultratest;

class SuperTest
{
   extern(D) export static SuperTest getter(){return new SuperTest();}
   extern(D) export static UltraTest ultraGetter(){return new UltraTest();}

   import core.demangle;

   pragma(msg, mangleFunc!(typeof(&SuperTest.getter))("supertest.SuperTest.getter"));
   //Prints _D9supertest9SuperTest6getterFZCQBeQx
   pragma(msg, mangleFunc!(typeof(&SuperTest.ultraGetter))("supertest.SuperTest.ultraGetter"));
   //Prints _D9supertest9SuperTest11ultraGetterFZC9ultratest9UltraTest

}
module ultratest;
class UltraTest{}
module app;
import core.demangle;

void main()
{
   //???

}

As you can see at module supertest, the pattern seems to break when returning user data
for another module. From my knowledge, I don't know how could I get this function, specially because you will need to know: the module that you're importing the function + the module that where the userdata is defined for getting it.

It seems pretty insane to work with that.

extern(D) advantages:

extern(D) disadvantages:

  • Code only callable in D(probably no other language as a demangler)
  • I don't remember seeing any other code before in that post doing that, so, no documentation at all
  • You will need to call the demangler for binding to a symbol, which in my project, it could make each call to a unique type from the demangler costs 15KB
  • You will need to know the module which you imported your function
  • If your function returns userdata from another function, there doesn't seem to be any workaround
  • Doesn't provide any overloading binding support though the language has support to overloading

extern(C) advantages:

  • Code callable from any language as it is absolutely intuitive
  • Well documented

extern(C) disadvantages:

  • You will need to declare your function pointer as extern(C) or it will swap the arguments order.

I have not even entered in the case where I tried overloading static methods, which I think it would need to declarate aliases to the static methods typings for actually generating a mangled name.

I want to know if extern(D) is actually meant to not be touched. adr said that his use for that was actually when doing

extern(C):
//Funcs defined here

extern(D): //Resets the linkage to the default one

So, there are just too many disadvantages for doing extern(D) for binding it to any code, I would like to know where we can get more documentation than what I posted here right now (really, I've never saw any code binding to an extern(D) code). And I do believe that is the main reason why people usually don't use dynamic libs in D, it is just inviable as you would need to regenerate all the API yourself

September 30, 2021

On Thursday, 30 September 2021 at 18:09:46 UTC, Hipreme wrote:

>

I write this post as both a learning tool, a question and an inquiry.

There are just a lot of drawbacks in trying to do function exporting while using D.

The terms that people use are a bit sloppy. There are three kinds of 'linking' here:

  1. static linking, performed during compilation, once. If linking fails, the compile files.
  2. dynamic linking (option 1), performed when an executable starts up, before your program gains control, by the system linker. If linking fails, your program never gets control.
  3. dynamic linking (option 2), performed arbitrarily at runtime, by your program. If linking fails, you can do whatever you want about that.

All of the loadSymbol and 'userdata module' hassle that you're frustrated by is from option 2. Option 1 is really the normal way to link large shared libraries and there's nothing to it. What your code looks like that loads a shared library is just import biglib;, and the rest of the work is in dub, pkg-config, LD_LIBRARY_PATH, etc. Phobos is commonly linked in this way.

Pretty much anything that isn't a plugin in a plugin directory can use option 1 instead of option 2.

>

extern(C) advantages:

  • Code callable from any language as it is absolutely intuitive
  • Well documented

You can call scalding water 'hot' even when you're fresh from observing a lava flow. People still find the C ABI frustrating in a lot of ways, and especially when they encounter it for the first time.

But the C ABI rules the world right now, yes. The real advantages are

  • it 'never' changes
  • 'everyone' already makes it easy to use
>

extern(C) disadvantages:

  • You will need to declare your function pointer as extern(C) or it will swap the arguments order.
  • you're limited to using C's types
  • you can't use overloading, lazy parameters, default values; you can't rely on scope parameters, etc., etc.
  • you can't casually hand over GC-allocated data and expect the other side to handle it right, or structs with lifetime functions that you expect to be called
  • very little of importance is statically checked: to use a C ABI right you need to very carefully read documentation that needs to exist to even know who is expected to clean up a pointer and how, how large buffers should be. (I wasn't feeling a lot of the C ABI's "absolute intuitiveness" when I was passing libpcre an ovector sized to the number of pairs I wanted back rather than the correct number of pairs*3/2)

Option 2 dynamic linking of D libraries sounds pretty frustrating. Even with a plugin architecture, maybe I'd prefer just recompiling the application each time the plugins change to retain option 1 dynamic linking. Using a C ABI instead is a good idea if just to play nice with other languages.

And if you were wanting something like untrusted plugins, a way to respond to a segfault in a plugin, like I think you mentioned in Discord, then I'd still suggest not linking at all but having separate applications and some form of interprocess communication (pipes, unix sockets, TCP sockets) instead of function calls. This is something that you could design, or with D's reflection, generate code for against the function calls you already have. But this is even more work that you'll have to do. If we add "a separate process telling you what to do with some kind of protocol" as a fourth kind of linking, then the respective effort is

  1. free! it compiles, it's probably good!
  2. free! if the program starts, it's probably good!
  3. wow, why don't you just write your own loadSymbol DSL?
  4. wow, why don't you just reimplement Erlang/OTP and call it std.distributed? maybe protobufs will be enough.
September 30, 2021

Okay, I do agree with you that I may have exaggerated with absolute intuitiveness, but I was talking about that intuitiveness for loading a symbol from a shared library.

>

You're limited to using C's types

  • I think I don't understood what you meant with that, if the data type is known before head, it is possible to just declare it from the other side

On Thursday, 30 September 2021 at 22:30:30 UTC, jfondren wrote:

>
  • you can't use overloading, lazy parameters, default values; you can't rely on scope parameters, etc., etc.
  • That seems to be pretty much more a problem for dynamically loading a function, although default values can be mirrored to in D API.
>
  • you can't casually hand over GC-allocated data and expect the other side to handle it right, or structs with lifetime functions that you expect to be called
  • That is another problem that doesn't seem related to the external linkage too, handling GC-allocated data with extern(D) doesn't stop it from it being garbage collected, I'm fixing that kind of error right now again.
>

separate applications and some form of interprocess communication (pipes, unix sockets, TCP sockets) instead of function calls.

  • I'm pretty interested in how to make that thing work, but I think that would change a lot in how I'm designing my code, and with that way, it would probably become absolutely data oriented, right?
October 01, 2021

On Thursday, 30 September 2021 at 22:30:30 UTC, jfondren wrote:

>
  1. dynamic linking (option 2), performed arbitrarily at runtime, by your program. If linking fails, you can do whatever you want about that.

That's actually "dynamic loading".

https://en.wikipedia.org/wiki/Dynamic_loading