D lacks syntax for initializing the uninitialized. We can do this:
T stuff = T(args); // or new T(args);
but this?..
T* ptr = allocateForT();
// now what?.. Can't just do *ptr = T(args) - that's an assignment, not initialization!
// is T a struct? A union? A class? An int?.. Is it even a constructor call?..
This is, uh, "solved", using library functions - emplaceInitializer
, emplace
, copyEmplace
, moveEmplace
. The fact that there are four functions to do this should already ring a bell, but if one was to look at how e.g. the emplace
is implemented, there's lots and lots more to it - classes or structs? Constructor or no constructor? Postblit? Copy?.. And all the delegation... A single call to emplace
may copy the bits around more than once. Talk about initializing a static array... Or look at emplaceInitializer
, which the other three all depend upon: it is, currently, built on a hack just to avoid blowing up the stack (which is, ostensibly, what previous less hacky hack lead to). Upcoming __traits(initSymbol)
would help in removing the hack, but won't help CTFE any. At various points of their lives, these things even explicitly called memcpy
, which is just... argh! And some still do (copyEmplace
, I'm looking at you). Call into CRT to blit a 8-byte struct? With statically known size and alignment? Just to sidestep type system? Eh??? Much fun for copying arrays!
...And still, none of them would work in CTFE for many types, due to various implementation quirks (which include those very calls to memcpy, or reinterpret casts). This one could, potentially, be solved with more barbed wire and swear words, that is, code, but...
Thing is, all those functions are re-implementing what the compiler can already do, but in a library. Or rather, come very close to doing that, but still don't really get there. C++ with its library solution does this better!
What if the language specified a "magic" function, called, say, __initialize
, that would just do the right thing (tm)? Given an lvalue, it would instruct the compiler to generate code writing initializer, bliting, copying, or calling the appropriate constructor with the arguments. And most importantly, would work in CTFE regardless of type, and not require weird dances around T.init, dummy types involving extra argument copies, or manual fieldwise and elementwise blits (which is what one would have to do in order to e.g. make copyEmplace
CTFE-able).
I.e:
// Write .init
T* raw0 = allocateForT();
// currently - emplaceInitializer(raw0);
(*raw0).__initialize;
// Initialize fields or call constructor, whichever is applicable for T(arg1, arg2)
T* raw1 = allocateForT();
// currently - raw1.emplace(forward!(arg1, arg2));
(*raw1).__initialize(forward!(arg1, arg2));
// Copy
T* raw2 = allocateForT();
// currently - copyEmplace(*raw1, *raw2);
(*raw2).__initialize(*raw1);
// Move
T* raw3 = allocateForT();
// currently - moveEmplace(*raw2, *raw3);
(*raw3).__initialize(move(*raw2));
// Could be called at runtime or during CTFE
auto createArray()
{
// big array, don't initialize
const(T)[1000] result = void;
// exception handling omitted for brevity
foreach (i, ref it; result)
{
// currently - `emplace`, which may fail to compile in CTFE
it.__initialize(createIthElement(i));
}
return result;
}
// CTFE use case:
static auto array = createArray();
The wins are obvious - unified syntax, better error messages, CTFE support, less library voodoo failing at mimicking the compiler. The losses? I don't see any.
Note that I am not talking about yet another library function. This would not be a symbol in druntime, this would be compiler magic. Having that, emplaceInitializer
, emplace
and copyEmplace
could be re-implemented in terms of __initialize
, and eventually deprecated and removed. moveEmplace
could linger until DIP1040 is implemented, tried, and proven. The move
example, verbatim, would be pessimized compared to moveEmplace
due to moving twice, which hopefully DIP1040 could solve.
I'm a bit hesitant to suggest how this should interact with @safe
. On one hand, the established precedent is in emplace
- it infers, and I'm leaning towards that, even though it can potentially invalidate existing state. On the other hand, because it can indeed invalidate existing state, it should be @system
. But then it would require some additional facility just for inference, so it could be called @trusted
correctly, otherwise it'd be useless. And that facility, whatever it is, better not be another library reincarnation of all required semantics. For example, something like a __traits(isSafeToInitWith, T, args)
. Whichever the approach, it should definitely infer all other attributes.
There are undoubtedly other things to consider. For example - classes. It would seem prudent for this hypothetical __initialize
to be calling class ctors. On the other, a reference itself is just a POD, and generic code might indeed want to write null as opposed to attempting to call a default constructor. Then again, generic code still would have to specialize for classes... Thoughts welcome.
What do you think? DIP this, yay or nay? Suggestions?..