Improve the OOP ABI

I have been arguing for quite some time about the fact that there is a well of improvement to make to D that are not necessarily breaking changes or languages addition - and that, in fact, D having more feature than most language, adding feature is clearly not where the impact is.

Today I would like to provide a concrete example of things that could be dramatically improved, with no breakage except at the ABI level - which we already do not guarantee. Let's go over these changes.

1/ Downcast to final classes.

class A {}
final class B : A {}

auto foo(A a) {
    return cast(B) a;
}

This generates a call into the runtime, making things completely opaque to the optimizer. The algorithm used by the runtime is linear. But there is a dumb constant time solution. because B is final, a is either an instance of B, or it is not (as in, it cannot be an instance of a subclass of B). Therefore, we expect the codegen to look like the following instead:

auto foo(A a) {
    return typeid(a) is typeid(B) ? a : null;
}

This obviously won't compile but you get the idea. Not only this is constant time, but very simple too, and visible to the optimizer, which means the check can be folded by the opitimizer after other transforms, for instance inlining.

2/ Inline ClassInfo with the vtbl.

The current layout of objects is as follow:

+-----------+    +-----------+    +-----------+
|  __vtbl   +--->|  typeid   +--->| ClassInfo |
+-----------+    +-----------+    |           |
| __monitor |    |  method0  |    |    ...    |
+-----------+    +-----------+    |           |
|  field0   |    |  method1  |    +-----------+
+-----------+    +-----------+
|  field1   |    |    ...    |
+-----------+    +-----------+
|    ...    |
+-----------+

This causes a ton of extra indirections that are not necessary. instead the following layout ought to be used:

+-----------+    +-----------+
|  __vtbl   +--->| ClassInfo |
+-----------+    |           |
| __monitor |    |    ...    |
+-----------+    |           |
|  field0   |    +-----------+
+-----------+    |  method0  |
|  field1   |    +-----------+
+-----------+    |  method1  |
|    ...    |    +-----------+
+-----------+    |    ...    |
                 +-----------+

Alternatively, it is possible to have the pointer point at the first method and subtract to get the typeid. The important part is that there is only one indirection now.

3/ Downcast.

Currently, we use a linear algorithm to downcast and we reach this algorithm via a call int he runtime. This prevent the optimizer from doing anything about it, in addition to be slow. The problem is similar to 1/. The whole check can be made trivial if we let the compiler prebake some data for us, namely an array of primary parents. Let's see what it looks like in the first example, when b is not final.

class A {}
class B : A {}

auto foo(A a) {
    return cast(B) a;
}

Which we can do as follow:

auto foo(A a) {
    // The primaries generated by the compiler for us.
    auto tidA = typeid(A);
    assert(tidA.primaries = [tidA]);
    auto tidB = typeid(B);
    assert(tidB.primaries = [tidA, tidB]);

    auto t = typeid(a);
    auto depth = tidB.primaries.length - 1;

    if (t.primary.length <= depth) {
        return null;
    }

    if (t.primary[depth] == tidB) {
        return a;
    }

    return null;
}

This is starting to look good, now we have general downcast in a form that is not only really fast to perform, but also completely transparent to the optimizer.

4/ Downcast to interfaces.

This is getting a bit more tricky, so I won't expand this one fully, but it is also possible to do much better on this front as well. The obvious complication that that interfaces allow for multiple inheritance.

The first obvious optimization is to consider the chain of leftmost (or alternatively the longest chain, which allows to cull more candidates faster) inheritance the primaries for the interface and eliminate the whole branch at once using the same strategy as 3/.

Now we are left with possible secondaries match. Here, no possibility to remain O(1), we'll have to loop over the other interfaces and repeat the matching process. Thankfully very branch hierarchy are uncommon in practice, but even then, we can use this one weird trick to dramatically cull out the search space: bloom filters.

Make the bloom filter 64 bits and simply cull via (targetBloom & aggregateBloom) == targetBloom and you can eliminate a good chunk of the search right there.

I hope this is the kind of change we can see in D in the future. This is the kind of changes that would make D better in practice, not the next cool feature, but that the current, existing feature, have an quality of implementation that is in line with what can be expected in a modern language.

On 29/09/2023 1:42 AM, deadalnix wrote: > 1/ Downcast to final classes. > ```d > auto foo(A a) { > return typeid(a) is typeid(B) ? a : null; > } > ``` TypeInfo isn't guaranteed to be unique to a process when dealing with shared libraries. There is a strong chance it'll be duplicative. So this won't work. > 2/ Inline ClassInfo with the vtbl. ClassInfo is an alias to TypeInfo_Class now. This indirection shouldn't exist. https://github.com/dlang/dmd/blob/a6285c26c0e04b9b3b1d40ae1c3200c834df1480/druntime/src/object.d#L1737

On Thursday, 28 September 2023 at 13:05:01 UTC, Richard (Rikki) Andrew Cattermole wrote: > TypeInfo isn't guaranteed to be unique to a process when dealing with shared libraries. There is a strong chance it'll be duplicative. This should be fixed, but even if it isn't, that just means using the == operator instead of the is operator; the concept i believe is still sound. (the == operator sux tho, this really should be fixed at the source) > ClassInfo is an alias to TypeInfo_Class now. This indirection shouldn't exist. this means emplacing the typeinfo_class in the same memory block as the vtbl instead of linking to it. so the virtual table exists at *(instance + __traits(classInstanceSize, TypeInfo_Class))

On Thursday, 28 September 2023 at 13:05:01 UTC, Richard (Rikki) Andrew Cattermole wrote: > On 29/09/2023 1:42 AM, deadalnix wrote: >> 1/ Downcast to final classes. > >> ```d >> auto foo(A a) { >> return typeid(a) is typeid(B) ? a : null; >> } >> ``` > > TypeInfo isn't guaranteed to be unique to a process when dealing with shared libraries. There is a strong chance it'll be duplicative. > > So this won't work. > lol, wat? That sounds crazy. >> 2/ Inline ClassInfo with the vtbl. > > ClassInfo is an alias to TypeInfo_Class now. This indirection shouldn't exist. > > https://github.com/dlang/dmd/blob/a6285c26c0e04b9b3b1d40ae1c3200c834df1480/druntime/src/object.d#L1737 It does exist.

On Thursday, 28 September 2023 at 14:31:33 UTC, deadalnix wrote: >> >> TypeInfo isn't guaranteed to be unique to a process when dealing with shared libraries. There is a strong chance it'll be duplicative. >> >> So this won't work. >> > > lol, wat? > > That sounds crazy. I think it's the same in C++, and sometimes dynamic_cast ends up making string comparisons.

On Thursday, 28 September 2023 at 14:41:16 UTC, Guillaume Piolat wrote: > On Thursday, 28 September 2023 at 14:31:33 UTC, deadalnix wrote: >>> >>> TypeInfo isn't guaranteed to be unique to a process when dealing with shared libraries. There is a strong chance it'll be duplicative. >>> >>> So this won't work. >>> >> >> lol, wat? >> >> That sounds crazy. > > I think it's the same in C++, and sometimes dynamic_cast ends up making string comparisons. What's the reason for this madness?

On Thursday, 28 September 2023 at 19:25:15 UTC, Richard (Rikki) Andrew Cattermole wrote: > I said this on Discord, but that means all TypeInfo has to be exported, even for executables. There might be an issue when your lib uses two other libs, and both of them use lib X but of different versions. This might cause issues with exported structures when they differ from version to version, and etc. Could this also be an issue for exported TypeInfos?

Forums