Thread overview
Need a way to get compressed mangling of a symbol.
Jul 16, 2013
QAston
Jul 16, 2013
Adam D. Ruppe
Jul 16, 2013
QAston
July 16, 2013
I'd like to dynamically load procedures from a dll in my app. To load a symbol from a DLL i need it's mangled name. D currently offers .mangleof which I currently use to generate the name. It works very good, but only for short symbols. Is there any way to get the final mangled name of a symbol during compilation, or maybe there's some documentation describing how compression is done (code'd be fine too)? http://dlang.org/abi.html doesn't cover the compression scheme.
July 16, 2013
I'm looking at the dmd source now...

The compression is done in the backend, file cgobj.c

The conditions are:


#define LIBIDMAX 128
    if (len > LIBIDMAX)
    {
        // Attempt to compress the name
        name2 = id_compress(name, len);
 // snip
        if (len2 > LIBIDMAX)            // still too long
        {
            /* Form md5 digest of the name and store it in the
             * last 32 bytes of the name.
             */

// snip impl, open the source to see specific details




/******************************************
 * Compress an identifier.
 * Format: if ASCII, then it's just the char
 *      if high bit set, then it's a length/offset pair
 * Returns:
 *      malloc'd compressed identifier
 */

char *id_compress(char *id, int idlen)
{



The implementation, same source file, looks like it compresses by looking for longest duplicate strings and then removes them, using the offset instead.



The reason I snipped the implementations here is the backend is under a more restrictive license so I don't want to get into copying that. But with just what I've said here combined with guess+check against dmd's output it might be enough to do a clean room implementation.



Or if Walter can give us permission to copy/paste this into a D file we could use id directly.
July 16, 2013
On Tuesday, 16 July 2013 at 13:58:10 UTC, Adam D. Ruppe wrote:
> The reason I snipped the implementations here is the backend is under a more restrictive license so I don't want to get into copying that. But with just what I've said here combined with guess+check against dmd's output it might be enough to do a clean room implementation.

Thank you for the reply!

My current clean room implementation is limited to:

const(char)* mangledSymbol(alias symbol)()
{
    static assert(((symbol.mangleof) ~ "\0").length < 128, "long names won't be available in a library!");
    return ((symbol.mangleof) ~ "\0").ptr;
}

as it turned out i_didn't_need_that_descriptive_names. I'm posting it here, so maybe it'll be easier for some people to follow the currently bumpy road of DLLs in D :)