GCC has a non-standard extension to do this, I've used it to get great performance wins writing emulators. I love this feature, love to see it in D!

On 1 Jun 2013 15:30, "Alex Rønne Petersen" <alex@lycus.org> wrote:
Hi,

I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter.

The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions

The idea is to be able to write code like this:

----

import std.algorithm;

enum Op : ubyte
{
    imm,
    add,
    sub,
    // ...
    ret,
}

final class Insn
{
    Op op;
    size_t[] args;
    void* lbl;
    Insn next;
}

final class State
{
    Insn pc;
    size_t[64] regs;
}

size_t interp(Insn[] code)
{
    // Set up the instruction stream with label addresses
    // the first time that it is executed. Label addresses
    // are stable, so we only do this once.

    foreach (insn; code.filter!(x => !x.lbl)())
    {
        void* lbl;

        with (Op)
        {
            final switch (insn.op)
            {
                case imm: lbl = &&handle_imm; break;
                case add: lbl = &&handle_add; break;
                case sub: lbl = &&handle_sub; break;
                // ...
                case ret: lbl = &&handle_ret; break;
            }
        }

        insn.lbl = lbl;
    }

    // Start interpreting the entry instruction.

    auto state = new State;
    state.pc = code[0];
    goto *state.pc.lbl;

    // Interpreter logic follows...

    // The whole point is to avoid unnecessary function
    // calls, so we use a mixin template for this stuff.
    enum advance = "state.pc = state.pc.next;" ~
                   "goto *state.pc.lbl;";

    handle_imm:
    {
        state.regs[state.pc.args[0]] = state.pc.args[1];
        mixin(advance);
    }

    handle_add:
    {
        state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] + state.regs[state.pc.args[2]];
        mixin(advance);
    }

    handle_sub:
    {
        state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] - state.regs[state.pc.args[2]];
        mixin(advance);
    }

    // ...

    handle_ret:
    {
        return state.regs[state.pc.args[0]];
    }

    assert(false);
}

----

Notice that there are no function calls going on. Just plain jumps all the way through. This is a technique that many real world interpreters use to get significant speedups, and I for one think we desperately need it.

Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too.

--
Alex Rønne Petersen
alex@alexrp.com / alex@lycus.org
http://alexrp.com / http://lycus.org