June 02, 2013
On 01-06-2013 07:41, Manu wrote:
> GCC has a non-standard extension to do this, I've used it to get great
> performance wins writing emulators. I love this feature, love to see it
> in D!

Yes, it's basically essential for high-perf interpreters. It's a feature that a systems language like D must have.

>
> On 1 Jun 2013 15:30, "Alex Rønne Petersen" <alex@lycus.org
> <mailto:alex@lycus.org>> wrote:
>
>     Hi,
>
>     I'm sure this has been brought up before, but I feel I need to bring
>     it up again (because I'm going to be writing a threaded-code
>     interpreter):
>     http://gcc.gnu.org/onlinedocs/__gcc/Labels-as-Values.html
>     <http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html>
>
>     This is an incredibly important extension. The final switch
>     statement is not a replacement because it doesn't allow the
>     programmer to store a label address directly into a code stream,
>     which is what's essential to write a threaded-code interpreter.
>
>     The Erlang folks went through hell just to use this feature; see the
>     5th Q at:
>     http://www.erlang.org/doc/__installation_guide/INSTALL-__WIN32.html#Frequently-Asked-__Questions
>     <http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions>
>
>     The idea is to be able to write code like this:
>
>     ----
>
>     import std.algorithm;
>
>     enum Op : ubyte
>     {
>          imm,
>          add,
>          sub,
>          // ...
>          ret,
>     }
>
>     final class Insn
>     {
>          Op op;
>          size_t[] args;
>          void* lbl;
>          Insn next;
>     }
>
>     final class State
>     {
>          Insn pc;
>          size_t[64] regs;
>     }
>
>     size_t interp(Insn[] code)
>     {
>          // Set up the instruction stream with label addresses
>          // the first time that it is executed. Label addresses
>          // are stable, so we only do this once.
>
>          foreach (insn; code.filter!(x => !x.lbl)())
>          {
>              void* lbl;
>
>              with (Op)
>              {
>                  final switch (insn.op)
>                  {
>                      case imm: lbl = &&handle_imm; break;
>                      case add: lbl = &&handle_add; break;
>                      case sub: lbl = &&handle_sub; break;
>                      // ...
>                      case ret: lbl = &&handle_ret; break;
>                  }
>              }
>
>              insn.lbl = lbl;
>          }
>
>          // Start interpreting the entry instruction.
>
>          auto state = new State;
>          state.pc = code[0];
>          goto *state.pc.lbl;
>
>          // Interpreter logic follows...
>
>          // The whole point is to avoid unnecessary function
>          // calls, so we use a mixin template for this stuff.
>          enum advance = "state.pc = state.pc.next;" ~
>                         "goto *state.pc.lbl;";
>
>          handle_imm:
>          {
>              state.regs[state.pc.args[0]] = state.pc.args[1];
>              mixin(advance);
>          }
>
>          handle_add:
>          {
>              state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]]
>     + state.regs[state.pc.args[2]];
>              mixin(advance);
>          }
>
>          handle_sub:
>          {
>              state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]]
>     - state.regs[state.pc.args[2]];
>              mixin(advance);
>          }
>
>          // ...
>
>          handle_ret:
>          {
>              return state.regs[state.pc.args[0]];
>          }
>
>          assert(false);
>     }
>
>     ----
>
>     Notice that there are no function calls going on. Just plain jumps
>     all the way through. This is a technique that many real world
>     interpreters use to get significant speedups, and I for one think we
>     desperately need it.
>
>     Implementing it should be trivial as both LLVM and GCC support
>     taking the address of a block. I'm sure the DMD back end could be
>     extended to support it too.
>
>     --
>     Alex Rønne Petersen
>     alex@alexrp.com <mailto:alex@alexrp.com> / alex@lycus.org
>     <mailto:alex@lycus.org>
>     http://alexrp.com / http://lycus.org
>


-- 
Alex Rønne Petersen
alex@alexrp.com / alex@lycus.org
http://alexrp.com / http://lycus.org
June 02, 2013
On 01-06-2013 15:17, Diggory wrote:
> On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote:
>> As example the very small but fast virtual machine written in GNU C++
>> from the International Contest on Functional Programming 2006:
>>
>> http://codepad.org/iibBeWKw
>>
>> It's faster than the same code without computed gotos.
>>
>> Bye,
>> bearophile
>
> Would be cool if there was a platform independent way in D to construct
> a sequence of "call" instructions in memory. Then it would be possible
> to write a JIT compiler in the exact same style as that example.

At that point, you just want a simple in-memory assembler. Think asmjit.

>
> It would be relatively easy to add to phobos although you'd have to be
> careful about DEP. Hmm, perhaps I will try writing something like this...

Probably a bit too domain-specific (not to say it isn't useful).

-- 
Alex Rønne Petersen
alex@alexrp.com / alex@lycus.org
http://alexrp.com / http://lycus.org
June 02, 2013
On 01-06-2013 11:43, bearophile wrote:
> Alex Rønne Petersen:
>
>>             final switch (insn.op)
>>             {
>>                 case imm: lbl = &&handle_imm; break;
>>                 case add: lbl = &&handle_add; break;
>>                 case sub: lbl = &&handle_sub; break;
>>                 // ...
>>                 case ret: lbl = &&handle_ret; break;
>
> Regarding the syntax, why do you use "&&"? Isn't a single "&" enough?
>
>                  case imm: lbl = &handle_imm; break;
>
> If such gotos become a natural part of the D syntax then it's not
> necessary to copy the GNU C syntax.
>
> Bye,
> bearophile

I just used the GNU C syntax because I was familiar with it. I don't particularly care how it ends up looking in D. But Timon makes a good point about namespaces.

-- 
Alex Rønne Petersen
alex@alexrp.com / alex@lycus.org
http://alexrp.com / http://lycus.org
June 02, 2013
On 01-06-2013 09:59, bearophile wrote:
> Alex Rønne Petersen:
>
>> I'm sure this has been brought up before, but I feel I need to bring
>> it up again (because I'm going to be writing a threaded-code
>> interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
>>
>> This is an incredibly important extension.
>
> This was discussed more than once, and I think it's a valuable
> improvement for certain kinds of low level D programming.
>
> Walter says that he has not added this feature to D because it's useful
> only to write interpreters and the like. I have found it useful in GCC
> to also write very fast finite state machines to analyse genomic
> strings. But even if Walter is right, writing interpreters (and
> emulators) is an important purpose for a system language as D. You can't
> write them efficient in scripting languages, and even Haskell/F# are not
> good at all to write them. Languages like C/D/C++ (and few others, as
> later probably Rust) are the only few natural languages to write them.
>
> "Recently" the Python C interpreter was modified and speed up thanks to
> this non-standard feature. CPython source code has two versions, one
> with computed gotos and one without, to compile it even if your C
> compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

>
>
>> Implementing it should be trivial as both LLVM and GCC support taking
>> the address of a block. I'm sure the DMD back end could be extended to
>> support it too.
>
> Even if DMD back-end will never implement it, I think it's important to
> define it formally and officially in the D language and add the relative
> syntax to the front-end (plus a standard version identifier that allows
> to write programs that have both a routine that uses computed gotos and
> one that doesn't use them).
>
> This has the big advantage that all future compilers that will want to
> implement it will use the same nice standard syntax. If D doesn't define
> this syntax, then maybe GDC will add it, and maybe later LDC will add
> it, and later maybe other future compilers will add it, and we can't be
> sure they will share the same syntax.
>
> Another advantage of having this syntax in D, is that even if a compiler
> back-end doesn't support computed gotos, the program compiles without
> syntax errors when you use conditional compilation to disable the piece
> of code that uses computed gotos.

Yes, good points.

>
> Bye,
> bearophile


-- 
Alex Rønne Petersen
alex@alexrp.com / alex@lycus.org
http://alexrp.com / http://lycus.org
June 02, 2013
Timon Gehr:

> D (like C) uses a different namespace for labels and symbols that are not labels.
>
> This compiles today:
>
> void main(){
>     int foo;
>     foo: auto b = &foo;
> }

On a related topic I wrote this:
http://d.puremagic.com/issues/show_bug.cgi?id=4902

Bye,
bearophile
June 02, 2013
On 6/1/13, Timon Gehr <timon.gehr@gmx.ch> wrote:
> D (like C) uses a different namespace for labels and symbols that are
> not labels.

Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.:

> void main(){
>      int foo;
>      foo: auto b = __traits(gotoAddr, foo);
> }

And if it's successful, we add language support.
June 02, 2013
On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
> On 01-06-2013 09:59, bearophile wrote:
>> "Recently" the Python C interpreter was modified and speed up thanks to
>> this non-standard feature. CPython source code has two versions, one
>> with computed gotos and one without, to compile it even if your C
>> compiler doesn't support them or their GNU-C syntax.
>
> I don't think there's any question as to the usefulness (and essentialness) of
> this feature. I'm very close to just writing most of the interpreter in C over a
> triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension.

Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.
June 02, 2013
On 02-06-2013 05:18, Andrej Mitrovic wrote:
> On 6/1/13, Timon Gehr <timon.gehr@gmx.ch> wrote:
>> D (like C) uses a different namespace for labels and symbols that are
>> not labels.
>
> Perhaps for experimenting purposes (before resorting to language
> changes), a trait could be introduced. E.g.:
>
>> void main(){
>>       int foo;
>>       foo: auto b = __traits(gotoAddr, foo);
>> }
>
> And if it's successful, we add language support.
>

You need a way to jump to an arbitrary address too.

-- 
Alex Rønne Petersen
alex@alexrp.com / alex@lycus.org
http://alexrp.com / http://lycus.org
June 02, 2013
On 02-06-2013 06:49, Walter Bright wrote:
> On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
>> On 01-06-2013 09:59, bearophile wrote:
>>> "Recently" the Python C interpreter was modified and speed up thanks to
>>> this non-standard feature. CPython source code has two versions, one
>>> with computed gotos and one without, to compile it even if your C
>>> compiler doesn't support them or their GNU-C syntax.
>>
>> I don't think there's any question as to the usefulness (and
>> essentialness) of
>> this feature. I'm very close to just writing most of the interpreter
>> in C over a
>> triviality like this.
>
> To be pedantic, C and C++ don't have that feature. Some compilers add it
> as an extension.

I know, I just don't always remember to type out "GNU C" instead of "C".

>
> Also, such a construct could not be made @safe. The trouble is you could
> pass those addresses anywhere, and goto them from anywhere.

I don't particularly care about its safety (its insanely unsafe). It's all about performance with a feature like this.

-- 
Alex Rønne Petersen
alex@alexrp.com / alex@lycus.org
http://alexrp.com / http://lycus.org
June 02, 2013
Am 02.06.2013 06:49, schrieb Walter Bright:
> On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
>> On 01-06-2013 09:59, bearophile wrote:
>>> "Recently" the Python C interpreter was modified and speed up thanks to
>>> this non-standard feature. CPython source code has two versions, one
>>> with computed gotos and one without, to compile it even if your C
>>> compiler doesn't support them or their GNU-C syntax.
>>
>> I don't think there's any question as to the usefulness (and
>> essentialness) of
>> this feature. I'm very close to just writing most of the interpreter
>> in C over a
>> triviality like this.
>
> To be pedantic, C and C++ don't have that feature. Some compilers add it
> as an extension.

I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour.

Not much people seem to care to read language standards. :)

--
Paulo