June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 01-06-2013 07:41, Manu wrote: > GCC has a non-standard extension to do this, I've used it to get great > performance wins writing emulators. I love this feature, love to see it > in D! Yes, it's basically essential for high-perf interpreters. It's a feature that a systems language like D must have. > > On 1 Jun 2013 15:30, "Alex Rønne Petersen" <alex@lycus.org > <mailto:alex@lycus.org>> wrote: > > Hi, > > I'm sure this has been brought up before, but I feel I need to bring > it up again (because I'm going to be writing a threaded-code > interpreter): > http://gcc.gnu.org/onlinedocs/__gcc/Labels-as-Values.html > <http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html> > > This is an incredibly important extension. The final switch > statement is not a replacement because it doesn't allow the > programmer to store a label address directly into a code stream, > which is what's essential to write a threaded-code interpreter. > > The Erlang folks went through hell just to use this feature; see the > 5th Q at: > http://www.erlang.org/doc/__installation_guide/INSTALL-__WIN32.html#Frequently-Asked-__Questions > <http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions> > > The idea is to be able to write code like this: > > ---- > > import std.algorithm; > > enum Op : ubyte > { > imm, > add, > sub, > // ... > ret, > } > > final class Insn > { > Op op; > size_t[] args; > void* lbl; > Insn next; > } > > final class State > { > Insn pc; > size_t[64] regs; > } > > size_t interp(Insn[] code) > { > // Set up the instruction stream with label addresses > // the first time that it is executed. Label addresses > // are stable, so we only do this once. > > foreach (insn; code.filter!(x => !x.lbl)()) > { > void* lbl; > > with (Op) > { > final switch (insn.op) > { > case imm: lbl = &&handle_imm; break; > case add: lbl = &&handle_add; break; > case sub: lbl = &&handle_sub; break; > // ... > case ret: lbl = &&handle_ret; break; > } > } > > insn.lbl = lbl; > } > > // Start interpreting the entry instruction. > > auto state = new State; > state.pc = code[0]; > goto *state.pc.lbl; > > // Interpreter logic follows... > > // The whole point is to avoid unnecessary function > // calls, so we use a mixin template for this stuff. > enum advance = "state.pc = state.pc.next;" ~ > "goto *state.pc.lbl;"; > > handle_imm: > { > state.regs[state.pc.args[0]] = state.pc.args[1]; > mixin(advance); > } > > handle_add: > { > state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] > + state.regs[state.pc.args[2]]; > mixin(advance); > } > > handle_sub: > { > state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] > - state.regs[state.pc.args[2]]; > mixin(advance); > } > > // ... > > handle_ret: > { > return state.regs[state.pc.args[0]]; > } > > assert(false); > } > > ---- > > Notice that there are no function calls going on. Just plain jumps > all the way through. This is a technique that many real world > interpreters use to get significant speedups, and I for one think we > desperately need it. > > Implementing it should be trivial as both LLVM and GCC support > taking the address of a block. I'm sure the DMD back end could be > extended to support it too. > > -- > Alex Rønne Petersen > alex@alexrp.com <mailto:alex@alexrp.com> / alex@lycus.org > <mailto:alex@lycus.org> > http://alexrp.com / http://lycus.org > -- Alex Rønne Petersen alex@alexrp.com / alex@lycus.org http://alexrp.com / http://lycus.org |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Diggory | On 01-06-2013 15:17, Diggory wrote: > On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote: >> As example the very small but fast virtual machine written in GNU C++ >> from the International Contest on Functional Programming 2006: >> >> http://codepad.org/iibBeWKw >> >> It's faster than the same code without computed gotos. >> >> Bye, >> bearophile > > Would be cool if there was a platform independent way in D to construct > a sequence of "call" instructions in memory. Then it would be possible > to write a JIT compiler in the exact same style as that example. At that point, you just want a simple in-memory assembler. Think asmjit. > > It would be relatively easy to add to phobos although you'd have to be > careful about DEP. Hmm, perhaps I will try writing something like this... Probably a bit too domain-specific (not to say it isn't useful). -- Alex Rønne Petersen alex@alexrp.com / alex@lycus.org http://alexrp.com / http://lycus.org |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 01-06-2013 11:43, bearophile wrote: > Alex Rønne Petersen: > >> final switch (insn.op) >> { >> case imm: lbl = &&handle_imm; break; >> case add: lbl = &&handle_add; break; >> case sub: lbl = &&handle_sub; break; >> // ... >> case ret: lbl = &&handle_ret; break; > > Regarding the syntax, why do you use "&&"? Isn't a single "&" enough? > > case imm: lbl = &handle_imm; break; > > If such gotos become a natural part of the D syntax then it's not > necessary to copy the GNU C syntax. > > Bye, > bearophile I just used the GNU C syntax because I was familiar with it. I don't particularly care how it ends up looking in D. But Timon makes a good point about namespaces. -- Alex Rønne Petersen alex@alexrp.com / alex@lycus.org http://alexrp.com / http://lycus.org |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 01-06-2013 09:59, bearophile wrote: > Alex Rønne Petersen: > >> I'm sure this has been brought up before, but I feel I need to bring >> it up again (because I'm going to be writing a threaded-code >> interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html >> >> This is an incredibly important extension. > > This was discussed more than once, and I think it's a valuable > improvement for certain kinds of low level D programming. > > Walter says that he has not added this feature to D because it's useful > only to write interpreters and the like. I have found it useful in GCC > to also write very fast finite state machines to analyse genomic > strings. But even if Walter is right, writing interpreters (and > emulators) is an important purpose for a system language as D. You can't > write them efficient in scripting languages, and even Haskell/F# are not > good at all to write them. Languages like C/D/C++ (and few others, as > later probably Rust) are the only few natural languages to write them. > > "Recently" the Python C interpreter was modified and speed up thanks to > this non-standard feature. CPython source code has two versions, one > with computed gotos and one without, to compile it even if your C > compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. > > >> Implementing it should be trivial as both LLVM and GCC support taking >> the address of a block. I'm sure the DMD back end could be extended to >> support it too. > > Even if DMD back-end will never implement it, I think it's important to > define it formally and officially in the D language and add the relative > syntax to the front-end (plus a standard version identifier that allows > to write programs that have both a routine that uses computed gotos and > one that doesn't use them). > > This has the big advantage that all future compilers that will want to > implement it will use the same nice standard syntax. If D doesn't define > this syntax, then maybe GDC will add it, and maybe later LDC will add > it, and later maybe other future compilers will add it, and we can't be > sure they will share the same syntax. > > Another advantage of having this syntax in D, is that even if a compiler > back-end doesn't support computed gotos, the program compiles without > syntax errors when you use conditional compilation to disable the piece > of code that uses computed gotos. Yes, good points. > > Bye, > bearophile -- Alex Rønne Petersen alex@alexrp.com / alex@lycus.org http://alexrp.com / http://lycus.org |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | Timon Gehr: > D (like C) uses a different namespace for labels and symbols that are not labels. > > This compiles today: > > void main(){ > int foo; > foo: auto b = &foo; > } On a related topic I wrote this: http://d.puremagic.com/issues/show_bug.cgi?id=4902 Bye, bearophile |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 6/1/13, Timon Gehr <timon.gehr@gmx.ch> wrote: > D (like C) uses a different namespace for labels and symbols that are > not labels. Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.: > void main(){ > int foo; > foo: auto b = __traits(gotoAddr, foo); > } And if it's successful, we add language support. |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Alex Rønne Petersen | On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
> On 01-06-2013 09:59, bearophile wrote:
>> "Recently" the Python C interpreter was modified and speed up thanks to
>> this non-standard feature. CPython source code has two versions, one
>> with computed gotos and one without, to compile it even if your C
>> compiler doesn't support them or their GNU-C syntax.
>
> I don't think there's any question as to the usefulness (and essentialness) of
> this feature. I'm very close to just writing most of the interpreter in C over a
> triviality like this.
To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension.
Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.
|
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrej Mitrovic | On 02-06-2013 05:18, Andrej Mitrovic wrote: > On 6/1/13, Timon Gehr <timon.gehr@gmx.ch> wrote: >> D (like C) uses a different namespace for labels and symbols that are >> not labels. > > Perhaps for experimenting purposes (before resorting to language > changes), a trait could be introduced. E.g.: > >> void main(){ >> int foo; >> foo: auto b = __traits(gotoAddr, foo); >> } > > And if it's successful, we add language support. > You need a way to jump to an arbitrary address too. -- Alex Rønne Petersen alex@alexrp.com / alex@lycus.org http://alexrp.com / http://lycus.org |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 02-06-2013 06:49, Walter Bright wrote: > On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: >> On 01-06-2013 09:59, bearophile wrote: >>> "Recently" the Python C interpreter was modified and speed up thanks to >>> this non-standard feature. CPython source code has two versions, one >>> with computed gotos and one without, to compile it even if your C >>> compiler doesn't support them or their GNU-C syntax. >> >> I don't think there's any question as to the usefulness (and >> essentialness) of >> this feature. I'm very close to just writing most of the interpreter >> in C over a >> triviality like this. > > To be pedantic, C and C++ don't have that feature. Some compilers add it > as an extension. I know, I just don't always remember to type out "GNU C" instead of "C". > > Also, such a construct could not be made @safe. The trouble is you could > pass those addresses anywhere, and goto them from anywhere. I don't particularly care about its safety (its insanely unsafe). It's all about performance with a feature like this. -- Alex Rønne Petersen alex@alexrp.com / alex@lycus.org http://alexrp.com / http://lycus.org |
June 02, 2013 Re: Labels as values and threaded-code interpretation | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Am 02.06.2013 06:49, schrieb Walter Bright:
> On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
>> On 01-06-2013 09:59, bearophile wrote:
>>> "Recently" the Python C interpreter was modified and speed up thanks to
>>> this non-standard feature. CPython source code has two versions, one
>>> with computed gotos and one without, to compile it even if your C
>>> compiler doesn't support them or their GNU-C syntax.
>>
>> I don't think there's any question as to the usefulness (and
>> essentialness) of
>> this feature. I'm very close to just writing most of the interpreter
>> in C over a
>> triviality like this.
>
> To be pedantic, C and C++ don't have that feature. Some compilers add it
> as an extension.
I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour.
Not much people seem to care to read language standards. :)
--
Paulo
|
Copyright © 1999-2021 by the D Language Foundation