View mode: basic / threaded / horizontal-split · Log in · Help
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates,
Paul Findlay Wrote:

> > Particular generators that spark my interest tend to have to do with x87,
> > SSE extensions, and x86-64.  Most compilers to date don't properly use
> > these functionalities.  Having them integrated into D Agner Fog-optimally
> > would hugely replace alot of C++ gaming, video, rendering and graphics
> > engine code.
> And its dawning on me that some text processing can take advantage of doing
> 64-bit/128-bit chunks at a time

Yeah, pretty much any large (>1kb) buffer copy algorithm runs fastest through SSE2 ever since 128-bit came out.  The raw power overcomes the need for tweaking the loop instead of just using rep movsd.

I think the asm guys have the best code we could hope for already written up on a few pages.  Agner Fog, Paul Hsieh et al.  If we could leech their strategy (if meaning legal) then I can see a good use for such a thing.  I guess we just ought to wait for AST reflection.
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Don Clugston wrote:
> The problem I was referring to, is: how to store both values, and 
> functions/operators, inside the tree. It seems to get messy very quickly.

It can, especially with the additional code you need to treat all those templates as specialized data types.

> I meant for this application. There's no doubt they're indispensable in 
> other contexts.

Oh, my bad.  Yea, it would probably be overkill for BLADE. :)

>> Basically what you see here
>> is a chunk of the as-of-yet-experimental compile-time Enki parser.  
>> This piece parses the Zero-or-more expression part of the EBNF variant 
>> that Enki supports.
>>
>> The templates evaluate to CTFE's that in turn make up the parser when 
>> it's executed.  So there's several layers of compile-time evaluation 
>> going on here.  Also, what's not obvious is that "char[] tokens" is 
>> actually an *array of strings* that is stored in a single char[] 
>> array; each string's length data is encoded as a size_t mapped onto 
>> the appropriate number of chars.  The Bind!() expressions also map 
>> parsed out data to a key/value set which is stored in a similar 
>> fashion (char[] bindings).
> 
> Seriously cool! Seems like you're generating a tree of nested mixins?

Almost.  The generate.ZeroOrMore returns an arbitrary string, that may be another "map" or a chunk of runtime code; the 
value is placed in the "value" string that's passed in.  The *rootmost* generator is what compiles all this stuff into 
an actual chunk of mixin-able code.  It's analogous to how the current rendition of Enki works.

It seems overkill, but it's needed so I can do some basic semantic analysis, and do things like declare binding vars. 
I'm still looking for ways to simplify the process.

> Anyway, I suspect this will really benefit from any compiler 
> improvements, eg CTFE support for AAs or nested functions. And obviously 
> AST macros.

Definitely for AA's but not so much for AST macros - I get the impression that AST manipulation will only be useful for 
D code, and not for completely arbitrary grammars like EBNF.  Getting compile-time AA support would cut Enki's code size 
down by almost a third:

const char[] hackedArray = "\x00\x00\x00\x05hello\x00\x00\x00\x05cruel\x00\x00\x00\x05world";

Under the hood, this is what most of my data looks like.  Dropping the support routines for manipulating such structures 
would help make things a lot less cumbersome. :(

> I really think the new mixins + CTFE was a breakthrough. Publish some of 
> this stuff, and I think we'll see an exodus of many Boost developers 
> into D.

Agreed.  CTFE frees us from many limitations imposed by templates, and the added flexibility of mixin() pretty much 
gives us the same strength of javascript's eval() statement at compile-time.  It's a huge deal.

It's possible that we might find a handful of interested devs in their camp, but I think we're still back to the same 
old problem as with the rest of C++: momentum.  I sincerely doubt we'll see a mass exodus from one camp to this one 
regardless of how good a job is done here.  Of course, I'll be happy to be wrong about that. :)
-- 
- EricAnderton at yahoo
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
KlausO wrote:
> 
> Hey pragma,
> 
> really cool, I've come up with a similar structure while experimenting
> with a PEG parser in D (see attachment)
> after I've read this article series on
> Codeproject:
> 
> http://www.codeproject.com/cpp/crafting_interpreter_p1.asp
> http://www.codeproject.com/cpp/crafting_interpreter_p2.asp
> http://www.codeproject.com/cpp/crafting_interpreter_p3.asp
> 
> The template system of D does an awesome job in keeping
> templated PEG grammars readable.

Yes it does!  Thanks for posting this - I didn't even know that article was there.

> BTW: If you turn Enki into a PEG style parser I definitely throw
> my attempts into the dustbin :-)
> Greets

Wow, that's one heck of an endorsement.  Thanks, but don't throw anything out yet.  This rendition of Enki is still a 
ways off though.  FYI I plan on keeping Enki's internals as human-readable as possible by keeping it self-hosting.  So 
there'll be two ways to utilize the toolkit: EBNF coding and "by hand".

> alias   Action!(
>           And!(
>             PlusRepeat!(EmailChar),
>             Char!('@'), 
>             PlusRepeat!(
>               And!(
>                 Or!(
>                   In!(
>                     Range!('a', 'z'),
>                     Range!('A', 'Z'),
>                     Range!('0', '9'),
>                     Char!('_'),
>                     Char!('%'),
>                     Char!('-')
>                   ),
>                   Char!('.')
>                 ),
>                 Not!(EmailSuffix)
>               )
>             ),
>             Char!('.'),
>             EmailSuffix
>           ),
>           delegate void(char[] email) { writefln("<email:", email, ">"); }
>         )
>         Email;

I had an earlier cut that looked a lot like this. :)  But there's a very subtle problem lurking in there.  By making 
your entire grammar one monster-sized template instance, you'll run into DMD's identifier-length limit *fast*.  As a 
result it failed when I tried to transcribe Enki's ENBF definition.  That's why I wrap each rule as a CTFE-style 
function as it side-steps that issue rather nicely, without generating too many warts.

-- 
- EricAnderton at yahoo
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Pragma schrieb:
> KlausO wrote:
>>
>> Hey pragma,
>>
>> really cool, I've come up with a similar structure while experimenting
>> with a PEG parser in D (see attachment)
>> after I've read this article series on
>> Codeproject:
>>
>> http://www.codeproject.com/cpp/crafting_interpreter_p1.asp
>> http://www.codeproject.com/cpp/crafting_interpreter_p2.asp
>> http://www.codeproject.com/cpp/crafting_interpreter_p3.asp
>>
>> The template system of D does an awesome job in keeping
>> templated PEG grammars readable.
> 
> Yes it does!  Thanks for posting this - I didn't even know that article 
> was there.
> 
>> BTW: If you turn Enki into a PEG style parser I definitely throw
>> my attempts into the dustbin :-)
>> Greets
> 
> Wow, that's one heck of an endorsement.  Thanks, but don't throw 
> anything out yet.  This rendition of Enki is still a ways off though.  
> FYI I plan on keeping Enki's internals as human-readable as possible by 
> keeping it self-hosting.  So there'll be two ways to utilize the 
> toolkit: EBNF coding and "by hand".

Nice to hear that I hit your taste :-)

> 
>> alias   Action!(
>>           And!(
>>             PlusRepeat!(EmailChar),
>>             Char!('@'),             PlusRepeat!(
>>               And!(
>>                 Or!(
>>                   In!(
>>                     Range!('a', 'z'),
>>                     Range!('A', 'Z'),
>>                     Range!('0', '9'),
>>                     Char!('_'),
>>                     Char!('%'),
>>                     Char!('-')
>>                   ),
>>                   Char!('.')
>>                 ),
>>                 Not!(EmailSuffix)
>>               )
>>             ),
>>             Char!('.'),
>>             EmailSuffix
>>           ),
>>           delegate void(char[] email) { writefln("<email:", email, 
>> ">"); }
>>         )
>>         Email;
> 
> I had an earlier cut that looked a lot like this. :)  But there's a very 
> subtle problem lurking in there.  By making your entire grammar one 
> monster-sized template instance, you'll run into DMD's identifier-length 
> limit *fast*.  As a result it failed when I tried to transcribe Enki's 
> ENBF definition.  That's why I wrap each rule as a CTFE-style function 
> as it side-steps that issue rather nicely, without generating too many 
> warts.
> 

Another issue I ran into is circular template dependencies. You could 
get nasty error messages like

dpeg.d(282): Error: forward reference to 
'And!(Alternative,OptRepeat!(And!(WS,Char!('/'),WS,Alternative)),WS)'

Any idea how they could be resolved ?


FYI:

Other good PEG references:
http://pdos.csail.mit.edu/~baford/packrat/

Cat is a open source interpreter which utilizes a rule based parser in C#
http://www.codeproject.com/csharp/cat.asp
very interesting had been
http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks

C++ templated parsers
http://www.codeproject.com/cpp/yard-xml-parser.asp
http://www.codeproject.com/cpp/biscuit.asp
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
KlausO wrote:
> Pragma schrieb:
>> KlausO wrote:
>>
>>> alias   Action!(
>>>           And!(
>>>             PlusRepeat!(EmailChar),
>>>             Char!('@'),             PlusRepeat!(
>>>               And!(
>>>                 Or!(
>>>                   In!(
>>>                     Range!('a', 'z'),
>>>                     Range!('A', 'Z'),
>>>                     Range!('0', '9'),
>>>                     Char!('_'),
>>>                     Char!('%'),
>>>                     Char!('-')
>>>                   ),
>>>                   Char!('.')
>>>                 ),
>>>                 Not!(EmailSuffix)
>>>               )
>>>             ),
>>>             Char!('.'),
>>>             EmailSuffix
>>>           ),
>>>           delegate void(char[] email) { writefln("<email:", email, 
>>> ">"); }
>>>         )
>>>         Email;
>>
>> I had an earlier cut that looked a lot like this. :)  But there's a 
>> very subtle problem lurking in there.  By making your entire grammar 
>> one monster-sized template instance, you'll run into DMD's 
>> identifier-length limit *fast*.  As a result it failed when I tried to 
>> transcribe Enki's ENBF definition.  That's why I wrap each rule as a 
>> CTFE-style function as it side-steps that issue rather nicely, without 
>> generating too many warts.
>>
> 
> Another issue I ran into is circular template dependencies. You could 
> get nasty error messages like
> 
> dpeg.d(282): Error: forward reference to 
> 'And!(Alternative,OptRepeat!(And!(WS,Char!('/'),WS,Alternative)),WS)'
> 
> Any idea how they could be resolved ?

Yep, that one bit me too; just use the CTFE trick I mentioned.  Wrapping each rule fixes this since DMD can resolve 
forward references to functions, but not with template instances.

> 
> FYI:
> 
> Other good PEG references:
> http://pdos.csail.mit.edu/~baford/packrat/
> 
> Cat is a open source interpreter which utilizes a rule based parser in C#
> http://www.codeproject.com/csharp/cat.asp
> very interesting had been
> http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks
> 
> C++ templated parsers
> http://www.codeproject.com/cpp/yard-xml-parser.asp
> http://www.codeproject.com/cpp/biscuit.asp

Thanks.  I can always use a little more research to lean on.

-- 
- EricAnderton at yahoo
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Don Clugston wrote:
> I've begun a draft. A historical question for you --
> 
> On this page,
> http://www.artima.com/cppsource/top_cpp_software.html
> Scott Meyers says that g++ was the first compiler to generate native 
> code. But I thought Zortech was older than g++. Is that correct?

Depends on how you look at it. g++ was first 'released' in December, 
1987. But the release notes for it say: "The GNU C++ Compiler is still 
in test release, and is NOT ready for everyday use" so I don't consider 
that a real release. Oregon Software released their native C++ compiler 
(not for the PC) in Jan or Feb 1988, nobody seems to recall exactly. 
Zortech's first release was in Jun 1988. Michael Tiemann, author of g++, 
calls the Sep 1988 release the "first really stable version."
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Pragma wrote:
> By making your entire grammar one 
> monster-sized template instance, you'll run into DMD's identifier-length 
> limit *fast*.  As a result it failed when I tried to transcribe Enki's 
> ENBF definition.  That's why I wrap each rule as a CTFE-style function 
> as it side-steps that issue rather nicely, without generating too many 
> warts.

One of the motivations for CTFE was to address that exact problem.
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
> KlausO wrote:
> 
>> Pragma schrieb:
>>>
>>> I had an earlier cut that looked a lot like this. :)  But there's a 
>>> very subtle problem lurking in there.  By making your entire grammar 
>>> one monster-sized template instance, you'll run into DMD's 
>>> identifier-length limit *fast*.  As a result it failed when I tried 
>>> to transcribe Enki's ENBF definition.  That's why I wrap each rule as 
>>> a CTFE-style function as it side-steps that issue rather nicely, 
>>> without generating too many warts.
>>>
>>
>> Another issue I ran into is circular template dependencies. You could 
>> get nasty error messages like
>>


The way my dparse sidesteps this (both the id length limit and forward 
references) is to have the parser functions specialized only on the name 
of the reduction, the grammar is carried in an outer scope (the mixed in 
template that is never used as an identifier). This also allows the 
inner template to specialize on anything that is defined.
Next ›   Last »
1 2 3 4 5
Top | Discussion index | About this forum | D home