View mode: basic / threaded / horizontal-split · Log in · Help
April 09, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates,
Dan wrote:
> Pragma Wrote:
>> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what 
>> kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which 
>> have drawbacks of their own.
>>
>> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier 
>> length built into the compiler.  Strings are no where near as flexible out of the box, but pose no storage limitations, 
>> which gives a slight edge to string-based representations of data.  With some clever coding, strings can be made to 
>> store just about any structure imaginable (even if it does make for some ugly code).
> 
> Well said.  Once the AST is reflected at compile time, I'm sure the code can be made vastly more refined; and perhaps additional code generators can be developed and *hopefully* integrated into D.
> 
> Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64.  Most compilers to date don't properly use these functionalities.  Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.
> 
> *bigsmile

Hmmm - I wonder if a lot of what Don's been doing is serving as a "proof of concept" for the D 2.0 
compiler, to be written in D of course <g>

(It just seems like a lot of the new D features fit so well with compiler development. Especially 
for things like "proving constness", etc.)

- Dave
April 09, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates,
Dan wrote:
> Pragma Wrote:
>> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what 
>> kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which 
>> have drawbacks of their own.
>>
>> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier 
>> length built into the compiler.  Strings are no where near as flexible out of the box, but pose no storage limitations, 
>> which gives a slight edge to string-based representations of data.  With some clever coding, strings can be made to 
>> store just about any structure imaginable (even if it does make for some ugly code).
> 
> Well said.  Once the AST is reflected at compile time, I'm sure the code can be made vastly more refined; and perhaps additional code generators can be developed and *hopefully* integrated into D.

Thanks.  But evaluating the AST will only take things so far.  There is a huge gulf of functionality that stands between 
where we are now (at compile time) and what D would be like as a full-on scripting language (not that I'm advocating 
that it be pushed that far).  Much of what sits in the middle (namely structs, multi-dimensional arrays and maps) can go 
a long way towards *analyzing* that tree for code generation and more.

> 
> Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64.  Most compilers to date don't properly use these functionalities.  Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.
> 

Anyway, I'm grinning too.  I can't wait to see what comes up next.  So far, converting Enki to a compile-time generator 
is proving to be a huge challenge, but workable.

-- 
- EricAnderton at yahoo
April 09, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Pragma wrote:
> Bruno Medeiros wrote:
>> Pragma wrote:
>>> Don Clugston wrote:
>>>> I have been trying to come up with a convincing use case for the new 
>>>> mixins (and for metaprogramming in general). My best effort to date 
>>>> can be found at:
>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 
>>>>
>>>>
>>>> It generates near-optimal x87 asm code for BLAS1-style basic vector 
>>>> operations. 32, 64 and 80 bit vectors are all supported.
>>>>
>>>> Compile with -version=BladeDebug to see the asm code which is 
>>>> generated.
>>>>
>>>> Typical usage:
>>>>
>>>> void main()
>>>> {
>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>     real d = dot(r, p+r+r);
>>>>     ireal e = dot(r, z);
>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>     d = dot(r, p+r+r);
>>>> }
>>>>
>>>> Notice that mixed-length operations (real[] + float[] - double[]) 
>>>> are supported.
>>>>
>>>> Like the C++ Blitz++ library, expression templates are used to 
>>>> convert vector expressions into efficient element-wise operations. 
>>>> Unlike that library, however, there is no reliance on the compiler's 
>>>> optimiser. Instead, the expression template is manipulated as text, 
>>>> converted into postfix, and then passed to a simple CTFE 
>>>> compile-time assembler, which creates highly efficient asm code 
>>>> which is used as a mixin.
>>>> To understand the later parts of the code, you need some knowledge 
>>>> of x87 assembler. In fact, you probably need to have read Agner 
>>>> Fog's superb Pentium optimisation manual (www.agner.org).
>>>>
>>>> Some observations:
>>>> * I was amazed at how simple the expression template code is (it is 
>>>> somewhat cluttered by the code to check for real/imaginary type 
>>>> mismatch errors).
>>>> * I've often read that the x87 floating-point stack is notoriously 
>>>> difficult for compilers to write code for, but it works quite well 
>>>> in this case.
>>>> * The major workarounds are:
>>>> - inability to use a tuple element directly from asm code (bug #1028);
>>>> - inability to define operators for built-in arrays (hence the use 
>>>> of 'Vec' wrappers).
>>>> - inability to index through a tuple in a CTFE function (solved by 
>>>> converting types into a string).
>>>> * There have been mutterings about how unhygenic/dangerous the new 
>>>> mixins are. In this case, the mixin forms the _entire_ body of the 
>>>> function. This is an interesting situation which I think a language 
>>>> purist will find more palatable.
>>>>
>>>> Enjoy.
>>>
>>> This is a work of art Don - it's practically a compiler extension.  
>>> Nice job. :)
>>>
>>> For others that are interested in how this actually gets the job 
>>> done, I found this in your documentation:
>>>
>>> * THEORY:
>>> * Expression templates are used to create an expression string of the 
>>> form "(a+b*c)+d"
>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>> * This string is converted to postfix. The postfix string is 
>>> converted to
>>> * a string containing x87 asm, which is then mixed into a function 
>>> which accepts the tuple.
>>>
>>
>> Hum, a minor question, is a string representation of the expressions 
>> better (easier to use, manipulate, etc.) than a tree representation?
>>
> 
> I know Don wrote this lib, but I think I can answer this.
> 
> In short: no.  But it's really an unintentionally loaded question: there 
> are some rather severe limitations as to what kind of data structures 
> you can create at compile time.  You're basically limited to tuples and 
> strings, each of which have drawbacks of their own.
> 
> You can create a tree using tuples, but then you run into problems with 
> over-running the limitations on identifier length built into the 
> compiler. 
Actually I don't know how to make a tree nicely with tuples.

 Strings are no where near as flexible out of the box, but
> pose no storage limitations, which gives a slight edge to string-based 
> representations of data.  With some clever coding, strings can be made 
> to store just about any structure imaginable (even if it does make for 
> some ugly code).

There's a couple of minor things, though: (1) in the case of generating 
D code to mixin, no tree is required (the string just gets mixed back in 
again, so it might as well stay as a string).
(2) The "separate string and tuple" structure means the expression can 
be optimised without needing to move any of the values around; this 
creates less garbage for the compiler to optimise away.
(3) Some patterns are probably easier to find with a string, than a tree.
(4) It's more natural for D coders to think in terms of arrays, than 
lists. (and there is much more language support).

(5) A string representation can easily express things like c=a+3*(a+b);
where 'a' appears twice. (At present, there's no way to generate it, but 
theoretically the compiler could provide it many cases).

So -- if trees were available, I'm not sure if I'd use them, or not.
April 09, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Don Clugston wrote:
> Pragma wrote:
>> Bruno Medeiros wrote:
>>> Pragma wrote:
>>>> Don Clugston wrote:
>>>>> I have been trying to come up with a convincing use case for the 
>>>>> new mixins (and for metaprogramming in general). My best effort to 
>>>>> date can be found at:
>>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 
>>>>>
>>>>>
>>>>> It generates near-optimal x87 asm code for BLAS1-style basic vector 
>>>>> operations. 32, 64 and 80 bit vectors are all supported.
>>>>>
>>>>> Compile with -version=BladeDebug to see the asm code which is 
>>>>> generated.
>>>>>
>>>>> Typical usage:
>>>>>
>>>>> void main()
>>>>> {
>>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>>     real d = dot(r, p+r+r);
>>>>>     ireal e = dot(r, z);
>>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>>     d = dot(r, p+r+r);
>>>>> }
>>>>>
>>>>> Notice that mixed-length operations (real[] + float[] - double[]) 
>>>>> are supported.
>>>>>
>>>>> Like the C++ Blitz++ library, expression templates are used to 
>>>>> convert vector expressions into efficient element-wise operations. 
>>>>> Unlike that library, however, there is no reliance on the 
>>>>> compiler's optimiser. Instead, the expression template is 
>>>>> manipulated as text, converted into postfix, and then passed to a 
>>>>> simple CTFE compile-time assembler, which creates highly efficient 
>>>>> asm code which is used as a mixin.
>>>>> To understand the later parts of the code, you need some knowledge 
>>>>> of x87 assembler. In fact, you probably need to have read Agner 
>>>>> Fog's superb Pentium optimisation manual (www.agner.org).
>>>>>
>>>>> Some observations:
>>>>> * I was amazed at how simple the expression template code is (it is 
>>>>> somewhat cluttered by the code to check for real/imaginary type 
>>>>> mismatch errors).
>>>>> * I've often read that the x87 floating-point stack is notoriously 
>>>>> difficult for compilers to write code for, but it works quite well 
>>>>> in this case.
>>>>> * The major workarounds are:
>>>>> - inability to use a tuple element directly from asm code (bug #1028);
>>>>> - inability to define operators for built-in arrays (hence the use 
>>>>> of 'Vec' wrappers).
>>>>> - inability to index through a tuple in a CTFE function (solved by 
>>>>> converting types into a string).
>>>>> * There have been mutterings about how unhygenic/dangerous the new 
>>>>> mixins are. In this case, the mixin forms the _entire_ body of the 
>>>>> function. This is an interesting situation which I think a language 
>>>>> purist will find more palatable.
>>>>>
>>>>> Enjoy.
>>>>
>>>> This is a work of art Don - it's practically a compiler extension.  
>>>> Nice job. :)
>>>>
>>>> For others that are interested in how this actually gets the job 
>>>> done, I found this in your documentation:
>>>>
>>>> * THEORY:
>>>> * Expression templates are used to create an expression string of 
>>>> the form "(a+b*c)+d"
>>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>>> * This string is converted to postfix. The postfix string is 
>>>> converted to
>>>> * a string containing x87 asm, which is then mixed into a function 
>>>> which accepts the tuple.
>>>>
>>>
>>> Hum, a minor question, is a string representation of the expressions 
>>> better (easier to use, manipulate, etc.) than a tree representation?
>>>
>>
>> I know Don wrote this lib, but I think I can answer this.
>>
>> In short: no.  But it's really an unintentionally loaded question: 
>> there are some rather severe limitations as to what kind of data 
>> structures you can create at compile time.  You're basically limited 
>> to tuples and strings, each of which have drawbacks of their own.
>>
>> You can create a tree using tuples, but then you run into problems 
>> with over-running the limitations on identifier length built into the 
>> compiler. 
> Actually I don't know how to make a tree nicely with tuples.

Here's an example (for grins):

import std.stdio;

template Node(char[] Data,Nodes...){
	alias Data data;
	alias Nodes nodes;
}

template PrintData(char[] parent){
	const char[] PrintData = "";
}

template PrintData(char[] parent,alias Node){
	static if(parent == ""){
		const char[] PrintData = Node.data ~ "\n" ~
			PrintData!(Node.data,Node.nodes);
	}
	else{
		const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~
			PrintData!(parent ~ "." ~ Node.data,Node.nodes);
	}
}

template PrintData(char[] parent,alias Node,V...){
	const char[] PrintData = PrintData!(parent,Node) ~ PrintData!(parent,V);
}

// example "tree" structure
alias Node!("1",
	Node!("1"),
	Node!("2",
		Node!("1"),
		Node!("2")
	)
) dataset;

void main(){
	writefln("%s",PrintData!("",dataset));
}

> 
>  Strings are no where near as flexible out of the box, but
>> pose no storage limitations, which gives a slight edge to string-based 
>> representations of data.  With some clever coding, strings can be made 
>> to store just about any structure imaginable (even if it does make for 
>> some ugly code).
> 
> There's a couple of minor things, though: (1) in the case of generating 
> D code to mixin, no tree is required (the string just gets mixed back in 
> again, so it might as well stay as a string).
> (2) The "separate string and tuple" structure means the expression can 
> be optimised without needing to move any of the values around; this 
> creates less garbage for the compiler to optimise away.
> (3) Some patterns are probably easier to find with a string, than a tree.
> (4) It's more natural for D coders to think in terms of arrays, than 
> lists. (and there is much more language support).
> 
> (5) A string representation can easily express things like c=a+3*(a+b);
> where 'a' appears twice. (At present, there's no way to generate it, but 
> theoretically the compiler could provide it many cases).
> 
> So -- if trees were available, I'm not sure if I'd use them, or not.

They're really handy.  I've found some uses for them already:

/*
ZeroOrMoreExpr
	= generate.ZeroOrMoreExpr()
	::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ] [Binding:binding ]
		["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
*/
bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout char[] tokens,inout char[] err){
	return Rule!(
		generate.ZeroOrMoreExpr,
		And!(
			Literal!("["),
			Literal!("{"),
			Bind!(Expression,"expr"),
			Literal!("}"),
			Literal!("]"),
			Optional!(
				Bind!(RangeSetExpr,"ranges")
			),			
			Optional!(
				Bind!(Binding,"binding")
			),
			Optional!(
				And!(
					Literal!("*"),
					Bind!(SubExpressionTerm,"delim")
				)
			),
			Bind!(SubExpressionTerm,"term")
		)
	)(value,bindings,tokens,err);		
}

There's a *lot* going on in this sample.  Basically what you see here is a chunk of the as-of-yet-experimental 
compile-time Enki parser.  This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports.

The templates evaluate to CTFE's that in turn make up the parser when it's executed.  So there's several layers of 
compile-time evaluation going on here.  Also, what's not obvious is that "char[] tokens" is actually an *array of 
strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the 
appropriate number of chars.  The Bind!() expressions also map parsed out data to a key/value set which is stored in a 
similar fashion (char[] bindings).

The goals here were to side-step the limitations of D's identifier name-length while providing a user-readable toolkit 
for parser composition; the only limit now is placed on the length of any given rule.  I haven't found a way to unify 
the compile-time and run-time parser generation yet (one backend with two targets), but I anticipate using something 
very similar to the above by providing a CT and RT version of the template library.  At a minimum I plan on having this 
consume a .bnf file at compile-time, in order to create a run-time parser via mixin.

-- 
- EricAnderton at yahoo
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates,
> Particular generators that spark my interest tend to have to do with x87,
> SSE extensions, and x86-64.  Most compilers to date don't properly use
> these functionalities.  Having them integrated into D Agner Fog-optimally
> would hugely replace alot of C++ gaming, video, rendering and graphics
> engine code.
And its dawning on me that some text processing can take advantage of doing
64-bit/128-bit chunks at a time

- Paul
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Pragma wrote:
> Don Clugston wrote:
>> Pragma wrote:
>>> Bruno Medeiros wrote:
>>>> Pragma wrote:
>>>>> Don Clugston wrote:
>>>>>> I have been trying to come up with a convincing use case for the 
>>>>>> new mixins (and for metaprogramming in general). My best effort to 
>>>>>> date can be found at:
>>>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 
>>>>>>
>>>>>>
>>>>>> It generates near-optimal x87 asm code for BLAS1-style basic 
>>>>>> vector operations. 32, 64 and 80 bit vectors are all supported.
>>>>>>
>>>>>> Compile with -version=BladeDebug to see the asm code which is 
>>>>>> generated.
>>>>>>
>>>>>> Typical usage:
>>>>>>
>>>>>> void main()
>>>>>> {
>>>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>>>     real d = dot(r, p+r+r);
>>>>>>     ireal e = dot(r, z);
>>>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>>>     d = dot(r, p+r+r);
>>>>>> }
>>>>>>
>>>>>> Notice that mixed-length operations (real[] + float[] - double[]) 
>>>>>> are supported.
>>>>>>
>>>>>> Like the C++ Blitz++ library, expression templates are used to 
>>>>>> convert vector expressions into efficient element-wise operations. 
>>>>>> Unlike that library, however, there is no reliance on the 
>>>>>> compiler's optimiser. Instead, the expression template is 
>>>>>> manipulated as text, converted into postfix, and then passed to a 
>>>>>> simple CTFE compile-time assembler, which creates highly efficient 
>>>>>> asm code which is used as a mixin.
>>>>>> To understand the later parts of the code, you need some knowledge 
>>>>>> of x87 assembler. In fact, you probably need to have read Agner 
>>>>>> Fog's superb Pentium optimisation manual (www.agner.org).
>>>>>>
>>>>>> Some observations:
>>>>>> * I was amazed at how simple the expression template code is (it 
>>>>>> is somewhat cluttered by the code to check for real/imaginary type 
>>>>>> mismatch errors).
>>>>>> * I've often read that the x87 floating-point stack is notoriously 
>>>>>> difficult for compilers to write code for, but it works quite well 
>>>>>> in this case.
>>>>>> * The major workarounds are:
>>>>>> - inability to use a tuple element directly from asm code (bug 
>>>>>> #1028);
>>>>>> - inability to define operators for built-in arrays (hence the use 
>>>>>> of 'Vec' wrappers).
>>>>>> - inability to index through a tuple in a CTFE function (solved by 
>>>>>> converting types into a string).
>>>>>> * There have been mutterings about how unhygenic/dangerous the new 
>>>>>> mixins are. In this case, the mixin forms the _entire_ body of the 
>>>>>> function. This is an interesting situation which I think a 
>>>>>> language purist will find more palatable.
>>>>>>
>>>>>> Enjoy.
>>>>>
>>>>> This is a work of art Don - it's practically a compiler extension.  
>>>>> Nice job. :)
>>>>>
>>>>> For others that are interested in how this actually gets the job 
>>>>> done, I found this in your documentation:
>>>>>
>>>>> * THEORY:
>>>>> * Expression templates are used to create an expression string of 
>>>>> the form "(a+b*c)+d"
>>>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>>>> * This string is converted to postfix. The postfix string is 
>>>>> converted to
>>>>> * a string containing x87 asm, which is then mixed into a function 
>>>>> which accepts the tuple.
>>>>>
>>>>
>>>> Hum, a minor question, is a string representation of the expressions 
>>>> better (easier to use, manipulate, etc.) than a tree representation?
>>>>
>>>
>>> I know Don wrote this lib, but I think I can answer this.
>>>
>>> In short: no.  But it's really an unintentionally loaded question: 
>>> there are some rather severe limitations as to what kind of data 
>>> structures you can create at compile time.  You're basically limited 
>>> to tuples and strings, each of which have drawbacks of their own.
>>>
>>> You can create a tree using tuples, but then you run into problems 
>>> with over-running the limitations on identifier length built into the 
>>> compiler. 
>> Actually I don't know how to make a tree nicely with tuples.
> 
> Here's an example (for grins):
> 
> import std.stdio;
> 
> template Node(char[] Data,Nodes...){
>     alias Data data;
>     alias Nodes nodes;
> }
>
> // example "tree" structure
> alias Node!("1",
>     Node!("1"),
>     Node!("2",
>         Node!("1"),
>         Node!("2")
>     )
> ) dataset;

The problem I was referring to, is: how to store both values, and 
functions/operators, inside the tree. It seems to get messy very quickly.

>> So -- if trees were available, I'm not sure if I'd use them, or not.
> 
> They're really handy.  I've found some uses for them already:

I meant for this application. There's no doubt they're indispensable in 
other contexts.

> 
> /*
> ZeroOrMoreExpr
>     = generate.ZeroOrMoreExpr()
>     ::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ] 
> [Binding:binding ]
>         ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
> */
> bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout 
> char[] tokens,inout char[] err){
>     return Rule!(
>         generate.ZeroOrMoreExpr,
>         And!(
>             Literal!("["),
>             Literal!("{"),
>             Bind!(Expression,"expr"),
>             Literal!("}"),
>             Literal!("]"),
>             Optional!(
>                 Bind!(RangeSetExpr,"ranges")
>             ),           
>             Optional!(
>                 Bind!(Binding,"binding")
>             ),
>             Optional!(
>                 And!(
>                     Literal!("*"),
>                     Bind!(SubExpressionTerm,"delim")
>                 )
>             ),
>             Bind!(SubExpressionTerm,"term")
>         )
>     )(value,bindings,tokens,err);       
> }
> 
> There's a *lot* going on in this sample.  Basically what you see here is 
> a chunk of the as-of-yet-experimental compile-time Enki parser.  This 
> piece parses the Zero-or-more expression part of the EBNF variant that 
> Enki supports.
> 
> The templates evaluate to CTFE's that in turn make up the parser when 
> it's executed.  So there's several layers of compile-time evaluation 
> going on here.  Also, what's not obvious is that "char[] tokens" is 
> actually an *array of strings* that is stored in a single char[] array; 
> each string's length data is encoded as a size_t mapped onto the 
> appropriate number of chars.  The Bind!() expressions also map parsed 
> out data to a key/value set which is stored in a similar fashion (char[] 
> bindings).

Seriously cool! Seems like you're generating a tree of nested mixins?
Anyway, I suspect this will really benefit from any compiler 
improvements, eg CTFE support for AAs or nested functions. And obviously 
AST macros.

> The goals here were to side-step the limitations of D's identifier 
> name-length while providing a user-readable toolkit for parser 
> composition; the only limit now is placed on the length of any given 
> rule.  I haven't found a way to unify the compile-time and run-time 
> parser generation yet (one backend with two targets), but I anticipate 
> using something very similar to the above by providing a CT and RT 
> version of the template library.  At a minimum I plan on having this 
> consume a .bnf file at compile-time, in order to create a run-time 
> parser via mixin.

I really think the new mixins + CTFE was a breakthrough. Publish some of 
this stuff, and I think we'll see an exodus of many Boost developers into D.
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Don Clugston wrote:
> I really think the new mixins + CTFE was a breakthrough. Publish some of 
> this stuff, and I think we'll see an exodus of many Boost developers 
> into D.

You guys gotta publish!
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Pragma schrieb:
> Don Clugston wrote:
>> Pragma wrote:
>>> Bruno Medeiros wrote:
>>>> Pragma wrote:
>>>>> Don Clugston wrote:
>>>>>> I have been trying to come up with a convincing use case for the 
>>>>>> new mixins (and for metaprogramming in general). My best effort to 
>>>>>> date can be found at:
>>>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 
>>>>>>
>>>>>>
>>>>>> It generates near-optimal x87 asm code for BLAS1-style basic 
>>>>>> vector operations. 32, 64 and 80 bit vectors are all supported.
>>>>>>
>>>>>> Compile with -version=BladeDebug to see the asm code which is 
>>>>>> generated.
>>>>>>
>>>>>> Typical usage:
>>>>>>
>>>>>> void main()
>>>>>> {
>>>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>>>     real d = dot(r, p+r+r);
>>>>>>     ireal e = dot(r, z);
>>>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>>>     d = dot(r, p+r+r);
>>>>>> }
>>>>>>
>>>>>> Notice that mixed-length operations (real[] + float[] - double[]) 
>>>>>> are supported.
>>>>>>
>>>>>> Like the C++ Blitz++ library, expression templates are used to 
>>>>>> convert vector expressions into efficient element-wise operations. 
>>>>>> Unlike that library, however, there is no reliance on the 
>>>>>> compiler's optimiser. Instead, the expression template is 
>>>>>> manipulated as text, converted into postfix, and then passed to a 
>>>>>> simple CTFE compile-time assembler, which creates highly efficient 
>>>>>> asm code which is used as a mixin.
>>>>>> To understand the later parts of the code, you need some knowledge 
>>>>>> of x87 assembler. In fact, you probably need to have read Agner 
>>>>>> Fog's superb Pentium optimisation manual (www.agner.org).
>>>>>>
>>>>>> Some observations:
>>>>>> * I was amazed at how simple the expression template code is (it 
>>>>>> is somewhat cluttered by the code to check for real/imaginary type 
>>>>>> mismatch errors).
>>>>>> * I've often read that the x87 floating-point stack is notoriously 
>>>>>> difficult for compilers to write code for, but it works quite well 
>>>>>> in this case.
>>>>>> * The major workarounds are:
>>>>>> - inability to use a tuple element directly from asm code (bug 
>>>>>> #1028);
>>>>>> - inability to define operators for built-in arrays (hence the use 
>>>>>> of 'Vec' wrappers).
>>>>>> - inability to index through a tuple in a CTFE function (solved by 
>>>>>> converting types into a string).
>>>>>> * There have been mutterings about how unhygenic/dangerous the new 
>>>>>> mixins are. In this case, the mixin forms the _entire_ body of the 
>>>>>> function. This is an interesting situation which I think a 
>>>>>> language purist will find more palatable.
>>>>>>
>>>>>> Enjoy.
>>>>>
>>>>> This is a work of art Don - it's practically a compiler extension.  
>>>>> Nice job. :)
>>>>>
>>>>> For others that are interested in how this actually gets the job 
>>>>> done, I found this in your documentation:
>>>>>
>>>>> * THEORY:
>>>>> * Expression templates are used to create an expression string of 
>>>>> the form "(a+b*c)+d"
>>>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>>>> * This string is converted to postfix. The postfix string is 
>>>>> converted to
>>>>> * a string containing x87 asm, which is then mixed into a function 
>>>>> which accepts the tuple.
>>>>>
>>>>
>>>> Hum, a minor question, is a string representation of the expressions 
>>>> better (easier to use, manipulate, etc.) than a tree representation?
>>>>
>>>
>>> I know Don wrote this lib, but I think I can answer this.
>>>
>>> In short: no.  But it's really an unintentionally loaded question: 
>>> there are some rather severe limitations as to what kind of data 
>>> structures you can create at compile time.  You're basically limited 
>>> to tuples and strings, each of which have drawbacks of their own.
>>>
>>> You can create a tree using tuples, but then you run into problems 
>>> with over-running the limitations on identifier length built into the 
>>> compiler. 
>> Actually I don't know how to make a tree nicely with tuples.
> 
> Here's an example (for grins):
> 
> import std.stdio;
> 
> template Node(char[] Data,Nodes...){
>     alias Data data;
>     alias Nodes nodes;
> }
> 
> template PrintData(char[] parent){
>     const char[] PrintData = "";
> }
> 
> template PrintData(char[] parent,alias Node){
>     static if(parent == ""){
>         const char[] PrintData = Node.data ~ "\n" ~
>             PrintData!(Node.data,Node.nodes);
>     }
>     else{
>         const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~
>             PrintData!(parent ~ "." ~ Node.data,Node.nodes);
>     }
> }
> 
> template PrintData(char[] parent,alias Node,V...){
>     const char[] PrintData = PrintData!(parent,Node) ~ 
> PrintData!(parent,V);
> }
> 
> // example "tree" structure
> alias Node!("1",
>     Node!("1"),
>     Node!("2",
>         Node!("1"),
>         Node!("2")
>     )
> ) dataset;
> 
> void main(){
>     writefln("%s",PrintData!("",dataset));
> }
> 
>>
>>  Strings are no where near as flexible out of the box, but
>>> pose no storage limitations, which gives a slight edge to 
>>> string-based representations of data.  With some clever coding, 
>>> strings can be made to store just about any structure imaginable 
>>> (even if it does make for some ugly code).
>>
>> There's a couple of minor things, though: (1) in the case of 
>> generating D code to mixin, no tree is required (the string just gets 
>> mixed back in again, so it might as well stay as a string).
>> (2) The "separate string and tuple" structure means the expression can 
>> be optimised without needing to move any of the values around; this 
>> creates less garbage for the compiler to optimise away.
>> (3) Some patterns are probably easier to find with a string, than a tree.
>> (4) It's more natural for D coders to think in terms of arrays, than 
>> lists. (and there is much more language support).
>>
>> (5) A string representation can easily express things like c=a+3*(a+b);
>> where 'a' appears twice. (At present, there's no way to generate it, 
>> but theoretically the compiler could provide it many cases).
>>
>> So -- if trees were available, I'm not sure if I'd use them, or not.
> 
> They're really handy.  I've found some uses for them already:
> 
> /*
> ZeroOrMoreExpr
>     = generate.ZeroOrMoreExpr()
>     ::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ] 
> [Binding:binding ]
>         ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
> */
> bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout 
> char[] tokens,inout char[] err){
>     return Rule!(
>         generate.ZeroOrMoreExpr,
>         And!(
>             Literal!("["),
>             Literal!("{"),
>             Bind!(Expression,"expr"),
>             Literal!("}"),
>             Literal!("]"),
>             Optional!(
>                 Bind!(RangeSetExpr,"ranges")
>             ),           
>             Optional!(
>                 Bind!(Binding,"binding")
>             ),
>             Optional!(
>                 And!(
>                     Literal!("*"),
>                     Bind!(SubExpressionTerm,"delim")
>                 )
>             ),
>             Bind!(SubExpressionTerm,"term")
>         )
>     )(value,bindings,tokens,err);       
> }
> 
> There's a *lot* going on in this sample.  Basically what you see here is 
> a chunk of the as-of-yet-experimental compile-time Enki parser.  This 
> piece parses the Zero-or-more expression part of the EBNF variant that 
> Enki supports.
> 
> The templates evaluate to CTFE's that in turn make up the parser when 
> it's executed.  So there's several layers of compile-time evaluation 
> going on here.  Also, what's not obvious is that "char[] tokens" is 
> actually an *array of strings* that is stored in a single char[] array; 
> each string's length data is encoded as a size_t mapped onto the 
> appropriate number of chars.  The Bind!() expressions also map parsed 
> out data to a key/value set which is stored in a similar fashion (char[] 
> bindings).
> 
> The goals here were to side-step the limitations of D's identifier 
> name-length while providing a user-readable toolkit for parser 
> composition; the only limit now is placed on the length of any given 
> rule.  I haven't found a way to unify the compile-time and run-time 
> parser generation yet (one backend with two targets), but I anticipate 
> using something very similar to the above by providing a CT and RT 
> version of the template library.  At a minimum I plan on having this 
> consume a .bnf file at compile-time, in order to create a run-time 
> parser via mixin.
> 

Hey pragma,

really cool, I've come up with a similar structure while experimenting
with a PEG parser in D (see attachment)
after I've read this article series on
Codeproject:

http://www.codeproject.com/cpp/crafting_interpreter_p1.asp
http://www.codeproject.com/cpp/crafting_interpreter_p2.asp
http://www.codeproject.com/cpp/crafting_interpreter_p3.asp

The template system of D does an awesome job in keeping
templated PEG grammars readable.

BTW: If you turn Enki into a PEG style parser I definitely throw
my attempts into the dustbin :-)
Greets

KlausO
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Walter Bright wrote:
> Don Clugston wrote:
>> I really think the new mixins + CTFE was a breakthrough. Publish some 
>> of this stuff, and I think we'll see an exodus of many Boost 
>> developers into D.
> 
> You guys gotta publish!

I've begun a draft. A historical question for you --

On this page,
http://www.artima.com/cppsource/top_cpp_software.html
Scott Meyers says that g++ was the first compiler to generate native 
code. But I thought Zortech was older than g++. Is that correct?
April 10, 2007
Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm
Walter Bright wrote:
> Don Clugston wrote:
>> I really think the new mixins + CTFE was a breakthrough. Publish some 
>> of this stuff, and I think we'll see an exodus of many Boost 
>> developers into D.
> 
> You guys gotta publish!

Baby steps Walter, baby steps... ;)

-- 
- EricAnderton at yahoo
1 2 3 4 5
Top | Discussion index | About this forum | D home