BLADE 0.2Alpha: Vector operations with mixins, expression templates, (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » BLADE 0.2Alpha: Vector operations with mixins, expression templates, (page 4)

April 09, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates,

Posted by Dave
in reply to Dan

Dave

Posted in reply to Dan

Dan wrote:
> Pragma Wrote:
>> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which have drawbacks of their own.
>>
>> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler.  Strings are no where near as flexible out of the box, but pose no storage limitations, which gives a slight edge to string-based representations of data.  With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).
> 
> Well said.  Once the AST is reflected at compile time, I'm sure the code can be made vastly more refined; and perhaps additional code generators can be developed and *hopefully* integrated into D.
> 
> Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64.  Most compilers to date don't properly use these functionalities.  Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.
> 
> *bigsmile

Hmmm - I wonder if a lot of what Don's been doing is serving as a "proof of concept" for the D 2.0 compiler, to be written in D of course <g>

(It just seems like a lot of the new D features fit so well with compiler development. Especially for things like "proving constness", etc.)

- Dave

April 09, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates,

Posted by Pragma
in reply to Dan

Pragma

Posted in reply to Dan

Dan wrote:
> Pragma Wrote:
>> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which have drawbacks of their own.
>>
>> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler.  Strings are no where near as flexible out of the box, but pose no storage limitations, which gives a slight edge to string-based representations of data.  With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).
> 
> Well said.  Once the AST is reflected at compile time, I'm sure the code can be made vastly more refined; and perhaps additional code generators can be developed and *hopefully* integrated into D.

Thanks.  But evaluating the AST will only take things so far.  There is a huge gulf of functionality that stands between where we are now (at compile time) and what D would be like as a full-on scripting language (not that I'm advocating that it be pushed that far).  Much of what sits in the middle (namely structs, multi-dimensional arrays and maps) can go a long way towards *analyzing* that tree for code generation and more.

> 
> Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64.  Most compilers to date don't properly use these functionalities.  Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.
> 

Anyway, I'm grinning too.  I can't wait to see what comes up next.  So far, converting Enki to a compile-time generator is proving to be a huge challenge, but workable.

-- 
- EricAnderton at yahoo

April 09, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm

Posted by Don Clugston
in reply to Pragma

Don Clugston

Posted in reply to Pragma

Pragma wrote:
> Bruno Medeiros wrote:
>> Pragma wrote:
>>> Don Clugston wrote:
>>>> I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at:
>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 
>>>>
>>>>
>>>> It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported.
>>>>
>>>> Compile with -version=BladeDebug to see the asm code which is generated.
>>>>
>>>> Typical usage:
>>>>
>>>> void main()
>>>> {
>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>     real d = dot(r, p+r+r);
>>>>     ireal e = dot(r, z);
>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>     d = dot(r, p+r+r);
>>>> }
>>>>
>>>> Notice that mixed-length operations (real[] + float[] - double[]) are supported.
>>>>
>>>> Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin.
>>>> To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org).
>>>>
>>>> Some observations:
>>>> * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors).
>>>> * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case.
>>>> * The major workarounds are:
>>>> - inability to use a tuple element directly from asm code (bug #1028);
>>>> - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers).
>>>> - inability to index through a tuple in a CTFE function (solved by converting types into a string).
>>>> * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable.
>>>>
>>>> Enjoy.
>>>
>>> This is a work of art Don - it's practically a compiler extension.  Nice job. :)
>>>
>>> For others that are interested in how this actually gets the job done, I found this in your documentation:
>>>
>>> * THEORY:
>>> * Expression templates are used to create an expression string of the form "(a+b*c)+d"
>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>> * This string is converted to postfix. The postfix string is converted to
>>> * a string containing x87 asm, which is then mixed into a function which accepts the tuple.
>>>
>>
>> Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?
>>
> 
> I know Don wrote this lib, but I think I can answer this.
> 
> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which have drawbacks of their own.
> 
> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler. 
Actually I don't know how to make a tree nicely with tuples.

 Strings are no where near as flexible out of the box, but
> pose no storage limitations, which gives a slight edge to string-based representations of data.  With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).

There's a couple of minor things, though: (1) in the case of generating D code to mixin, no tree is required (the string just gets mixed back in again, so it might as well stay as a string).
(2) The "separate string and tuple" structure means the expression can be optimised without needing to move any of the values around; this creates less garbage for the compiler to optimise away.
(3) Some patterns are probably easier to find with a string, than a tree.
(4) It's more natural for D coders to think in terms of arrays, than lists. (and there is much more language support).

(5) A string representation can easily express things like c=a+3*(a+b);
where 'a' appears twice. (At present, there's no way to generate it, but theoretically the compiler could provide it many cases).

So -- if trees were available, I'm not sure if I'd use them, or not.

April 09, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm

Posted by Pragma
in reply to Don Clugston

Pragma

Posted in reply to Don Clugston

Don Clugston wrote:
> Pragma wrote:
>> Bruno Medeiros wrote:
>>> Pragma wrote:
>>>> Don Clugston wrote:
>>>>> I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at:
>>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 
>>>>>
>>>>>
>>>>> It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported.
>>>>>
>>>>> Compile with -version=BladeDebug to see the asm code which is generated.
>>>>>
>>>>> Typical usage:
>>>>>
>>>>> void main()
>>>>> {
>>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>>     real d = dot(r, p+r+r);
>>>>>     ireal e = dot(r, z);
>>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>>     d = dot(r, p+r+r);
>>>>> }
>>>>>
>>>>> Notice that mixed-length operations (real[] + float[] - double[]) are supported.
>>>>>
>>>>> Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin.
>>>>> To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org).
>>>>>
>>>>> Some observations:
>>>>> * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors).
>>>>> * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case.
>>>>> * The major workarounds are:
>>>>> - inability to use a tuple element directly from asm code (bug #1028);
>>>>> - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers).
>>>>> - inability to index through a tuple in a CTFE function (solved by converting types into a string).
>>>>> * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable.
>>>>>
>>>>> Enjoy.
>>>>
>>>> This is a work of art Don - it's practically a compiler extension.  Nice job. :)
>>>>
>>>> For others that are interested in how this actually gets the job done, I found this in your documentation:
>>>>
>>>> * THEORY:
>>>> * Expression templates are used to create an expression string of the form "(a+b*c)+d"
>>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>>> * This string is converted to postfix. The postfix string is converted to
>>>> * a string containing x87 asm, which is then mixed into a function which accepts the tuple.
>>>>
>>>
>>> Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?
>>>
>>
>> I know Don wrote this lib, but I think I can answer this.
>>
>> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which have drawbacks of their own.
>>
>> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler. 
> Actually I don't know how to make a tree nicely with tuples.

Here's an example (for grins):

import std.stdio;

template Node(char[] Data,Nodes...){
	alias Data data;
	alias Nodes nodes;
}

template PrintData(char[] parent){
	const char[] PrintData = "";
}

template PrintData(char[] parent,alias Node){
	static if(parent == ""){
		const char[] PrintData = Node.data ~ "\n" ~
			PrintData!(Node.data,Node.nodes);
	}
	else{
		const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~
			PrintData!(parent ~ "." ~ Node.data,Node.nodes);
	}
}

template PrintData(char[] parent,alias Node,V...){
	const char[] PrintData = PrintData!(parent,Node) ~ PrintData!(parent,V);
}

// example "tree" structure
alias Node!("1",
	Node!("1"),
	Node!("2",
		Node!("1"),
		Node!("2")
	)
) dataset;

void main(){
	writefln("%s",PrintData!("",dataset));
}

> 
>  Strings are no where near as flexible out of the box, but
>> pose no storage limitations, which gives a slight edge to string-based representations of data.  With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).
> 
> There's a couple of minor things, though: (1) in the case of generating D code to mixin, no tree is required (the string just gets mixed back in again, so it might as well stay as a string).
> (2) The "separate string and tuple" structure means the expression can be optimised without needing to move any of the values around; this creates less garbage for the compiler to optimise away.
> (3) Some patterns are probably easier to find with a string, than a tree.
> (4) It's more natural for D coders to think in terms of arrays, than lists. (and there is much more language support).
> 
> (5) A string representation can easily express things like c=a+3*(a+b);
> where 'a' appears twice. (At present, there's no way to generate it, but theoretically the compiler could provide it many cases).
> 
> So -- if trees were available, I'm not sure if I'd use them, or not.

They're really handy.  I've found some uses for them already:

/*
ZeroOrMoreExpr
	= generate.ZeroOrMoreExpr()
	::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ] [Binding:binding ]
		["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
*/
bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout char[] tokens,inout char[] err){
	return Rule!(
		generate.ZeroOrMoreExpr,
		And!(
			Literal!("["),
			Literal!("{"),
			Bind!(Expression,"expr"),
			Literal!("}"),
			Literal!("]"),
			Optional!(
				Bind!(RangeSetExpr,"ranges")
			),			
			Optional!(
				Bind!(Binding,"binding")
			),
			Optional!(
				And!(
					Literal!("*"),
					Bind!(SubExpressionTerm,"delim")
				)
			),
			Bind!(SubExpressionTerm,"term")
		)
	)(value,bindings,tokens,err);		
}

There's a *lot* going on in this sample.  Basically what you see here is a chunk of the as-of-yet-experimental compile-time Enki parser.  This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports.

The templates evaluate to CTFE's that in turn make up the parser when it's executed.  So there's several layers of compile-time evaluation going on here.  Also, what's not obvious is that "char[] tokens" is actually an *array of strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the appropriate number of chars.  The Bind!() expressions also map parsed out data to a key/value set which is stored in a similar fashion (char[] bindings).

The goals here were to side-step the limitations of D's identifier name-length while providing a user-readable toolkit for parser composition; the only limit now is placed on the length of any given rule.  I haven't found a way to unify the compile-time and run-time parser generation yet (one backend with two targets), but I anticipate using something very similar to the above by providing a CT and RT version of the template library.  At a minimum I plan on having this consume a .bnf file at compile-time, in order to create a run-time parser via mixin.

-- 
- EricAnderton at yahoo

April 10, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates,

Posted by Paul Findlay
in reply to Dan

Paul Findlay

Posted in reply to Dan

> Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64.  Most compilers to date don't properly use these functionalities.  Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.
And its dawning on me that some text processing can take advantage of doing 64-bit/128-bit chunks at a time

 - Paul

April 10, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm

Posted by Don Clugston
in reply to Pragma

Don Clugston

Posted in reply to Pragma

Pragma wrote:
> Don Clugston wrote:
>> Pragma wrote:
>>> Bruno Medeiros wrote:
>>>> Pragma wrote:
>>>>> Don Clugston wrote:
>>>>>> I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at:
>>>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 
>>>>>>
>>>>>>
>>>>>> It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported.
>>>>>>
>>>>>> Compile with -version=BladeDebug to see the asm code which is generated.
>>>>>>
>>>>>> Typical usage:
>>>>>>
>>>>>> void main()
>>>>>> {
>>>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>>>     real d = dot(r, p+r+r);
>>>>>>     ireal e = dot(r, z);
>>>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>>>     d = dot(r, p+r+r);
>>>>>> }
>>>>>>
>>>>>> Notice that mixed-length operations (real[] + float[] - double[]) are supported.
>>>>>>
>>>>>> Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin.
>>>>>> To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org).
>>>>>>
>>>>>> Some observations:
>>>>>> * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors).
>>>>>> * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case.
>>>>>> * The major workarounds are:
>>>>>> - inability to use a tuple element directly from asm code (bug #1028);
>>>>>> - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers).
>>>>>> - inability to index through a tuple in a CTFE function (solved by converting types into a string).
>>>>>> * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable.
>>>>>>
>>>>>> Enjoy.
>>>>>
>>>>> This is a work of art Don - it's practically a compiler extension.  Nice job. :)
>>>>>
>>>>> For others that are interested in how this actually gets the job done, I found this in your documentation:
>>>>>
>>>>> * THEORY:
>>>>> * Expression templates are used to create an expression string of the form "(a+b*c)+d"
>>>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>>>> * This string is converted to postfix. The postfix string is converted to
>>>>> * a string containing x87 asm, which is then mixed into a function which accepts the tuple.
>>>>>
>>>>
>>>> Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?
>>>>
>>>
>>> I know Don wrote this lib, but I think I can answer this.
>>>
>>> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which have drawbacks of their own.
>>>
>>> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler. 
>> Actually I don't know how to make a tree nicely with tuples.
> 
> Here's an example (for grins):
> 
> import std.stdio;
> 
> template Node(char[] Data,Nodes...){
>     alias Data data;
>     alias Nodes nodes;
> }
>
> // example "tree" structure
> alias Node!("1",
>     Node!("1"),
>     Node!("2",
>         Node!("1"),
>         Node!("2")
>     )
> ) dataset;

The problem I was referring to, is: how to store both values, and functions/operators, inside the tree. It seems to get messy very quickly.

>> So -- if trees were available, I'm not sure if I'd use them, or not.
> 
> They're really handy.  I've found some uses for them already:

I meant for this application. There's no doubt they're indispensable in other contexts.

> 
> /*
> ZeroOrMoreExpr
>     = generate.ZeroOrMoreExpr()
>     ::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ] [Binding:binding ]
>         ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
> */
> bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout char[] tokens,inout char[] err){
>     return Rule!(
>         generate.ZeroOrMoreExpr,
>         And!(
>             Literal!("["),
>             Literal!("{"),
>             Bind!(Expression,"expr"),
>             Literal!("}"),
>             Literal!("]"),
>             Optional!(
>                 Bind!(RangeSetExpr,"ranges")
>             ),                       Optional!(
>                 Bind!(Binding,"binding")
>             ),
>             Optional!(
>                 And!(
>                     Literal!("*"),
>                     Bind!(SubExpressionTerm,"delim")
>                 )
>             ),
>             Bind!(SubExpressionTerm,"term")
>         )
>     )(value,bindings,tokens,err);       }
> 
> There's a *lot* going on in this sample.  Basically what you see here is a chunk of the as-of-yet-experimental compile-time Enki parser.  This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports.
> 
> The templates evaluate to CTFE's that in turn make up the parser when it's executed.  So there's several layers of compile-time evaluation going on here.  Also, what's not obvious is that "char[] tokens" is actually an *array of strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the appropriate number of chars.  The Bind!() expressions also map parsed out data to a key/value set which is stored in a similar fashion (char[] bindings).

Seriously cool! Seems like you're generating a tree of nested mixins?
Anyway, I suspect this will really benefit from any compiler improvements, eg CTFE support for AAs or nested functions. And obviously AST macros.

> The goals here were to side-step the limitations of D's identifier name-length while providing a user-readable toolkit for parser composition; the only limit now is placed on the length of any given rule.  I haven't found a way to unify the compile-time and run-time parser generation yet (one backend with two targets), but I anticipate using something very similar to the above by providing a CT and RT version of the template library.  At a minimum I plan on having this consume a .bnf file at compile-time, in order to create a run-time parser via mixin.

I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.

April 10, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm

Posted by Walter Bright
in reply to Don Clugston

Walter Bright

Posted in reply to Don Clugston

Don Clugston wrote:
> I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.

You guys gotta publish!

April 10, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm

Posted by KlausO
in reply to Pragma

KlausO

Posted in reply to Pragma

Attachments:

dpegtest.d

Pragma schrieb:
> Don Clugston wrote:
>> Pragma wrote:
>>> Bruno Medeiros wrote:
>>>> Pragma wrote:
>>>>> Don Clugston wrote:
>>>>>> I have been trying to come up with a convincing use case for the
>>>>>> new mixins (and for metaprogramming in general). My best effort to
>>>>>> date can be found at:
>>>>>> http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d
>>>>>>
>>>>>>
>>>>>> It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported.
>>>>>>
>>>>>> Compile with -version=BladeDebug to see the asm code which is generated.
>>>>>>
>>>>>> Typical usage:
>>>>>>
>>>>>> void main()
>>>>>> {
>>>>>>     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
>>>>>>     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
>>>>>>     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
>>>>>>     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
>>>>>>     real d = dot(r, p+r+r);
>>>>>>     ireal e = dot(r, z);
>>>>>>     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
>>>>>>     d = dot(r, p+r+r);
>>>>>> }
>>>>>>
>>>>>> Notice that mixed-length operations (real[] + float[] - double[]) are supported.
>>>>>>
>>>>>> Like the C++ Blitz++ library, expression templates are used to
>>>>>> convert vector expressions into efficient element-wise operations.
>>>>>> Unlike that library, however, there is no reliance on the
>>>>>> compiler's optimiser. Instead, the expression template is
>>>>>> manipulated as text, converted into postfix, and then passed to a
>>>>>> simple CTFE compile-time assembler, which creates highly efficient
>>>>>> asm code which is used as a mixin.
>>>>>> To understand the later parts of the code, you need some knowledge
>>>>>> of x87 assembler. In fact, you probably need to have read Agner
>>>>>> Fog's superb Pentium optimisation manual (www.agner.org).
>>>>>>
>>>>>> Some observations:
>>>>>> * I was amazed at how simple the expression template code is (it
>>>>>> is somewhat cluttered by the code to check for real/imaginary type
>>>>>> mismatch errors).
>>>>>> * I've often read that the x87 floating-point stack is notoriously
>>>>>> difficult for compilers to write code for, but it works quite well
>>>>>> in this case.
>>>>>> * The major workarounds are:
>>>>>> - inability to use a tuple element directly from asm code (bug
>>>>>> #1028);
>>>>>> - inability to define operators for built-in arrays (hence the use
>>>>>> of 'Vec' wrappers).
>>>>>> - inability to index through a tuple in a CTFE function (solved by
>>>>>> converting types into a string).
>>>>>> * There have been mutterings about how unhygenic/dangerous the new
>>>>>> mixins are. In this case, the mixin forms the _entire_ body of the
>>>>>> function. This is an interesting situation which I think a
>>>>>> language purist will find more palatable.
>>>>>>
>>>>>> Enjoy.
>>>>>
>>>>> This is a work of art Don - it's practically a compiler extension. Nice job. :)
>>>>>
>>>>> For others that are interested in how this actually gets the job done, I found this in your documentation:
>>>>>
>>>>> * THEORY:
>>>>> * Expression templates are used to create an expression string of
>>>>> the form "(a+b*c)+d"
>>>>> * and a tuple, the entries of which correspond to a, b, c, d, ...
>>>>> * This string is converted to postfix. The postfix string is
>>>>> converted to
>>>>> * a string containing x87 asm, which is then mixed into a function
>>>>> which accepts the tuple.
>>>>>
>>>>
>>>> Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?
>>>>
>>>
>>> I know Don wrote this lib, but I think I can answer this.
>>>
>>> In short: no.  But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time.  You're basically limited to tuples and strings, each of which have drawbacks of their own.
>>>
>>> You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler.
>> Actually I don't know how to make a tree nicely with tuples.
> 
> Here's an example (for grins):
> 
> import std.stdio;
> 
> template Node(char[] Data,Nodes...){
>     alias Data data;
>     alias Nodes nodes;
> }
> 
> template PrintData(char[] parent){
>     const char[] PrintData = "";
> }
> 
> template PrintData(char[] parent,alias Node){
>     static if(parent == ""){
>         const char[] PrintData = Node.data ~ "\n" ~
>             PrintData!(Node.data,Node.nodes);
>     }
>     else{
>         const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~
>             PrintData!(parent ~ "." ~ Node.data,Node.nodes);
>     }
> }
> 
> template PrintData(char[] parent,alias Node,V...){
>     const char[] PrintData = PrintData!(parent,Node) ~
> PrintData!(parent,V);
> }
> 
> // example "tree" structure
> alias Node!("1",
>     Node!("1"),
>     Node!("2",
>         Node!("1"),
>         Node!("2")
>     )
> ) dataset;
> 
> void main(){
>     writefln("%s",PrintData!("",dataset));
> }
> 
>>
>>  Strings are no where near as flexible out of the box, but
>>> pose no storage limitations, which gives a slight edge to string-based representations of data.  With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).
>>
>> There's a couple of minor things, though: (1) in the case of
>> generating D code to mixin, no tree is required (the string just gets
>> mixed back in again, so it might as well stay as a string).
>> (2) The "separate string and tuple" structure means the expression can
>> be optimised without needing to move any of the values around; this
>> creates less garbage for the compiler to optimise away.
>> (3) Some patterns are probably easier to find with a string, than a tree.
>> (4) It's more natural for D coders to think in terms of arrays, than
>> lists. (and there is much more language support).
>>
>> (5) A string representation can easily express things like c=a+3*(a+b);
>> where 'a' appears twice. (At present, there's no way to generate it,
>> but theoretically the compiler could provide it many cases).
>>
>> So -- if trees were available, I'm not sure if I'd use them, or not.
> 
> They're really handy.  I've found some uses for them already:
> 
> /*
> ZeroOrMoreExpr
>     = generate.ZeroOrMoreExpr()
>     ::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ]
> [Binding:binding ]
>         ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
> */
> bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout
> char[] tokens,inout char[] err){
>     return Rule!(
>         generate.ZeroOrMoreExpr,
>         And!(
>             Literal!("["),
>             Literal!("{"),
>             Bind!(Expression,"expr"),
>             Literal!("}"),
>             Literal!("]"),
>             Optional!(
>                 Bind!(RangeSetExpr,"ranges")
>             ),
>             Optional!(
>                 Bind!(Binding,"binding")
>             ),
>             Optional!(
>                 And!(
>                     Literal!("*"),
>                     Bind!(SubExpressionTerm,"delim")
>                 )
>             ),
>             Bind!(SubExpressionTerm,"term")
>         )
>     )(value,bindings,tokens,err);
> }
> 
> There's a *lot* going on in this sample.  Basically what you see here is a chunk of the as-of-yet-experimental compile-time Enki parser.  This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports.
> 
> The templates evaluate to CTFE's that in turn make up the parser when it's executed.  So there's several layers of compile-time evaluation going on here.  Also, what's not obvious is that "char[] tokens" is actually an *array of strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the appropriate number of chars.  The Bind!() expressions also map parsed out data to a key/value set which is stored in a similar fashion (char[] bindings).
> 
> The goals here were to side-step the limitations of D's identifier name-length while providing a user-readable toolkit for parser composition; the only limit now is placed on the length of any given rule.  I haven't found a way to unify the compile-time and run-time parser generation yet (one backend with two targets), but I anticipate using something very similar to the above by providing a CT and RT version of the template library.  At a minimum I plan on having this consume a .bnf file at compile-time, in order to create a run-time parser via mixin.
> 

Hey pragma,

really cool, I've come up with a similar structure while experimenting
with a PEG parser in D (see attachment)
after I've read this article series on
Codeproject:

http://www.codeproject.com/cpp/crafting_interpreter_p1.asp http://www.codeproject.com/cpp/crafting_interpreter_p2.asp http://www.codeproject.com/cpp/crafting_interpreter_p3.asp

The template system of D does an awesome job in keeping templated PEG grammars readable.

BTW: If you turn Enki into a PEG style parser I definitely throw
my attempts into the dustbin :-)
Greets

KlausO

April 10, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm

Posted by Don Clugston
in reply to Walter Bright

Don Clugston

Posted in reply to Walter Bright

Walter Bright wrote:
> Don Clugston wrote:
>> I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.
> 
> You guys gotta publish!

I've begun a draft. A historical question for you --

On this page,
http://www.artima.com/cppsource/top_cpp_software.html
Scott Meyers says that g++ was the first compiler to generate native code. But I thought Zortech was older than g++. Is that correct?

April 10, 2007

Re: BLADE 0.2Alpha: Vector operations with mixins, expression templates, and asm

Posted by Pragma
in reply to Walter Bright

Pragma

Posted in reply to Walter Bright

Walter Bright wrote:
> Don Clugston wrote:
>> I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.
> 
> You guys gotta publish!

Baby steps Walter, baby steps... ;)

-- 
- EricAnderton at yahoo

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation