compile-time regex redux (page 7) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » compile-time regex redux (page 7)

February 09, 2007

Re: compile-time regex redux

Posted by janderson
in reply to Walter Bright

janderson

Posted in reply to Walter Bright

Walter Bright wrote:
> Walter Bright wrote:
>> kris wrote:
>>> Surely some of the others long-term concerns, such as solid debugging support, simmering code/dataseg bloat, lib support for templates, etc, etc, should deserve full attention instead? Surely that is a more successful approach to getting D adopted in the marketplace?
>>
>> Those are all extremely important, too.
> 
> I wish to add that if you look at the changelog, the bread and butter issues (see the list of bugs fixed) get a solid share of attention.

Personally I think you've got a good balance going.  A lot of bug fixes and a few cool features to keep peoples interest in D is a good way to go.  I appreciate that.

You've gotta really love working on D to have kept up this solid pace for so long.

-Joel

February 09, 2007

Re: compile-time regex redux

Posted by Andrei Alexandrescu (See Website For Email)
in reply to janderson

Andrei Alexandrescu (See Website For Email)

Posted in reply to janderson

janderson wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Bill Baxter wrote:
>>
>> Templates already do that, albeit with a slightly odd syntax. But stay tuned, Walter is eyeing $ as the prefix to denote compile-time variables, and sure enough, compile-time functions will then emerge naturally :o).
>>
>>
>> Andrei
> 
> While its good that Walter is considering compile-time variable.  I don't see why you need a symbol $ inside a template.  Of course if your going to use them out side then you do.  I think template code could look almost the same as normal code, which would make it much more writable/readable and reusable.

I think the same. We need to convince Walter :o).

Andrei

February 09, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by Tom S
in reply to Sean Kelly

Tom S

Posted in reply to Sean Kelly

First of all, I totally agree with Sean.

Using the compiler for preprocessing is not a step in the right direction. It only gives delusional benefits, while in fact being a source of many problems.

Let's face it, the new mixin stuff is a handy tool for generating few-line snippets that the current template system cannot handle. Some would argue that it's already too much like the C preprocessor, but IMO, it's fine.
But going further that way and extending the compiler into a general purpose pluggable text processor isn't anything that will make the language any more powerful. Not to mention that a standard-compliant D compiler was meant to be simple...

When the compiler is used for the processing of a DSL, it simply masks a simple step that an external tool would do. It's not much of a problem to run an external script or program, that will read the DSL and output D code, while perhaps also doing other stuff, connecting with databases or making coffee. But when this is moved to the compiler, security problems arise, code becomes more cryptic and suddenly, the D code generated from the DSL cannot be simply accessed. It's simply produced by the 'compiler extension' and given further into compilation. A standalone tool will produce a .d module, which can be further verified, processed by other tools - such as one that generates reflection data - and when something breaks, one can step into the generated source, review it and easier spot errors in it. Sean also mentioned that the DSL processor will probably need some diagnostic output, and simple pragma(msg) and static assert simply won't cut it.

Therefore, I'd like to see a case when a compile-time DSL processing is going to be really useful in the real world, as to provoke further complication of the compiler and its usage patterns.

The other aspect I observed in the discussions following the new dmd release, is that folks are suggesting writing full language parsers, D preprocessors and various sort of operations on complex languages, notably extended-D parsing... This may sound weird in my lips, but that's clearly abuse. What these people really want is a way to extend the language's syntax, pretty much as Nemerle does it. And frankly, if D is going to be a more powerful language, built-in text preprocessing won't cut it. Full fledged macro support and syntax extension mechanics are something that we should look at.

--
Tomasz Stachowiak

Sean Kelly wrote:
> What I've done in the past is manage the entire schema, stored procedures and all, in a modeling system like ErWin.  From there I'll dump the lot to a series of scripts which are then applied to the DB. In this case, the DSL would be the intermediate query files, though parsing the complete SQL query syntax (since the files include transactions, etc), sounds sub-optimal.  I suppose a peripheral data format would perhaps be more appropriate for generating code based on the static representation of a DB schema.  UML perhaps?
> 
> My only concern here is that the process seems confusing and unwieldy: manage the schema in one tool, dump the data description in a meta-language to a file, and then have template code in the application parse that file during compilation to generate code.  Each of these translation points creates a potential for failure, and the process and code risks being incomprehensible and unmanageable for new employees.
> 
> Since you've established that the schema for large systems changes only rarely and that the changes are a careful and deliberate process, is it truly the best approach to attempt to automate code changes in this way?  I would think that a well-designed application or interface library could be modified manually in concert with the schema changes to produce the same result, and with a verifiable audit trail to boot.
> 
> Alternately, if the process were truly to be automated, it seems preferable to generate D code directly from the schema management application or via a standalone tool operating on the intermediate data rather than in preprocessor code during compilation.  This approach would give much more informative error messages, and the process could be easily tracked and debugged.
> 
> Please note that I'm not criticizing in-language DSL parsing as a general idea so much as questioning whether this is truly the best example for the usefulness of such a feature.

February 10, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by Andrei Alexandrescu (See Website For Email)
in reply to Tom S

Andrei Alexandrescu (See Website For Email)

Posted in reply to Tom S

Tom S wrote:
> When the compiler is used for the processing of a DSL, it simply masks a simple step that an external tool would do. It's not much of a problem to run an external script or program, that will read the DSL and output D code, while perhaps also doing other stuff, connecting with databases or making coffee.

This is a misrepresentation. Code generation with external tools has been done forever; it is never easy (unless all you need is a table of logarithms), and it always incurs a non-amortized cost of parsing the DSL _plus_ the host language. Look at lex and yacc, the prototypical examples. They aren't small or simple nor perfectly integrated with the host language. And their DSL is extremely well understood. That's why there's no proliferation of lex&yacc-like tools for other DSLs (I seem to recall there was an embedded SQL that got lost in the noise) simply because the code generator would basically have to rewrite a significant part of the compiler to do anything interesting.

Even lex and yacc are often dropped in favor of Xpressive and Spirit, which, for all their odd syntax, are 100% integrated with the host language, which allows writing fully expressive code without fear that the tool won't understand this or won't recognize that. People have gone at amazing lengths to stay within the language, and guess why - because within the language you're immersed in the environment that your DSL lives in.

Reducing all the issue to the mythical external code generator that does it all and make coffee is simplistic. Proxy/stub generators for remote procedure calls were always an absolute pain to deal with; now compilers do it automatically, because they can. Understanding that that door can, and should, be opened to the programmer is an essential step in appreciating the power of metacode.

> But when this is moved to the compiler, security problems arise, code becomes more cryptic and suddenly, the D code generated from the DSL cannot be simply accessed. It's simply produced by the 'compiler extension' and given further into compilation. A standalone tool will produce a .d module, which can be further verified, processed by other tools - such as one that generates reflection data - and when something breaks, one can step into the generated source, review it and easier spot errors in it. Sean also mentioned that the DSL processor will probably need some diagnostic output, and simple pragma(msg) and static assert simply won't cut it.

I don't see this as a strong argument. Tools can get better, no question about that. But their current defects should be estimated only having the potential power in mind. Heck, nobody would have bought the first lightbulb or the first automobile. They sucked.

> Therefore, I'd like to see a case when a compile-time DSL processing is going to be really useful in the real world, as to provoke further complication of the compiler and its usage patterns.

I can't parse this sentence. Did you mean "as opposed to provoking" instead of "as to provoke"?

> The other aspect I observed in the discussions following the new dmd release, is that folks are suggesting writing full language parsers, D preprocessors and various sort of operations on complex languages, notably extended-D parsing... This may sound weird in my lips, but that's clearly abuse. What these people really want is a way to extend the language's syntax, pretty much as Nemerle does it. And frankly, if D is going to be a more powerful language, built-in text preprocessing won't cut it. Full fledged macro support and syntax extension mechanics are something that we should look at.

This is a misunderstanding. The syntax is not to be extended. It stays fixed, and that is arguably a good thing. The semantics become more flexible. For example, they will make it easy to write a matrix operation:

A = (I - B) * C + D

and generate highly performant code from it. (There are many reasons for which that's way harder than it looks.)

I think there is a lot of apprehension and misunderstanding surrounding what metacode is able and supposed to do or simplify. Please, let's focus on understanding _before_ forming an opinion.

Andrei

February 10, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by Bill Baxter
in reply to Andrei Alexandrescu (See Website For Email)

Bill Baxter

Posted in reply to Andrei Alexandrescu (See Website For Email)

Andrei Alexandrescu (See Website For Email) wrote:
> Tom S wrote:

>> Therefore, I'd like to see a case when a compile-time DSL processing is going to be really useful in the real world, as to provoke further complication of the compiler and its usage patterns.
> 
> I can't parse this sentence. Did you mean "as opposed to provoking" instead of "as to provoke"?

I think he just meant he wants to see some real world examples that justify the additional complexity that will be added to the compiler and language.

>> The other aspect I observed in the discussions following the new dmd release, is that folks are suggesting writing full language parsers, D preprocessors and various sort of operations on complex languages, notably extended-D parsing... This may sound weird in my lips, but that's clearly abuse. What these people really want is a way to extend the language's syntax, pretty much as Nemerle does it. And frankly, if D is going to be a more powerful language, built-in text preprocessing won't cut it. Full fledged macro support and syntax extension mechanics are something that we should look at.
> 
> This is a misunderstanding. The syntax is not to be extended. It stays fixed, and that is arguably a good thing. The semantics become more flexible. For example, they will make it easy to write a matrix operation:
> 
> A = (I - B) * C + D
> 
> and generate highly performant code from it. (There are many reasons for which that's way harder than it looks.)

This is one thing I haven't really understood in the discussion.  How do the current proposals help that case?  From what I'm getting you're going to have to write every statement like the above as something like:

mixin {
   ProcessMatrixExpr!( "A = (I - B) * C + D;" );
}

How do you get from this mixin/string processing stuff to an API that I might actually be willing to use for common operations?

> I think there is a lot of apprehension and misunderstanding surrounding what metacode is able and supposed to do or simplify. Please, let's focus on understanding _before_ forming an opinion.

I think that's exactly what Tom's getting at.  He's asking for examples of how this would make like better for him and others.  I think given your background you take it for granted that metaprogramming is the future.  But D is attracts folks from all kinds of walks of life, because it promises to be a kinder, gentler C++.  So some people here aren't even sure why D needs templates at all.  Fortran doesn't have 'em after all.  And Java just barely does.

Anyway, I think it would help get everyone on board if some specific and useful examples were given of how this solves real problems (and no I don't really consider black and white holes as solving real problems.  I couldn't even find any non-astronomical uses of the terms in a google search.)

For instance, it would be nice to see some more concrete discussion about
* the "rails" case.
* the X = A*B + C matrix/vector expressions case.
* the case of generating bindings to scripting langauges / ORB stubs
* the Spirit/parser generator case

So here's my take on vector expressions since that's the only one I know anything about.

*Problem statement*:
Make the expression A=(I-B)*C+D efficient, where the variables are large vectors (I'll leave out matrices for now).

*Why it's hard*:
The difficulty is that (ignoring SSE instruction etc) the most efficient way to compute that is do all operations component-wise.  So instead of computing I-B then multiplying by C, you compute
    A[i] = (I[i]-B[i])*C[i]+D[i];
for each i.  This eliminates the need to allocate large intermediate vectors.

*Existing solutions*:
  Expression templates in C++, e.g. The Blitz++ library.  Instead of making opSub in I-B return a new Vector object, you make opSub return an ExpressionTemplate object.  This is a little template struct that contains a reference to I and to B, and knows how to subtract the two in a component-wise manner.  The types of I and B are template parameters, LeftT and RightT. Its interface also allows it to be treated just like a Vector.  You can add a vector to it, subtract a Vector from it etc.

Now we go and try to multiply that result times C.  The result of that is a new MultExpressionTemplate with two paramters, the LeftT being our previous SubExpressionTemplate!(Vector,Vector) and the RightT being Vector.

Proceeding on in this way eventually the result of the math is of type:

AddExpressionTemplate!(
   MultExpressionTemplate!(
      SubExpressionTemplate!(Vector,Vector),
      Vector),
   Vector)

And you can see that we basically have a parse tree expressed as nested templates.  The final trick is that a Vector.opAssign that takes an ExpressionTemplate is provided and that method calls a method of the expression template to finally trigger the calculation, like expr.eval(this).  eval() has the top-level loop over the components of the answer.

*Why Existing solutions are insufficient*
For that all that effort to actually be useful the compiler has to be pretty agressive about inlining everything so that in the end all the temporary template structs and function calls go away and you're just left with one eval call.  It can be tricky to get the code into just the right configuration so that the compiler will do the inlining.  And even then results will depend on the quality of the compiler.  My attempts using MSVC++ several years ago always came out to be slower than the naive version, but with about 10x the amount of code.  The code is also pretty difficult to follow because of all those different types of expression templates that have to be created.

If you include matrices things are even trickier because there are special cases like "A*B+a*C" that can be computed efficiently by a single optimized routine.  You'd like to recognize such cases and turn them into single calls to that fast routine.  You might also want to recognize "A*B+C" as a special case of that with a==1.

*How it could be improved*:
??? this is what I'd like to see explained better.

--bb

February 10, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by kris
in reply to Andrei Alexandrescu (See Website For Email)

kris

Posted in reply to Andrei Alexandrescu (See Website For Email)

Andrei Alexandrescu (See Website For Email) wrote:
> Tom S wrote:
> 
>> When the compiler is used for the processing of a DSL, it simply masks a simple step that an external tool would do. It's not much of a problem to run an external script or program, that will read the DSL and output D code, while perhaps also doing other stuff, connecting with databases or making coffee.
> 
> 
> This is a misrepresentation. Code generation with external tools has been done forever; it is never easy (unless all you need is a table of logarithms), and it always incurs a non-amortized cost of parsing the DSL _plus_ the host language. Look at lex and yacc, the prototypical examples. They aren't small or simple nor perfectly integrated with the host language. And their DSL is extremely well understood. That's why there's no proliferation of lex&yacc-like tools for other DSLs (I seem to recall there was an embedded SQL that got lost in the noise) simply because the code generator would basically have to rewrite a significant part of the compiler to do anything interesting.

Are you stating that D will address all these concerns? Without any detrimental side-effects?

[snip]

> I think there is a lot of apprehension and misunderstanding surrounding what metacode is able and supposed to do or simplify. Please, let's focus on understanding _before_ forming an opinion.

If that's the case, then perhaps it's due to a lack of solid & practical examples for people to examine? There's been at least two requests recently for an example of how this could help DeRailed in a truly practical sense, yet both of those requests appear to have been ignored thus far.

I suspect that such practical examples would help everyone understand since, as you suggest, there appears to be "differences" in perspective? Since Walter brough RoR up, and you apparently endorsed his point, perhaps one of you might enlighten us via those relevant examples?

There's a request in the original post on "The DeRailed Challenge" for just such an example ... don't feel overtly obliged; but it might do something to offset the misunderstanding you believe is prevalent.

- Kris

February 10, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by Bill Baxter
in reply to Bill Baxter

Bill Baxter

Posted in reply to Bill Baxter

Bill Baxter wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Tom S wrote:
> 
>>> Therefore, I'd like to see a case when a compile-time DSL processing is going to be really useful in the real world, as to provoke further complication of the compiler and its usage patterns.
>>
>> I can't parse this sentence. Did you mean "as opposed to provoking" instead of "as to provoke"?
> 
> I think he just meant he wants to see some real world examples that justify the additional complexity that will be added to the compiler and language.
> 
>>> The other aspect I observed in the discussions following the new dmd release, is that folks are suggesting writing full language parsers, D preprocessors and various sort of operations on complex languages, notably extended-D parsing... This may sound weird in my lips, but that's clearly abuse. What these people really want is a way to extend the language's syntax, pretty much as Nemerle does it. And frankly, if D is going to be a more powerful language, built-in text preprocessing won't cut it. Full fledged macro support and syntax extension mechanics are something that we should look at.
>>
>> This is a misunderstanding. The syntax is not to be extended. It stays fixed, and that is arguably a good thing. The semantics become more flexible. For example, they will make it easy to write a matrix operation:
>>
>> A = (I - B) * C + D
>>
>> and generate highly performant code from it. (There are many reasons for which that's way harder than it looks.)
> 
> This is one thing I haven't really understood in the discussion.  How do the current proposals help that case?  From what I'm getting you're going to have to write every statement like the above as something like:
> 
> mixin {
>    ProcessMatrixExpr!( "A = (I - B) * C + D;" );
> }
> 
> How do you get from this mixin/string processing stuff to an API that I might actually be willing to use for common operations?
> 
> 
>> I think there is a lot of apprehension and misunderstanding surrounding what metacode is able and supposed to do or simplify. Please, let's focus on understanding _before_ forming an opinion.
> 
> I think that's exactly what Tom's getting at.  He's asking for examples of how this would make like better for him and others.  I think given your background you take it for granted that metaprogramming is the future.  But D is attracts folks from all kinds of walks of life, because it promises to be a kinder, gentler C++.  So some people here aren't even sure why D needs templates at all.  Fortran doesn't have 'em after all.  And Java just barely does.
> 
> Anyway, I think it would help get everyone on board if some specific and useful examples were given of how this solves real problems (and no I don't really consider black and white holes as solving real problems.  I couldn't even find any non-astronomical uses of the terms in a google search.)
> 
> For instance, it would be nice to see some more concrete discussion about
> * the "rails" case.
> * the X = A*B + C matrix/vector expressions case.
> * the case of generating bindings to scripting langauges / ORB stubs
> * the Spirit/parser generator case
> 
> So here's my take on vector expressions since that's the only one I know anything about.
> 
> *Problem statement*:
> Make the expression A=(I-B)*C+D efficient, where the variables are large vectors (I'll leave out matrices for now).
> 
> *Why it's hard*:
> The difficulty is that (ignoring SSE instruction etc) the most efficient way to compute that is do all operations component-wise.  So instead of computing I-B then multiplying by C, you compute
>     A[i] = (I[i]-B[i])*C[i]+D[i];
> for each i.  This eliminates the need to allocate large intermediate vectors.
> 
> *Existing solutions*:
>   Expression templates in C++, e.g. The Blitz++ library.  Instead of making opSub in I-B return a new Vector object, you make opSub return an ExpressionTemplate object.  This is a little template struct that contains a reference to I and to B, and knows how to subtract the two in a component-wise manner.  The types of I and B are template parameters, LeftT and RightT. Its interface also allows it to be treated just like a Vector.  You can add a vector to it, subtract a Vector from it etc.
> 
> Now we go and try to multiply that result times C.  The result of that is a new MultExpressionTemplate with two paramters, the LeftT being our previous SubExpressionTemplate!(Vector,Vector) and the RightT being Vector.
> 
> Proceeding on in this way eventually the result of the math is of type:
> 
> AddExpressionTemplate!(
>    MultExpressionTemplate!(
>       SubExpressionTemplate!(Vector,Vector),
>       Vector),
>    Vector)
> 
> And you can see that we basically have a parse tree expressed as nested templates.  The final trick is that a Vector.opAssign that takes an ExpressionTemplate is provided and that method calls a method of the expression template to finally trigger the calculation, like expr.eval(this).  eval() has the top-level loop over the components of the answer.
> 
> *Why Existing solutions are insufficient*
> For that all that effort to actually be useful the compiler has to be pretty agressive about inlining everything so that in the end all the temporary template structs and function calls go away and you're just left with one eval call.  It can be tricky to get the code into just the right configuration so that the compiler will do the inlining.  And even then results will depend on the quality of the compiler.  My attempts using MSVC++ several years ago always came out to be slower than the naive version, but with about 10x the amount of code.  The code is also pretty difficult to follow because of all those different types of expression templates that have to be created.
> 
> If you include matrices things are even trickier because there are special cases like "A*B+a*C" that can be computed efficiently by a single optimized routine.  You'd like to recognize such cases and turn them into single calls to that fast routine.  You might also want to recognize "A*B+C" as a special case of that with a==1.
> 
> *How it could be improved*:
> ??? this is what I'd like to see explained better.

Ok, here's my high-level stab at how the new stuff could help.
Instead of returning "expression templates" where the the parse tree is represented as a hierarchical type, just make opSub (I-B) return an ExpressionWrapper which is just a light wrapper around the string "I-B".  Next the compiler gets to "ExpressionWrapper * C", have that just return another ExpressionWrapper that modifies the original string to be
   "(" ~ origstring ~ "*C"
In the end you still have an opAssign overload that calls eval() on the expression, but now the expression has a (fully parenthesized) string representation of the expression:  "((I-B)*C)+D".  And eval just has to use the new compile-time string parsing tricks to decide how best to evaluate that expression.  It's a pattern matching task, and thankfully now you have the whole pattern in one place, as opposed to the ExpressionTemplates, which have a hard time really getting a look at the whole picture.

--bb

February 10, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by Andrei Alexandrescu (See Website For Email)
in reply to kris

Andrei Alexandrescu (See Website For Email)

Posted in reply to kris

kris wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Tom S wrote:
>>
>>> When the compiler is used for the processing of a DSL, it simply masks a simple step that an external tool would do. It's not much of a problem to run an external script or program, that will read the DSL and output D code, while perhaps also doing other stuff, connecting with databases or making coffee.
>>
>>
>> This is a misrepresentation. Code generation with external tools has been done forever; it is never easy (unless all you need is a table of logarithms), and it always incurs a non-amortized cost of parsing the DSL _plus_ the host language. Look at lex and yacc, the prototypical examples. They aren't small or simple nor perfectly integrated with the host language. And their DSL is extremely well understood. That's why there's no proliferation of lex&yacc-like tools for other DSLs (I seem to recall there was an embedded SQL that got lost in the noise) simply because the code generator would basically have to rewrite a significant part of the compiler to do anything interesting.
> 
> Are you stating that D will address all these concerns? Without any detrimental side-effects?

It will naturally address the concerns exactly by not relying on an external generator. This was my point all along.

>> I think there is a lot of apprehension and misunderstanding surrounding what metacode is able and supposed to do or simplify. Please, let's focus on understanding _before_ forming an opinion.
> 
> If that's the case, then perhaps it's due to a lack of solid & practical examples for people to examine? There's been at least two requests recently for an example of how this could help DeRailed in a truly practical sense, yet both of those requests appear to have been ignored thus far.
> 
> I suspect that such practical examples would help everyone understand since, as you suggest, there appears to be "differences" in perspective? Since Walter brough RoR up, and you apparently endorsed his point, perhaps one of you might enlighten us via those relevant examples?
> 
> There's a request in the original post on "The DeRailed Challenge" for just such an example ... don't feel overtly obliged; but it might do something to offset the misunderstanding you believe is prevalent.

I saw the request. My problem is that I don't know much about DeRailed, and that I don't have time to invest in it. My current understanding is that DeRailed's approach is basically dynamic, a domain that metaprogramming can help somewhat, but not a lot. The simplest example is to define variant types (probably they are already there) using templates. Also possibly there are some code generation aspects (e.g. for various platforms), which I understand RoR does a lot, that could be solved using metacode.

On the other hand, there are many good examples coming from C++ (e.g. most of Boost and Loki) offering good experimental evidence that metacode can help a whole lot. I've tried to add a couple of ad-hoc examples in posts, but I understand they can't qualify because they don't directly and obviously help DeRailed.

Andrei

February 10, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by Andrei Alexandrescu (See Website For Email)
in reply to Bill Baxter

Andrei Alexandrescu (See Website For Email)

Posted in reply to Bill Baxter

Bill Baxter wrote:
> Bill Baxter wrote:
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Tom S wrote:
>>
>>>> Therefore, I'd like to see a case when a compile-time DSL processing is going to be really useful in the real world, as to provoke further complication of the compiler and its usage patterns.
>>>
>>> I can't parse this sentence. Did you mean "as opposed to provoking" instead of "as to provoke"?
>>
>> I think he just meant he wants to see some real world examples that justify the additional complexity that will be added to the compiler and language.
>>
>>>> The other aspect I observed in the discussions following the new dmd release, is that folks are suggesting writing full language parsers, D preprocessors and various sort of operations on complex languages, notably extended-D parsing... This may sound weird in my lips, but that's clearly abuse. What these people really want is a way to extend the language's syntax, pretty much as Nemerle does it. And frankly, if D is going to be a more powerful language, built-in text preprocessing won't cut it. Full fledged macro support and syntax extension mechanics are something that we should look at.
>>>
>>> This is a misunderstanding. The syntax is not to be extended. It stays fixed, and that is arguably a good thing. The semantics become more flexible. For example, they will make it easy to write a matrix operation:
>>>
>>> A = (I - B) * C + D
>>>
>>> and generate highly performant code from it. (There are many reasons for which that's way harder than it looks.)
>>
>> This is one thing I haven't really understood in the discussion.  How do the current proposals help that case?  From what I'm getting you're going to have to write every statement like the above as something like:
>>
>> mixin {
>>    ProcessMatrixExpr!( "A = (I - B) * C + D;" );
>> }
>>
>> How do you get from this mixin/string processing stuff to an API that I might actually be willing to use for common operations?
>>
>>
>>> I think there is a lot of apprehension and misunderstanding surrounding what metacode is able and supposed to do or simplify. Please, let's focus on understanding _before_ forming an opinion.
>>
>> I think that's exactly what Tom's getting at.  He's asking for examples of how this would make like better for him and others.  I think given your background you take it for granted that metaprogramming is the future.  But D is attracts folks from all kinds of walks of life, because it promises to be a kinder, gentler C++.  So some people here aren't even sure why D needs templates at all.  Fortran doesn't have 'em after all.  And Java just barely does.
>>
>> Anyway, I think it would help get everyone on board if some specific and useful examples were given of how this solves real problems (and no I don't really consider black and white holes as solving real problems.  I couldn't even find any non-astronomical uses of the terms in a google search.)
>>
>> For instance, it would be nice to see some more concrete discussion about
>> * the "rails" case.
>> * the X = A*B + C matrix/vector expressions case.
>> * the case of generating bindings to scripting langauges / ORB stubs
>> * the Spirit/parser generator case
>>
>> So here's my take on vector expressions since that's the only one I know anything about.
>>
>> *Problem statement*:
>> Make the expression A=(I-B)*C+D efficient, where the variables are large vectors (I'll leave out matrices for now).
>>
>> *Why it's hard*:
>> The difficulty is that (ignoring SSE instruction etc) the most efficient way to compute that is do all operations component-wise.  So instead of computing I-B then multiplying by C, you compute
>>     A[i] = (I[i]-B[i])*C[i]+D[i];
>> for each i.  This eliminates the need to allocate large intermediate vectors.
>>
>> *Existing solutions*:
>>   Expression templates in C++, e.g. The Blitz++ library.  Instead of making opSub in I-B return a new Vector object, you make opSub return an ExpressionTemplate object.  This is a little template struct that contains a reference to I and to B, and knows how to subtract the two in a component-wise manner.  The types of I and B are template parameters, LeftT and RightT. Its interface also allows it to be treated just like a Vector.  You can add a vector to it, subtract a Vector from it etc.
>>
>> Now we go and try to multiply that result times C.  The result of that is a new MultExpressionTemplate with two paramters, the LeftT being our previous SubExpressionTemplate!(Vector,Vector) and the RightT being Vector.
>>
>> Proceeding on in this way eventually the result of the math is of type:
>>
>> AddExpressionTemplate!(
>>    MultExpressionTemplate!(
>>       SubExpressionTemplate!(Vector,Vector),
>>       Vector),
>>    Vector)
>>
>> And you can see that we basically have a parse tree expressed as nested templates.  The final trick is that a Vector.opAssign that takes an ExpressionTemplate is provided and that method calls a method of the expression template to finally trigger the calculation, like expr.eval(this).  eval() has the top-level loop over the components of the answer.
>>
>> *Why Existing solutions are insufficient*
>> For that all that effort to actually be useful the compiler has to be pretty agressive about inlining everything so that in the end all the temporary template structs and function calls go away and you're just left with one eval call.  It can be tricky to get the code into just the right configuration so that the compiler will do the inlining.  And even then results will depend on the quality of the compiler.  My attempts using MSVC++ several years ago always came out to be slower than the naive version, but with about 10x the amount of code.  The code is also pretty difficult to follow because of all those different types of expression templates that have to be created.
>>
>> If you include matrices things are even trickier because there are special cases like "A*B+a*C" that can be computed efficiently by a single optimized routine.  You'd like to recognize such cases and turn them into single calls to that fast routine.  You might also want to recognize "A*B+C" as a special case of that with a==1.
>>
>> *How it could be improved*:
>> ??? this is what I'd like to see explained better.
> 
> Ok, here's my high-level stab at how the new stuff could help.
> Instead of returning "expression templates" where the the parse tree is represented as a hierarchical type, just make opSub (I-B) return an ExpressionWrapper which is just a light wrapper around the string "I-B".  Next the compiler gets to "ExpressionWrapper * C", have that just return another ExpressionWrapper that modifies the original string to be
>    "(" ~ origstring ~ "*C"
> In the end you still have an opAssign overload that calls eval() on the expression, but now the expression has a (fully parenthesized) string representation of the expression:  "((I-B)*C)+D".  And eval just has to use the new compile-time string parsing tricks to decide how best to evaluate that expression.  It's a pattern matching task, and thankfully now you have the whole pattern in one place, as opposed to the ExpressionTemplates, which have a hard time really getting a look at the whole picture.

You're on the right track!!! I was writing an answer, but I need to leave so I have to interrupt it.

Thanks for lending a hand :o).


Andrei

February 10, 2007

Re: DeRailed DSL (was Re: compile-time regex redux)

Posted by kris
in reply to Andrei Alexandrescu (See Website For Email)

kris

Posted in reply to Andrei Alexandrescu (See Website For Email)

Andrei Alexandrescu (See Website For Email) wrote:
>>> I think there is a lot of apprehension and misunderstanding surrounding what metacode is able and supposed to do or simplify. Please, let's focus on understanding _before_ forming an opinion.
>>
>>
>> If that's the case, then perhaps it's due to a lack of solid & practical examples for people to examine? There's been at least two requests recently for an example of how this could help DeRailed in a truly practical sense, yet both of those requests appear to have been ignored thus far.
>>
>> I suspect that such practical examples would help everyone understand since, as you suggest, there appears to be "differences" in perspective? Since Walter brough RoR up, and you apparently endorsed his point, perhaps one of you might enlighten us via those relevant examples?
>>
>> There's a request in the original post on "The DeRailed Challenge" for just such an example ... don't feel overtly obliged; but it might do something to offset the misunderstanding you believe is prevalent.
> 
> 
> I saw the request. My problem is that I don't know much about DeRailed, and that I don't have time to invest in it. 

You wrote (in the past):
====
I think things would be better if we had better libraries and some success stories.
====

We have better libraries now. And we're /trying/ to build a particular success story. Is that not enough reason to illustrate some practical examples, busy though you are? I mean, /we're/ putting in a lot of effort, regardless of how busy our personal lives may be. All we're asking for are some practical and relevant examples as to why advanced DSL support in D will assist us so much.

> My current understanding is that DeRailed's approach is basically dynamic, a domain that metaprogramming can help somewhat, but not a lot. The simplest example is to define variant types (probably they are already there) using templates. Also possibly there are some code generation aspects (e.g. for various platforms), which I understand RoR does a lot, that could be solved using metacode.

I read "Possibly there ... Could be solved ..."

Forgive me, Andrei, but that really does not assist in comprehending what it is that you say we fail to understand. Surely you would agree? Variant types can easily be handled by templates today, but without an example it's hard to tell if that's what you were referring to.

On Feb 7th, you wrote (in reference to the DSL discourse):
=====
Walter gave another good case study: Ruby on Rails. The success of Ruby
on Rails has a lot to do with its ability to express abstractions that
were a complete mess to deal with in concreteland.
====

On Feb 7th, Walter wrote (emphasis added):
=====
Good question. The simple answer is look what Ruby on Rails did for Ruby. Ruby's a good language, but the killer app for it was RoR. RoR is what drove adoption of Ruby through the roof. /Enabling ways for sophisticated DSLs to interoperate with D will enable such applications/
=====

You see that subtle but important reference to RoR and DSL in the above? What we're asking for is that one of you explain just exactly what is /meant/ by that. DeRailed faces many of the same issues RoR did, so some relevant examples pertaining to the RoR claim may well assist DeRailed.

Do we have to get down on our knees and beg?

~ Kris

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation