Thread overview
Nice Inline Syntax for DSLs
Feb 16, 2007
Russell Lewis
Feb 16, 2007
Russell Lewis
Feb 17, 2007
Knud Soerensen
Feb 17, 2007
Tyler Knott
Feb 19, 2007
Leandro Lucarella
Feb 20, 2007
Kristian Kilpi
Feb 20, 2007
Knud Soerensen
Feb 20, 2007
Tyler Knott
Feb 17, 2007
Russell Lewis
February 16, 2007
Originally posted in digitalmars.d.announce.  I reposted here when I realized my mistake.

We have been talking about using string imports and code mixins and their applicability for domain-specific languages.  But the current design requires either that we wrap the whole sub-language as a string, or we store it in another file so that we can use string imports.  But what if we allowed there to be some simple syntax which allowed us to say, "Here at the top is D code; below is the DSL."  How about something like:

    import my_dsl_compiler;
    mixin(MyDSLCompiler!(import_rest_of_this_file));
    FIRST_LINE_OF_MY_DSL

or

    import my_dsl_compiler;
    int main(char[][] argv) {
        // this line resolves to a lot of variable declarations
        // and functions, including a my_dsl_main()
        mixin MyDSLCompiler!(import_rest_of_this_file));
        return my_dsl_main(argv);
    }
    FIRST_LINE_OF_MY_DSL

Sort of the idea is that whenever the compiler hits a line that includes some special keyword (in the above example, it is import_rest_of_this_file), it keeps on to the end of the current declaration, treating it as D code.  The rest is assumed to be string data which is imported into the D code above.

Think of it like a shebang line in a script, which documents how the rest of the code is to be handled.

Russ
February 16, 2007
grrr.....

I swear, I only hit "send" once.  Sorry for the spam, guys.
February 17, 2007
On Fri, 16 Feb 2007 16:35:46 -0700, Russell Lewis wrote:

> Originally posted in digitalmars.d.announce.  I reposted here when I realized my mistake.
> 
> We have been talking about using string imports and code mixins and their applicability for domain-specific languages.  But the current design requires either that we wrap the whole sub-language as a string, or we store it in another file so that we can use string imports.  But what if we allowed there to be some simple syntax which allowed us to say, "Here at the top is D code; below is the DSL."  How about something like:
> 
>      import my_dsl_compiler;
>      mixin(MyDSLCompiler!(import_rest_of_this_file));
>      FIRST_LINE_OF_MY_DSL
> 
> or
> 
>      import my_dsl_compiler;
>      int main(char[][] argv) {
>          // this line resolves to a lot of variable declarations
>          // and functions, including a my_dsl_main()
>          mixin MyDSLCompiler!(import_rest_of_this_file));
>          return my_dsl_main(argv);
>      }
>      FIRST_LINE_OF_MY_DSL
> 
> Sort of the idea is that whenever the compiler hits a line that includes some special keyword (in the above example, it is import_rest_of_this_file), it keeps on to the end of the current declaration, treating it as D code.  The rest is assumed to be string data which is imported into the D code above.
> 
> Think of it like a shebang line in a script, which documents how the rest of the code is to be handled.
> 
> Russ

Well, we already have asm as a dsl.
Why not use a similar syntax like:

dslname
{
...
}

We just need a way to tell the compiler which passer to use for dslname.
February 17, 2007
Knud Soerensen wrote:
> On Fri, 16 Feb 2007 16:35:46 -0700, Russell Lewis wrote:
> 
> Well, we already have asm as a dsl.
> Why not use a similar syntax like:
> 
> dslname {
> ...
> }
> 
> We just need a way to tell the compiler which passer to use for dslname.

I was thinking of a similar syntax as well, except it would look like this:

# dsl(compilerFunc[, ...])
# {
#     DSL GOES HERE
# }

With three restrictions:
1. All curly braces within the DSL must be balanced.
2. compilerFunc must be evaluable at compile time.
3. compilerFunc must use the signature char[] function(char[][, ...]).

The code between the braces would be passed in to compilerFunc as the first argument, any arguments to the DSL statement after the complier function would be passed as the remaining arguments to the compiler function (like how opApply overloading works), and the return value of the function would be implicitly mixed in where the dsl declaration occurs.  (This of course depends on Walter fixing compile-time evaluation of functions for const initialization, which should be done in the next release of DMD.)
February 17, 2007
Russell Lewis wrote:
> say, "Here at the top is D code; below is the DSL."  How about something like:

Funny that you wrote that: That's called HEREDOC syntax and it's implemented in several scripting languages (Perl and PHP, for example):

htmlentities(<<<EOF);
<p>This is HEREDOC syntax.
EOF;

The EOF it's chosen by the programer (It could as well be any other word) and serves as a terminator character.
February 17, 2007
Knud Soerensen wrote:
> On Fri, 16 Feb 2007 16:35:46 -0700, Russell Lewis wrote:
> 
>> Originally posted in digitalmars.d.announce.  I reposted here when I realized my mistake.
>>
>> We have been talking about using string imports and code mixins and their applicability for domain-specific languages.  But the current design requires either that we wrap the whole sub-language as a string, or we store it in another file so that we can use string imports.  But what if we allowed there to be some simple syntax which allowed us to say, "Here at the top is D code; below is the DSL."  How about something like:
>>
>>      import my_dsl_compiler;
>>      mixin(MyDSLCompiler!(import_rest_of_this_file));
>>      FIRST_LINE_OF_MY_DSL
>>
>> or
>>
>>      import my_dsl_compiler;
>>      int main(char[][] argv) {
>>          // this line resolves to a lot of variable declarations
>>          // and functions, including a my_dsl_main()
>>          mixin MyDSLCompiler!(import_rest_of_this_file));
>>          return my_dsl_main(argv);
>>      }
>>      FIRST_LINE_OF_MY_DSL
>>
>> Sort of the idea is that whenever the compiler hits a line that includes some special keyword (in the above example, it is import_rest_of_this_file), it keeps on to the end of the current declaration, treating it as D code.  The rest is assumed to be string data which is imported into the D code above.
>>
>> Think of it like a shebang line in a script, which documents how the rest of the code is to be handled.
>>
>> Russ
> 
> Well, we already have asm as a dsl.
> Why not use a similar syntax like:
> 
> dslname {
> ...
> }
> 
> We just need a way to tell the compiler which passer to use for dslname.

Sure, you can already do that...but with double quotes.  The reason that I think that double quotes, or brackets, are unsatisfying is because then those tokens become unusable inside the DSL.  I'm looking for a format where the DSL can use any grammar whatsoever, without any need for escape sequences or workarounds.
February 19, 2007
Tyler Knott escribió:
> Knud Soerensen wrote:
>> On Fri, 16 Feb 2007 16:35:46 -0700, Russell Lewis wrote:
>>
>> Well, we already have asm as a dsl.
>> Why not use a similar syntax like:
>>
>> dslname {
>> ...
>> }
>>
>> We just need a way to tell the compiler which passer to use for dslname.
> 
> I was thinking of a similar syntax as well, except it would look like this:
> 
> # dsl(compilerFunc[, ...])
> # {
> #     DSL GOES HERE
> # }
> 
> With three restrictions:
> 1. All curly braces within the DSL must be balanced.
> 2. compilerFunc must be evaluable at compile time.
> 3. compilerFunc must use the signature char[] function(char[][, ...]).

Maybe
# dsl(compilerFunc[, ...]):
# DSL GOES HERE TO THE EOF

Because you don't impose any restrictions on the DSL syntax (removes the restriction 1). The only drawback I see in this, is it has to be some kind of special key where only ':' block syntax shoul be used, and where no other block can appear before the end of file.

-- 
Leandro Lucarella
Integratech S.A.
4571-5252
February 20, 2007
On Sat, 17 Feb 2007 03:08:47 +0200, Tyler Knott <tywebmail@mailcity.com> wrote:
> Knud Soerensen wrote:
>> On Fri, 16 Feb 2007 16:35:46 -0700, Russell Lewis wrote:
>>  Well, we already have asm as a dsl.
>> Why not use a similar syntax like:
>>  dslname {
>> ...
>> }
>>  We just need a way to tell the compiler which passer to use for dslname.
>
> I was thinking of a similar syntax as well, except it would look like this:
>
> # dsl(compilerFunc[, ...])
> # {
> #     DSL GOES HERE
> # }
>
> With three restrictions:
> 1. All curly braces within the DSL must be balanced.
> 2. compilerFunc must be evaluable at compile time.
> 3. compilerFunc must use the signature char[] function(char[][, ...]).
>
> The code between the braces would be passed in to compilerFunc as the first argument, any arguments to the DSL statement after the complier function would be passed as the remaining arguments to the compiler function (like how opApply overloading works), and the return value of the function would be implicitly mixed in where the dsl declaration occurs.  (This of course depends on Walter fixing compile-time evaluation of functions for const initialization, which should be done in the next release of DMD.)


I have proposed earlier an alternate way to mark string literals that allows nesting etc (http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=48307).

That is, I suggested that some special marks could be used to mark a part of a source file as a string literal, just like /* */ marks a part of code as  a comment. I used the @{ }@ marks, but some other (unambiguous) ones could also be used, of course (for instance, </ /> was proposed).

The benefit of using the @{ }@ syntax is you can get rid of the restriction #1, and the syntax can be used anywhere where strings can be used.


The 'dsl() {...}' syntax is nice, and that could be combined with the @{ }@ syntax; that is, when you need to use unbalanced curly braces with it. For example:

  dsl(MyDSL) @{
    a = 10;
    ...
  }@

->

  mixin(MyDSL!( "
    a = 10;
    ...
  "));


If my 'extension' (added later) to the @{ }@ syntax (http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=49042) would be possible, then DSLs will nicely be the part of the language, just like the asm is. The $ character will cause a constant, an alias, a (constant) function call, etc to be evaluated and inserted to the string. For example:

  int getVal() {return 200;}

  dsl(MyDSL) {
    a = $getVal();
    ...
  }

->

  mixin(MyDSL!( "
    a = 200;
    ...
  "));


So, in short, all the following cases:

  dsl(X) {...}
  dsl(X) @{...}@
  dsl(X) "..."

would be an alternate way to write:

  mixin(X!("..."));

(Optional arguments for X are left out for simplicity.)
February 20, 2007
On Fri, 16 Feb 2007 19:08:47 -0600, Tyler Knott wrote:

> Knud Soerensen wrote:
>> On Fri, 16 Feb 2007 16:35:46 -0700, Russell Lewis wrote:
>> 
>> Well, we already have asm as a dsl.
>> Why not use a similar syntax like:
>> 
>> dslname
>> {
>> ...
>> }
>> 
>> We just need a way to tell the compiler which passer to use for dslname.
> 
> I was thinking of a similar syntax as well, except it would look like this:
> 
> # dsl(compilerFunc[, ...])
> # {
> #     DSL GOES HERE
> # }
> 
No, I meant something like.

compilerFunc draw[, ...] { ...   // parser code for the draw dsl. };

and then it is used like

draw
{
moveto x,y;
circle(2.0);
...
}

the should be no need to repeat "compilerFunc[, ...]"
every time the dsl is needed.
February 20, 2007
Knud Soerensen wrote:
> On Fri, 16 Feb 2007 19:08:47 -0600, Tyler Knott wrote:
> 
> No, I meant something like.
> 
> compilerFunc draw[, ...] { ...   // parser code for the draw dsl.   };
> 
> and then it is used like
> 
> draw
> {
> moveto x,y;
> circle(2.0);
> ...
> }
> 
> the should be no need to repeat "compilerFunc[, ...]"
> every time the dsl is needed.

I don't think you're quite understanding what I meant.  For your draw example, it'd look like this

char[] draw(char[] rawcode)
{
	//process rawcode and return D code
}

dsl(draw)
{
	move x,y;
	circle(2.0);
	...
}

which translates to:

mixin(draw(`
	move x,y;
	circle(2.0);
	...
`));

The only real differences between our syntaxes is that I require the "dsl" keyword every time this syntax is used, and the compiler functions are just standard functions (with the two restrictions mentioned in my post).