February 09, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> janderson wrote:
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Hasan Aljudy wrote:
>>> [snip]
>>>> The trick may lie in letting the compiler recognize these kind of functions. A simple solution might me to come up with a new attribute; let's call it "meta":
>>>> # meta int add( int x, int y ) { return x + y; }
>>>> This attribute will assist the compiler in recognizing that this function can be computed at compile time if it's given constant arguments.
>>>
>>> This is much in keep with my idea on how metaprogramming should be done, with the little semantic nit that "meta" should be "dual" as add has dual functionality.
>>>
>>> The attribute is not even needed if meta-code is flagged as such. My thoughts currently gravitate around the idea of using mixin as an escape into compile-time world. Anything that's printed to standard output in compile-time world becomes code, e.g. (using your function):
>>>
>>> mixin
>>> {
>>>   writefln("int x = ", add(10, 20), ";");
>>> }
>>>
>>> is entirely equivalent to:
>>>
>>> int x = 30;
>>>
>>> No annotation is needed on add because mixin clarifies that it's called during compilation. The compiler will complain if add cannot be user dually.
>>>
>>> Using mixin and write as separation devices makes it very clear what is to be done when; otherwise, it quickly becomes confusing what code is meant to actually get evaluated eagerly, and what code is to actually be "output" for compilation.
>>>
>>> The not-so-nice thing is that we get to manipulate numbers, strings, and arrays, not trees like in LISP.
>>>
>>>
>>> Andrei
>>
>> Although I like this idea, I fear it will not work for anything hidden in a library (if you are using something that is moved to a .lib, your code will stop working).  Maybe that's ok.  Actually now I think about it, that would be safer, because even if the D compiler could put an "ok" signatures in a library, someone could create fake signatures.
> 
> Good point. This is reasonable. To execute code during compilation it's reasonable to expect transparency. C++ commercial vendors did not really suffer financial loss due to this requirement, and at any rate, OSS is on the rise :o).
> 
>> It could get confusing if you don't know which functions will work and which won't.  Perhaps the compiler could help there (spit out a list of functions u can use or something).
> 
> Yes, that's a documentation issue. The nice thing is that a plethora of really useful functions (e.g. string manipulation) can be written in D's subset that can be interpreted. Pretty much the entire string library will be meta-executable. No more need for the metastrings lib!
> 
>> I think the compiler would compile these commands on demand and cache them so it doesn't need to do them every time.  That would help a lot. It could even cache results so it only needs to compute them once.
>>
>> Anyways, there i think there is so much possibility with compile time coding.
> 
> Me too. Once compile-time interpretation (and mutation) makes it in, I think there's no fear of efficiency loss anymore. Speed will be comparable to that of any interpreted code, and there are entire communities that don't have a problem with that.
> 
> The question is, what is the subset of D that can be interpreted? I'm thinking:
> 
> * Basic data types (they will be stored as a dynamically-typed variant anyway), except pointers to functions and delegates.
> 
> * Arrays and hashes
> 
> * Basic expressions (except 'is', 'delete' et al.)
> 
> * if, for, foreach, while, do, switch, continue, break, return
> 
> I'm constructing this list thinking what it takes to write basic data manipulation functions. What did I forget?
> 
> 
> Andrei

Simple answer: the intersection of a simple interpreted language and D.  What can D do that a scripted language cannot?  Take that out.

Approaching it from another angle, imagine I have a parse tree and I'm writing code to interpret it... what do I want to leave out?

- inline asm
- exceptions
- synchronization and volatile and related
- calls to C (though this might prompt some rewriting of Phobos to avoid
  calling things like strlen() if they aren't builtins/intrinsics.

What I see as a bigger snag is processing order.  The whole D universe expects to be able to forward or backward reference anything regardless of how it is defined.

Let's say I have these two functions:

char[] foo(char[] a, char[] b) metaok;
char[] bar(char[] a, char[] b, char[] c) metaok;

The metaok keyword means "this can be run at compile time."

Suppose further that bar(a,b,c) calls foo(a,b) as a base case, and that both are generated via the "meta" mechanism.  No mutual recursion or anything though.

What order do you compile them in?  Obviously, foo then bar -- bar needs foo to be runable at compile time.

I love the idea of makeing all of std.strings compile-time-able, but the D compiler sees the code as a big array of unrelated functions right now (or it can if it chooses to.)

With this stuff, the compiler is thrown back into the shell or C world where every function definition and execution is an event in a history, and order is everything.

The alternative I prefer, is to collect all function definitions, sort out the dependency order in the compiler using graphs or what not, and work in that order.  To be consistent with D philosophy as I see it, the detection of cycles in this graph should be flagged like an ambiguous overload.

That should be okay, but when compiling across modules it might get somewhat harder...?

Kevin
February 09, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> janderson wrote:
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Hasan Aljudy wrote:
>>> [snip]
>>>> The trick may lie in letting the compiler recognize these kind of functions. A simple solution might me to come up with a new attribute; let's call it "meta":
>>>> # meta int add( int x, int y ) { return x + y; }
>>>> This attribute will assist the compiler in recognizing that this function can be computed at compile time if it's given constant arguments.
>>>
>>> This is much in keep with my idea on how metaprogramming should be done, with the little semantic nit that "meta" should be "dual" as add has dual functionality.
>>>
>>> The attribute is not even needed if meta-code is flagged as such. My thoughts currently gravitate around the idea of using mixin as an escape into compile-time world. Anything that's printed to standard output in compile-time world becomes code, e.g. (using your function):
>>>
>>> mixin
>>> {
>>>   writefln("int x = ", add(10, 20), ";");
>>> }
>>>
>>> is entirely equivalent to:
>>>
>>> int x = 30;
>>>
>>> No annotation is needed on add because mixin clarifies that it's called during compilation. The compiler will complain if add cannot be user dually.
>>>
>>> Using mixin and write as separation devices makes it very clear what is to be done when; otherwise, it quickly becomes confusing what code is meant to actually get evaluated eagerly, and what code is to actually be "output" for compilation.
>>>
>>> The not-so-nice thing is that we get to manipulate numbers, strings, and arrays, not trees like in LISP.
>>>
>>>
>>> Andrei
>>
>> Although I like this idea, I fear it will not work for anything hidden in a library (if you are using something that is moved to a .lib, your code will stop working).  Maybe that's ok.  Actually now I think about it, that would be safer, because even if the D compiler could put an "ok" signatures in a library, someone could create fake signatures.
> 
> Good point. This is reasonable. To execute code during compilation it's reasonable to expect transparency. C++ commercial vendors did not really suffer financial loss due to this requirement, and at any rate, OSS is on the rise :o).
> 
>> It could get confusing if you don't know which functions will work and which won't.  Perhaps the compiler could help there (spit out a list of functions u can use or something).
> 
> Yes, that's a documentation issue. The nice thing is that a plethora of really useful functions (e.g. string manipulation) can be written in D's subset that can be interpreted. Pretty much the entire string library will be meta-executable. No more need for the metastrings lib!
> 
>> I think the compiler would compile these commands on demand and cache them so it doesn't need to do them every time.  That would help a lot. It could even cache results so it only needs to compute them once.
>>
>> Anyways, there i think there is so much possibility with compile time coding.
> 
> Me too. Once compile-time interpretation (and mutation) makes it in, I think there's no fear of efficiency loss anymore. Speed will be comparable to that of any interpreted code, and there are entire communities that don't have a problem with that.
> 
> The question is, what is the subset of D that can be interpreted? I'm thinking:
> 
> * Basic data types (they will be stored as a dynamically-typed variant anyway), except pointers to functions and delegates.

Agreed. Pointers are a likely area for abuse.  Maybe they could be added at a later time once we understand how this goes in practice.

> 
> * Arrays and hashes
> 
> * Basic expressions (except 'is', 'delete' et al.)
> 
> * if, for, foreach, while, do, switch, continue, break, return
> 
> I'm constructing this list thinking what it takes to write basic data manipulation functions. What did I forget?
> 
> 
> Andrei

This is a good start.

* Import and mixin's as well so we can read in stuff (safely) at compile time and do self reflection and stuff.

* Classes + structs would be a nice addition, although they wouldn't be need in the first additions.  Classes would mean you'd need a GC and would add more complications.  I think once a foundation is in they could be added with more discussions.  OO is a powerful abstraction concept so would be useful in simplifying the compile-time code.

-Joel
February 09, 2007
Kevin Bealer wrote:
> Let's say I have these two functions:
> 
> char[] foo(char[] a, char[] b) metaok;
> char[] bar(char[] a, char[] b, char[] c) metaok;
> 
> The metaok keyword means "this can be run at compile time."
> 
> Suppose further that bar(a,b,c) calls foo(a,b) as a base case, and that both are generated via the "meta" mechanism.  No mutual recursion or anything though.
> 
> What order do you compile them in?  Obviously, foo then bar -- bar needs foo to be runable at compile time.
> 

> Kevin

I think that is a good case for labeling which functions are compile-time safe, although I think it solveable.

Here's another case for problems:

void

mixin
{
"char [] foo1" ~ foo3 ~ "{...}"
};

void

mixin
{
"foo2" ~ foo1()  ~ "{...}"
};


How is the compiler going to figure that one out.  Parhaps there should be some more rules to help with these cases?  Maybe its possible to compile it?  Maybe something to do with namespacing?  I'm not sure.  I think would be possible.

I think you shouldn't be able to access functions a compile time defined in a mixin.  But I'm sure there are many other cases like this to figure out.

-Joel
February 09, 2007
Reiner Pope wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> The question is, what is the subset of D that can be interpreted? I'm thinking:
>>
>> * Basic data types (they will be stored as a dynamically-typed variant anyway), except pointers to functions and delegates.
> What's wrong with function pointers and delegates? For starters, you need to support them for user-written foreach's, and (as long as they are written in an alias-free form) they are const-foldable.
> 
>> I'm constructing this list thinking what it takes to write basic data manipulation functions. What did I forget?
> What about structs? They are a plain old data type, and they don't have virtual functions. Is there something tricky with them that I'm missing?

I guess I'm just trying to simplify Walter's life :o).

Andrei
February 09, 2007
On Fri, 09 Feb 2007 08:42:37 +0300, Andrei Alexandrescu (See Website For Email) <SeeWebsiteForEmail@erdani.org> wrote:

> The attribute is not even needed if meta-code is flagged as such. My thoughts currently gravitate around the idea of using mixin as an escape into compile-time world. Anything that's printed to standard output in compile-time world becomes code, e.g. (using your function):
>
> mixin
> {
>    writefln("int x = ", add(10, 20), ";");
> }
>
> is entirely equivalent to:
>
> int x = 30;
>
> No annotation is needed on add because mixin clarifies that it's called during compilation. The compiler will complain if add cannot be user dually.
>
> Using mixin and write as separation devices makes it very clear what is to be done when; otherwise, it quickly becomes confusing what code is meant to actually get evaluated eagerly, and what code is to actually be "output" for compilation.

This is a good idea!
But I think it is no need to iterpret 'pure D' code at compile time. If compiler sees such mixin expression it can create temporary D program which has content of the mixin expression and then execute it in background with standard output redirection (with help of rdmd).

This approach can be used even with current string mixin expression. For example, if compiler sees a construct:

mixin( SomeDSLProcessor( `some DSL code` ) );

and knowns than SomeDSLProcessor is an ordinary D function then compiler can create a simple temporary program:

import <module in which SomeDSLProcessor is defined>;

void main() {
  writefln( SomeDSLProcessor( `some DSL code` ) );
}

then run it by rdmd and use its output as argument for mixin expression in original program.

-- 
Regards,
Yauheni Akhotnikau
February 09, 2007
Yauheni Akhotnikau wrote:
> On Fri, 09 Feb 2007 08:42:37 +0300, Andrei Alexandrescu (See Website For Email) <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> The attribute is not even needed if meta-code is flagged as such. My thoughts currently gravitate around the idea of using mixin as an escape into compile-time world. Anything that's printed to standard output in compile-time world becomes code, e.g. (using your function):
>>
>> mixin
>> {
>>    writefln("int x = ", add(10, 20), ";");
>> }
>>
>> is entirely equivalent to:
>>
>> int x = 30;
>>
>> No annotation is needed on add because mixin clarifies that it's called during compilation. The compiler will complain if add cannot be user dually.
>>
>> Using mixin and write as separation devices makes it very clear what is to be done when; otherwise, it quickly becomes confusing what code is meant to actually get evaluated eagerly, and what code is to actually be "output" for compilation.
> 
> This is a good idea!
> But I think it is no need to iterpret 'pure D' code at compile time. If compiler sees such mixin expression it can create temporary D program which has content of the mixin expression and then execute it in background with standard output redirection (with help of rdmd).

I've said it before, this is useless. Metacode must have access to the program's symbols.

Andrei
February 09, 2007

janderson wrote:
> Kevin Bealer wrote:
>> Let's say I have these two functions:
>>
>> char[] foo(char[] a, char[] b) metaok;
>> char[] bar(char[] a, char[] b, char[] c) metaok;
>>
>> The metaok keyword means "this can be run at compile time."
>>
>> Suppose further that bar(a,b,c) calls foo(a,b) as a base case, and that both are generated via the "meta" mechanism.  No mutual recursion or anything though.
>>
>> What order do you compile them in?  Obviously, foo then bar -- bar needs foo to be runable at compile time.
>>
> 
>> Kevin
> 
> I think that is a good case for labeling which functions are compile-time safe, although I think it solveable.
> 
> Here's another case for problems:
> 
> void
> 
> mixin
> {
> "char [] foo1" ~ foo3 ~ "{...}"
> };
> 
> void
> 
> mixin
> {
> "foo2" ~ foo1()  ~ "{...}"
> };

That one might work by luck, but that's not the intended usage of compile-time execution so I think it'd be safe to define that as an error; i.e. specify that mixed-in code is not compile-time executable.

> 
> 
> How is the compiler going to figure that one out.  Parhaps there should be some more rules to help with these cases?  Maybe its possible to compile it?  Maybe something to do with namespacing?  I'm not sure.  I think would be possible.
> 
> I think you shouldn't be able to access functions a compile time defined in a mixin.  But I'm sure there are many other cases like this to figure out.
> 
> -Joel
February 09, 2007
>> But I think it is no need to iterpret 'pure D' code at compile time. If compiler sees such mixin expression it can create temporary D program which has content of the mixin expression and then execute it in background with standard output redirection (with help of rdmd).
>
> I've said it before, this is useless. Metacode must have access to the program's symbols.
>
> Andrei

Can you provide some examples which show for what that is need? (May be I miss something but I don't see more or less realistic examples yet).

-- 
Regards,
Yauheni Akhotnikau
February 09, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> janderson wrote:
> Me too. Once compile-time interpretation (and mutation) makes it in, I think there's no fear of efficiency loss anymore. Speed will be comparable to that of any interpreted code, and there are entire communities that don't have a problem with that.
> 
> The question is, what is the subset of D that can be interpreted? I'm thinking:
> 
> * Basic data types (they will be stored as a dynamically-typed variant anyway), except pointers to functions and delegates.
> 
> * Arrays and hashes
> 
> * Basic expressions (except 'is', 'delete' et al.)
> 
> * if, for, foreach, while, do, switch, continue, break, return
> 
> I'm constructing this list thinking what it takes to write basic data manipulation functions. What did I forget?
> 
> 
> Andrei

Looking at this from a template perspective.  If we don't have classes/structs.  How much would the template system need to be changed to work like this?  I mean, as I said before "static if" -> "if" and the other operations and compile-time data types (arrays/AA's).  Its very close to what we want.  Not quite as reusable, but the template syntax would be improved.

One issue I have with templates is they cause compilation to slow down and bloat the executable code.  Maybe a template with no arguments should be compiled as a normal function without inlining.  That would solve most of those problems.

-Joel
February 09, 2007
Yauheni Akhotnikau wrote:
>>> But I think it is no need to iterpret 'pure D' code at compile time. If compiler sees such mixin expression it can create temporary D program which has content of the mixin expression and then execute it in background with standard output redirection (with help of rdmd).
>>
>> I've said it before, this is useless. Metacode must have access to the program's symbols.
>>
>> Andrei
> 
> Can you provide some examples which show for what that is need? (May be I miss something but I don't see more or less realistic examples yet).

In a previous post I described the white hole and black hole classes. Starting from an interface, build two degenerate implementations of it. This is not possible with the naive approach of spawning execution of separate programs.


Andrei