Thread overview
Another way to do CTFE
Jun 17, 2014
Ary Borenszweig
Jun 17, 2014
Dicebot
Jun 17, 2014
Nick Sabalausky
Jun 17, 2014
Dmitry Olshansky
Jun 17, 2014
Tofu Ninja
Jun 17, 2014
Araq
Jun 18, 2014
Dicebot
June 17, 2014
CTFE is really nice but has its limitations: you can't do anything you want, and since it's interpreted it requires an interpreter and it's generally slow. Nimrod does the same thing, and now they are implementing a VM to run the interpreted code faster. Is this really the way to go?

In our language we are thinking about allowing code generation at compile time but in a different way. The idea is, at compile time, to compile and execute another program that would generate the code that would be mixed into the current program. This program could receive the execution context as arguments, along with any AST nodes that are passed to the program.

So right now in D you can do ctRegex:

auto ctr = ctRegex!(`^.*/([^/]+)/?$`);

I don't know what the syntax could be, but the idea is to have a file ct_regex.d. This file would receive the string as an argument and must generate the code that would be mixed in the program. Since this program is a program (compiled and executed), it has no limits on what it can do. Then you would do something like this:

mixin(compile_time_execute("ct_regex.d", `^.*/([^/]+)/?$`));

The compiler could be smart and cache the executable so that anytime it has to expand it it just needs to invoke it (skip the compile phase).

What do you think?

I know, I know. The first answer I'll get is: "Oh, no! But that way I could download a program, compile it and suddenly all my files are gone". My reply is: If you downloaded and compiled that program, weren't you going to execute it afterwards? At that point the program could do something harmful, so what's the difference?. You must either way check the source code to see that something fishy isn't happening there.

Just as a reference that something like this is possible, in our language you can already do this:

build_date = {{ system("date").stringify }}
puts build_date

That generates a program that has the build date embedded in it. We can also get the git hash of a repo and stick it into the executable without an additional Makefile or some build process. "system" is our first step towards doing this compile-time things. The next thing would be do do:

ct_regex = {{ run("ct_regex", "^.*/([^/]+)/?$") }}
June 17, 2014
Heh: http://forum.dlang.org/post/lnhtiq$qqn$1@digitalmars.com
June 17, 2014
17-Jun-2014 23:41, Ary Borenszweig пишет:
> CTFE is really nice but has its limitations: you can't do anything you
> want, and since it's interpreted it requires an interpreter and it's
> generally slow. Nimrod does the same thing, and now they are
> implementing a VM to run the interpreted code faster. Is this really the
> way to go?
>
> In our language we are thinking about allowing code generation at
> compile time but in a different way. The idea is, at compile time, to
> compile and execute another program that would generate the code that
> would be mixed into the current program. This program could receive the
> execution context as arguments, along with any AST nodes that are passed
> to the program.
>
> So right now in D you can do ctRegex:
>
> auto ctr = ctRegex!(`^.*/([^/]+)/?$`);
>
> I don't know what the syntax could be, but the idea is to have a file
> ct_regex.d. This file would receive the string as an argument and must
> generate the code that would be mixed in the program. Since this program
> is a program (compiled and executed), it has no limits on what it can
> do. Then you would do something like this:
>
> mixin(compile_time_execute("ct_regex.d", `^.*/([^/]+)/?$`));
>
> The compiler could be smart and cache the executable so that anytime it
> has to expand it it just needs to invoke it (skip the compile phase).
>
> What do you think?

Not limiting it to just calling some external tools, but plugins and services, I agree. Well, see the link by Dicebot.

I belive this is more practical and useful stuff for heavy meta-programming.
My reasons pro:
a) Not everything could be done in CTFE envirnoment, e.g. please go ahead and compile a HLSL shader for me.
b) Performance of standalone optimized code and definitive boundaries for caching of results.

Some points against:
a) Can't be as deeply integrated into compiler. Passing arbitrary D types won't work, for instance, or it needs to share type info with compiler. Same limitations with meta-data.
b) We haven't seen a proper interpreter for CTFE yet at all, so are unable to truly assess its performance.

Overall I think it's much more practical (yet hackish) way that can be easily done in near future.

>
> I know, I know. The first answer I'll get is: "Oh, no! But that way I
> could download a program, compile it and suddenly all my files are
> gone". My reply is: If you downloaded and compiled that program, weren't
> you going to execute it afterwards? At that point the program could do
> something harmful, so what's the difference?. You must either way check
> the source code to see that something fishy isn't happening there.

Well, there are ways to constrain plugins,
even in system languages like D.

>
> Just as a reference that something like this is possible, in our
> language you can already do this:
>
> build_date = {{ system("date").stringify }}
> puts build_date
>
> That generates a program that has the build date embedded in it. We can
> also get the git hash of a repo and stick it into the executable without
> an additional Makefile or some build process. "system" is our first step
> towards doing this compile-time things. The next thing would be do do:
>
> ct_regex = {{ run("ct_regex", "^.*/([^/]+)/?$") }}


-- 
Dmitry Olshansky
June 17, 2014
On Tuesday, 17 June 2014 at 19:41:59 UTC, Ary Borenszweig wrote:
> CTFE is really nice but has its limitations: you can't do anything you want, and since it's interpreted it requires an interpreter and it's generally slow. Nimrod does the same thing, and now they are implementing a VM to run the interpreted code faster. Is this really the way to go?
>
> In our language we are thinking about allowing code generation at compile time but in a different way. The idea is, at compile time, to compile and execute another program that would generate the code that would be mixed into the current program. This program could receive the execution context as arguments, along with any AST nodes that are passed to the program.
>
> So right now in D you can do ctRegex:
>
> auto ctr = ctRegex!(`^.*/([^/]+)/?$`);
>
> I don't know what the syntax could be, but the idea is to have a file ct_regex.d. This file would receive the string as an argument and must generate the code that would be mixed in the program. Since this program is a program (compiled and executed), it has no limits on what it can do. Then you would do something like this:
>
> mixin(compile_time_execute("ct_regex.d", `^.*/([^/]+)/?$`));
>
> The compiler could be smart and cache the executable so that anytime it has to expand it it just needs to invoke it (skip the compile phase).
>
> What do you think?
>
> I know, I know. The first answer I'll get is: "Oh, no! But that way I could download a program, compile it and suddenly all my files are gone". My reply is: If you downloaded and compiled that program, weren't you going to execute it afterwards? At that point the program could do something harmful, so what's the difference?. You must either way check the source code to see that something fishy isn't happening there.
>
> Just as a reference that something like this is possible, in our language you can already do this:
>
> build_date = {{ system("date").stringify }}
> puts build_date
>
> That generates a program that has the build date embedded in it. We can also get the git hash of a repo and stick it into the executable without an additional Makefile or some build process. "system" is our first step towards doing this compile-time things. The next thing would be do do:
>
> ct_regex = {{ run("ct_regex", "^.*/([^/]+)/?$") }}

I had a similar idea a while ago. The only difference was that
instead of compiling and running some d file at compile time,
mine was simply run some pre-compiled executable at compile time
and return the output as a string(similar to string imports). I
expressed similar use cases as you mentioned, replace extremely
slow ctfe, but it didn't seem to catch on. Every one screamed
that it was a security risk and it died there.

-tofu
June 17, 2014
On 6/17/2014 3:55 PM, Dicebot wrote:
> Heh: http://forum.dlang.org/post/lnhtiq$qqn$1@digitalmars.com

Yea, Nemerle's approach addresses that, although it comes with other tradeoffs. In Nemerle, you compile your macros to a dll, then you pass that dll to the compiler when compiling any code that uses the macros.

It has various pros/cons versus D's approach, but I think it's at least something worth being aware of.

June 17, 2014
On Tuesday, 17 June 2014 at 19:41:59 UTC, Ary Borenszweig wrote:
> CTFE is really nice but has its limitations: you can't do anything you want, and since it's interpreted it requires an interpreter and it's generally slow. Nimrod does the same thing, and now they are implementing a VM to run the interpreted code faster. Is this really the way to go?
>

For your information the new VM shipped with 0.9.4 and runs
Nimrod code faster at compile-time than Python runs code at
run-time in the tests that I did with it. :-) That said, it
turned out to be much harder to implement than I thought and I
wouldn't do it again.

> ...
> The compiler could be smart and cache the executable so that anytime it has to expand it it just needs to invoke it (skip the compile phase).
>
> What do you think?
>

It is a *very* good idea and this is exactly the way I would do
it now. However, you usually only trade one set of problems for
another. (For instance, giving Nimrod an 'eval' module is now
quite easy to do...)
June 18, 2014
On actual topic.

Do I think it is practical approach and has benefits over existing situation? Definitely yes.

Do I think it is the right design with a more idealized infrastructure? No. As Dmitry has mentioned it has huge flaw of not being able to use template alias and type arguments, effectively removing reflection out of the question.

Do I think including it in the language as opposed to build system is the deal breaker here? Not sure but unlikely. It improves mental context locality which is not important until this become a much more casual tool. And by the time this happens I'd like another design to be encouraged anyway.