Thread overview
Is it possible to "cache" results of compile-time executions between compiles?
Jan 24, 2017
TheFlyingFiddle
Jan 24, 2017
TheFlyingFiddle
Jan 24, 2017
ketmar
Jan 24, 2017
TheFlyingFiddle
Jan 24, 2017
TheFlyingFiddle
Jan 24, 2017
TheFlyingFiddle
Jan 24, 2017
Profile Anaysis
Jan 24, 2017
Profile Anaysis
Jan 24, 2017
TheFlyingFiddle
Jan 24, 2017
TheFlyingFiddle
January 24, 2017
Context:
I am currently writing a small library that compiles sql strings at compile-time and generates query objects.

Something like this:

unittest
{
    mixin Sql!(q{
       select feed.url, feed.title
       from users
       join user_feeds as feed
       on users.id = feed.user
       where user.id = {user}
    }, DBInterface) UserFeeds;

    UserFeeds.Query query;
    query.user = 1; //some user

    auto con   = //Open db connection
    auto feeds = con.execute(query);
    foreach(f; feeds)
    {
        writef("Feed: %s\n\tTitle: %s\n", f.url, f.title);
    }
}

The parsing step does some amount work like validates the sql query, makes sure that tables "user" and "user_feed" exists in DBInterface and creates a query having "user" as input and a result type containing url, title with appropriate type.

Now the compile times for parsing a small number of queries are marginal. However, as the queries become more numerous and complex compile-times starts to become a problem.

Currently I "solve"tm long compile times by having the DBInterface and queries be compiled into a separate lib and linking to that lib from the main application.

Having a dedicated db lib fixes compile times for the main application. But I am wondering if it's possible to work on a lower level of granularity. Somehow "cache" the queries between compiles, without resorting to each query being compiled into it's own lib/object file.

Does D have any facilities that could make this possible?



January 24, 2017
On Tuesday, 24 January 2017 at 11:19:58 UTC, TheFlyingFiddle wrote:
> Does D have any facilities that could make this possible?

It seems that there is a feature I was unaware of/forgot called Import Expressions.

unittest
{
   enum s = import("myfile");
}

Is there something similar to this for outputting files at compile-time? Or do I need to keep the source around and write it at run-time?
January 24, 2017
On Tuesday, 24 January 2017 at 12:14:05 UTC, TheFlyingFiddle wrote:
> unittest
> {
>    enum s = import("myfile");
> }
>
> Is there something similar to this for outputting files at compile-time?

no. this is by design, so it won't be fixed. sorry. you may use build script that will create the code first, and you can dump `pragma(msg, …);` to file, but that's all.
January 24, 2017
On Tuesday, 24 January 2017 at 12:19:33 UTC, ketmar wrote:
> On Tuesday, 24 January 2017 at 12:14:05 UTC, TheFlyingFiddle wrote:
>> unittest
>> {
>>    enum s = import("myfile");
>> }
>>
>> Is there something similar to this for outputting files at compile-time?
>
> no. this is by design, so it won't be fixed. sorry. you may use build script that will create the code first, and you can dump `pragma(msg, …);` to file, but that's all.

Yeah, I guess allowing the compiler to produce arbitrary files would be problematic from a security standpoint.

Thanks for the pragma idea! Wrapping the build in a script is a satisfactory solution for me.
January 24, 2017
On Tuesday, 24 January 2017 at 12:19:33 UTC, ketmar wrote:
> On Tuesday, 24 January 2017 at 12:14:05 UTC, TheFlyingFiddle wrote:
>> unittest
>> {
>>    enum s = import("myfile");
>> }
>>
>> Is there something similar to this for outputting files at compile-time?
>
> no. this is by design, so it won't be fixed. sorry. you may use build script that will create the code first, and you can dump `pragma(msg, …);` to file, but that's all.

Thanks again. Wrapping dmd with a script worked wonders!

Now i'm able to do this:
From the old: (a)
unittest
{
   mixin Sql!(...);
   mixin Sql!(...);
   ...
   mixin Sql!(...);
}

To the new:  (b)
unittest
{
   mixin Cache!(Sql, ...);
   mixin Cache!(Sql, ...);
   ...
   mixin Cache!(Sql, ...);
}

For (a) the build times are in the 10-30s always
For (b) the build times are in the 10-30s the first time and subseconds later.
Each query just adds a few ms to the build time now!

Additionally even if dmd crashes with an out of memory exception (which still happens with the current ctfe engine) most of the queries will have already been built and dmd can be restarted. After the restart the built queries are loaded via caching and dmd can finish working on the leftovers.

Everything turned out soooo much better than expected :)

January 24, 2017
On Tuesday, 24 January 2017 at 16:41:13 UTC, TheFlyingFiddle wrote:
> Everything turned out soooo much better than expected :)
Added bonus is that mixin output can be viewed in the generated files :D
January 24, 2017
On Tuesday, 24 January 2017 at 16:49:03 UTC, TheFlyingFiddle wrote:
> On Tuesday, 24 January 2017 at 16:41:13 UTC, TheFlyingFiddle wrote:
>> Everything turned out soooo much better than expected :)
> Added bonus is that mixin output can be viewed in the generated files :D

Could you post your solution?

I suggest we get a real caching module like above that has the extra feature of hashing the mixin strings.

This way the caching mechanism can validate if the mixin strings have changed. Put the hash in a comment in the output file that used to test if the input string has the same hash. If it does, simply use the output file, else, regenerate.

Adds some overhead but keeps things consistent.


(Since I'm not sure what Cache!() is, I'm assuming it doesn't do this)
January 24, 2017
On Tuesday, 24 January 2017 at 21:36:50 UTC, Profile Anaysis wrote:
>...

Maybe with all this talk of the new CTFE engine being developed, a similar mechanism can be used optionally? This could help with debugging also.

In debug mode, the cfte mixin's are written to disk with hash, if they are not a string themselves. (could be done with all cfte's, I suppose, but not sure about performance and consistency)

Then debuggers can use the outputed cfte's for proper analysis, line breaking, etc...









January 24, 2017
On Tuesday, 24 January 2017 at 21:36:50 UTC, Profile Anaysis wrote:
> On Tuesday, 24 January 2017 at 16:49:03 UTC, TheFlyingFiddle wrote:
>> On Tuesday, 24 January 2017 at 16:41:13 UTC, TheFlyingFiddle wrote:
>>> Everything turned out soooo much better than expected :)
>> Added bonus is that mixin output can be viewed in the generated files :D
>
> Could you post your solution?
>
> I suggest we get a real caching module like above that has the extra feature of hashing the mixin strings.
>
> This way the caching mechanism can validate if the mixin strings have changed. Put the hash in a comment in the output file that used to test if the input string has the same hash. If it does, simply use the output file, else, regenerate.
>
> Adds some overhead but keeps things consistent.
>
>
> (Since I'm not sure what Cache!() is, I'm assuming it doesn't do this)

This is the solution I through together:
// module mixin_cache.d
mixin template Cache(alias GenSource, string from, string path)
{
    import core.internal.hash;
    import std.conv : to;

    //Hash to keep track of changes
    enum h = hashOf(from);
    enum p = path ~ h.to!string ~ ".txt";

    //Check if the file exists else suppress errors
    //The -J flag needs to be set on dmd else
    //this always fails
    static if(__traits(compiles, import(p)))
    {
        //Tell the wrapper that we loaded the file p
        //_importing is a magic string
	pragma(msg, "_importing");
	pragma(msg, p);
	mixin(import(p));
    }
    else
    {
        //We don't have a cached file so generate it
	private enum src = GenSource!(from);
	static if(__traits(compiles, () { mixin(src); }))
	{
            //_exporing_start_ tells the wrapper to begin
            //outputing the generated source into file p
	    pragma(msg, "_exported_start_");
	    pragma(msg, p);
	    pragma(msg, src);
	    pragma(msg, "_exported_end_");
	}

	mixin(src);
    }
}

To make this work I wrap dmd in a d script like this:
(ignoring some details as what i've got is not really tested yet)

// dmd_with_mixin_cache.d

void main(string[] args)
{
   auto dmdargs = ... //Fix args etc.
   auto dmd = pipeProcess(dmdargs, Redirect.stderr);

   foreach(line; dmd.stderr.byLine(KeepTerminator.yes))
   {
       if(line.startsWith("_exported_start_")) {
          //Parse file and store source in a file
          //Keep going until _exported_end_
       } else if(line.startsWith("_importing")) {
          //A user imported a file. (don't delete it!)
       } else {
         //Other output from dmd like errors / other pragma(msg, ...)
       }
   }

   //Files not imported / exported could be stale
   //delete them. Unless we got a compile error from dmd
   //Then don't delete anything.
}

The cache template and the small wrapper that wraps dmd was all that were needed.

usage is somthing like this:

template Generate(string source)
{
    string Generate()
    {
       //Do some complex ctfe here
       //can't wait for the new ctfe engine!
       foreach(i; 0 .. 100_000)
       { }
       return source;
    }
}

mixin Cache!("Some interesting DSL or similar", __MODULE__);

//some_filename is not really needed but can be nice when browsing the
//mixin code to see where it came from. (approximately anwyays)

This is it. If you want I can post a full solution sometime later this week but I want to clean up what I have first.


January 24, 2017
On Tuesday, 24 January 2017 at 21:41:12 UTC, Profile Anaysis wrote:
> On Tuesday, 24 January 2017 at 21:36:50 UTC, Profile Anaysis wrote:
>>...
>
> Maybe with all this talk of the new CTFE engine being developed, a similar mechanism can be used optionally? This could help with debugging also.
>
> In debug mode, the cfte mixin's are written to disk with hash, if they are not a string themselves. (could be done with all cfte's, I suppose, but not sure about performance and consistency)
>
> Then debuggers can use the outputed cfte's for proper analysis, line breaking, etc...

Would be nice to have something like this in dmd. Would be even better if it could work on templates as well. No more stepping though functions filled with static if :).

I think it would be possible to make something like this work to:

template Generator(T...)
{
   string Generator()
   {
      //Complex ctfe with T... that generates strings and nothing else.
   }
}

If it's possible to quickly detect changes to all T arguments one could cache things on this form to.

For example:

mixin Cache!("some_file_name", Generator, size_t, int, 2,
                               "hello", MyStruct(3),
                               MyType);

The problem would be detecting changes in the arguments. As long as one is able to get a unique hash from each input element it should work fine I think. I guess it would be required to reflect over the members of the structs/classes to lookup attributes and such. If the generation stage is time consuming this might be worth it... But it's not gonna be "almost free" like for DSLs. Basic types (not including functions/delegates don't see how this could work without writing mixin code targeted at caching) should be simple/trivial to detect changes to however.