January 10, 2013
On Wednesday, 9 January 2013 at 12:29:07 UTC, Philippe Sigaud wrote:
> On Wed, Jan 9, 2013 at 12:05 PM, Timon Gehr <timon.gehr@gmx.ch> wrote:
>> Yes, it is in D. Nothing is released yet. It needs to be polished a little
>> so that there are no known embarrassing shortcomings anymore.
>
> What? That's FOSS, release early, release often!
>

I want to release it at a time when I am ready to cope with reactions and bug reports.

>
>
>> Also, it obviously needs a repl. :-)
>
> Obviously.
>

Actually, at this point, the main challenge is getting the interactive code editing to work.

> And direct programmatic access to the lexer and the parser. I'm coding
> a macro system for D right now, as an over-layer above DMD (like rdmd)
> and having to create the AST by myself to transform them according to
> the user-defined macros is a pain.
>
>> CTFE is basically done (as a portable byte code interpreter, but other
>> strategies, such as JIT, could be easily plugged).
>
> Great!
>
>> This is a snippet of my regression test suite:
>>
>> auto dynRangePrimes(){
>>     DynRange!int impl(int start)=>
>>
>> dynRange(cons(start,delay(()=>filter!(a=>a%start)(impl(start+1)))));
>>     return impl(2);
>> }
>>
>> static assert(array(take(dynRangePrimes(), 20)) ==
>> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]);
>
> Lazy recursive range, right? As the usual Haskell definition for a prime generator. Nice!
>

It uses the usual range templates + dynRange, which wraps a range using delegates in order to hide some static type information and allow recursion.

>> I also need to decide on a licence.
>
> Unless you've reason not to, I'd advise using the same as Phobos: Boost.
>

I'll need to give it some thought, my knowledge about licensing is still rather limited.

>> I assume that the alpha will be out in late spring. (I am busy until early spring.)
>
> You can count me in to test it (I gather to prefer to code this as you see fit, right now).

Thanks!

> You could also present it at the D conf.
>

I'd like to, but I probably cannot make it.

>
> In any cases, congratulations for what seems to be a elegantly done D implementation.

Thanks! (Well, it makes extensive use of string mixins to simulate a coroutine-based system with low overhead.)
January 10, 2013
On Wednesday, 9 January 2013 at 15:45:15 UTC, Tim Krimm wrote:
> O
>> Yes, it is in D. Nothing is released yet. It needs to be polished a little so that there are no known embarrassing shortcomings anymore.
>> (eg. the parser cannot parse extern(...) declarations in alias declarations yet, and I need to finish making a minor tweak to how template instances are analyzed in order to get circular dependency detection to work reliably.
>
> How are you implementing the code generating back-end for the compiler?
>

Currently, not at all. It's just a front-end.

> In your design, I would recommend keeping the front-end separate from the back-end so that maybe the front-end can be connected to the LDC or GDC back-ends.

Yes, the goal is to have the back-end be fully separate. It should also work with no back end at all, eg. for highlighting in code editors.
January 10, 2013
On Wednesday, 9 January 2013 at 18:45:56 UTC, H. S. Teoh wrote:
> On Wed, Jan 09, 2013 at 12:05:55PM +0100, Timon Gehr wrote:
>> On 01/08/2013 10:06 PM, Philippe Sigaud wrote:
> [...]
>> >Also, Timon Gehr spoke of his own front-end (assumed to be in D) in the
>> >past, but did not provide any link to it.
>> >
>> 
>> Yes, it is in D. Nothing is released yet. It needs to be polished a
>> little so that there are no known embarrassing shortcomings anymore.
> [...]
>> CTFE is basically done (as a portable byte code interpreter, ...
>> 
>> static assert(array(take(dynRangePrimes(), 20)) ==
>> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]);
>> 
>> )
>
> Whoa!!! Now you really caught my interest!
>
> Can it handle CTFE-computed mixins that declare and reference variables?
> If it can, I will finally be able to write a library AA implementation
> that allows static AA literals (no runtime init/allocation needed)!
>
>

Probably yes, but I am not sure what you mean. Can you provide an example?

January 10, 2013
On Thu, Jan 10, 2013 at 04:51:35AM +0100, Timon Gehr wrote:
> On Wednesday, 9 January 2013 at 18:45:56 UTC, H. S. Teoh wrote:
[...]
> >Whoa!!! Now you really caught my interest!
> >
> >Can it handle CTFE-computed mixins that declare and reference variables?  If it can, I will finally be able to write a library AA implementation that allows static AA literals (no runtime init/allocation needed)!
> 
> Probably yes, but I am not sure what you mean. Can you provide an example?

The basic idea behind static AA literals is to use CTFE to compute the hashes of the keys, and therefore which slot(s) they will fall in, at compile time. Armed with this information, we can create an array of slots at compile-time that contains the AA entries by declaring each slot as a static variable and using the slot assignment information to initialize the hash table (array of pointers to slots) to point to these slots.

Now, there are some challenges to be overcome here. For example, the slots need to be declared as local variables, and therefore need unique names (the hash table array needs to be able to refer to the addresses of these variables), so the only way I can think of to do this at compile-time is to have the CTFE code generate mixins to declare those variables. So here's the tricky part. Suppose you have this CTFE code:

	// Not the real code, just a rough sketch to give the idea
	string aaComputeAASlots(K,V)(K[V] aaLiteral) {
		// Create slots
		string slotVars;
		foreach (key, value; aaLiteral) {
			size_t slotNum = hashOf(key) % numKeys;
			string slotName = "aaSlot" ~ slotNum;
			slotVars ~= "immutable Slot " ~ slotName ~
				" = Slot(" ~ key ~ ", " ~ value ~ ");";
		}

		string mainTable = q{
			immutable Slot*[] aaTable = {
				&aaSlot0,
				&aaSlot1,
				... // generate these programmatically
			};
		};

		return slotVars ~ mainTable;
	}

	// Instantiate AA here, at compile time.
	mixin(aaComputeAASlots(myLiteral));

The mixin string would look something like this:

	"immutable Slot aaSlot0 = Slot(..., ...);
	 immutable Slot aaSlot1 = Slot(..., ...);
	 immutable Slot*[] aaTable = {
	 	&aaSlot0, &aaSlot1, ...
	 }"

The question is, will the compile be able to resolve &aaSlot0, &aaSlot1, etc., at compile-time?

Obviously, this code works if you copy-n-paste it into a .d source file. It also works if I write it as a string literal, then use mixin(). However, on DMD, if the string is returned by a CTFE function that builds it using ~, for some reason all the pointers in aaTable are null. Strangely enough, if I assign the returned string to a string variable and print it out at runtime, the correct string is produced.

So I concluded that somehow, DMD was unable to correctly interpret &aaSlot0 inside a mixin string at compile-time.


T

-- 
It's bad luck to be superstitious. -- YHL
January 10, 2013
On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> ...
>
> So I concluded that somehow, DMD was unable to correctly interpret
> &aaSlot0 inside a mixin string at compile-time.
>
>
> T

If so, then that is a bug.
January 10, 2013
On 2013-01-10 04:49, Timon Gehr wrote:

> Yes, the goal is to have the back-end be fully separate. It should also
> work with no back end at all, eg. for highlighting in code editors.

That's always nice to hear.

-- 
/Jacob Carlborg
January 10, 2013
On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> On Thu, Jan 10, 2013 at 04:51:35AM +0100, Timon Gehr wrote:
>> On Wednesday, 9 January 2013 at 18:45:56 UTC, H. S. Teoh wrote:
> [...]
>> >Whoa!!! Now you really caught my interest!
>> >
>> >Can it handle CTFE-computed mixins that declare and reference
>> >variables?  If it can, I will finally be able to write a library AA
>> >implementation that allows static AA literals (no runtime
>> >init/allocation needed)!
>> 
>> Probably yes, but I am not sure what you mean. Can you provide an
>> example?
>
> The basic idea behind static AA literals is to use CTFE to compute the
> hashes of the keys, and therefore which slot(s) they will fall in, at
> compile time. Armed with this information, we can create an array of
> slots at compile-time that contains the AA entries by declaring each
> slot as a static variable and using the slot assignment information to
> initialize the hash table (array of pointers to slots) to point to these
> slots.
>
> Now, there are some challenges to be overcome here. For example, the
> slots need to be declared as local variables, and therefore need unique
> names (the hash table array needs to be able to refer to the addresses
> of these variables), so the only way I can think of to do this at
> compile-time is to have the CTFE code generate mixins to declare those
> variables. So here's the tricky part. Suppose you have this CTFE code:
>
> 	// Not the real code, just a rough sketch to give the idea
> 	string aaComputeAASlots(K,V)(K[V] aaLiteral) {
> 		// Create slots
> 		string slotVars;
> 		foreach (key, value; aaLiteral) {
> 			size_t slotNum = hashOf(key) % numKeys;
> 			string slotName = "aaSlot" ~ slotNum;
> 			slotVars ~= "immutable Slot " ~ slotName ~
> 				" = Slot(" ~ key ~ ", " ~ value ~ ");";
> 		}
>
> 		string mainTable = q{
> 			immutable Slot*[] aaTable = {
> 				&aaSlot0,
> 				&aaSlot1,
> 				... // generate these programmatically
> 			};
> 		};
>
> 		return slotVars ~ mainTable;
> 	}
>
> 	// Instantiate AA here, at compile time.
> 	mixin(aaComputeAASlots(myLiteral));
>
> The mixin string would look something like this:
>
> 	"immutable Slot aaSlot0 = Slot(..., ...);
> 	 immutable Slot aaSlot1 = Slot(..., ...);
> 	 immutable Slot*[] aaTable = {
> 	 	&aaSlot0, &aaSlot1, ...
> 	 }"
>
> The question is, will the compile be able to resolve &aaSlot0, &aaSlot1,
> etc., at compile-time?
>
> Obviously, this code works if you copy-n-paste it into a .d source file.
> It also works if I write it as a string literal, then use mixin().
> However, on DMD, if the string is returned by a CTFE function that
> builds it using ~, for some reason all the pointers in aaTable are null.
> Strangely enough, if I assign the returned string to a string variable
> and print it out at runtime, the correct string is produced.
>
> So I concluded that somehow, DMD was unable to correctly interpret
> &aaSlot0 inside a mixin string at compile-time.
>
>
> T

January 10, 2013
 Eep, seems I hit send...

On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> The basic idea behind static AA literals is to use CTFE to compute the hashes of the keys, and therefore which slot(s) they will fall in, at compile time. Armed with this information, we can create an array of slots at compile-time that contains the AA entries by declaring each slot as a static variable and using the slot assignment information to initialize the hash table (array of pointers to slots) to point to these slots.

 Hmmm somehow that doesn't seem like a good idea; I mean it will work....

 Alternative is to sorta have a pair of static arrays, then either use a balanced tree, or a modulus to best hold (and separate) the values.

 Modulus based: It could be like...

  //T obviously replace with appropriate type
  ref T AA_lookup(name publicName)(uint hash) {
    //offset for part1
    //could be external to..
    immutable static int offsets[];  //no need for pointers, right?
    immutable static T values[]; //contains actual item data

    enum mod; //divider for table.
    uint result = hash % mod;

    //or branch to return empty? But can't be ref then...
    assert(offsets[result] != -1);

    return values[offsets[result]];
  }

 Then your CTFE increases the mod in it's calculations until every element can fit only once, and make the offsets based on them. Default values would be -1 (range error if called), course if it's mod is rather large and wastes a lot of space then perhaps it falls back on the tree structure (in those cases, > 3x of array size)


 Tree: I don't mind a minor performance hit, log(n) is a very small hit in my mind. In that case a statically created tree is rather easy. AA's created ahead of time are likely rather small (<1 Million elements). So a tree structure of:

 struct Node(T) {
   //pointers don't seem like they are needed;
   //Can even be ushorts or ubytes if small enough.
   uint left, right;
   T* value;
 }

 You get the idea.
January 10, 2013
On Thu, Jan 10, 2013 at 08:38:08PM +0100, Era Scarecrow wrote:
>  Eep, seems I hit send...
> 
> On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> >The basic idea behind static AA literals is to use CTFE to compute the hashes of the keys, and therefore which slot(s) they will fall in, at compile time. Armed with this information, we can create an array of slots at compile-time that contains the AA entries by declaring each slot as a static variable and using the slot assignment information to initialize the hash table (array of pointers to slots) to point to these slots.
> 
>  Hmmm somehow that doesn't seem like a good idea; I mean it will
> work....
> 
>  Alternative is to sorta have a pair of static arrays, then either
> use a balanced tree, or a modulus to best hold (and separate) the
> values.
[...]

The issue here is that I wanted to avoid the complication of having a separate set of functions to lookup the AA literal vs. a runtime AA (and trust me, things get *very* hairy if you go that route -- it requires non-trivial compiler hacks to make things work with two distinct AA types), so whatever is generated at compile-time must be the same structure as runtime AA's so that the same AA methods will work on both.

If this is unimportant, then I would've done it another way.


T

-- 
MACINTOSH: Most Applications Crash, If Not, The Operating System Hangs
January 10, 2013
09-Jan-2013 15:05, Timon Gehr пишет:
> On 01/08/2013 10:06 PM, Philippe Sigaud wrote:
>> ...
>>
>> Isn't SDC also in D? (Bernard Helyer and friends)
>> https://github.com/bhelyer/SDC
>>
>>
>> Also, Timon Gehr spoke of his own front-end (assumed to be in D) in the
>> past, but did not provide any link to it.
>>
>
> Yes, it is in D. Nothing is released yet. It needs to be polished a
> little so that there are no known embarrassing shortcomings anymore.
> (eg. the parser cannot parse extern(...) declarations in alias
> declarations yet, and I need to finish making a minor tweak to how
> template instances are analyzed in order to get circular dependency
> detection to work reliably. Furthermore, examples like the following are
> currently rejected, while I want it to work:

[snip]

> CTFE is basically done (as a portable byte code interpreter, but other
> strategies, such as JIT, could be easily plugged). This is a snippet of
> my regression test suite:
>
> auto dynRangePrimes(){
>      DynRange!int impl(int start)=>
>
> dynRange(cons(start,delay(()=>filter!(a=>a%start)(impl(start+1)))));
>      return impl(2);
> }
>
> static assert(array(take(dynRangePrimes(), 20)) ==
> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]);
>
> )

Cool! I'd love to take even the very preliminary peek at the speed profile of this CTFE engine.

If you are interested I'd love to test a small (the code though contains a lot of static data) CTFE benchmark that is the bottleneck in the compilation of current ctRegex.

See the attachment here:
http://d.puremagic.com/issues/show_bug.cgi?id=7442

would be nice to see some rough numbers for this vs DMD.

> I also need to decide on a licence. I assume that the alpha will be out
> in late spring. (I am busy until early spring.)
>
Eagerly waiting to play with it.

-- 
Dmitry Olshansky