View mode: basic / threaded / horizontal-split · Log in · Help
January 10, 2013
Re: Official DMD compiler written in D
On Wednesday, 9 January 2013 at 12:29:07 UTC, Philippe Sigaud 
wrote:
> On Wed, Jan 9, 2013 at 12:05 PM, Timon Gehr <timon.gehr@gmx.ch> 
> wrote:
>> Yes, it is in D. Nothing is released yet. It needs to be 
>> polished a little
>> so that there are no known embarrassing shortcomings anymore.
>
> What? That's FOSS, release early, release often!
>

I want to release it at a time when I am ready to cope with 
reactions and bug reports.

>
>
>> Also, it obviously needs a repl. :-)
>
> Obviously.
>

Actually, at this point, the main challenge is getting the 
interactive code editing to work.

> And direct programmatic access to the lexer and the parser. I'm 
> coding
> a macro system for D right now, as an over-layer above DMD 
> (like rdmd)
> and having to create the AST by myself to transform them 
> according to
> the user-defined macros is a pain.
>
>> CTFE is basically done (as a portable byte code interpreter, 
>> but other
>> strategies, such as JIT, could be easily plugged).
>
> Great!
>
>> This is a snippet of my regression test suite:
>>
>> auto dynRangePrimes(){
>>     DynRange!int impl(int start)=>
>>
>> dynRange(cons(start,delay(()=>filter!(a=>a%start)(impl(start+1)))));
>>     return impl(2);
>> }
>>
>> static assert(array(take(dynRangePrimes(), 20)) ==
>> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]);
>
> Lazy recursive range, right? As the usual Haskell definition 
> for a prime generator. Nice!
>

It uses the usual range templates + dynRange, which wraps a range 
using delegates in order to hide some static type information and 
allow recursion.

>> I also need to decide on a licence.
>
> Unless you've reason not to, I'd advise using the same as 
> Phobos: Boost.
>

I'll need to give it some thought, my knowledge about licensing 
is still rather limited.

>> I assume that the alpha will be out in late spring. (I am busy 
>> until early spring.)
>
> You can count me in to test it (I gather to prefer to code this 
> as you see fit, right now).

Thanks!

> You could also present it at the D conf.
>

I'd like to, but I probably cannot make it.

>
> In any cases, congratulations for what seems to be a elegantly 
> done D implementation.

Thanks! (Well, it makes extensive use of string mixins to 
simulate a coroutine-based system with low overhead.)
January 10, 2013
Re: Official DMD compiler written in D
On Wednesday, 9 January 2013 at 15:45:15 UTC, Tim Krimm wrote:
> O
>> Yes, it is in D. Nothing is released yet. It needs to be 
>> polished a little so that there are no known embarrassing 
>> shortcomings anymore.
>> (eg. the parser cannot parse extern(...) declarations in alias 
>> declarations yet, and I need to finish making a minor tweak to 
>> how template instances are analyzed in order to get circular 
>> dependency detection to work reliably.
>
> How are you implementing the code generating back-end for the 
> compiler?
>

Currently, not at all. It's just a front-end.

> In your design, I would recommend keeping the front-end 
> separate from the back-end so that maybe the front-end can be 
> connected to the LDC or GDC back-ends.

Yes, the goal is to have the back-end be fully separate. It 
should also work with no back end at all, eg. for highlighting in 
code editors.
January 10, 2013
Re: Official DMD compiler written in D
On Wednesday, 9 January 2013 at 18:45:56 UTC, H. S. Teoh wrote:
> On Wed, Jan 09, 2013 at 12:05:55PM +0100, Timon Gehr wrote:
>> On 01/08/2013 10:06 PM, Philippe Sigaud wrote:
> [...]
>> >Also, Timon Gehr spoke of his own front-end (assumed to be in 
>> >D) in the
>> >past, but did not provide any link to it.
>> >
>> 
>> Yes, it is in D. Nothing is released yet. It needs to be 
>> polished a
>> little so that there are no known embarrassing shortcomings 
>> anymore.
> [...]
>> CTFE is basically done (as a portable byte code interpreter, 
>> ...
>> 
>> static assert(array(take(dynRangePrimes(), 20)) ==
>> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]);
>> 
>> )
>
> Whoa!!! Now you really caught my interest!
>
> Can it handle CTFE-computed mixins that declare and reference 
> variables?
> If it can, I will finally be able to write a library AA 
> implementation
> that allows static AA literals (no runtime init/allocation 
> needed)!
>
>

Probably yes, but I am not sure what you mean. Can you provide an 
example?
January 10, 2013
Re: Official DMD compiler written in D
On Thu, Jan 10, 2013 at 04:51:35AM +0100, Timon Gehr wrote:
> On Wednesday, 9 January 2013 at 18:45:56 UTC, H. S. Teoh wrote:
[...]
> >Whoa!!! Now you really caught my interest!
> >
> >Can it handle CTFE-computed mixins that declare and reference
> >variables?  If it can, I will finally be able to write a library AA
> >implementation that allows static AA literals (no runtime
> >init/allocation needed)!
> 
> Probably yes, but I am not sure what you mean. Can you provide an
> example?

The basic idea behind static AA literals is to use CTFE to compute the
hashes of the keys, and therefore which slot(s) they will fall in, at
compile time. Armed with this information, we can create an array of
slots at compile-time that contains the AA entries by declaring each
slot as a static variable and using the slot assignment information to
initialize the hash table (array of pointers to slots) to point to these
slots.

Now, there are some challenges to be overcome here. For example, the
slots need to be declared as local variables, and therefore need unique
names (the hash table array needs to be able to refer to the addresses
of these variables), so the only way I can think of to do this at
compile-time is to have the CTFE code generate mixins to declare those
variables. So here's the tricky part. Suppose you have this CTFE code:

	// Not the real code, just a rough sketch to give the idea
	string aaComputeAASlots(K,V)(K[V] aaLiteral) {
		// Create slots
		string slotVars;
		foreach (key, value; aaLiteral) {
			size_t slotNum = hashOf(key) % numKeys;
			string slotName = "aaSlot" ~ slotNum;
			slotVars ~= "immutable Slot " ~ slotName ~
				" = Slot(" ~ key ~ ", " ~ value ~ ");";
		}

		string mainTable = q{
			immutable Slot*[] aaTable = {
				&aaSlot0,
				&aaSlot1,
				... // generate these programmatically
			};
		};

		return slotVars ~ mainTable;
	}

	// Instantiate AA here, at compile time.
	mixin(aaComputeAASlots(myLiteral));

The mixin string would look something like this:

	"immutable Slot aaSlot0 = Slot(..., ...);
	 immutable Slot aaSlot1 = Slot(..., ...);
	 immutable Slot*[] aaTable = {
	 	&aaSlot0, &aaSlot1, ...
	 }"

The question is, will the compile be able to resolve &aaSlot0, &aaSlot1,
etc., at compile-time?

Obviously, this code works if you copy-n-paste it into a .d source file.
It also works if I write it as a string literal, then use mixin().
However, on DMD, if the string is returned by a CTFE function that
builds it using ~, for some reason all the pointers in aaTable are null.
Strangely enough, if I assign the returned string to a string variable
and print it out at runtime, the correct string is produced.

So I concluded that somehow, DMD was unable to correctly interpret
&aaSlot0 inside a mixin string at compile-time.


T

-- 
It's bad luck to be superstitious. -- YHL
January 10, 2013
Re: Official DMD compiler written in D
On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> ...
>
> So I concluded that somehow, DMD was unable to correctly 
> interpret
> &aaSlot0 inside a mixin string at compile-time.
>
>
> T

If so, then that is a bug.
January 10, 2013
Re: Official DMD compiler written in D
On 2013-01-10 04:49, Timon Gehr wrote:

> Yes, the goal is to have the back-end be fully separate. It should also
> work with no back end at all, eg. for highlighting in code editors.

That's always nice to hear.

-- 
/Jacob Carlborg
January 10, 2013
Re: Official DMD compiler written in D
On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> On Thu, Jan 10, 2013 at 04:51:35AM +0100, Timon Gehr wrote:
>> On Wednesday, 9 January 2013 at 18:45:56 UTC, H. S. Teoh wrote:
> [...]
>> >Whoa!!! Now you really caught my interest!
>> >
>> >Can it handle CTFE-computed mixins that declare and reference
>> >variables?  If it can, I will finally be able to write a 
>> >library AA
>> >implementation that allows static AA literals (no runtime
>> >init/allocation needed)!
>> 
>> Probably yes, but I am not sure what you mean. Can you provide 
>> an
>> example?
>
> The basic idea behind static AA literals is to use CTFE to 
> compute the
> hashes of the keys, and therefore which slot(s) they will fall 
> in, at
> compile time. Armed with this information, we can create an 
> array of
> slots at compile-time that contains the AA entries by declaring 
> each
> slot as a static variable and using the slot assignment 
> information to
> initialize the hash table (array of pointers to slots) to point 
> to these
> slots.
>
> Now, there are some challenges to be overcome here. For 
> example, the
> slots need to be declared as local variables, and therefore 
> need unique
> names (the hash table array needs to be able to refer to the 
> addresses
> of these variables), so the only way I can think of to do this 
> at
> compile-time is to have the CTFE code generate mixins to 
> declare those
> variables. So here's the tricky part. Suppose you have this 
> CTFE code:
>
> 	// Not the real code, just a rough sketch to give the idea
> 	string aaComputeAASlots(K,V)(K[V] aaLiteral) {
> 		// Create slots
> 		string slotVars;
> 		foreach (key, value; aaLiteral) {
> 			size_t slotNum = hashOf(key) % numKeys;
> 			string slotName = "aaSlot" ~ slotNum;
> 			slotVars ~= "immutable Slot " ~ slotName ~
> 				" = Slot(" ~ key ~ ", " ~ value ~ ");";
> 		}
>
> 		string mainTable = q{
> 			immutable Slot*[] aaTable = {
> 				&aaSlot0,
> 				&aaSlot1,
> 				... // generate these programmatically
> 			};
> 		};
>
> 		return slotVars ~ mainTable;
> 	}
>
> 	// Instantiate AA here, at compile time.
> 	mixin(aaComputeAASlots(myLiteral));
>
> The mixin string would look something like this:
>
> 	"immutable Slot aaSlot0 = Slot(..., ...);
> 	 immutable Slot aaSlot1 = Slot(..., ...);
> 	 immutable Slot*[] aaTable = {
> 	 	&aaSlot0, &aaSlot1, ...
> 	 }"
>
> The question is, will the compile be able to resolve &aaSlot0, 
> &aaSlot1,
> etc., at compile-time?
>
> Obviously, this code works if you copy-n-paste it into a .d 
> source file.
> It also works if I write it as a string literal, then use 
> mixin().
> However, on DMD, if the string is returned by a CTFE function 
> that
> builds it using ~, for some reason all the pointers in aaTable 
> are null.
> Strangely enough, if I assign the returned string to a string 
> variable
> and print it out at runtime, the correct string is produced.
>
> So I concluded that somehow, DMD was unable to correctly 
> interpret
> &aaSlot0 inside a mixin string at compile-time.
>
>
> T
January 10, 2013
Re: Official DMD compiler written in D
Eep, seems I hit send...

On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> The basic idea behind static AA literals is to use CTFE to 
> compute the hashes of the keys, and therefore which slot(s) 
> they will fall in, at compile time. Armed with this 
> information, we can create an array of slots at compile-time 
> that contains the AA entries by declaring each slot as a static 
> variable and using the slot assignment information to 
> initialize the hash table (array of pointers to slots) to point 
> to these slots.

 Hmmm somehow that doesn't seem like a good idea; I mean it will 
work....

 Alternative is to sorta have a pair of static arrays, then 
either use a balanced tree, or a modulus to best hold (and 
separate) the values.

 Modulus based: It could be like...

  //T obviously replace with appropriate type
  ref T AA_lookup(name publicName)(uint hash) {
    //offset for part1
    //could be external to..
    immutable static int offsets[];  //no need for pointers, 
right?
    immutable static T values[]; //contains actual item data

    enum mod; //divider for table.
    uint result = hash % mod;

    //or branch to return empty? But can't be ref then...
    assert(offsets[result] != -1);

    return values[offsets[result]];
  }

 Then your CTFE increases the mod in it's calculations until 
every element can fit only once, and make the offsets based on 
them. Default values would be -1 (range error if called), course 
if it's mod is rather large and wastes a lot of space then 
perhaps it falls back on the tree structure (in those cases, > 3x 
of array size)


 Tree: I don't mind a minor performance hit, log(n) is a very 
small hit in my mind. In that case a statically created tree is 
rather easy. AA's created ahead of time are likely rather small 
(<1 Million elements). So a tree structure of:

 struct Node(T) {
   //pointers don't seem like they are needed;
   //Can even be ushorts or ubytes if small enough.
   uint left, right;
   T* value;
 }

 You get the idea.
January 10, 2013
Re: Official DMD compiler written in D
On Thu, Jan 10, 2013 at 08:38:08PM +0100, Era Scarecrow wrote:
>  Eep, seems I hit send...
> 
> On Thursday, 10 January 2013 at 05:47:01 UTC, H. S. Teoh wrote:
> >The basic idea behind static AA literals is to use CTFE to compute
> >the hashes of the keys, and therefore which slot(s) they will fall
> >in, at compile time. Armed with this information, we can create an
> >array of slots at compile-time that contains the AA entries by
> >declaring each slot as a static variable and using the slot
> >assignment information to initialize the hash table (array of
> >pointers to slots) to point to these slots.
> 
>  Hmmm somehow that doesn't seem like a good idea; I mean it will
> work....
> 
>  Alternative is to sorta have a pair of static arrays, then either
> use a balanced tree, or a modulus to best hold (and separate) the
> values.
[...]

The issue here is that I wanted to avoid the complication of having a
separate set of functions to lookup the AA literal vs. a runtime AA (and
trust me, things get *very* hairy if you go that route -- it requires
non-trivial compiler hacks to make things work with two distinct AA
types), so whatever is generated at compile-time must be the same
structure as runtime AA's so that the same AA methods will work on both.

If this is unimportant, then I would've done it another way.


T

-- 
MACINTOSH: Most Applications Crash, If Not, The Operating System Hangs
January 10, 2013
Re: Official DMD compiler written in D
09-Jan-2013 15:05, Timon Gehr пишет:
> On 01/08/2013 10:06 PM, Philippe Sigaud wrote:
>> ...
>>
>> Isn't SDC also in D? (Bernard Helyer and friends)
>> https://github.com/bhelyer/SDC
>>
>>
>> Also, Timon Gehr spoke of his own front-end (assumed to be in D) in the
>> past, but did not provide any link to it.
>>
>
> Yes, it is in D. Nothing is released yet. It needs to be polished a
> little so that there are no known embarrassing shortcomings anymore.
> (eg. the parser cannot parse extern(...) declarations in alias
> declarations yet, and I need to finish making a minor tweak to how
> template instances are analyzed in order to get circular dependency
> detection to work reliably. Furthermore, examples like the following are
> currently rejected, while I want it to work:

[snip]

> CTFE is basically done (as a portable byte code interpreter, but other
> strategies, such as JIT, could be easily plugged). This is a snippet of
> my regression test suite:
>
> auto dynRangePrimes(){
>      DynRange!int impl(int start)=>
>
> dynRange(cons(start,delay(()=>filter!(a=>a%start)(impl(start+1)))));
>      return impl(2);
> }
>
> static assert(array(take(dynRangePrimes(), 20)) ==
> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]);
>
> )

Cool! I'd love to take even the very preliminary peek at the speed 
profile of this CTFE engine.

If you are interested I'd love to test a small (the code though contains 
a lot of static data) CTFE benchmark that is the bottleneck in the 
compilation of current ctRegex.

See the attachment here:
http://d.puremagic.com/issues/show_bug.cgi?id=7442

would be nice to see some rough numbers for this vs DMD.

> I also need to decide on a licence. I assume that the alpha will be out
> in late spring. (I am busy until early spring.)
>
Eagerly waiting to play with it.

-- 
Dmitry Olshansky
1 2 3 4 5
Top | Discussion index | About this forum | D home