Official DMD compiler written in D (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Official DMD compiler written in D (page 4)

January 10, 2013

Re: Official DMD compiler written in D

Posted by Dmitry Olshansky
in reply to H. S. Teoh

Dmitry Olshansky

Posted in reply to H. S. Teoh

10-Jan-2013 09:45, H. S. Teoh пишет:
> On Thu, Jan 10, 2013 at 04:51:35AM +0100, Timon Gehr wrote:
>> On Wednesday, 9 January 2013 at 18:45:56 UTC, H. S. Teoh wrote:
> [...]
>>> Whoa!!! Now you really caught my interest!
>>>
>>> Can it handle CTFE-computed mixins that declare and reference
>>> variables?  If it can, I will finally be able to write a library AA
>>> implementation that allows static AA literals (no runtime
>>> init/allocation needed)!
>>
>> Probably yes, but I am not sure what you mean. Can you provide an
>> example?
>
> The basic idea behind static AA literals is to use CTFE to compute the
> hashes of the keys, and therefore which slot(s) they will fall in, at
> compile time.

Why not try to make an AA literal a universal entity for all kinds of user-defined associative arrays? Then the actual construction of AA is done via constructor taking some type like AALiteral!(Key, Value) which is a sequence of Key, Value pairs. It could be even presorted.

> Armed with this information, we can create an array of
> slots at compile-time that contains the AA entries by declaring each
> slot as a static variable and using the slot assignment information to
> initialize the hash table (array of pointers to slots) to point to these
> slots.
>
> Now, there are some challenges to be overcome here. For example, the
> slots need to be declared as local variables, and therefore need unique
> names (the hash table array needs to be able to refer to the addresses
> of these variables), so the only way I can think of to do this at
> compile-time is to have the CTFE code generate mixins to declare those
> variables. So here's the tricky part. Suppose you have this CTFE code:
>
> 	// Not the real code, just a rough sketch to give the idea
> 	string aaComputeAASlots(K,V)(K[V] aaLiteral) {
> 		// Create slots
> 		string slotVars;
> 		foreach (key, value; aaLiteral) {
> 			size_t slotNum = hashOf(key) % numKeys;
> 			string slotName = "aaSlot" ~ slotNum;
> 			slotVars ~= "immutable Slot " ~ slotName ~
> 				" = Slot(" ~ key ~ ", " ~ value ~ ");";
> 		}
>
> 		string mainTable = q{
> 			immutable Slot*[] aaTable = {
> 				&aaSlot0,
> 				&aaSlot1,
> 				... // generate these programmatically
> 			};
> 		};
>
> 		return slotVars ~ mainTable;
> 	}
>
> 	// Instantiate AA here, at compile time.
> 	mixin(aaComputeAASlots(myLiteral));
>
> The mixin string would look something like this:
>
> 	"immutable Slot aaSlot0 = Slot(..., ...);
> 	 immutable Slot aaSlot1 = Slot(..., ...);
> 	 immutable Slot*[] aaTable = {
> 	 	&aaSlot0, &aaSlot1, ...
> 	 }"
>
> The question is, will the compile be able to resolve &aaSlot0, &aaSlot1,
> etc., at compile-time?
>
> Obviously, this code works if you copy-n-paste it into a .d source file.
> It also works if I write it as a string literal, then use mixin().
> However, on DMD, if the string is returned by a CTFE function that
> builds it using ~, for some reason all the pointers in aaTable are null.
> Strangely enough, if I assign the returned string to a string variable
> and print it out at runtime, the correct string is produced.
>

Seems like a bug.

> So I concluded that somehow, DMD was unable to correctly interpret
> &aaSlot0 inside a mixin string at compile-time.


-- 
Dmitry Olshansky

January 10, 2013

Re: Official DMD compiler written in D

Posted by Era Scarecrow
in reply to Dmitry Olshansky

Era Scarecrow

Posted in reply to Dmitry Olshansky

On Thursday, 10 January 2013 at 19:57:29 UTC, Dmitry Olshansky wrote:
> Why not try to make an AA literal a universal entity for all kinds of user-defined associative arrays? Then the actual construction of AA is done via constructor taking some type like AALiteral!(Key, Value) which is a sequence of Key, Value pairs. It could be even presorted.

 Only problem I would see with that, is trying to give something to a non-template function.

auto x = AALiteral!(string, string)([]);

void func(string[string] something);

func(x); // x is not type string[string], is type xxx!string,string

 Course that brings up problem(s) with my own idea thrown out there.

 Seems like the problem is just determining storage type, at CTFE it calls one type of constructor while the non-CTFE calls another; So long as the structure is the same then passing it around wouldn't matter.

January 10, 2013

Re: Official DMD compiler written in D

Posted by Era Scarecrow
in reply to Era Scarecrow

Era Scarecrow

Posted in reply to Era Scarecrow

On Thursday, 10 January 2013 at 20:48:12 UTC, Era Scarecrow wrote:

> func(x); // x is not type string[string], is type xxx!string,string

 Thinking about it, if the AA is just syntactical sugar for a struct (say defined in core.memory) which handles that then the problem goes away.

January 10, 2013

Re: Official DMD compiler written in D

Posted by Dmitry Olshansky
in reply to Era Scarecrow

Dmitry Olshansky

Posted in reply to Era Scarecrow

11-Jan-2013 00:49, Era Scarecrow пишет:
> On Thursday, 10 January 2013 at 20:48:12 UTC, Era Scarecrow wrote:
>
>> func(x); // x is not type string[string], is type xxx!string,string
>
>   Thinking about it, if the AA is just syntactical sugar for a struct
> (say defined in core.memory) which handles that then the problem goes away.

+ user defined implicit conversion so that AALiteral(K,V) --*implicitly*!--> AA!(K,V)


The other use case for user-defined implicit conversion were already outlined before so I hope it will make its way in one day.

-- 
Dmitry Olshansky

January 10, 2013

Re: Official DMD compiler written in D

Posted by Era Scarecrow
in reply to Dmitry Olshansky

Era Scarecrow

Posted in reply to Dmitry Olshansky

On Thursday, 10 January 2013 at 22:02:17 UTC, Dmitry Olshansky wrote:
> + user defined implicit conversion so that AALiteral(K,V) --*implicitly*!--> AA!(K,V)
>
>
> The other use case for user-defined implicit conversion were already outlined before so I hope it will make its way in one day.

 I thought I heard talk of possibly moving AA's to a library implementation rather than a built in feature (Still 'built in' technically).

January 10, 2013

Re: Official DMD compiler written in D

Posted by H. S. Teoh
in reply to Era Scarecrow

H. S. Teoh

Posted in reply to Era Scarecrow

On Thu, Jan 10, 2013 at 11:18:07PM +0100, Era Scarecrow wrote:
> On Thursday, 10 January 2013 at 22:02:17 UTC, Dmitry Olshansky wrote:
> >+ user defined implicit conversion so that AALiteral(K,V)
> >--*implicitly*!--> AA!(K,V)
> >
> >
> >The other use case for user-defined implicit conversion were already outlined before so I hope it will make its way in one day.
> 
>  I thought I heard talk of possibly moving AA's to a library
> implementation rather than a built in feature (Still 'built in'
> technically).

That's my plan, and Andrei's hope. But it's not going to be easy. The current AA implementation is split between object_.d and aaA.d in druntime, with bits scattered throughout the compiler. Cleaning all of that up is going to be a big job.

T

-- 
No! I'm not in denial!

January 11, 2013

Re: Official DMD compiler written in D

Posted by Rob T
in reply to Era Scarecrow

Rob T

Posted in reply to Era Scarecrow

On Thursday, 10 January 2013 at 22:18:08 UTC, Era Scarecrow wrote:
> On Thursday, 10 January 2013 at 22:02:17 UTC, Dmitry Olshansky wrote:
>> + user defined implicit conversion so that AALiteral(K,V) --*implicitly*!--> AA!(K,V)
>>
>>
>> The other use case for user-defined implicit conversion were already outlined before so I hope it will make its way in one day.
>
>  I thought I heard talk of possibly moving AA's to a library implementation rather than a built in feature (Still 'built in' technically).

I was wondering the same thing: Why do AA's have to be built directly into the language rather than implemented within the standard library?

AA's may be nice to have, but are they really all that fundamental? For example, everyone uses strings, but I suspect not everyone will use AA's.

It could however be that some fundamental constructs in the D language make use of the built in AA's (tuples?), so if that is the case I can understand why they would be built in, but I do not know if this is the case or not.

On a side note, I'm wondering if this may be a way that can help decide what should or should not be made a built in component of the language, i.e, If the language itself can make good use of the feature internally, then it's a good candidate for being built in, but if not, then it's a good candidate for being left out and instead implemented in the std lib or elsewhere.

--rt

January 11, 2013

Re: Official DMD compiler written in D

Posted by Andrei Alexandrescu
in reply to Rob T

Andrei Alexandrescu

Posted in reply to Rob T

On 1/10/13 4:21 PM, Rob T wrote:
> I was wondering the same thing: Why do AA's have to be built directly
> into the language rather than implemented within the standard library?
>
> AA's may be nice to have, but are they really all that fundamental? For
> example, everyone uses strings, but I suspect not everyone will use AA's.
>
> It could however be that some fundamental constructs in the D language
> make use of the built in AA's (tuples?), so if that is the case I can
> understand why they would be built in, but I do not know if this is the
> case or not.

The only distinguishing feature is literal syntax. Everything else is a mistake in language design or an insufficiency in the state of the art.

> On a side note, I'm wondering if this may be a way that can help decide
> what should or should not be made a built in component of the language,
> i.e, If the language itself can make good use of the feature internally,
> then it's a good candidate for being built in, but if not, then it's a
> good candidate for being left out and instead implemented in the std lib
> or elsewhere.

Long discussion. Very long.


Andrei

January 11, 2013

Re: Official DMD compiler written in D

Posted by Timon Gehr
in reply to Dmitry Olshansky

Timon Gehr

Posted in reply to Dmitry Olshansky

On 01/10/2013 08:52 PM, Dmitry Olshansky wrote:
> 09-Jan-2013 15:05, Timon Gehr пишет:
>> On 01/08/2013 10:06 PM, Philippe Sigaud wrote:
>>> ...
>>>
>>> Isn't SDC also in D? (Bernard Helyer and friends)
>>> https://github.com/bhelyer/SDC
>>>
>>>
>>> Also, Timon Gehr spoke of his own front-end (assumed to be in D) in the
>>> past, but did not provide any link to it.
>>>
>>
>> Yes, it is in D. Nothing is released yet. It needs to be polished a
>> little so that there are no known embarrassing shortcomings anymore.
>> (eg. the parser cannot parse extern(...) declarations in alias
>> declarations yet, and I need to finish making a minor tweak to how
>> template instances are analyzed in order to get circular dependency
>> detection to work reliably. Furthermore, examples like the following are
>> currently rejected, while I want it to work:
>
> [snip]
>
>> CTFE is basically done (as a portable byte code interpreter, but other
>> strategies, such as JIT, could be easily plugged). This is a snippet of
>> my regression test suite:
>>
>> auto dynRangePrimes(){
>>      DynRange!int impl(int start)=>
>>
>> dynRange(cons(start,delay(()=>filter!(a=>a%start)(impl(start+1)))));
>>      return impl(2);
>> }
>>
>> static assert(array(take(dynRangePrimes(), 20)) ==
>> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]);
>>
>> )
>
> Cool! I'd love to take even the very preliminary peek at the speed
> profile of this CTFE engine.
>
> If you are interested I'd love to test a small (the code though contains
> a lot of static data) CTFE benchmark that is the bottleneck in the
> compilation of current ctRegex.
>
> See the attachment here:
> http://d.puremagic.com/issues/show_bug.cgi?id=7442
>
> would be nice to see some rough numbers for this vs DMD.
>

I'll let you know as soon as the example runs. Currently this is blocked by the missing implementation for the following language features:

- import declarations
- UFCS
- Optional parens on function calls
- Struct literals
- Static struct data
- debug declarations
- Some of the built-in array operations
- Missing object.d (no string, size_t, hash_t alias.)
- (static) foreach
- __ctfe

I will prioritize those features. Except import declarations, they are mostly easy to implement, but I haven't gotten around to them yet.

For the meantime, maybe these quick measurements are somewhat useful:

int[] erathos(int x){
	bool[] p;
	for(int i=0;i<=x;i++) p~=true;
	for(int i=3;i*i<=x;i+=2){
		if(p[i]) for(int k=i*i;k<=x;k=k+i) p[k]=false;
	}
	int[] r;
	if(x>=2) r~=2;
	for(int i=3;i<=x;i+=2) if(p[i]) r~=i;
	return r;
}

pragma(msg, "erathos: ",erathos(40000).length);


The frontend (32-bit dmd build, without -inline, otherwise DMD ICEs):
$ time ./d erathos.d
erathos: 4203U

real	0m0.077s
user	0m0.076s
sys	0m0.000s


DMD 2.060 (64 bit):
$ time dmd -o- erathos.d
erathos: 4203u

real	0m2.594s
user	0m0.716s
sys	0m1.696s


...

pragma(msg, "erathos: ",erathos(400000)); // (that is one 0 more)

The frontend:
erathos: 33860U

real	0m0.662s
user	0m0.660s
sys	0m0.000s

DMD: brings down the machine

pragma(msg, "erathos: ",erathos(4000000)); // (yet another 0 more)

The frontend:
erathos: 283146U

real	0m6.867s
user	0m6.832s
sys	0m0.016s

// pragma(msg, "erathos: ",erathos(4000000));
void main(){
    import std.stdio;
    writeln(erathos(4000000).length);
}

$ dmd -O -release -inline -noboundscheck erathos.d && time ./erathos
dmd: module.c:829: void Module::semantic3(): Assertion `semanticstarted == 2' failed.

(I'll see if it also fails with DMD 2.061.)

$ dmd -O -release -noboundscheck erathos.d && time ./erathos
283146

real	0m0.144s
user	0m0.132s
sys	0m0.008s

So CTFE in the front end seems to be ~50 times slower than a optimized DMD build of the same code in this case. But note that it is powered by a simple-minded bytecode interpreter I hacked together mostly during two weekends. (the array append is the one from druntime) A lot more is possible. I guess it is already fast enough to power std.regex.

January 11, 2013

Re: Official DMD compiler written in D

Posted by deadalnix
in reply to H. S. Teoh

deadalnix

Posted in reply to H. S. Teoh

On Thursday, 10 January 2013 at 22:28:46 UTC, H. S. Teoh wrote:
> On Thu, Jan 10, 2013 at 11:18:07PM +0100, Era Scarecrow wrote:
>> On Thursday, 10 January 2013 at 22:02:17 UTC, Dmitry Olshansky
>> wrote:
>> >+ user defined implicit conversion so that AALiteral(K,V)
>> >--*implicitly*!--> AA!(K,V)
>> >
>> >
>> >The other use case for user-defined implicit conversion were
>> >already outlined before so I hope it will make its way in one day.
>> 
>>  I thought I heard talk of possibly moving AA's to a library
>> implementation rather than a built in feature (Still 'built in'
>> technically).
>
> That's my plan, and Andrei's hope. But it's not going to be easy. The
> current AA implementation is split between object_.d and aaA.d in
> druntime, with bits scattered throughout the compiler. Cleaning all of
> that up is going to be a big job.
>

You may want to look at this : http://preshing.com/20130107/this-hash-table-is-faster-than-a-judy-array

Additionally, can you show me what you have so far ? I'd be interested in using it directly for SDC without using the existing one at all.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation