Jump to page: 1 24  
Page
Thread overview
Re: Replacing AA's in druntime
Mar 14, 2012
H. S. Teoh
Mar 14, 2012
H. S. Teoh
Mar 14, 2012
Daniel Murphy
Standalone AA implementation ready for review (Was: Re: Replacing AA's in druntime)
Mar 14, 2012
H. S. Teoh
Re: Standalone AA implementation ready for review (Was: Re: Replacing
Mar 15, 2012
bearophile
Re: Standalone AA implementation ready for review (Was: Re: Replacing
Mar 15, 2012
bearophile
Mar 15, 2012
James Miller
Re: Standalone AA implementation ready for review
Mar 15, 2012
bearophile
Mar 15, 2012
H. S. Teoh
Mar 15, 2012
H. S. Teoh
Mar 15, 2012
H. S. Teoh
Re: Standalone AA implementation ready for review
Mar 15, 2012
bearophile
Mar 15, 2012
foobar
Mar 15, 2012
Jacob Carlborg
Mar 15, 2012
Jacob Carlborg
Mar 15, 2012
Jacob Carlborg
Mar 15, 2012
foobar
Mar 16, 2012
Jonathan M Davis
Mar 16, 2012
Jacob Carlborg
Mar 15, 2012
Don Clugston
Mar 15, 2012
H. S. Teoh
Mar 15, 2012
Dmitry Olshansky
Mar 15, 2012
H. S. Teoh
Mar 15, 2012
H. S. Teoh
Mar 16, 2012
H. S. Teoh
Re: Standalone AA implementation ready for review (Was: Re:
Mar 16, 2012
bearophile
Mar 16, 2012
H. S. Teoh
March 14, 2012
On Tue, Mar 13, 2012 at 09:30:45PM -0500, Andrei Alexandrescu wrote:
> On 3/13/12 7:54 PM, H. S. Teoh wrote:
> >Hi all,
> >
> >My AA implementation is slowly inching closer to being ready to replace aaA.d.
> 
> Great! This will need compiler restructuring, and in fact offers the perfect opportunity for it. I suggest you to post your implementation here for review first, and assume only the minimal lowerings from the compiler.

Currently I have it as a struct that implements AA functionality without needing special support from the compiler (i.e., as a template library). The only thing that needs compiler support is AA literals and internal mapping between V[K] and AssociativeArray!(K,V) for nicer syntax and error messages.

I didn't do anything fancy with the implementation, just used the same algorithms as the current aaA.d, the idea being that fancy stuff can be added later once we have successfully extricated AA's from internal compiler dependencies.

The current code is already nicer in that it no longer needs the void* casts and typeinfo's now that key/value types are directly accessible. This will allow us to address some fundamental issues with aaA.d such as key/value types that require deep traversal of references (the current implementation of AA.toHash is broken partly because of this). Of course, performance issues also need to be addressed eventually, such as possible optimizations when key/value types have no external references, in which case some operations can use byte-level representations instead. But IMO this should be postponed till later; the main thing right now is to fix up the compiler. Once AA's are completely in object.d, optimizations and other fixes/improvements will be much easier.


> Here's something that I thinks we should have:
> 
> int[string] aa;
> char[] b;
> ...
> aa[b] = 42;
> 
> The implementation should be clever enough to work, and only duplicate the string if it wasn't already present.
[...]

Hmm, this is something nice to have. I guess get() and opIndex*() should
allow any type that is comparable with the key type? Now that the
implementation is in a template, this should be much easier to do. :)

But while we're at it, one thing that I'd like to see is that AA key types should be *implicitly* immutable. It makes no sense for AA keys to be mutable, and it's syntactically painful to keep typing stuff like string[immutable(int)[]]. Especially when mutable key types *never* make sense anyway. So why not translate string[int[]] into string[immutable(int)[]] implicitly? Thoughts?


T

-- 
INTEL = Only half of "intelligence".
March 14, 2012
On Wed, Mar 14, 2012 at 12:37:08PM +1100, Daniel Murphy wrote:
> Welcome to Hell. =D

Ahhhahahaha... sounds like I leapt into the deep end of the pool without knowing it. :-P


> Some of the things you can do with AAs are recognized by the compiler during semantic and turned into druntime calls, sometimes the constructs survive all the way to the glue layer (e2ir) and are turned into druntime calls there and sometimes the type of an expressions is magically rewritten to AssociativeArray and the methods are looked up normally.  (this one caused problems with literals)
> 
> The type needs to stay as V[K] _not_ AssociativeArray, so that error messages work properly.  Something needs to be done about literals too...  Don't forget template arg deduction!
> 
> There's a function AAGetSym (or something like that) that can be searched for to find where dmd emits druntime calls, but there might be other places it generates them.
[...]

Alright. So now I have to dig into dmd internals to do what I want. Maybe I should complete the template implementation first before tackling this stuff. Sounds nasty. :-P  (But then again, I *did* have to deal with over-engineered C++ in the past, which at one point required making an IPC call via 6 levels of abstraction, one of which involved fork(), fwrite() and fread(). I doubt dmd attains to that level of evil.)


T

-- 
Жил-был король когда-то, при нём блоха жила.
March 14, 2012
"H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote in message news:mailman.651.1331696880.4860.digitalmars-d@puremagic.com...
> On Wed, Mar 14, 2012 at 12:37:08PM +1100, Daniel Murphy wrote:
>> Welcome to Hell. =D
>
> Ahhhahahaha... sounds like I leapt into the deep end of the pool without knowing it. :-P
>
>

Yeah, the AA implementation is the deepest part I've found in dmd.

>> Some of the things you can do with AAs are recognized by the compiler during semantic and turned into druntime calls, sometimes the constructs survive all the way to the glue layer (e2ir) and are turned into druntime calls there and sometimes the type of an expressions is magically rewritten to AssociativeArray and the methods are looked up normally.  (this one caused problems with literals)
>>
>> The type needs to stay as V[K] _not_ AssociativeArray, so that error messages work properly.  Something needs to be done about literals too...  Don't forget template arg deduction!
>>
>> There's a function AAGetSym (or something like that) that can be searched for to find where dmd emits druntime calls, but there might be other places it generates them.
> [...]
>
> Alright. So now I have to dig into dmd internals to do what I want. Maybe I should complete the template implementation first before tackling this stuff. Sounds nasty. :-P  (But then again, I *did* have to deal with over-engineered C++ in the past, which at one point required making an IPC call via 6 levels of abstraction, one of which involved fork(), fwrite() and fread(). I doubt dmd attains to that level of evil.)
>
>
> T

Probably a good idea to finish it first.  It might be possible to just recognize AssociativeArray when printing error messages, and let the existing struct template code handle everything else.  Ideally syntax and error messages/stringof/pragma msg should be the only places where the compiler is actually aware AAs exist.  But this might not be that easy...


March 14, 2012
On Tue, Mar 13, 2012 at 09:30:45PM -0500, Andrei Alexandrescu wrote:
> On 3/13/12 7:54 PM, H. S. Teoh wrote:
> >Hi all,
> >
> >My AA implementation is slowly inching closer to being ready to replace aaA.d.
> 
> Great! This will need compiler restructuring, and in fact offers the perfect opportunity for it. I suggest you to post your implementation here for review first, and assume only the minimal lowerings from the compiler.

Alright, I've finished the basic functionality of my AA implementation. I still haven't solved that problem with using suffixless string literals to index X[dstring], so you'll have to write aa["abc"d] instead of just aa["abc"]. But I thought I should post my code now so that people can take a look at it:

	https://github.com/quickfur/New-AA-implementation/blob/master/newAA.d

Currently, this code is still standalone, not integrated with druntime yet.  Since that will require compiler changes and is potentially a very big change, I've decided to polish up the standalone version as much as possible before attempting druntime integration.

There are still some outstanding issues:

- As mentioned above, wstring or dstring keys (in fact any array keys
  with literals that doesn't default to the key type) need an explicit
  suffix when indexing with literals. I haven't figured out how to make
  this work yet.  At the very least, wstring and dstring needs to
  support suffixless key literals; it would be nice if a general
  solution could be found for all array-typed keys.

- Declaring an AA with non-const array keys will cause reams and reams
  of compile errors. I'm not *too* worried about this at the moment
  since it doesn't make sense to have non-const AA keys anyway. I'm also
  seriously considering forcing *all* AA keys to be immutable, in which
  case this will become a non-issue.

- Some code (most notably that table of primes that I hoisted from
  aaA.d) may need to be refactored to prevent excessive code bloat when
  the AA template is instantiated.

- Array literals don't work yet (requires compiler support).

- X[Y] syntax doesn't work yet (requires druntime integration & compiler
  support); you have to explicitly declare AssociativeArray!(Y,X).

- The use of typeid.getHash. This is probably OK, except that it forces
  a lot of AA methods not to be pure nothrow @safe. It would also be
  nice if there was a uniform syntax for getting the hash of a type
  without going through typeid (may be more efficient?).

- I haven't stress-tested AA performance yet. (Any volunteers? ;-)) Some
  improvement in this area will probably be needed, though we may not
  need to address that until this is ready for druntime integration.

- (There's also some temporary development-only code, syntactic sugar
  aliases, that shouldn't end up in object_.d -- those will be removed
  when I'm ready for druntime integration. There's also some code that
  needs a bit of cleanup.)

In spite of its flaws, this implementation already addresses quite a few AA-related issues in the bugtracker, as verified by a bunch of unittests (search for "Issue" in the code).

Comments? Flames? ;-) Pull requests?? :-D


T

-- 
It is the quality rather than the quantity that matters. -- Lucius Annaeus Seneca
March 15, 2012
H. S. Teoh:

> - As mentioned above, wstring or dstring keys (in fact any array keys
>   with literals that doesn't default to the key type) need an explicit
>   suffix when indexing with literals. I haven't figured out how to make
>   this work yet.  At the very least, wstring and dstring needs to
>   support suffixless key literals; it would be nice if a general
>   solution could be found for all array-typed keys.

This might be a problem worth fixing generally, if possible, so it's solved for other library-defined collections too.


> I'm also
>   seriously considering forcing *all* AA keys to be immutable, in which
>   case this will become a non-issue.

I am for not mutable keys.


> - Some code (most notably that table of primes that I hoisted from
>   aaA.d) may need to be refactored to prevent excessive code bloat when
>   the AA template is instantiated.

Ah, another usage for my "static static" (or for the @templated()). I'll count how many times they come out useful :-)


> It would also be
>   nice if there was a uniform syntax for getting the hash of a type
>   without going through typeid (may be more efficient?).

I agree!


> - I haven't stress-tested AA performance yet. (Any volunteers? ;-))

I have several benchmarks, but you can start quickly adapting this one, comparing the performance of the built-in AAs with your ones:
http://rosettacode.org/wiki/Sokoban#D

Bye,
bearophile
March 15, 2012
H. S. Teoh:

> - I haven't stress-tested AA performance yet. (Any volunteers? ;-))

I have just seen that the performance of the Sokoban solver is unchanged.


> In spite of its flaws, this implementation already addresses quite a few AA-related issues in the bugtracker,

How efficiently is this AA working with keys like:
AssociativeArray!(immutable char[60], bool) aa;
Built-in AAs become snail-slow with such keys.

Bye,
bearophile
March 15, 2012
On 15 March 2012 13:47, bearophile <bearophileHUGS@lycos.com> wrote:
> H. S. Teoh:
>
>> - I haven't stress-tested AA performance yet. (Any volunteers? ;-))
>
> I have just seen that the performance of the Sokoban solver is unchanged.
>
>
>> In spite of its flaws, this implementation already addresses quite a few AA-related issues in the bugtracker,
>
> How efficiently is this AA working with keys like:
> AssociativeArray!(immutable char[60], bool) aa;
> Built-in AAs become snail-slow with such keys.
>
> Bye,
> bearophile

Hmm, I might code up a stress-test suite for this if I get time/bored. Any tips on where to start? Obviously you want things like lots of entries, but is it important to test big Keys, particular types of keys, that sort of thing.

I'll also probably do separate speed and memory benchmarks.

--
James Miller
March 15, 2012
James Miller:

> Any tips on where to start?

You have a lot of search space to cover, so before doing a systematic search that requires a lot of time I suggest a quick random sampling of the search space. If there are significantly wide performance inefficiencies you will hit them in less than one hour of tests. Such approach is sometimes missed even by professional engineers :-)

I have already given a link to a game solver that uses associative arrays a lot.

Then, this is my currently "best short solution" of one of the Shootout benchmarks that stresses only hashing. The input data comes from the Fasta benchmark (just use the Python or C++ Fasta code if you want the test data):
http://codepad.org/ZSlrcuad

I have few other benchmarks.

Bye,
bearophile
March 15, 2012
On Wed, Mar 14, 2012 at 08:47:23PM -0400, bearophile wrote:
> H. S. Teoh:
> 
> > - I haven't stress-tested AA performance yet. (Any volunteers? ;-))
> 
> I have just seen that the performance of the Sokoban solver is unchanged.
> 
> 
> > In spite of its flaws, this implementation already addresses quite a few AA-related issues in the bugtracker,
> 
> How efficiently is this AA working with keys like:
> AssociativeArray!(immutable char[60], bool) aa;
> Built-in AAs become snail-slow with such keys.
[...]

Well, I didn't invent any clever new algorithms or anything like that, I more-or-less followed the same implementation as aaA.d.  So I don't think there will be any great performance gains. I'm more concerned about drastic performances *losses* caused by code that I didn't realize would cause a problem, perhaps an overlooked inefficiency that crept in when I was translating the AA code into template-based code.


T

-- 
Being able to learn is a great learning; being able to unlearn is a greater learning.
March 15, 2012
On 3/14/12 6:16 PM, H. S. Teoh wrote:
> - Declaring an AA with non-const array keys will cause reams and reams
>    of compile errors. I'm not *too* worried about this at the moment
>    since it doesn't make sense to have non-const AA keys anyway. I'm also
>    seriously considering forcing *all* AA keys to be immutable, in which
>    case this will become a non-issue.

I think the built-in associative array must allow non-constant keys. As long as there's no memory safety issue, the AA should work with types that the user doesn't change for comparison purposes but can otherwise modify.

A practical matter is that if we introduce this restriction we'll break a ton of code.


Andrei
« First   ‹ Prev
1 2 3 4