July 31, 2013
Ary Borenszweig:

> However in Ruby 1.9 they are ordered.
>
> You only need to store in each bucket entry a pointer to the next bucket. Then traversing the hash is super fast (just follow the links) while preserving insertion order. Accessing by key still uses the buckets and entries (if that's how AAs are implemented in D).

If it's a singly linked list how do you remove a key from the associative array keeping the list coherent? How do you know where is the precedent to remove an item from the linked list?
Using the 'deleted' tag on AA items is enough to skip the deleted items when you scan the singly linked list, but when you re-use that space I think you mess things up. One way to solve it is to never re-use locations, but this is bad if you want to add and remove many items.


> I don't think adding a pointer to each bucket entry hurts performance that much, and having traversal by insertion order is pretty common (in my opinion).

Python has such data structure in the standard library, and thanks to dynamic typing it's transparently compatible with associative arrays, so when you need it you have it.
I have often needed such data structure, but it's not always needed, I need it probably less than 5% of the times.
Python uses associative arrays all the time, they must be as fast as possible, so it's a good idea to not add that feature to the standard associative arrays. This is true for the built-in associative arrays too, despite they should be designed for flexibility first and despite D doesn't use associative arrays for basic language purposes, as Python does (symbol lookup for variables!).

I have added a small enhancement request to Bugzilla:
http://d.puremagic.com/issues/show_bug.cgi?id=10733

Bye,
bearophile
July 31, 2013
On 7/31/13 12:56 PM, H. S. Teoh wrote:
> On Wed, Jul 31, 2013 at 12:51:25PM -0300, Ary Borenszweig wrote:
>> On 7/31/13 12:05 PM, bearophile wrote:
>>> Daniel Kozak:
>>>
>>>> is there a way for AA to behave same as PHP?
>>>
>>> D associative arrays are based on hashing, so they do not keep the
>>> order of the items. This is true in Python too. The Python standard
>>> library has a hash-based ordered dict that keeps the insertion order
>>> of the keys. There is no such data structure in Phobos.
>>
>> However in Ruby 1.9 they are ordered.
>>
>> You only need to store in each bucket entry a pointer to the next
>> bucket. Then traversing the hash is super fast (just follow the
>> links) while preserving insertion order. Accessing by key still uses
>> the buckets and entries (if that's how AAs are implemented in D).
>>
>> I don't think adding a pointer to each bucket entry hurts
>> performance that much, and having traversal by insertion order is
>> pretty common (in my opinion).
> [...]
>
> It does. What do you do when you delete entries?
>
>
> T
>

Sorry, I meant doubly-linked list. But yeah, two pointers seem now too much overhead.
August 22, 2013
On Wednesday, 31 July 2013 at 14:55:55 UTC, Dicebot wrote:
> On Wednesday, 31 July 2013 at 14:43:21 UTC, Daniel Kozak wrote:
>> is there a way for AA to behave same as PHP?
>
> I doubt it. This snippet suggests that AA's in PHP are not simply AA's and do additionally track insertion order (or use some similar trick). Data in associative arrays is organized in a way that allows fast key lookup and iterating in an original order requires tracking extra state. Contrary to PHP, D does care about performance so I see no way this can happen automatically.

I implemented my own OrderedAA type, which is able to preserve insert order and is a little faster and take less memory(30%) then the builtin one :-).

But definitly there will be some cases where my own implementation doesn't beat the builtin one.
August 22, 2013
On Thursday, 22 August 2013 at 07:59:05 UTC, Daniel Kozak wrote:
> On Wednesday, 31 July 2013 at 14:55:55 UTC, Dicebot wrote:
>> On Wednesday, 31 July 2013 at 14:43:21 UTC, Daniel Kozak wrote:
>>> is there a way for AA to behave same as PHP?
>>
>> I doubt it. This snippet suggests that AA's in PHP are not simply AA's and do additionally track insertion order (or use some similar trick). Data in associative arrays is organized in a way that allows fast key lookup and iterating in an original order requires tracking extra state. Contrary to PHP, D does care about performance so I see no way this can happen automatically.
>
> I implemented my own OrderedAA type, which is able to preserve insert order and is a little faster and take less memory(30%) then the builtin one :-).
>
> But definitly there will be some cases where my own implementation doesn't beat the builtin one.

Nice to hear. Perhaps you could share the implementation with us? Even if its not perfect, I'm curious to know how you went at it.
August 22, 2013
On Thursday, 22 August 2013 at 08:41:13 UTC, monarch_dodra wrote:
> On Thursday, 22 August 2013 at 07:59:05 UTC, Daniel Kozak wrote:
>> On Wednesday, 31 July 2013 at 14:55:55 UTC, Dicebot wrote:
>>> On Wednesday, 31 July 2013 at 14:43:21 UTC, Daniel Kozak wrote:
>>>> is there a way for AA to behave same as PHP?
>>>
>>> I doubt it. This snippet suggests that AA's in PHP are not simply AA's and do additionally track insertion order (or use some similar trick). Data in associative arrays is organized in a way that allows fast key lookup and iterating in an original order requires tracking extra state. Contrary to PHP, D does care about performance so I see no way this can happen automatically.
>>
>> I implemented my own OrderedAA type, which is able to preserve insert order and is a little faster and take less memory(30%) then the builtin one :-).
>>
>> But definitly there will be some cases where my own implementation doesn't beat the builtin one.
>
> Nice to hear. Perhaps you could share the implementation with us? Even if its not perfect, I'm curious to know how you went at it.

Yes, it should not be a problem. This evening I should have some more spare time. So I do some cleanups and push it somewhere.
August 23, 2013
When I do some cleanups, I find some bugs and after fixing them, my own OrderedAA implementation does not beat builtin hash anymore :(. But speed is still really near to builtin AA's.

On Thursday, 22 August 2013 at 07:59:05 UTC, Daniel Kozak wrote:
> On Wednesday, 31 July 2013 at 14:55:55 UTC, Dicebot wrote:
>> On Wednesday, 31 July 2013 at 14:43:21 UTC, Daniel Kozak wrote:
>>> is there a way for AA to behave same as PHP?
>>
>> I doubt it. This snippet suggests that AA's in PHP are not simply AA's and do additionally track insertion order (or use some similar trick). Data in associative arrays is organized in a way that allows fast key lookup and iterating in an original order requires tracking extra state. Contrary to PHP, D does care about performance so I see no way this can happen automatically.
>
> I implemented my own OrderedAA type, which is able to preserve insert order and is a little faster and take less memory(30%) then the builtin one :-).
>
> But definitly there will be some cases where my own implementation doesn't beat the builtin one.
August 23, 2013
https://github.com/Kozzi11/Trash/blob/master/util/orderedaa.d

On Thursday, 22 August 2013 at 08:41:13 UTC, monarch_dodra wrote:
> On Thursday, 22 August 2013 at 07:59:05 UTC, Daniel Kozak wrote:
>> On Wednesday, 31 July 2013 at 14:55:55 UTC, Dicebot wrote:
>>> On Wednesday, 31 July 2013 at 14:43:21 UTC, Daniel Kozak wrote:
>>>> is there a way for AA to behave same as PHP?
>>>
>>> I doubt it. This snippet suggests that AA's in PHP are not simply AA's and do additionally track insertion order (or use some similar trick). Data in associative arrays is organized in a way that allows fast key lookup and iterating in an original order requires tracking extra state. Contrary to PHP, D does care about performance so I see no way this can happen automatically.
>>
>> I implemented my own OrderedAA type, which is able to preserve insert order and is a little faster and take less memory(30%) then the builtin one :-).
>>
>> But definitly there will be some cases where my own implementation doesn't beat the builtin one.
>
> Nice to hear. Perhaps you could share the implementation with us? Even if its not perfect, I'm curious to know how you went at it.
August 23, 2013
On Fri, Aug 23, 2013 at 01:54:27PM +0200, Daniel Kozak wrote:
> https://github.com/Kozzi11/Trash/blob/master/util/orderedaa.d
[...]

Neat!

Is a doubly-linked list really necessary for the hash buckets? You could save on some memory & improve performance slightly if you use a singly-linked list for the buckets, but still keep the left/right pointers for retaining insertion order. It will also simplify your code a bit.


T

-- 
If Java had true garbage collection, most programs would delete themselves upon execution. -- Robert Sewell
August 23, 2013
H. S. Teoh:
> ...

On this subject, I added this request to Bugzilla:
http://d.puremagic.com/issues/show_bug.cgi?id=10733

Bye,
bearophile
August 23, 2013
> H. S. Teoh:
>> ...

Your answer was really too much short, so I have asked you  a question in that enhancement request :-)

Bye,
bearophile