August 15, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #10 from bearophile_hugs@eml.cc 2013-08-15 10:49:03 PDT ---
(In reply to comment #9)

> Actually, that is not undefined. AA's are designed such that inserting new elements does not invalidate pointers to existing elements.

I didn't know this. Is this stated somewhere in the D specs?


> This holds even in the event of a rehash,

Associative arrays have to grow when you keep adding key-value pairs, I presume this is done allocating a new larger hash (probably 2 or 4 times larger), and copying data in it. In such situation aren't the pointers to the items becoming invalid? Even if the doubling is done with a realloc, it can sometimes not be able to reallocate in place.

To test my theory I have written a small test program:


void main() {
    enum size_t N = 1_000_000;
    bool[immutable uint] aa;
    auto pointers = new void*[N];

    foreach (immutable uint i; 0 .. N) {
        aa[i] = true;
        pointers[i] = i in aa;
    }

    foreach (immutable uint i; 0 .. N)
        assert(pointers[i] == (i in aa));
}


It gives no errors, so I am not understanding something. But are D specs asserting this program will work in all D implementations?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 15, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #11 from hsteoh@quickfur.ath.cx 2013-08-15 12:03:49 PDT ---
(In reply to comment #10)
[...]
> Associative arrays have to grow when you keep adding key-value pairs, I presume this is done allocating a new larger hash (probably 2 or 4 times larger), and copying data in it. In such situation aren't the pointers to the items becoming invalid? Even if the doubling is done with a realloc, it can sometimes not be able to reallocate in place.

The reason it works, is because the hash table itself doesn't contain the actual key/value pairs; it just contains pointers to linked-lists of these key/value pairs. So the hash table can be modified however you like, but the key/value pairs stays in the same memory address.

This would work even if we used something other than linked-lists for the key/value pairs, e.g., trees, because the key/value pairs would just have some pointers to neighbouring nodes, and during AA rehash (or add/delete) all that happens is that some of these pointers get reassigned, but the node itself (containing the key/value pair) remains in the same memory address. This kind of implementation avoids copying/moving of keys and values, so I'd expect any good AA implementation to do something similar.

I'm pretty sure that it's generally expected that AA implementations should obey the principle that iterators (i.e. pointers to key/value) are not invalidated by add/delete, otherwise it would greatly reduce the usefulness of AA's. I'm not too sure about this also holding for rehash, but the current AA implementation does indeed preserve references across rehash as well (though it does break iteration order if you trigger a rehash in the middle of iterating over the AA -- but you won't get invalid pointers out of it).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 15, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=4475



--- Comment #12 from bearophile_hugs@eml.cc 2013-08-15 12:52:19 PDT ---
(In reply to comment #11)

> the hash table itself doesn't contain the
> actual key/value pairs; it just contains pointers to linked-lists of these
> key/value pairs. So the hash table can be modified however you like, but the
> key/value pairs stays in the same memory address.

I see. But that's just an implementation detail (you could design an AA that keeps small keys-value pairs in an array, plus a pointer to a chain for the collisions, this is how I have created associative arrays in C), D specs can't assert that implementation, so D code that relies on that implementation detail goes into the realm of undefined behavour.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
1 2
Next ›   Last »