AA with complex keytype? (page 2)

Stewart Gordon wrote: > Manfred Nowak Wrote: > >> Frits van Bommel wrote > <snip> >>> (do current AAs use unbalanced trees? I have no idea, I just know they use _some_ form of trees) >> They are unbalanced. Look at phobos/internal/aaA.d:278 and following lines. > > Balancing the trees on the fly would be inefficient. But rehash ought to balance them while it's at it. There are several well-known methods of efficiently keeping binary trees (reasonably) balanced. Off the top of my head there are AVL trees and red-black trees, but those definitely aren't the only ones. For implementing a hash-table bucket this is probably not necessary, though. There typically shouldn't be very much elements in a bucket.

February 10, 2007

Re: AA with complex keytype?

Posted by Bill Baxter
in reply to Frits van Bommel

Permalink

Bill Baxter

Posted in reply to Frits van Bommel

Permalink

Frits van Bommel wrote:
> Stewart Gordon wrote:
>> Manfred Nowak Wrote:
>>
>>> Frits van Bommel wrote
>> <snip>
>>>> (do current AAs use unbalanced trees?  I have no idea, I just know they use _some_ form of trees)
>>> They are unbalanced.  Look at phobos/internal/aaA.d:278 and following lines.
>>
>> Balancing the trees on the fly would be inefficient.  But rehash ought to balance them while it's at it.

> There are several well-known methods of efficiently keeping binary trees (reasonably) balanced. Off the top of my head there are AVL trees and red-black trees, but those definitely aren't the only ones.
> 
> For implementing a hash-table bucket this is probably not necessary, though. There typically shouldn't be very much elements in a bucket.

So much for O(N lg N) performance in the case of a really bad hash function, though.  You'll get O(N) if you happen to insert your values such that their hashes are strictly increasing or strictly decreasing. In theory anyway. ;-)  I wouldn't be surprised though if in practice the simplicity of leaving out balancing outweighs the theoretical advantage of balanced trees in 95% of the cases.

Someone who has too much time on their hands should take a look at what the code for Python dicts and Lua tables do.  Both of those are reputed to be heavily tweaked for super-duper performance.

--bb

"Bill Baxter" <dnewsgroup@billbaxter.com> wrote in message news:eqkmg1$hnb$1@digitaldaemon.com... > Someone who has too much time on their hands should take a look at what the code for Python dicts and Lua tables do. Both of those are reputed to be heavily tweaked for super-duper performance. Lua tables use (according to the source) "a mix of chained scatter table with Brent's variation." The insert method comment reads: /* ** inserts a new key into a hash table; first, check whether key's main ** position is free. If not, check whether colliding node is in its main ** position or not: if it is not, move colliding node to an empty place and ** put new key in its main position; otherwise (colliding node is in its main ** position), new key goes to an empty position. */ It's basically using a variation on separate chaining. Additionally, Lua 5 tables will attempt to keep integer keys in a true array rather than in the hash, which improves performance when using tables as arrays, but that's not really necessary in a language that has true arrays like D ;)

Frits van Bommel wrote > So the implementation needs to be careful about parameter order, That's right. The value stemming from the collision bucket has to be at a fixed parameter position on all calls of opCmp. But this can be easily verified by inserting and retrieving two elements under a worthless toHash. > Since the API doesn't specify this That's also right. But this also can be fixed easily. -manfred

Stewart Gordon wrote: > <snip> > > So if x != y, then both x < y and y < x? That wouldn't make sense at all. > >> It should be easy to implement this as the default opCmp(Object). > > OUAT, Object.opCmp was set up to compare the memory addresses, but this behaviour was removed to prepare for copying/compacting GC, and probably partly to eliminate the confusing behaviour it created. > > Stewart. > Actually, there are many cases where modular arithmetic to various bases is the desired behavior, and in such a system "both x < y and y < x" makes perfect sense. You can reach ANY number by repeatedly incrementing and also by repeatedly decrementing. OTOH, I'm not certain that these are worth building into a language. Ada thought so, but no other language appears to have followed their lead. (Possibly it was also done in Algol68 ... I never used it, but it contained all sorts of experimental things.)

Charles D Hixson wrote: > Stewart Gordon wrote: >> So if x != y, then both x < y and y < x? That wouldn't make sense at all. >> > > Actually, there are many cases where modular arithmetic to various bases is the desired behavior, and in such a system "both x < y and y < x" makes perfect sense. You can reach ANY number by repeatedly incrementing and also by repeatedly decrementing. Mathematically speaking, a group like ℤ₃ is unordered, precisely for the reason given. If we need to impose an ordering for array purposes (I don’t understand that bit, just (mis)repeating something I°ve seen said on the subject), then give the ordering lexicographic properties. (x < y) && (y < x) should not ever return true. --Joel

Charles D Hixson wrote: > Actually, there are many cases where modular arithmetic to various bases is the desired behavior, and in such a system "both x < y and y < x" makes perfect sense. You can reach ANY number by repeatedly incrementing and also by repeatedly decrementing. What's about this: a = 1+0j b = 0+1j how would you increment a to finnally get to b? how do you want to compare two complex numbers? this is mathematically impossible. Nicolai

Nicolai Waniek wrote > What's about this: > > a = 1+0j > b = 0+1j > > how would you increment a to finnally get to b? how do you want to compare two complex numbers? this is mathematically impossible. It depends on how one defines "increment". Because Charles only mentions one operation, "increment" can be defined to be the normal complex multiplication. Then "increment( a, b)= b" holds in your case above. If one wants to define an ordering for all points on the unit-circle around the origin Charles is also righteous to define: (a != b) => (a < b) & (b < a) Because you can travel on the circumference of the circle in each direction until "a != b" turns out to be false. -manfred

Joel C. Salomon wrote > Mathematically speaking, a group like â„¤â‚ƒ is unordered, precisely for the reason given. If we need to impose an ordering for array purposes (I donâ€™t understand that bit, just (mis)repeating something IÂ°ve seen said on the subject), then give the ordering lexicographic properties. It is faster to not impose an ordering on an unordered set. -manfred

> > Mathematically speaking, a group like â„¤â‚ƒ is unordered, precisely for the reason given. If we need to impose an ordering for array purposes (I donâ€™t understand that bit, just (mis)repeating something Iâ€™ve seen said on the subject), then give the ordering lexicographic properties. > > It is faster to not impose an ordering on an unordered set. Do you know where D insists that â€œsomeâ€ ordering be applied to a type, and why?

Forums