-
No function to create with desired capability
The initial size is default to 8 and increased of power of 2. There are two problems with this scheme. To populate to size of few thousands, there are too many memory relocations and rebuild the list. The other problem is wasted of unused slots; example a list of 2_100, the list will have 4_096 of slots -
Cascade of collision
If hash causes a collision, it will use the subsequent slot as an overflow. However, if a next item hash points to that subsequent slot and it is already occupied by that overflow item, hence the cascade collision -
Duplicate hash calculation
For a key that fits into register, size_t, there is no hash calculation but when it is added into the list, it will apply final MurmurHash2; for a key size bigger than size_t, it will do hash calculation with applied final MurmurHah2 and later added to the list, it will apply MurmurHash2 again. There are two problems with this scheme, the first one is obvious un-needed calculation; the other is if an integer key value fitted into list length/capability, applied MurmurHash2 potentially causes hash collision
Some suggestions
- Use prime number length/capability to avoid wasted of unused slots
- Add a way to initialize list with length/capability
- A way to avoid final MurmurHash2 or get rid of it completely and add function/delegate to customized hash calculation
- Use different scheme to handle collision such as plain array to hold key-value pair and hash list with integer index pointed to key-value pair. This scheme has two advantages; key-value pair will have integer index for collision and associated list will have order of items added when do iteration (possible use in json...); The dis-advantage is indirect access and it will take more work when doing deletion. However, deletion is rarely used and not a big deal for this use case
Happy codings