Thread overview
Hashtables from Associative Arrays
Sep 18, 2003
Benji Smith
Sep 18, 2003
Helmut Leitner
Sep 18, 2003
Antti Sykäri
Sep 18, 2003
Charles Sanders
September 18, 2003
During the coding of several different modules in the past few weeks, I've found myself wishing that I could use the associative array capabilities of D to create a simple hastable, without having to have anything in the array at all.

The code I've been writing has looked something like this:

bit[char[]] myHashTable;

But I'm not really using the bit values, just the hash values. So, when I add an entry to the hash, it looks like this:

myHashTable["red"] = true;
myHashTable["blue"] = true;

But I don't like the fact that I'm wasting a bunch of memory space in the process. In fact, I'm about to implement a data structure (a fulltext index for a pseudo-db application) that will have hundreds of thousands of entries, and I hate to think about tacking on an extra 32 bits to every single entry in the hashtable.

I'd rather do something crazy, like this:

null[char[]] myHashTable;

..or this...

empty[char[]] myHashTable;

..or even crazier, like this:

[char[]] myHashTable;

Using semantics like this, it should generate a compile-time error to try to use the assignment operator. So, this code:

myHashTable["red"] = mySomethingElse;

..would break the compilation. The only thing I'm interested in checking is whether a particular key value exists. So, I'd run checks like this:

if ("red" in myHashTable) { ... }

Of course, I don't know what syntax you'd be able to use to put a key _into_ the hashtable, but I could probably think of something.

Of course, I could just implement my own hashtable class that would allow this type of behavior, but then I wouldn't be able to use the same hashtable syntax that already exists for associative arrays in the language.

Wouldn anyone else find something like this useful?

--Benji Smith


September 18, 2003

Benji Smith wrote:
> During the coding of several different modules in the past few weeks, I've found myself wishing that I could use the associative array capabilities of D to create a simple hastable, without having to have anything in the array at all.
> 
> The code I've been writing has looked something like this:
> 
> bit[char[]] myHashTable;
> 
> But I'm not really using the bit values, just the hash values. So, when I add an entry to the hash, it looks like this:
> 
> myHashTable["red"] = true;
> myHashTable["blue"] = true;
> 
> But I don't like the fact that I'm wasting a bunch of memory space in the process. In fact, I'm about to implement a data structure (a fulltext index for a pseudo-db application) that will have hundreds of thousands of entries, and I hate to think about tacking on an extra 32 bits to every single entry in the hashtable.

It may not be very costly. It depends on implementation detail that only Walter knows.

Hashes are always costly, because you have a bucket table, you have nodes for the linked lists that start at the buckets and contain key references and values (or value references). It might be the there is place for a 32-bit value reference that is used for small values. It might be that the nodes have a flat space for references or values and because allocations are multiple of 16 this may hurt or may not hurt.

So it may cost you a lot or nothing to do

   myHashTable[word] = word;
   myHashTable[word] = bit;

You will have to measure it.

-- 
Helmut Leitner    leitner@hls.via.at
Graz, Austria   www.hls-software.com
September 18, 2003
> I'd rather do something crazy, like this:
> 
> null[char[]] myHashTable;

void[char[]] is probably what you're looking for (as void is a type, and
void f(int x) is a valid type as well). I don't think it works, though
(and cannot currently check)

-Antti
September 18, 2003
Im not sure why u just wouldnt use an array here ?  Sort it and use a binary search for  O(log n ^ 2 )


Charles


"Benji Smith" <dlanguage@xxagg.com> wrote in message news:bkchpe$2rb$1@digitaldaemon.com...
> During the coding of several different modules in the past few weeks, I've
found
> myself wishing that I could use the associative array capabilities of D to create a simple hastable, without having to have anything in the array at
all.
>
> The code I've been writing has looked something like this:
>
> bit[char[]] myHashTable;
>
> But I'm not really using the bit values, just the hash values. So, when I
add an
> entry to the hash, it looks like this:
>
> myHashTable["red"] = true;
> myHashTable["blue"] = true;
>
> But I don't like the fact that I'm wasting a bunch of memory space in the process. In fact, I'm about to implement a data structure (a fulltext
index for
> a pseudo-db application) that will have hundreds of thousands of entries,
and I
> hate to think about tacking on an extra 32 bits to every single entry in
the
> hashtable.
>
> I'd rather do something crazy, like this:
>
> null[char[]] myHashTable;
>
> ..or this...
>
> empty[char[]] myHashTable;
>
> ..or even crazier, like this:
>
> [char[]] myHashTable;
>
> Using semantics like this, it should generate a compile-time error to try
to use
> the assignment operator. So, this code:
>
> myHashTable["red"] = mySomethingElse;
>
> ..would break the compilation. The only thing I'm interested in checking
is
> whether a particular key value exists. So, I'd run checks like this:
>
> if ("red" in myHashTable) { ... }
>
> Of course, I don't know what syntax you'd be able to use to put a key
_into_ the
> hashtable, but I could probably think of something.
>
> Of course, I could just implement my own hashtable class that would allow
this
> type of behavior, but then I wouldn't be able to use the same hashtable
syntax
> that already exists for associative arrays in the language.
>
> Wouldn anyone else find something like this useful?
>
> --Benji Smith
>
>