Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
April 17, 2004 assoc array performance | ||||
---|---|---|---|---|
| ||||
well, based on Serge's eariler post I've been playing around with different assoc array implementations and believe it or not the current one is the only one that behaves well on both Windows and Linux under various tests that vary the table size and GC interactions. The only trick that significantly improved performance was preallocating the node pointer array. I hadn't realized this before but you can already do this as follows: int[char[]] X; // an assoc array of ints indexed by strings void*[] Y; Y.length = 1000000; // allocate array of void* X = cast(int[char[]])Y; // voila, a pre-allocated assoc array Another performance boost came by using stdup with malloc to duplicate strings instead of .dup but if you do that you have to manage the dup'ed strings so they don't leak. I tried all kinds of combinations of pre-allocating the data nodes as well as the node pointers and messing with the hashing function but they all had drawbacks when mixed with either the GC or the default pointer array (the default assoc array allocates 10 pointers no matter how many entries you eventually put in). Go figure. Either Walter got lucky or he actually tested this stuff out ;-) -Ben |
April 18, 2004 Re: assoc array performance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle | Thanks Ben! I've been tearing my hair out trying to figure a way to preallocate AA's. Was just about to post to the "What's needed for 1.0" thread regarding this. Would be nice/appropriate if there was a cleaner 'syntax' that the example below <g> BTW, did you find a quick way to delete all entries? - Kris "Ben Hinkle" <bhinkle4@juno.com> wrote in message news:c5sgbp$2p8j$1@digitaldaemon.com... > well, based on Serge's eariler post I've been playing around with different assoc array implementations and believe it or not the current one is the only one that behaves well on both Windows and Linux under various tests that vary the table size and GC interactions. > > The only trick that significantly improved performance was preallocating the node pointer array. I hadn't realized this before but you can already do this as follows: > > int[char[]] X; // an assoc array of ints indexed by strings > void*[] Y; > Y.length = 1000000; // allocate array of void* > X = cast(int[char[]])Y; // voila, a pre-allocated assoc array > > Another performance boost came by using stdup with malloc to duplicate strings instead of .dup but if you do that you have to manage the dup'ed strings so they don't leak. > > I tried all kinds of combinations of pre-allocating the data nodes as well as the node pointers and messing with the hashing function but they all had drawbacks when mixed with either the GC or the default pointer array (the default assoc array allocates 10 pointers no matter how many entries you eventually put in). Go figure. Either Walter got lucky or he actually tested this stuff out ;-) > > -Ben > |
April 18, 2004 Re: assoc array performance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kris |
> BTW, did you find a quick way to delete all entries?
either
void*[] Y;
Y.length = 0; // not really needed since Y has 0 length by default
X = cast(int[char[]])Y;
to dump the node pointer array entirely. If you want to preserve the array (for instance you had preallocated the pointer array) and just clear out all the pointers try
void*[] Y = cast(void*[])X;
Y[] = null; // set all the node pointers to null
X = cast(int[char[]])Y;
Isolating code like that would be nice since it depends on implementation details of assoc arrays.
-Ben
|
April 18, 2004 Re: assoc array performance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle | Cheers! Really appreciate that. Let's hope Walter will add some properties for doing this ... "Ben Hinkle" <bhinkle4@juno.com> wrote in message news:c5sr12$7q4$1@digitaldaemon.com... > > > BTW, did you find a quick way to delete all entries? > > either > > void*[] Y; > Y.length = 0; // not really needed since Y has 0 length by default > X = cast(int[char[]])Y; > > to dump the node pointer array entirely. If you want to preserve the array (for instance you had preallocated the pointer array) and just clear out all the pointers try > > void*[] Y = cast(void*[])X; > Y[] = null; // set all the node pointers to null > X = cast(int[char[]])Y; > > Isolating code like that would be nice since it depends on implementation details of assoc arrays. > > -Ben > > > |
April 18, 2004 Re: assoc array performance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle | Isn't this dangerous? You're relying on the semantics of how an Assoc array is implemented by the compiler. I would have thought the X.length = 100000 would work, but I'm obviously missing a subtle point. Regards, Scott "Ben Hinkle" <bhinkle4@juno.com> wrote in message news:c5sgbp$2p8j$1@digitaldaemon.com... > well, based on Serge's eariler post I've been playing around with different assoc array implementations and believe it or not the current one is the only one that behaves well on both Windows and Linux under various tests that vary the table size and GC interactions. > > The only trick that significantly improved performance was preallocating the node pointer array. I hadn't realized this before but you can already do this as follows: > > int[char[]] X; // an assoc array of ints indexed by strings > void*[] Y; > Y.length = 1000000; // allocate array of void* > X = cast(int[char[]])Y; // voila, a pre-allocated assoc array > > Another performance boost came by using stdup with malloc to duplicate strings instead of .dup but if you do that you have to manage the dup'ed strings so they don't leak. > > I tried all kinds of combinations of pre-allocating the data nodes as well as the node pointers and messing with the hashing function but they all had drawbacks when mixed with either the GC or the default pointer array (the default assoc array allocates 10 pointers no matter how many entries you eventually put in). Go figure. Either Walter got lucky or he actually tested this stuff out ;-) > > -Ben > |
April 18, 2004 Re: assoc array performance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Scott Egan | "Scott Egan" <scotte@tpg.com.aux> wrote in message news:c5tels$175v$1@digitaldaemon.com... > Isn't this dangerous? You're relying on the semantics of how an Assoc array > is implemented by the compiler. Sure it is, Scott. But at least this is a workaround, until Walter is "encouraged" by enough of us to add such properties to AAs <g> |
Copyright © 1999-2021 by the D Language Foundation