Thread overview | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
February 08, 2005 hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
hash lookup make dummy hash entry --- /* hash bug */ int main(char[][] args) { char[][char[]] hash; printf("%d\n",hash.length); hash["foo"] = "foo"; printf("%d\n",hash.length); hash["bar"] = "bar"; printf("%d\n",hash.length); // print 2 char[] val = hash["baz"]; // lookups make ' hash["baz"] = null ' new entry printf("%d\n",hash.length); // print 3 return 0; } --- Both hash lookup and setting generate "_aaGet" function call and set hash[key]=0 when lookup. |
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to bero | bero wrote:
> hash lookup make dummy hash entry
This is a well known bug / behaviour...
You currently need to use the "in" operator:
char[] val = "baz" in hash ? hash["baz"] : "";
It's on purpose, but nobody knows why ?
--anders
|
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Anders F Björklund | On Tue, 08 Feb 2005 19:43:28 +0100, Anders F Björklund <afb@algonet.se> wrote: > bero wrote: > >> hash lookup make dummy hash entry > > This is a well known bug / behaviour... > > You currently need to use the "in" operator: > char[] val = "baz" in hash ? hash["baz"] : ""; > > It's on purpose, but nobody knows why ? I believe this thread (if followed far enough) contains the reasoning: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/124 If you have an AA of 'reference' types then you can return null on a non-existant entry. If you have an AA of 'value' types i.e. int, long, struct then you can't return null, but you can return Type.init, however: how do you tell that apart from a existing entry with that same value? So, either the AA creates an entry, or it throws an exception. Personally I think the latter is the correct default behaviour. Really the only reason not to use "if (x in aa)" all the time is that it in most cases causes a double lookup eg. if (x in aa) { y = aa[x]; ..etc.. } In the thread above I proposed some helper functions to avoid the double lookup and Ben even wrote some of them. bool contains(key_type key, out value_type value); --returns existance of value, assigns 'value' if found bool contains(key_type key, out index_type at); --returns existance of value, assigns 'at' to location which can then be used to get/replace/add an item. Regan |
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | Which is why it was changed recently so you can do this:
y = x in aa;
if (y)
{
...
}
Which of course is a bit different, but is only a single lookup all the same.
-[Unknown]
> Really the only reason not to use "if (x in aa)" all the time is that it in most cases causes a double lookup eg.
>
> if (x in aa) {
> y = aa[x];
> ..etc..
> }
|
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | Regan Heath wrote: >>> hash lookup make dummy hash entry >> >> It's on purpose, but nobody knows why ? > > I believe this thread (if followed far enough) contains the reasoning: > http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/124 > > If you have an AA of 'reference' types then you can return null on a non-existant entry. Perfectly alright. > If you have an AA of 'value' types i.e. int, long, struct then you can't return null, but you can return Type.init, however: how do you tell that apart from a existing entry with that same value? By using the "in" operator, in case you care about the difference ? In silly languages, like Java's JDBC interface for instance, they have "helper" functions like isNull or wrap primitives in objects. (both of which could still be emulated in D with some extra code) > So, either the AA creates an entry, or it throws an exception. > Personally I think the latter is the correct default behaviour. It would be perfectly acceptible for the AA to: 1) *not* create something upon reading 2) *not* throw an annoying exception But to just return Type.init and be done with it ? This enables the hashes to work as unlimited arrays, in that they are all inited to the default value but do not throw ArrayBoundsError (since there is none) > Really the only reason not to use "if (x in aa)" all the time is that it in most cases causes a double lookup eg. > > if (x in aa) { > y = aa[x]; > ..etc.. > } Walter changed the language, so that "in" now returns a *pointer* (there are still a few bugs in the new implementation, but anyway) This means you can now do: if ((p = x in aa) != null) { y = *p; } which avoids the double-lookup, at the expense of readability... Just don't try this with function delegates, and a few other AA types. (there you still need the double lookup like above, at least for now) > In the thread above I proposed some helper functions to avoid the double lookup and Ben even wrote some of them. Shouldn't be needed, with the new "in" implementation (when debugged) Never mind re-introducing pointers, and mixing booleans and pointers. But the horribly confusing "create entries when reading" should die! Then the only thing left to do is to avoid the "delete" keyword... --anders |
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Unknown W. Brackets | On Tue, 08 Feb 2005 12:58:47 -0800, Unknown W. Brackets <unknown@simplemachines.org> wrote:
> Which is why it was changed recently so you can do this:
>
> y = x in aa;
> if (y)
> {
> ...
> }
>
> Which of course is a bit different, but is only a single lookup all the same.
True. This was a good idea IMO.
However, I'm still not convinced the behaviour of aa["key"] for a missing value is 'correct'.
It seems to cause bugs, at least for begginners un-used to the behaviour and may even cause the odd problem for the more experienced.
That said, I'm not convinced an exception is all that 'nice' either.
Regan
|
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Anders F Björklund | I'm afraid that I, apparently unlike you, use associative arrays with many values in them - including some that are Type.init.
To have it return Type.init and not create the entry seems much worse to me than just creating the darn thing (which mind you, I don't quite like either.)
-[Unknown]
> But to just return Type.init and be done with it ?
|
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Anders F Björklund | On Tue, 08 Feb 2005 22:04:57 +0100, Anders F Björklund <afb@algonet.se> wrote: > Regan Heath wrote: > >>>> hash lookup make dummy hash entry >>> >>> It's on purpose, but nobody knows why ? >> I believe this thread (if followed far enough) contains the reasoning: >> http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/124 >> If you have an AA of 'reference' types then you can return null on a non-existant entry. > > Perfectly alright. Agreed. >> If you have an AA of 'value' types i.e. int, long, struct then you can't return null, but you can return Type.init, however: how do you tell that apart from a existing entry with that same value? > > By using the "in" operator, in case you care about the difference ? Forcing you to either use a double lookup, or a pointer... it just seems to me like there has to be a better solution, I mean these are basic types after all, my gut feeling is that it should be easy/simple and it's not. > In silly languages, like Java's JDBC interface for instance, they > have "helper" functions like isNull or wrap primitives in objects. > (both of which could still be emulated in D with some extra code) I dislike these wrappers in general. >> So, either the AA creates an entry, or it throws an exception. >> Personally I think the latter is the correct default behaviour. > > It would be perfectly acceptible for the AA to: > 1) *not* create something upon reading Agreed. > 2) *not* throw an annoying exception Agreed. > But to just return Type.init and be done with it ? This bugs me. I'd rather use a function: bool contains(key_type key, out value_type value); Which: 1. returned 'true' and assigned 'value' to the existing value. 2. returned 'false' and did nothing to 'value'. This works the same way for both reference and value types. I admit it doesn't work for structs, it's the whole "how do you copy a struct?" problem, however I'd argue that in this case it's better to use struct pointers, to avoid copying large blocks of memory. It's not a perfect solution, but it works well, and identically for everything but structs, so it seems the best soln to me. > This enables the hashes to work as unlimited arrays, > in that they are all inited to the default value but > do not throw ArrayBoundsError (since there is none) I don't think they should ever throw 'ArrayBoundsError'. If they throw anything it's 'NoSuchItemError' or something. >> Really the only reason not to use "if (x in aa)" all the time is that it in most cases causes a double lookup eg. >> if (x in aa) { >> y = aa[x]; >> ..etc.. >> } > > Walter changed the language, so that "in" now returns a *pointer* > (there are still a few bugs in the new implementation, but anyway) > > This means you can now do: if ((p = x in aa) != null) { y = *p; } > which avoids the double-lookup, at the expense of readability... > > Just don't try this with function delegates, and a few other AA types. > (there you still need the double lookup like above, at least for now) It's a good solution, but it's not perfect as you say. Readability suffers, and we're forced to use pointers again, something I'm keen to avoid if at all possible. >> In the thread above I proposed some helper functions to avoid the double lookup and Ben even wrote some of them. > > Shouldn't be needed, with the new "in" implementation (when debugged) > Never mind re-introducing pointers, and mixing booleans and pointers. I want to avoid re-introducing pointers, my solution does this by using an 'out' variable. I don't understand "mixing booleans and pointers", are you referring to "if (pointer)" statements? > But the horribly confusing "create entries when reading" should die! Agreed. > Then the only thing left to do is to avoid the "delete" keyword... Which has some (commonly percieved) odd behaviour all of it's own. Regan |
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | > In the thread above I proposed some helper functions to avoid the double lookup and Ben even wrote some of them.
>
> bool contains(key_type key, out value_type value);
>
> --returns existance of value, assigns 'value' if found
>
> bool contains(key_type key, out index_type at);
>
> --returns existance of value, assigns 'at' to location which can then be used to get/replace/add an item.
I don't think those tricks work anymore. The implementation of AAs changed around dmd-107 or so and my code broke. I took all that stuff out of my libraries now that "in" returns a pointer.
|
February 08, 2005 Re: hash lookup make dummy hash entry | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | Regan Heath wrote: > I'd rather use a function: > bool contains(key_type key, out value_type value); > > Which: > 1. returned 'true' and assigned 'value' to the existing value. > 2. returned 'false' and did nothing to 'value'. > > This works the same way for both reference and value types. Hashes are built-in types in D. Thus they don't need eg. functions, or it would be the same as it is implemented in Java Collections ? > I don't think they should ever throw 'ArrayBoundsError'. > If they throw anything it's 'NoSuchItemError' or something. Doesn't matter, since you still need the double lookup then with "in", to avoid having it throw up all over the place. > It's a good solution, but it's not perfect as you say. Readability suffers, and we're forced to use pointers again, something I'm keen to avoid if at all possible. It's also cheerfully mixing those pointers with a boolean "in" >> Then the only thing left to do is to avoid the "delete" keyword... > > Which has some (commonly percieved) odd behaviour all of it's own. Well, compare: delete array[key], for different types of arrays... (dynamic int[] arrays where key is an int, with the associative) I suggested "out" myself. 1) it's a keyword 2) rhymes with "in" (and it could return a pointer to the item that it just removed) --anders |
Copyright © 1999-2021 by the D Language Foundation