Jump to page: 1 2 3
Thread overview
hash lookup make dummy hash entry
Feb 08, 2005
bero
Feb 08, 2005
Regan Heath
Feb 08, 2005
Regan Heath
Feb 09, 2005
Regan Heath
Feb 08, 2005
Regan Heath
Feb 09, 2005
Nick
Feb 09, 2005
Regan Heath
Feb 09, 2005
Regan Heath
Feb 09, 2005
Regan Heath
Feb 10, 2005
Regan Heath
Feb 10, 2005
Regan Heath
Feb 09, 2005
Nick
Feb 08, 2005
Ben Hinkle
February 08, 2005
hash lookup make dummy hash entry

---
/* hash bug */
int main(char[][] args)
{
	char[][char[]] hash;

	printf("%d\n",hash.length);

	hash["foo"] = "foo";

	printf("%d\n",hash.length);

	hash["bar"] = "bar";

	printf("%d\n",hash.length); // print 2

	char[] val = hash["baz"]; // lookups make ' hash["baz"] = null ' new entry

	printf("%d\n",hash.length); // print 3
	return 0;
}
---

Both hash lookup and setting generate "_aaGet" function call and set hash[key]=0 when lookup.
February 08, 2005
bero wrote:

> hash lookup make dummy hash entry

This is a well known bug / behaviour...

You currently need to use the "in" operator:
char[] val = "baz" in hash ? hash["baz"] : "";

It's on purpose, but nobody knows why ?

--anders
February 08, 2005
On Tue, 08 Feb 2005 19:43:28 +0100, Anders F Björklund <afb@algonet.se> wrote:
> bero wrote:
>
>> hash lookup make dummy hash entry
>
> This is a well known bug / behaviour...
>
> You currently need to use the "in" operator:
> char[] val = "baz" in hash ? hash["baz"] : "";
>
> It's on purpose, but nobody knows why ?

I believe this thread (if followed far enough) contains the reasoning:
  http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/124

If you have an AA of 'reference' types then you can return null on a non-existant entry.

If you have an AA of 'value' types i.e. int, long, struct then you can't return null, but you can return Type.init, however: how do you tell that apart from a existing entry with that same value?

So, either the AA creates an entry, or it throws an exception.
Personally I think the latter is the correct default behaviour.

Really the only reason not to use "if (x in aa)" all the time is that it in most cases causes a double lookup eg.

if (x in aa) {
  y = aa[x];
  ..etc..
}

In the thread above I proposed some helper functions to avoid the double lookup and Ben even wrote some of them.

bool contains(key_type key, out value_type value);

--returns existance of value, assigns 'value' if found

bool contains(key_type key, out index_type at);

--returns existance of value, assigns 'at' to location which can then be used to get/replace/add an item.

Regan
February 08, 2005
Which is why it was changed recently so you can do this:

y = x in aa;
if (y)
{
   ...
}

Which of course is a bit different, but is only a single lookup all the same.

-[Unknown]


> Really the only reason not to use "if (x in aa)" all the time is that it  in most cases causes a double lookup eg.
> 
> if (x in aa) {
>   y = aa[x];
>   ..etc..
> }
February 08, 2005
Regan Heath wrote:

>>> hash lookup make dummy hash entry
>>
>> It's on purpose, but nobody knows why ?
> 
> I believe this thread (if followed far enough) contains the reasoning:
>   http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/124
> 
> If you have an AA of 'reference' types then you can return null on a  non-existant entry.

Perfectly alright.

> If you have an AA of 'value' types i.e. int, long, struct then you can't  return null, but you can return Type.init, however: how do you tell that  apart from a existing entry with that same value?

By using the "in" operator, in case you care about the difference ?

In silly languages, like Java's JDBC interface for instance, they
have "helper" functions like isNull or wrap primitives in objects.
(both of which could still be emulated in D with some extra code)

> So, either the AA creates an entry, or it throws an exception.
> Personally I think the latter is the correct default behaviour.

It would be perfectly acceptible for the AA to:
1) *not* create something upon reading
2) *not* throw an annoying exception

But to just return Type.init and be done with it ?

This enables the hashes to work as unlimited arrays,
in that they are all inited to the default value but
do not throw ArrayBoundsError (since there is none)

> Really the only reason not to use "if (x in aa)" all the time is that it  in most cases causes a double lookup eg.
> 
> if (x in aa) {
>   y = aa[x];
>   ..etc..
> }

Walter changed the language, so that "in" now returns a *pointer*
(there are still a few bugs in the new implementation, but anyway)

This means you can now do: if ((p = x in aa) != null) { y = *p; }
which avoids the double-lookup, at the expense of readability...

Just don't try this with function delegates, and a few other AA types.
(there you still need the double lookup like above, at least for now)

> In the thread above I proposed some helper functions to avoid the double  lookup and Ben even wrote some of them.

Shouldn't be needed, with the new "in" implementation (when debugged)
Never mind re-introducing pointers, and mixing booleans and pointers.

But the horribly confusing "create entries when reading" should die!
Then the only thing left to do is to avoid the "delete" keyword...

--anders
February 08, 2005
On Tue, 08 Feb 2005 12:58:47 -0800, Unknown W. Brackets <unknown@simplemachines.org> wrote:
> Which is why it was changed recently so you can do this:
>
> y = x in aa;
> if (y)
> {
>     ...
> }
>
> Which of course is a bit different, but is only a single lookup all the same.

True. This was a good idea IMO.

However, I'm still not convinced the behaviour of aa["key"] for a missing value is 'correct'.

It seems to cause bugs, at least for begginners un-used to the behaviour and may even cause the odd problem for the more experienced.

That said, I'm not convinced an exception is all that 'nice' either.

Regan
February 08, 2005
I'm afraid that I, apparently unlike you, use associative arrays with many values in them - including some that are Type.init.

To have it return Type.init and not create the entry seems much worse to me than just creating the darn thing (which mind you, I don't quite like either.)

-[Unknown]


> But to just return Type.init and be done with it ?
February 08, 2005
On Tue, 08 Feb 2005 22:04:57 +0100, Anders F Björklund <afb@algonet.se> wrote:
> Regan Heath wrote:
>
>>>> hash lookup make dummy hash entry
>>>
>>> It's on purpose, but nobody knows why ?
>>  I believe this thread (if followed far enough) contains the reasoning:
>>   http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/124
>>  If you have an AA of 'reference' types then you can return null on a  non-existant entry.
>
> Perfectly alright.

Agreed.

>> If you have an AA of 'value' types i.e. int, long, struct then you can't  return null, but you can return Type.init, however: how do you tell that  apart from a existing entry with that same value?
>
> By using the "in" operator, in case you care about the difference ?

Forcing you to either use a double lookup, or a pointer... it just seems to me like there has to be a better solution, I mean these are basic types after all, my gut feeling is that it should be easy/simple and it's not.

> In silly languages, like Java's JDBC interface for instance, they
> have "helper" functions like isNull or wrap primitives in objects.
> (both of which could still be emulated in D with some extra code)

I dislike these wrappers in general.

>> So, either the AA creates an entry, or it throws an exception.
>> Personally I think the latter is the correct default behaviour.
>
> It would be perfectly acceptible for the AA to:
> 1) *not* create something upon reading

Agreed.

> 2) *not* throw an annoying exception

Agreed.

> But to just return Type.init and be done with it ?

This bugs me.

I'd rather use a function:
bool contains(key_type key, out value_type value);

Which:
1. returned 'true' and assigned 'value' to the existing value.
2. returned 'false' and did nothing to 'value'.

This works the same way for both reference and value types.

I admit it doesn't work for structs, it's the whole "how do you copy a struct?" problem, however I'd argue that in this case it's better to use struct pointers, to avoid copying large blocks of memory.

It's not a perfect solution, but it works well, and identically for everything but structs, so it seems the best soln to me.

> This enables the hashes to work as unlimited arrays,
> in that they are all inited to the default value but
> do not throw ArrayBoundsError (since there is none)

I don't think they should ever throw 'ArrayBoundsError'.
If they throw anything it's 'NoSuchItemError' or something.

>> Really the only reason not to use "if (x in aa)" all the time is that it  in most cases causes a double lookup eg.
>>  if (x in aa) {
>>   y = aa[x];
>>   ..etc..
>> }
>
> Walter changed the language, so that "in" now returns a *pointer*
> (there are still a few bugs in the new implementation, but anyway)
>
> This means you can now do: if ((p = x in aa) != null) { y = *p; }
> which avoids the double-lookup, at the expense of readability...
>
> Just don't try this with function delegates, and a few other AA types.
> (there you still need the double lookup like above, at least for now)

It's a good solution, but it's not perfect as you say. Readability suffers, and we're forced to use pointers again, something I'm keen to avoid if at all possible.

>> In the thread above I proposed some helper functions to avoid the double  lookup and Ben even wrote some of them.
>
> Shouldn't be needed, with the new "in" implementation (when debugged)
> Never mind re-introducing pointers, and mixing booleans and pointers.

I want to avoid re-introducing pointers, my solution does this by using an 'out' variable.
I don't understand "mixing booleans and pointers", are you referring to "if (pointer)" statements?

> But the horribly confusing "create entries when reading" should die!

Agreed.

> Then the only thing left to do is to avoid the "delete" keyword...

Which has some (commonly percieved) odd behaviour all of it's own.

Regan
February 08, 2005
> In the thread above I proposed some helper functions to avoid the double lookup and Ben even wrote some of them.
>
> bool contains(key_type key, out value_type value);
>
> --returns existance of value, assigns 'value' if found
>
> bool contains(key_type key, out index_type at);
>
> --returns existance of value, assigns 'at' to location which can then be used to get/replace/add an item.

I don't think those tricks work anymore. The implementation of AAs changed around dmd-107 or so and my code broke. I took all that stuff out of my libraries now that "in" returns a pointer.


February 08, 2005
Regan Heath wrote:

> I'd rather use a function:
> bool contains(key_type key, out value_type value);
> 
> Which:
> 1. returned 'true' and assigned 'value' to the existing value.
> 2. returned 'false' and did nothing to 'value'.
> 
> This works the same way for both reference and value types.

Hashes are built-in types in D. Thus they don't need eg. functions,
or it would be the same as it is implemented in Java Collections ?

> I don't think they should ever throw 'ArrayBoundsError'.
> If they throw anything it's 'NoSuchItemError' or something.

Doesn't matter, since you still need the double lookup then
with "in", to avoid having it throw up all over the place.

> It's a good solution, but it's not perfect as you say. Readability  suffers, and we're forced to use pointers again, something I'm keen to  avoid if at all possible.

It's also cheerfully mixing those pointers with a boolean "in"

>> Then the only thing left to do is to avoid the "delete" keyword...
> 
> Which has some (commonly percieved) odd behaviour all of it's own.

Well, compare: delete array[key], for different types of arrays...
(dynamic int[] arrays where key is an int, with the associative)

I suggested "out" myself. 1) it's a keyword 2) rhymes with "in"
(and it could return a pointer to the item that it just removed)

--anders
« First   ‹ Prev
1 2 3