Thread overview
Looking up keys in associative arrays
Aug 04, 2005
Uwe Salomon
Aug 04, 2005
Sean Kelly
Aug 04, 2005
Uwe Salomon
Aug 04, 2005
Sean Kelly
Aug 05, 2005
Regan Heath
Aug 04, 2005
Shammah Chancellor
Aug 05, 2005
Ben Hinkle
August 04, 2005
I know this subject was discussed just before a week or so. At that time i was not sure if it is that important or not, but now i am. The current behaviour of the indexing operator[] for associative arrays is simply unacceptable, considering that:

* It throws an ArrayBoundsError in debug/normal mode.
* Simply segfaults in release mode!

This renders the [] operator almost useless, and that sure can't be intended. Currently i have to use this code to check for a value in an array of objects:

####
TextCodec* val;
if ((val = (name in m_availableCodecs)) is null)
  return null;
else
  return *val;
####

These are 5 lines for a simple lookup -- way too much, i think.

If i explicitly want to test for a value, i can use the "in" operator or whatever it is replaced by in the next DMD version, but there really should be a possibility to simply lookup a value and get the default if the key does not exist.

Ciao
uwe
August 04, 2005
In article <op.suzoz1f76yjbe6@sandmann.maerchenwald.net>, Uwe Salomon says...
>
>I know this subject was discussed just before a week or so. At that time i was not sure if it is that important or not, but now i am. The current behaviour of the indexing operator[] for associative arrays is simply unacceptable, considering that:
>
>* It throws an ArrayBoundsError in debug/normal mode.
>* Simply segfaults in release mode!
>
>This renders the [] operator almost useless, and that sure can't be intended. Currently i have to use this code to check for a value in an array of objects:
>
>####
>TextCodec* val;
>if ((val = (name in m_availableCodecs)) is null)
>   return null;
>else
>   return *val;
>####
>
>These are 5 lines for a simple lookup -- way too much, i think.
>
>If i explicitly want to test for a value, i can use the "in" operator or whatever it is replaced by in the next DMD version, but there really should be a possibility to simply lookup a value and get the default if the key does not exist.

This is how the subscript operator used to behave and people complained.  Or are you saying you want the default returned but not inserted if it doesn't exist?


Sean


August 04, 2005
>> If i explicitly want to test for a value, i can use the "in" operator or
>> whatever it is replaced by in the next DMD version, but there really
>> should be a possibility to simply lookup a value and get the default if
>> the key does not exist.
>
> This is how the subscript operator used to behave and people complained.  Or are
> you saying you want the default returned but not inserted if it doesn't exist?

???

I was talking about rvalue opIndex, not opIndexAssign. opIndexAssign inserts the value if it doesn't exist, but opIndex just returns a default if the key doesn't exist ─ no inserting:

####
int[char[]] array;
array["1"] = 1;

printf("%i\n", array["1"]);
printf("%p\n", ("2" in array));

printf("%i\n", array["2"]);
printf("%p\n", ("2" in array));

printf("%i\n", (array["2"] = 5));
printf("%p\n", ("2" in array));
####

This should output:

####
1
(nil)
0
(nil)
5
0x8329349SomeAddress
####

At least that was it what i thought people wanted?

Ciao
uwe
August 04, 2005
In article <op.suzr4hji6yjbe6@sandmann.maerchenwald.net>, Uwe Salomon says...
>
>>> If i explicitly want to test for a value, i can use the "in" operator or whatever it is replaced by in the next DMD version, but there really should be a possibility to simply lookup a value and get the default if the key does not exist.
>>
>> This is how the subscript operator used to behave and people
>> complained.  Or are
>> you saying you want the default returned but not inserted if it doesn't
>> exist?
>
>???
>
>I was talking about rvalue opIndex, not opIndexAssign. opIndexAssign inserts the value if it doesn't exist, but opIndex just returns a default if the key doesn't exist ─ no inserting:
>
>####
>int[char[]] array;
>array["1"] = 1;
>
>printf("%i\n", array["1"]);
>printf("%p\n", ("2" in array));
>
>printf("%i\n", array["2"]);
>printf("%p\n", ("2" in array));
>
>printf("%i\n", (array["2"] = 5));
>printf("%p\n", ("2" in array));
>####
>
>This should output:
>
>####
>1
>(nil)
>0
>(nil)
>5
>0x8329349SomeAddress
>####
>
>At least that was it what i thought people wanted?

You're right.  And you're saying that right now it's throwing an OutOfBoundsError instead of returning the default value?  Wasn't that something Walter put in temporarily to help users find places in their code that may need to be fixed to adapt to the new behavior?  If so, perhaps it's time to remove it.


Sean


August 04, 2005
In article <op.suzoz1f76yjbe6@sandmann.maerchenwald.net>, Uwe Salomon says...
>
>I know this subject was discussed just before a week or so. At that time i was not sure if it is that important or not, but now i am. The current behaviour of the indexing operator[] for associative arrays is simply unacceptable, considering that:
>
>* It throws an ArrayBoundsError in debug/normal mode.
>* Simply segfaults in release mode!
>
>This renders the [] operator almost useless, and that sure can't be intended. Currently i have to use this code to check for a value in an array of objects:
>
>####
>TextCodec* val;
>if ((val = (name in m_availableCodecs)) is null)
>   return null;
>else
>   return *val;
>####
>
>These are 5 lines for a simple lookup -- way too much, i think.
>
>If i explicitly want to test for a value, i can use the "in" operator or whatever it is replaced by in the next DMD version, but there really should be a possibility to simply lookup a value and get the default if the key does not exist.
>
>Ciao
>uwe

You may use this code in the mean time:  (I personally like the behavior of in, but I think there should be a contains method as well.)

#import std.stdio;
#
#template HashUtils( T, KeyType ) {
#  T Contains( T[KeyType] hash, KeyType key )
#  {
#   T* val;
#   return (val = (key in hash)) == null ? T.init : *val;
#  }
#}
#
#
#typedef int foobars = 100;
#
#alias HashUtils!( foobars, char[] ).Contains Contains;
#
#int main(char[][] argv)
#{
#	foobars[char[]] hash;
#	hash["Hello!"] = 1;
#
#	writefln( hash.Contains("Hello!") );
#	writefln( hash.Contains("AckBar!") );
#
#	return 0;
#}


August 05, 2005
Sean Kelly wrote:

>> These are 5 lines for a simple lookup -- way too much, i think.
>>
>>If i explicitly want to test for a value, i can use the "in" operator or  whatever it is replaced by in the next DMD version, but there really  should be a possibility to simply lookup a value and get the default if  the key does not exist.
> 
> This is how the subscript operator used to behave and people complained.  Or are
> you saying you want the default returned but not inserted if it doesn't exist?

No,
it used to insert "blank" keys into the table on lookup,
which also sucked a lot - but in a different fashion...

The current behaviour *is* to throw an assert, or crash.
I would also prefer it returning a default value instead.

--anders

PS.
This is just one line, but it's still pretty clumsy: (and double-lookup)
  return (name in m_availableCodecs) ? m_availableCodecs[name] : null;
But at least it has worked similarly over time, as the syntax mutated ?
August 05, 2005
On Fri, 05 Aug 2005 13:17:55 +0200, Anders F Björklund <afb@algonet.se> wrote:
> Sean Kelly wrote:
>
>>> These are 5 lines for a simple lookup -- way too much, i think.
>>>
>>> If i explicitly want to test for a value, i can use the "in" operator or  whatever it is replaced by in the next DMD version, but there really  should be a possibility to simply lookup a value and get the default if  the key does not exist.
>>  This is how the subscript operator used to behave and people complained.  Or are
>> you saying you want the default returned but not inserted if it doesn't exist?
>
> No,
> it used to insert "blank" keys into the table on lookup,
> which also sucked a lot - but in a different fashion...
>
> The current behaviour *is* to throw an assert, or crash.
> I would also prefer it returning a default value instead.

The downside to the default value is that for value types the default value is a valid value for an existing entry, thus you cannot tell if it exists and get it's value in one operation without resorting to pointers (which is nothing new really)

At least with the Exception we can tell, and it's the same for all types.

Of course there should be a way to 'ask' if it exists, and to get it if it doesn't. You also might want to ask if it exists and insert if it doesn't. In both cases you want to avoid a double lookup.

My preferred solution to the former is 'contains' eg.

bool contains(Value[Key] aa, out Value value);
if (aa.contains("key",val)) { //use val }

As for the latter.. I haven't a good idea.

With the current behaviour we can write a contains wrapper that uses a single lookup. I can't see myself using an AA without 'contains'.

Regan
August 05, 2005
"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.suzoz1f76yjbe6@sandmann.maerchenwald.net...
>I know this subject was discussed just before a week or so. At that time i was not sure if it is that important or not, but now i am. The current behaviour of the indexing operator[] for associative arrays is simply unacceptable, considering that:
>
> * It throws an ArrayBoundsError in debug/normal mode.
> * Simply segfaults in release mode!
>
> This renders the [] operator almost useless, and that sure can't be intended. Currently i have to use this code to check for a value in an array of objects:
>
> ####
> TextCodec* val;
> if ((val = (name in m_availableCodecs)) is null)
>   return null;
> else
>   return *val;
> ####
>
> These are 5 lines for a simple lookup -- way too much, i think.
>
> If i explicitly want to test for a value, i can use the "in" operator or whatever it is replaced by in the next DMD version, but there really should be a possibility to simply lookup a value and get the default if the key does not exist.
>
> Ciao
> uwe

This is slightly OT but I changed the API for MinTL's aa containers to experiment with different options as a way to get feedback about what works and what doesn't. The API is

Value opIndex(Key key);
Return item with given key. Returns the default missing value if not
present.
void opIndexAssign(Value val, Key key);
Assign a value to the given key
Value missing
Read/write property for the value to use on indexing a key not in the array.
Defaults to Value.init
Value* get(Key key, bit throwOnMiss = false);
Return a pointer to the value stored at the key. If the key is not in the
array then null is returned or if throwOnMiss is true an exception is
thrown.
Value* put(Key key)
Return a pointer to the value stored at the key and insert the key with
value Value.init if the key is not in the array.
bool contains(Key key)
Returns true if the array contains the key
bool contains(Key key, out Value value)
Returns true if the array contains the key and sets the out value if
present.
void remove(Key key);
Remove a key from array if present
Value take(Key key);
Remove a key from array if present and return value. Returns the default
missing value if the key was not present.

That way to get the old AA insert-on-lookup behavior (ie C++ map behavior) you call put(key) and to get the current AA behavior you call get(key,true). The 'in' operator is replaced with get(key). The implementations are all a few lines long and rely on one common lookup function to do all the true work while making sure all the operations don't need to do double-lookup. I'm curious if people think there's too much overlap in functionality - for example contains(key) is get(key) !is null.


August 13, 2005
In article <opsu0879bt23k2f5@nrage.netwin.co.nz>, Regan Heath says... [snip]
>As for the latter.. I haven't a good idea.

Here is a proposal for retrieval. I think the names are short and sweet and do what they say:

aa.has(name) ==> returns true if the aa has the name
aa[name] ==> returns the value or throws an exception (regardless of -release
mode or not)
aa.get(name, default) ==> returns the value for aa[name] or default if it does
not exist
aa.getOrSet(name, value) ==> returns the value for aa[name] if it exists, or
sets it to value and returns it still

Without explanations:
aa.has(name)
aa[name]
aa.get(name, default)
aa.getOrSet(name, value)

How does that sound?

Chuck