July 28, 2004
On Tue, 27 Jul 2004 22:40:02 -0700, Kris <someidiot@earthlink.dot.dot.dot.net> wrote:
> "Ben Hinkle"  wrote ...
>> Here's the dilemma: if you try to get a value that isn't in the hashmap
> then
>> what should be returned? No matter what you return the user can't tell if
>> the key is/was in the hashmap or not. So in some sense the right thing to
>> do is throw an exception when the user tries to evaluate aa["key"] and
>> "key" isn't in the map - this solution would be the most correct in terms
>> of principle of least astonishment since it is analogous to asking a
>> dynamic array for an index that is out of bounds (which throws an
>> exception). I'm half-tempted to serious propose that instead of some hack
>> about returning Type.init and maybe or maybe not adding it to the map.
> That
>> way both STL and Java users will get a surprise!
>>
>> eh, maybe I need another glass of wine. :-)
>
> Perhaps I've had too many recently ... 8~}
>
> How about returning a simple true/false from a lookup instead? One can use
> an 'out' argument for the "value" if there's one present, yes?

That is what I've been trying to suggest for the last 5 posts!

> Not the most
> elegant solution, or perhaps the most efficient; but it's a whole lot better
> than two lookups, or the dreaded (cue music ...) *mutating rValue* ...

I think it's a very elegant solution, it handles all value types, and can be placed in an if statement like so:

if (aa.get("regan",value)) {
}

> bool get (K key, out V value);

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 28, 2004
On Wed, 28 Jul 2004 10:17:57 -0400, Ben Hinkle <bhinkle@mathworks.com> wrote:
>> I think this is the best solution. In addition you can supply a method
> like
>>
>> if (aa.contains("key",value))
>>
>> which returns true/false and assigns value if found. So those who do not
>> want to catch an exception can call that one instead. Personally I think
>> I'd always want to use the contains method instead of aa["key"].
>>
>> If you dont like the name contains you can use anything else i.e. find,
>> get, lookup
>
> Seems reasonable to me - though the name "contains" doesn't imply anything
> about setting values

True, I think 'get' is the best name.

> so maybe something like "trySet" would be better.

I think we should avoid the word 'set' it implies to me we're going to add something to the aa, which we're not in this case.

> Also
> I could see a "tryGet" function that looks up the key and if present sets
> the output variable and returns true and otherwise returns false.

That is the function I am proposing.. it seems I have not been making myself very clear. :)

Regan

>> > That way both STL and Java users will get a surprise!
>>
>> At least it's an immediate surprise and not one you don't realise until
>> later, as it was for Kris with the current behaviour.
>
> very true. subtle bugs can be very frustrating.
>
>> > eh, maybe I need another glass of wine. :-)
>>
>> Don't we all..
>>
>> Regan
>>
>> --
>> Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
>
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 28, 2004
On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan@netwin.co.nz> wrote:

> On Tue, 27 Jul 2004 22:03:17 -0700, Kris <someidiot@earthlink.dot.dot.dot.net> wrote:
>> 1) Oh, I misread your initial post. Sorry. So, when you say "returns
>> true/false and assigns a value if found" I guess you mean this is a kind of
>> "replace" function instead?
>
> No, no, no.. I'm just not making myself clear methinks.
>
> Two different functions, one:
>
>    bool contains(key_type key, value_type value);
>
> looks for key, returns true/false (existance of the value) and assigns value on true.

by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the out variable 'value', this is a 'get a value' function, not a 'set a value' function. It never mutates the aa. My function definition is missing the 'out' here:

bool contains(key_type key, out value_type value);

> The other:
>
>    bool contains(key_type key, index_type at);

This should be:

bool contains(key_type key, out index_type at);

> looks for key, returns true/false (existance of the value) and gives an index to the value which you can use to:
> a) retrieve the value (if it exists)
> b) replace the value with a new one (if it exists)
> c) insert a value (if it does not exist)
>
>> And returns whether the operation took place? If
>> so, then that's very useful. Still leaves the inverse scenario open though,
>> where you want to retrieve the value of something from the container without
>> mutating (egad!) at all: a very common case. With the current syntax you
>> have to check first, then do the retrieval: two lookups.
>
> Not using my idea above :)
>
>> 2) The problem with those "insertion location" style APIs is one of
>> atomicity ... if something throws an exception in-between, then all bets are
>> off. Definately works well for some environments though.
>
> I realise this, I don't think it should ever mutate on read/lookup operations.
>
> Regan
>
>> "Regan Heath"  wrote...
>>
>>> The only case I can think of is a large struct. In which case you should
>>> probably be storing pointers to them in the AA. eg.
>>>
>>> struct Foo {
>>>    ..lots..
>>> }
>>>
>>> Foo*[char[]] aa;
>>> Foo* value;
>>>
>>> if (aa.contains("regan",value)) ..
>>>
>>> If you're not storing pointers then it's going to be slow/expensive to
>>> simply add/assign one to the aa in the first place.
>>>
>>
>>
>
>
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 28, 2004
Yep - that's where the confusion was Regan. Without the 'out' declaration, I was reading your "assignment" phrase as assigning /into/ the AA instead. Sorry about that: I clearly had far too much wine at the time :-)

And yes; that is a much better way to do than the strategy adopted by AA.

- Kris


"Regan Heath" <regan@netwin.co.nz> wrote in message news:opsbu95dvy5a2sq9@digitalmars.com...
> On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan@netwin.co.nz>
wrote:
>
> > On Tue, 27 Jul 2004 22:03:17 -0700, Kris <someidiot@earthlink.dot.dot.dot.net> wrote:
> >> 1) Oh, I misread your initial post. Sorry. So, when you say "returns
> >> true/false and assigns a value if found" I guess you mean this is a
> >> kind of
> >> "replace" function instead?
> >
> > No, no, no.. I'm just not making myself clear methinks.
> >
> > Two different functions, one:
> >
> >    bool contains(key_type key, value_type value);
> >
> > looks for key, returns true/false (existance of the value) and assigns
> > value on true.
>
> by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the out variable 'value', this is a 'get a value' function, not a 'set a value' function. It never mutates the aa. My function definition is missing the 'out' here:
>
> bool contains(key_type key, out value_type value);
>
> > The other:
> >
> >    bool contains(key_type key, index_type at);
>
> This should be:
>
> bool contains(key_type key, out index_type at);
>
> > looks for key, returns true/false (existance of the value) and gives an
> > index to the value which you can use to:
> > a) retrieve the value (if it exists)
> > b) replace the value with a new one (if it exists)
> > c) insert a value (if it does not exist)
> >
> >> And returns whether the operation took place? If
> >> so, then that's very useful. Still leaves the inverse scenario open
> >> though,
> >> where you want to retrieve the value of something from the container
> >> without
> >> mutating (egad!) at all: a very common case. With the current syntax
you
> >> have to check first, then do the retrieval: two lookups.
> >
> > Not using my idea above :)
> >
> >> 2) The problem with those "insertion location" style APIs is one of
> >> atomicity ... if something throws an exception in-between, then all
> >> bets are
> >> off. Definately works well for some environments though.
> >
> > I realise this, I don't think it should ever mutate on read/lookup operations.
> >
> > Regan
> >
> >> "Regan Heath"  wrote...
> >>
> >>> The only case I can think of is a large struct. In which case you
> >>> should
> >>> probably be storing pointers to them in the AA. eg.
> >>>
> >>> struct Foo {
> >>>    ..lots..
> >>> }
> >>>
> >>> Foo*[char[]] aa;
> >>> Foo* value;
> >>>
> >>> if (aa.contains("regan",value)) ..
> >>>
> >>> If you're not storing pointers then it's going to be slow/expensive to simply add/assign one to the aa in the first place.
> >>>
> >>
> >>
> >
> >
> >
>
>
>
> --
> Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/


July 29, 2004
I've added the following function to the collection of "array helper" functions in mintl/array.d:

/** Query if an associative array contains a given key and if
 * so set the output value and return true and otherwise return
 * false without adding the key to the map or setting the value
 * parameter. This mechanism is faster than making two lookups:
 * one to see if the key is in the map and another to get the
 * value. Compiler-dependent.
 * \param x the array to query
 * \param key the key to look up
 * \param value the output value if any
 * \return true if the key is in the map and false otherwise
 */
template contains(K,V) {
  bool contains(V[K] x, K key, inout V value) {
    ...
  }
}
unittest {
  double[int] c;
  c[100] = 1.1;
  c[300] = 2.2;
  c[-100] = 3.3;
  double v;
  assert( contains!(int,double)(c,-100,v) );
  assert( v == 3.3 );
  assert( !contains!(int,double)(c,200,v) );
  assert( !(200 in c) );
}

I haven't added the second version of "contains" since the index_type
doesn't exist yet.
-Ben

Regan Heath wrote:

> On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan@netwin.co.nz> wrote:
> 
>> On Tue, 27 Jul 2004 22:03:17 -0700, Kris <someidiot@earthlink.dot.dot.dot.net> wrote:
>>> 1) Oh, I misread your initial post. Sorry. So, when you say "returns
>>> true/false and assigns a value if found" I guess you mean this is a
>>> kind of
>>> "replace" function instead?
>>
>> No, no, no.. I'm just not making myself clear methinks.
>>
>> Two different functions, one:
>>
>>    bool contains(key_type key, value_type value);
>>
>> looks for key, returns true/false (existance of the value) and assigns
>> value on true.
> 
> by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the out variable 'value', this is a 'get a value' function, not a 'set a value' function. It never mutates the aa. My function definition is missing the 'out' here:
> 
> bool contains(key_type key, out value_type value);
> 
>> The other:
>>
>>    bool contains(key_type key, index_type at);
> 
> This should be:
> 
> bool contains(key_type key, out index_type at);
> 
>> looks for key, returns true/false (existance of the value) and gives an
>> index to the value which you can use to:
>> a) retrieve the value (if it exists)
>> b) replace the value with a new one (if it exists)
>> c) insert a value (if it does not exist)
>>
>>> And returns whether the operation took place? If
>>> so, then that's very useful. Still leaves the inverse scenario open
>>> though,
>>> where you want to retrieve the value of something from the container
>>> without
>>> mutating (egad!) at all: a very common case. With the current syntax you
>>> have to check first, then do the retrieval: two lookups.
>>
>> Not using my idea above :)
>>
>>> 2) The problem with those "insertion location" style APIs is one of
>>> atomicity ... if something throws an exception in-between, then all
>>> bets are
>>> off. Definately works well for some environments though.
>>
>> I realise this, I don't think it should ever mutate on read/lookup operations.
>>
>> Regan
>>
>>> "Regan Heath"  wrote...
>>>
>>>> The only case I can think of is a large struct. In which case you
>>>> should
>>>> probably be storing pointers to them in the AA. eg.
>>>>
>>>> struct Foo {
>>>>    ..lots..
>>>> }
>>>>
>>>> Foo*[char[]] aa;
>>>> Foo* value;
>>>>
>>>> if (aa.contains("regan",value)) ..
>>>>
>>>> If you're not storing pointers then it's going to be slow/expensive to simply add/assign one to the aa in the first place.
>>>>
>>>
>>>
>>
>>
>>
> 
> 
> 

July 29, 2004
What ever happened to using '==' to test if something exists? I don't know, I guess I should leave the intelligent things to the smart people. When I see = then I assume that you want something to equal that value and not test for it. I always read in books that you use '==' instead of  '=' to test for something. Perhaps you can overload the '==' to see if something exists.
July 29, 2004
Ben Hinkle wrote:

> eh, maybe I need another glass of wine. :-)

You don't, that's exactly what Python does when you try to access and unexistent key on a dict:
-------------------------
>>> d = {}
>>> d["akey"] = "avalue"
>>> print d["akey"]
avalue
>>> print d["nokey"]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KeyError: 'nokey'
-------------------------

So the usual idiom to access an array that could or not could have a key is:

if "akey" in d:
    print d["akey"]

A nice method of python dicts is setdefault where you can specify a default value so if the key is not in the dictionary the default value is returned:

----------------------------------
>>> print d.setdefault("akey", "defaultvalue")
avalue
>>> print d.setdefault("nokey", "defaultvalue")
defaultvalue

July 29, 2004
Basically, I've found the STL-way of doing things - which is what D's AAs apes - painful 75% of the time, and useful 25% of the time. I've no doubt the same will apply to D's AAs.

It's especially painful in STL, since find has to be tested against end(), and then subject to (*??).second. Hateful
stuff.

Basically, there's no way to have your cake and eat it, either in C++ or D, when overloading operators for non built-in types (and even built-in types, so it seems!). (In my book - out in 6 weeks, not plugged for a while - I show how overloading the subscript operators for array types can never get full semantic conformance with built-in arrays. <g>)

If we want to be pedantically correct, I'd suggest that we abandon the subscript operators for associative arrays, but the logical correlation of that is that we then abandon built-in AAs (which I would also think is probably the best thing). But D's not really about doing what's pedantically correct, and it's sometimes tempting to think that it doesn't even do things that are sensibly correct.

If/when I can ever get DTL compiled, I shall henceforth use the Map template, which *is* able to support the distinction between opIndexAssign (creates a new element, or updates an existing one) and opIndex (which throws a NotFoundException exception if the element does not exist). However, I'm still not entirely comfortable given the fact that something so syntactically similar has such different semantics.

:-(

Matthew


"Kris" <someidiot@earthlink.dot.dot.dot.net> wrote in message news:ce3tul$1h6o$1@digitaldaemon.com...
> So, I've been happily thinking this whole time that associative arrays (AA) in D are rather cool; being built into the language and so on ... then today I spent several hours tracking down what seemed to be a truly bizarre bug, which turned out to be caused by the way D associative arrays operate. Here's what happened:
>
> I want to check if there's an entry in the AA; use it if so, otherwise continue. The documentation say to use this construct for checking:
>
> class X
> {
>      MyClass [char[]] aa;
>
>     void test ()
>     {
>          if ("hello" in aa)
>              ....
>      }
> }
>
> and then (presumably) you have to subsequently retrieve the entry as
> follows:
>
>     void test ()
>     {
>          if ("hello" in aa)
>             {
>             MyClass x = aa ["hello"];
>             // do something with x ...
>             }
>      }
>
> Of course, you've just performed two full lookups instead of one: first you look to see if the entry exists, then you lookup again to retrieve it. Being the freak that I am, I don't want that overhead. So ... I proceeded with this instead:
>
>     void test ()
>     {
>          MyClass x = aa ["hello"];
>
>          if (x)
>             {
>             // do something with x ...
>             }
>      }
>
> Much better! Right?
>
> Well, apparently not. Now clearly I did not assign anything to the AA in the above code, but an entry is created regardless. That is, AA entries are created as a /side-effect/ of doing that lookup. Now, I've used many hash-table implementations before, but this behavior is truly a first. Hands-up please, all those who would have expected this ...
>
> This sneaky behavior really wasted a whole lot of effort (I can almost hear Walter laughing in the background :-)
>
> That aside, one is forced to do a double lookup in such cases. This is not good at all, and I simply cannot fathom what benefit there is to such an approach. In my particular application, the double-lookup (and, of course, double hash-computation) is entirely unacceptable. Lookup is the most common operation on a hash-table, by far. It has to be efficient, and this "side-effect" totally flies in the face of reason. If it were a Phobos module it would have been replaced long ago.
>
> For the moment I have to assume the behavior is "by design", so fair-warning to everyone.
>
> (Ben? Do you have a proper hash-table I can use please? If not, I'll do an FNV port; AA's be damned. Out! Damn spot! Out, I say!)
>
>
>


July 30, 2004
On Thu, 29 Jul 2004 04:32:58 -0500, Gold Dragon <dragonwing@dragonu.net> wrote:

> What ever happened to using '==' to test if something exists? I don't know, I guess I should leave the intelligent things to the smart people. When I see = then I assume that you want something to equal that value and not test for it. I always read in books that you use '==' instead of   '=' to test for something. Perhaps you can overload the '==' to see if something exists.

If you do that, how do you compare two AA's with each other?

I don't mean identity eg. '===' or 'is' checking they refer to the same AA, I mean compare the actual AA. I admit this is perhaps a rare requirement, but still.

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 30, 2004
On Thu, 29 Jul 2004 05:07:16 -0400, Ben Hinkle <bhinkle4@juno.com> wrote:

> I've added the following function to the collection of "array helper"
> functions in mintl/array.d:
>
> /** Query if an associative array contains a given key and if
>  * so set the output value and return true and otherwise return
>  * false without adding the key to the map or setting the value
>  * parameter. This mechanism is faster than making two lookups:
>  * one to see if the key is in the map and another to get the
>  * value. Compiler-dependent.
>  * \param x the array to query
>  * \param key the key to look up
>  * \param value the output value if any
>  * \return true if the key is in the map and false otherwise
>  */
> template contains(K,V) {
>   bool contains(V[K] x, K key, inout V value) {
>     ...
>   }
> }
> unittest {
>   double[int] c;
>   c[100] = 1.1;
>   c[300] = 2.2;
>   c[-100] = 3.3;
>   double v;
>   assert( contains!(int,double)(c,-100,v) );
>   assert( v == 3.3 );
>   assert( !contains!(int,double)(c,200,v) );
>   assert( !(200 in c) );
> }

Perfect :)

> I haven't added the second version of "contains" since the index_type
> doesn't exist yet.

I'm not sure an index_type makes sense for an AA, it does for other container types, but for an AA isn't the key the index type, or maybe the hashed value of the key is the index type?

Regan

> -Ben
>
> Regan Heath wrote:
>
>> On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan@netwin.co.nz>
>> wrote:
>>
>>> On Tue, 27 Jul 2004 22:03:17 -0700, Kris
>>> <someidiot@earthlink.dot.dot.dot.net> wrote:
>>>> 1) Oh, I misread your initial post. Sorry. So, when you say "returns
>>>> true/false and assigns a value if found" I guess you mean this is a
>>>> kind of
>>>> "replace" function instead?
>>>
>>> No, no, no.. I'm just not making myself clear methinks.
>>>
>>> Two different functions, one:
>>>
>>>    bool contains(key_type key, value_type value);
>>>
>>> looks for key, returns true/false (existance of the value) and assigns
>>> value on true.
>>
>> by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the
>> out variable 'value', this is a 'get a value' function, not a 'set a
>> value' function. It never mutates the aa. My function definition is
>> missing the 'out' here:
>>
>> bool contains(key_type key, out value_type value);
>>
>>> The other:
>>>
>>>    bool contains(key_type key, index_type at);
>>
>> This should be:
>>
>> bool contains(key_type key, out index_type at);
>>
>>> looks for key, returns true/false (existance of the value) and gives an
>>> index to the value which you can use to:
>>> a) retrieve the value (if it exists)
>>> b) replace the value with a new one (if it exists)
>>> c) insert a value (if it does not exist)
>>>
>>>> And returns whether the operation took place? If
>>>> so, then that's very useful. Still leaves the inverse scenario open
>>>> though,
>>>> where you want to retrieve the value of something from the container
>>>> without
>>>> mutating (egad!) at all: a very common case. With the current syntax you
>>>> have to check first, then do the retrieval: two lookups.
>>>
>>> Not using my idea above :)
>>>
>>>> 2) The problem with those "insertion location" style APIs is one of
>>>> atomicity ... if something throws an exception in-between, then all
>>>> bets are
>>>> off. Definately works well for some environments though.
>>>
>>> I realise this, I don't think it should ever mutate on read/lookup
>>> operations.
>>>
>>> Regan
>>>
>>>> "Regan Heath"  wrote...
>>>>
>>>>> The only case I can think of is a large struct. In which case you
>>>>> should
>>>>> probably be storing pointers to them in the AA. eg.
>>>>>
>>>>> struct Foo {
>>>>>    ..lots..
>>>>> }
>>>>>
>>>>> Foo*[char[]] aa;
>>>>> Foo* value;
>>>>>
>>>>> if (aa.contains("regan",value)) ..
>>>>>
>>>>> If you're not storing pointers then it's going to be slow/expensive to
>>>>> simply add/assign one to the aa in the first place.
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/