March 08, 2005
Mike Parker wrote:

> Maybe I'm missing something, but the only way in D for a dynamic array to grow or shrink is to set the length property, correct? That means the last index is always known - the index (n + 1) always points to an area of memory beyond the end of the array.

You can also implicitly set .length, by appending stuff with ~=

> I said in another post that the [] used by associative arrays causes people to view them in the same light as normal arrays. From this perspective, it's easy to draw the conclusing that a aa["missing key"] is invalid and exceptional.

It's also a sense of how you view variables that have not been set...
Are they implicitly set to a known usable value, or are they off limits?


int i;

// what is the value of i ? Is it 0 (int.init), or is it
// TryingToUseAnUninitializedVariableException ?


int[] a;
a.length = 10;

a[0] = 1;
a[1] = 2;
a[2] = 3;

// what is the value of a[3] ? Is it 0 (int.init), or is it
// TryingToUseAnUninitializedVariableException ? the index
// it's within the bounds, [0..9], so it's not OutOfBounds


int[int] aa;

aa[0] = 1;
aa[1] = 2;
aa[2] = 3;

// what is the value of aa[3] ? :) is the "bounds" of the hash
// the current keys [0,1,2], or is every possible key (int) ?
// depending on the definition, it *might* be OutOfBounds...


Currently, all of these return 0 (that is: the .init value
for variables and arrays, and a zero-filled value for AAs)

It's just that the last one above, with the AA, also has a nasty
*side effect* of also adding the key that was used for lookup ?

And that is why it was reported as a bug in D. (several times)
Having it throw an exception would NOT be a good way to fix it.


Removing the side effect would be enough. I have not heard of
anyone actually *using* this side effect, to set the value...

>         // Not found, create new elem
>         //printf("create new one\n");
>         e = cast(aaA *) cast(void*) new byte[aaA.sizeof + keysize + valuesize];
>         memcpy(e + 1, pkey, keysize);
>         e.hash = key_hash;
>         *pe = e;

Note that it doesn't even use the .init value, just zeroes...
Which means that char's get 0x00 and not 0xFF, floats here
get 0.0f instead of float.nan, and so on... Which sucks (too).

My thoughts: (still)
Please make it use .init, and *avoid* creating new elements ?

--anders
March 08, 2005
Aye. Mango avoids all this for the most part via a library-based HashMap instead. I remain one of the principal detractors of the built-in AA, for all kinds of reasons. The only thing going for the latter is the way in which it avoids the need for assignment casts (cos' the compiler knows the types already). That certainly has /some/ value, but it's not clear just how much.


Anders F Björklund wrote:
> bamb00 wrote:
> 
>>>> Javascript does it just that way, too.
>>>
>>>
>>> Does what ? Set keys on lookup ? No, it doesn't.
> 
> 
>> C++'s map does it for sure , why I have no idea, seems like a horrible idea
>> to me as well.  Just wanted to throw my vote in the 'No AA Writing on
>> lookup' camp.
> 
> 
> Throwing exceptions on missing keys is *equally* bad,
> in my opinion. It would also stop me from using hashes
> the way that I am used, and I'd have to continue with
> the workaround that's currently needed due to setting.
> 
> I guess it boils down to whether you consider an
> empty array to be full of valid lookups, or not...
> To me, a dynamic array is full of .init values
> so then it makes sense that associative arrays
> should also be full of .init values as well.
> 
> 
> In the end, I'll just continue to write code like today.
> value = (key in hash) ? hash[key] : null;
> 
> It's also the only form that has survived for a while,
> even if it does do a double lookup in the hash table.
> (but actually using "key.init" instead of null above
> does not work, due to a horrible init-related bug)
> 
> --anders
March 08, 2005
On Tue, 08 Mar 2005 17:10:04 +0900, Mike Parker <aldacron71@yahoo.com> wrote:
> Matthew wrote:
>
>> Same here. And trying to get a value out of an associative container that does not exist is exactly that, exceptional. (Or it certainly should be.)
>
> I just can't agree with this. A missing key is simply a missing key. See below.

I'm with Matthew on this one. Assuming there is a method:
  bool contains(KeyType key, out ValueType value);

that returns true/false and sets value when found.

Then array["key"] makes an assumption (that the key exists), and if it's false, that is an exception.

>> Absolute nonsense.
>>  I'm bugging out now, because I fear we're so far apart that there's no point wasting our breaths. :-(
>
> Too bad, because I think this is something that needs to be discussed. Associative arrays are an integral part of the language that will be used frequently. And maybe the fact that you and I are so far apart indicates that other people are going to have extremely different views on the issue as well.
>
> The problem, as I see it, is twofold. First, is the [] operator. This causes people to view aa's in the same light as normal arrays. From that perspective, I can understand how aa["missing key"] can be construed as functionally equivalent to array[out_of_bounds_index].

This is not why I want [] to throw an exception.

> I see it the former being more like a substitute for Hashmap.get("missing key") in Java, which always returns null and which I have never heard anyone argue should being throwing an exception instead.

The two are similar in that they're both implementations of the same concept or idea, however the Java hasmap is limited in scope to objects, which is why it can return null. The same is not true for D's AA's.

> The second problem is that aas allow any sort of value to be stored. If it allowed only class objects, then we wouldn't need the pointer syntax which 'in' currently returns (which seems to be the bit that ignited the discussion in the first place). But that's nasty. I surely don't want to wrap my struct instances and integrals in a class just to put them in an aa.

Indeed.

> In my opinion, aas should function thusly:
>
> boolean contains = (key in aa);
> int* val = aa["key"]; // return null if missing
>
> I don't see a problem with using pointers as return values, as it eliminates the requirement that all aa values be objects, and pointers are a part of the language anyway.

That is one solution, however I think a much better soln is:
  bool contains(KeyType key, out ValueType value);

1. no pointers.
2. can get and/or check for a value in one operation.
3. reads well:

if (array.contains("bob",value)) {
}

given that, I believe [] should throw an exception.

> Perhaps I'm wrong viewing associative rays in the same light as Java's Hashmaps. But in my mind it's a natural way to look at it. What else are they if not hashmaps?

I agree, they're the same in concept, just not in implementation.

Regan

March 08, 2005
On Tue, 08 Mar 2005 17:16:03 +0900, Mike Parker <aldacron71@yahoo.com> wrote:
> Ben Hinkle wrote:
>
>>> I disagree. An out of bounds array index is an exceptional case because the size of the array is, usually, a known quantity - i.e. all members of the set of numbers from 0...n are valid, and n *must* be known during the allocation of the array.
>>   Dynamic arrays grow and shrink all the time. Same thing with adding and removing
>> keys from an AA. The only difference (conceptually) is that dynamic arrays have
>> a continuous block of integer keys. But I don't see why that matters.
>
> Maybe I'm missing something, but the only way in D for a dynamic array to grow or shrink is to set the length property, correct? That means the last index is always known - the index (n + 1) always points to an area of memory beyond the end of the array.
>
> I said in another post that the [] used by associative arrays causes people to view them in the same light as normal arrays. From this perspective, it's easy to draw the conclusing that a aa["missing key"] is invalid and exceptional. But if D had a hashmap class instead, would hashmap.get("missing key") still be viewed as an exceptional case?

Yes or No. :)

If we assume the hashmap class has (at least) these methods:

class hashmap(KeyType,ValueType) {
  //to get value
  ValueType get(KeyType key) {..}

  //to query existance and get value
  bool contains(KeyType key, out ValueType value) {..}

  //to query existance.
  bool contains(KeyType key) {..}
}

One could argue for 'get' throwing an exception, because there is a 'contains' method which you should use if you're not 100% certain the key exists.

One could argue for 'get' returning type.init, because there is a 'contains' method which can be used if it's important whether it exists or not.

Truth be told, I would be happy with either method, so long as I get my 'contains' method in the form:
  bool contains(KeyType key, out ValueType value) {..}

Regan

p.s. This is assuming nothing changes WRT to char[].init stopping me from telling an "empty" string apart from an "undefined" string as I think this is important.
March 08, 2005
kris wrote:

> Aye.
> Mango avoids all this for the most part via a library-based HashMap instead.

Mango uses a library-based String too, neatly avoiding the built-ins...
Cheater. ;-)

--anders

PS. Seriously, it seems to be working for Java so it can't be all bad.
March 08, 2005
On Tue, 08 Mar 2005 09:48:43 +0100, Anders F Björklund <afb@algonet.se> wrote:
> Currently, all of these return 0 (that is: the .init value
> for variables and arrays, and a zero-filled value for AAs)
>
> It's just that the last one above, with the AA, also has a nasty
> *side effect* of also adding the key that was used for lookup ?

I think it's nasty also.

> And that is why it was reported as a bug in D. (several times)
> Having it throw an exception would NOT be a good way to fix it.

I think throwing an exception is better than returning type.init.

I think it's better because it will highlight an invalid assumption immediately instead of propagating the bug further down the source to another location where you get unexpected (but consistent - thanks to type.init behaviour).

But wait, you say, it's not an invalid assumption, I _want_ the value _or_ type.init, to which I answer, use 'contains':

aa.contains("bob",value);

value is an 'out' parameter, and will be set to type.init.

As long as 'contains' exists I will be happy.

bool contains(KeyType key, out ValueType value);
bool contains(KeyType key);

Regan
March 08, 2005
On Tue, 08 Mar 2005 08:21:39 +0100, Anders F Björklund <afb@algonet.se> wrote:
> bamb00 wrote:
>
>>>> Javascript does it just that way, too.
>>>
>>> Does what ? Set keys on lookup ? No, it doesn't.
>
>> C++'s map does it for sure , why I have no idea, seems like a horrible idea
>> to me as well.  Just wanted to throw my vote in the 'No AA Writing on
>> lookup' camp.
>
> Throwing exceptions on missing keys is *equally* bad,
> in my opinion. It would also stop me from using hashes
> the way that I am used, and I'd have to continue with
> the workaround that's currently needed due to setting.
>
> I guess it boils down to whether you consider an
> empty array to be full of valid lookups, or not...
> To me, a dynamic array is full of .init values
> so then it makes sense that associative arrays
> should also be full of .init values as well.
>
>
> In the end, I'll just continue to write code like today.
> value = (key in hash) ? hash[key] : null;

Or you could simply write:
  contains(key,value);

as the 'out' param value will be set to type.init (which is null for arrays, object etc)

> It's also the only form that has survived for a while,
> even if it does do a double lookup in the hash table.
> (but actually using "key.init" instead of null above
> does not work, due to a horrible init-related bug)

Has Walter fixed the "horrible init-related bug" yet?

Regan
March 08, 2005
Regan Heath wrote:
>> (but actually using "key.init" instead of null above
>> does not work, due to a horrible init-related bug)
> 
> Has Walter fixed the "horrible init-related bug" yet?

Maybe... :-) I couldn't get it to reproduce, but anyway it
was about key getting initialized when *reading* key.init ?

-anders
March 08, 2005
Regan Heath wrote:

> I think throwing an exception is better than returning type.init.

Maybe, but remember that you can write non-OOP code in D too... :-)

And I think such an Exception would be better off hidden in a
HashMap class library, instead of in the core language spec ?

> But wait, you say, it's not an invalid assumption, I _want_ the value _or_  type.init, to which I answer, use 'contains':
> 
> aa.contains("bob",value);
> 
> value is an 'out' parameter, and will be set to type.init.
> 
> As long as 'contains' exists I will be happy.
> 
> bool contains(KeyType key, out ValueType value);
> bool contains(KeyType key);

But that's how it works now ? Just using a pointer,
instead of: a "boolean" (bit) and an out reference.

bool contains(KeyType key)
{
  return cast(bit) (key in hash);
}

bool contains(KeyType key, out ValueType value)
{
  ValueType* pointer = key in hash;
  value = pointer ? *pointer : ValueType.init;
  return cast(bit) pointer;
}

private ValueType[KeyType] hash;

--anders
March 08, 2005
On Tue, 08 Mar 2005 10:33:06 +0100, Anders F Björklund <afb@algonet.se> wrote:
> Regan Heath wrote:
>
>> I think throwing an exception is better than returning type.init.
>
> Maybe, but remember that you can write non-OOP code in D too... :-)

Sure, however Exceptions are the proposed/recommended "D Error Handling Solution"
  http://www.digitalmars.com/d/errors.html

> And I think such an Exception would be better off hidden in a
> HashMap class library, instead of in the core language spec ?

Why? What's the difference between a hashmap class library and built in AA's?

>> But wait, you say, it's not an invalid assumption, I _want_ the value _or_  type.init, to which I answer, use 'contains':
>>  aa.contains("bob",value);
>>  value is an 'out' parameter, and will be set to type.init.
>>  As long as 'contains' exists I will be happy.
>>  bool contains(KeyType key, out ValueType value);
>> bool contains(KeyType key);
>
> But that's how it works now ? Just using a pointer,
> instead of: a "boolean" (bit) and an out reference.

I don't want to use pointers (I can, I just dont want to). I want 'contains' built into AA's. Failing that I want implicit template instantiation and the array method feature, so I can write a template to add contains to AA's myself.

In short I want to be able to say:

if (aa.contains("bob",value)) {
}

Regan