Jump to page: 1 24  
Page
Thread overview
Access violation with AA's
Oct 28, 2005
Kris
Oct 29, 2005
Walter Bright
Oct 29, 2005
Kris
Oct 30, 2005
Regan Heath
Oct 30, 2005
Regan Heath
Oct 31, 2005
Bruno Medeiros
Nov 05, 2005
Walter Bright
Nov 05, 2005
kris
Nov 05, 2005
Walter Bright
Nov 06, 2005
kris
Nov 06, 2005
Derek Parnell
Nov 06, 2005
kris
Nov 06, 2005
Derek Parnell
Nov 06, 2005
kris
Nov 06, 2005
Regan Heath
Nov 06, 2005
Sean Kelly
Nov 06, 2005
Derek Parnell
Nov 06, 2005
kris
Nov 07, 2005
kris
Nov 08, 2005
Kris
Nov 05, 2005
Walter Bright
Nov 06, 2005
Sean Kelly
Nov 06, 2005
Regan Heath
Nov 06, 2005
Derek Parnell
Nov 07, 2005
Bruno Medeiros
Nov 07, 2005
kris
Nov 07, 2005
Regan Heath
Nov 07, 2005
Ameer Armaly
Oct 29, 2005
Kris
Oct 31, 2005
David Medlock
Oct 31, 2005
Kris
Oct 30, 2005
Ben Hinkle
Oct 30, 2005
kris
October 28, 2005
class Foo
{
        private Record [char[]] map;

        static class Record
        {
                void write (Foo parent, void[] data) {}
        }

        synchronized void put (char[] key, void[] data)
        {
                /****** access violation here *****/
                Record  r = map [key];

                if (r is null)
                   {
                   r = new Record ();
                   map [key] =  r;
                   }
                r.write (this, data);
        }
}


void main()
{
        Foo f = new Foo;
        f.put ("foo", new void[10]);
}

# Error: Access Violation

Depending upon where (in the code body) the dereference of 'map' occurs, one gets either an access-violation or an OutOfBounds exception. Fragile.

However ~ this, again, casts a shadow upon the AA implementation. I mean, throwing an exception in this case is surely dubious (a missing entry in an AA is *not* exceptional; often it is the norm). Further compare these two implementation of the above code:

#1
        synchronized void put (char[] key, void[] data)
        {
                Record  r = map [key];

                if (r is null)
                   {
                   r = new Record ();
                   map [key] =  r;
                   }
                r.write (this, data);
        }

#2
        synchronized void put (char[] key, void[] data)
        {
                Record  *r = key in map;

                if (r is null)
                   {
                   Record rr = new Record();
                   map [key] =  rr;
                   r = &rr;
                   }
                (*r).write (this, data);
        }


#2 is the way one is apparently expected to code; forcing one into using pointers. Is this a good thing? For something as rudimentary as a HashMap? Alternatively, one could do this:

#3
        synchronized void put (char[] key, void[] data)
        {
                Record r;

                try {
                     r = map [key];
                     } catch (OutOfBoundsException e)
                                 {
                                  r = new Record ();
                                  map [key] =  r;
                                 }
                r.write (this, data);
        }

While some might argue this is "better than #1", it has the side effect of being rather slow; and is rather less clear than #1, in my view. Again, not something one needs for a rudimentary container. I think this particular change to AA's is just flat-out bogus (and, none of this would be necessary if AA's were implemented as a template library).

Finally, any code written for the "old" implementation of AA's is broken. Turns out that Mango has quite a few cases similar to #1; each one now broken. Wonderful!

For Mango, neither # 2 or #3 are attractive options; #1 is clearly (IMO) the simplest and most in line with what D used to represent. What is one supposed to do?



October 29, 2005
"Kris" <fu@bar.com> wrote in message news:dju31o$2he6$1@digitaldaemon.com...
>  I think this particular change to AA's is just flat-out bogus

Nobody had a nice word to say about the original implementation.

Just write it as:

    if (!(key in map))
        map[key] = new Record;
    r = map[key];
    ...

No, it isn't as efficient as the old way. But, like I said, the old way was called lots of unkind things <g>.

It is possible that a future compiler may recognize the above as an idiom and rewrite it so the array lookup is done only once.


October 29, 2005
Hi, Walter ~

I haven't been at all shy in the past in terms of criticizing the multiple lookups required for AA usage. Nor for some of the other "deficiencies" (as I imagine you're referring to).

However, what you suggest below /minimally/ requires two lookups, and three on an insert <g>. Your changes just made the performance concern notable worse than it was, along with breaking the existing code-base. The compiler optimization you speak of does not exist, and really should not need to.

Can we step back a moment, please?

One issue here is the Access Violation (why this is posted in the bugs section rather than in the main forum), although this new behaviour of throwing an exception is, I think, highly questionable. How did that get by the wolves in the first place? <g>

For example, I quite often use an AA to identify things as 'special' ~ URL schemes for example. If the scheme is not in the AA then it ain't special. The missing case is /not/ exceptional; instead it is actually the norm; I certainly don't wish to be catching exceptions for the normal case (from either a semantic or performance perspective). Nor do I wish to use pointers for such usual, simplistic, cases.

OTOH, what you did with the 'in' keyword and pointers improved that aspect of it ~ if one wants to eliminate a potential double-lookup then one can use the pointer syntax. Good!

The problem here is that, at the same time, you changed the semantics of a simple lookup such that it now either requires pointer-syntax, the overhead of try/catch, or yet another lookup. I think that was a mistake, and am not too shy to say so :-)

With respect, I think this sets a rather poor precedent for D beginners ~ as a questionable example of exception usage, and the associated added complexity or overhead of the current AA lookup model (or, alternatively, the required use of pointers).

Let's not forget there's an access-violation here too. Just compile that example and run it.

Lastly, I /do/ actually have a kind word to say about the original implementation: other than the potential double-lookup, it was fast, and it was simple. I still think AA's could/should have been handled via templates when they came along, and could therefore have been treated as a library utility rather than being built into the compiler itself. Regardless, the usage model is now arguably slower and more complex than before ~ largely negating the effort of placing AA's within the compiler in the first place. IMO.

Regards;





"Walter Bright" <newshound@digitalmars.com> wrote in message news:djuj2d$2vvv$1@digitaldaemon.com...
>
> "Kris" <fu@bar.com> wrote in message news:dju31o$2he6$1@digitaldaemon.com...
>>  I think this particular change to AA's is just flat-out bogus
>
> Nobody had a nice word to say about the original implementation.
>
> Just write it as:
>
>    if (!(key in map))
>        map[key] = new Record;
>    r = map[key];
>    ...
>
> No, it isn't as efficient as the old way. But, like I said, the old way
> was
> called lots of unkind things <g>.
>
> It is possible that a future compiler may recognize the above as an idiom and rewrite it so the array lookup is done only once.
>
> 


October 29, 2005
Ah. I just remembered that AA lookup's would insert an empty entry if one was not there already ... that behaviour has been exchanged for throwing an exception instead. Bleah.  As for idioms (mentioned below), in this case they can be optimized by the API rather than by the compiler.

I believe there is a way, within D, to elegantly resolve these ongoing issues ... if you'd be open to change, then I'd be happy to make some suggestions <g>

- Kris


"Walter Bright" <newshound@digitalmars.com> wrote in message news:djuj2d$2vvv$1@digitaldaemon.com...
>
> "Kris" <fu@bar.com> wrote in message news:dju31o$2he6$1@digitaldaemon.com...
>>  I think this particular change to AA's is just flat-out bogus
>
> Nobody had a nice word to say about the original implementation.
>
> Just write it as:
>
>    if (!(key in map))
>        map[key] = new Record;
>    r = map[key];
>    ...
>
> No, it isn't as efficient as the old way. But, like I said, the old way
> was
> called lots of unkind things <g>.
>
> It is possible that a future compiler may recognize the above as an idiom and rewrite it so the array lookup is done only once.
>
> 


October 30, 2005
Just to add my 2c here as well.

I disliked the original AA and array behaviour because it inserted when you did a lookup.

I was a supporter of the "make it an exception" idea because to me the statement "val = aa["key"]; says get me the value for "key". However it seems that in practice, where we use AA's it's more typical that we're saying "have we got a value for "key" (as Kris mentioned). It is for this reason I have come to dislike the exception.

The trouble as I see it is that the the compiler cannot know in each case whether a missing item is exceptional and this is because we have no way of telling it. We've got one way to ask for an item "item = aa[index]" and that's it.

The solution IMO, something I have been an advocate of for some time now, is adding different ways to ask for items. The first and most relevant here is a "contains" method, eg.

bool contains(VALUE[KEY] a);
bool contains(VALUE[KEY] a, out VALUE v);

The first from essentiall does what "in" does. The second does what "in" does but assigns the value to 'v'.

Sure, I can and have written a template that uses 'in' to achieve these, but it seems that something this useful should be part of the default built-in array handling so that everyone has access to it. At the very least it should be part of the standard library.

Here is a list of things I can imagine a programmer wanting to do on lookup of an item, ideally all should be supported in the most efficient manner possible i.e. no double lookup/hash.

 - check for item (i.e. if ("a" in AA))
 - check for item
   - if exists, get value
   - if not exists, error
   - if not exists, add 'this one' (or .init value)

(feel free to add to this list)

My feeling is that we handle each of the above like so:

"check for item"
  if ("key" in AA) {}
  if (AA.contains("key")) {}

"check for item, if exists get value"
  if (AA.contains("key",val)) {}

"check for item, if not exists, error"
  val = AA["key"];
  val = AA.get("key");

So, contains is added (get is added), the rest remains as it is now.

The tricky one seems to be:

"check for item, if not exists, add 'this one' (or .init value)"

perhaps?

val = <value to insert>;
AA.getset("key",val);

so val would end up being the existing value, or the new inserted value. If you wanted to tell if there was an existing value you could keep a copy i.e.

nval = val = <value to insert>;
AA.getset("key",val);

if (nval is val) { //new value was inserted }
else { //we had an existing value }

A better name than "getset" can likely be found.

Regan
October 30, 2005
> My feeling is that we handle each of the above like so:
>
> "check for item"
>    if ("key" in AA) {}
>    if (AA.contains("key")) {}
>
> "check for item, if exists get value"
>    if (AA.contains("key",val)) {}
>
> "check for item, if not exists, error"
>    val = AA["key"];
>    val = AA.get("key");

If it really bothers people that "val = AA["key"];" throws an exception then perhaps it could return the .init value and the explicit "get" call could throw the exception. I don't think it matters too much as I can see myself using "contains" in most cases.

I dislike "val = AA["key"];" returning .init if there is no _other_ method (i.e. "contains") to get a value which can tell me whether the item did in fact exist or not. Example:

int[char[]] AA;
int v;

v = AA["test"];

the .init value for int is 0, so if v == 0 after this call we do not know whether it existed and was 0 or didn't exist at all.

Regan



Regan
October 30, 2005
"Kris" <fu@bar.com> wrote in message news:dju31o$2he6$1@digitaldaemon.com...
> class Foo
> {
>        private Record [char[]] map;
>
>        static class Record
>        {
>                void write (Foo parent, void[] data) {}
>        }
>
>        synchronized void put (char[] key, void[] data)
>        {
>                /****** access violation here *****/
>                Record  r = map [key];
>
>                if (r is null)
>                   {
>                   r = new Record ();
>                   map [key] =  r;
>                   }
>                r.write (this, data);
>        }
> }
>
>
> void main()
> {
>        Foo f = new Foo;
>        f.put ("foo", new void[10]);
> }
>
> # Error: Access Violation

FWIW, MinTL HashAA returns a user-settable missing value on invalid keys. The default missing value is value.init. So you code above would have worked if map was a HashAA!(char[],Record). HashAA also has more methods that tweek things like "contains" and "take". It also supports sorting - by default elements are sorted by insertion order.

-Ben


October 30, 2005
That all sounds cool, Ben.

Perhaps the core problem with AAs is that they're just trying to do too much with the [] syntax? Perhaps if AAs supported a contains(key, inout value) property, as Regan noted a year or more ago, then it would be more palatable and useful?

On the other hand, templates are more than adequate for handling such things; as you've proved with MinTL. Removing AAs would simplify the compiler also (obviously). If the template syntax were deemed too complicated for new users, perhaps the compiler could provide some generic sugar for 'special' templates, instead of all the AA specific code? Perhaps some kind of alias (or variation thereupon) might be sufficiently sugary?

- Kris


Ben Hinkle wrote:
> "Kris" <fu@bar.com> wrote in message news:dju31o$2he6$1@digitaldaemon.com...
> 
>>class Foo
>>{
>>       private Record [char[]] map;
>>
>>       static class Record
>>       {
>>               void write (Foo parent, void[] data) {}
>>       }
>>
>>       synchronized void put (char[] key, void[] data)
>>       {
>>               /****** access violation here *****/
>>               Record  r = map [key];
>>
>>               if (r is null)
>>                  {
>>                  r = new Record ();
>>                  map [key] =  r;
>>                  }
>>               r.write (this, data);
>>       }
>>}
>>
>>
>>void main()
>>{
>>       Foo f = new Foo;
>>       f.put ("foo", new void[10]);
>>}
>>
>># Error: Access Violation
> 
> 
> FWIW, MinTL HashAA returns a user-settable missing value on invalid keys. The default missing value is value.init. So you code above would have worked if map was a HashAA!(char[],Record). HashAA also has more methods that tweek things like "contains" and "take". It also supports sorting - by default elements are sorted by insertion order.
> 
> -Ben 
> 
> 
October 31, 2005
Regan Heath wrote:
> Just to add my 2c here as well.
> 
> I disliked the original AA and array behaviour because it inserted when  you did a lookup.
> 
> I was a supporter of the "make it an exception" idea because to me the  statement "val = aa["key"]; says get me the value for "key". However it  seems that in practice, where we use AA's it's more typical that we're  saying "have we got a value for "key" (as Kris mentioned). It is for this  reason I have come to dislike the exception.
> 
> The trouble as I see it is that the the compiler cannot know in each case  whether a missing item is exceptional and this is because we have no way  of telling it. We've got one way to ask for an item "item = aa[index]" and  that's it.
> 
> The solution IMO, something I have been an advocate of for some time now,  is adding different ways to ask for items. The first and most relevant  here is a "contains" method, eg.
> 
> bool contains(VALUE[KEY] a);
> bool contains(VALUE[KEY] a, out VALUE v);
> 
> The first from essentiall does what "in" does. The second does what "in"  does but assigns the value to 'v'.
> 
> Sure, I can and have written a template that uses 'in' to achieve these,  but it seems that something this useful should be part of the default  built-in array handling so that everyone has access to it. At the very  least it should be part of the standard library.
> 
> Here is a list of things I can imagine a programmer wanting to do on  lookup of an item, ideally all should be supported in the most efficient  manner possible i.e. no double lookup/hash.
> 
>  - check for item (i.e. if ("a" in AA))
>  - check for item
>    - if exists, get value
>    - if not exists, error
>    - if not exists, add 'this one' (or .init value)
> 
> (feel free to add to this list)
> 
> My feeling is that we handle each of the above like so:
> 
> "check for item"
>   if ("key" in AA) {}
>   if (AA.contains("key")) {}
> 
> "check for item, if exists get value"
>   if (AA.contains("key",val)) {}
> 
> "check for item, if not exists, error"
>   val = AA["key"];
>   val = AA.get("key");
> 
> So, contains is added (get is added), the rest remains as it is now.
> 
> The tricky one seems to be:
> 
> "check for item, if not exists, add 'this one' (or .init value)"
> 
> perhaps?
> 
> val = <value to insert>;
> AA.getset("key",val);
> 
> so val would end up being the existing value, or the new inserted value.  If you wanted to tell if there was an existing value you could keep a copy  i.e.
> 
> nval = val = <value to insert>;
> AA.getset("key",val);
> 
> if (nval is val) { //new value was inserted }
> else { //we had an existing value }
> 
> A better name than "getset" can likely be found.
> 
> Regan
I strongly agree with all of this. I just want to add that we should also have a method to set/add a new pair. Like:

  AA.set("key",val);  // or AA.add("key",val);

I find the current array usage syntax (and the previous one too, for that matter), quite strange and unnatural.


-- 
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to be... unnatural."
October 31, 2005
Kris wrote:
> Ah. I just remembered that AA lookup's would insert an empty entry if one was not there already ... that behaviour has been exchanged for throwing an exception instead. Bleah.  As for idioms (mentioned below), in this case they can be optimized by the API rather than by the compiler.
> 
> I believe there is a way, within D, to elegantly resolve these ongoing issues ... if you'd be open to change, then I'd be happy to make some suggestions <g>
> 
> - Kris
> 
> 
> "Walter Bright" <newshound@digitalmars.com> wrote in message news:djuj2d$2vvv$1@digitaldaemon.com...
> 
>>"Kris" <fu@bar.com> wrote in message news:dju31o$2he6$1@digitaldaemon.com...
>>
>>> I think this particular change to AA's is just flat-out bogus
>>
>>Nobody had a nice word to say about the original implementation.
>>
>>Just write it as:
>>
>>   if (!(key in map))
>>       map[key] = new Record;
>>   r = map[key];
>>   ...
>>
>>No, it isn't as efficient as the old way. But, like I said, the old way was
>>called lots of unkind things <g>.
>>
>>It is possible that a future compiler may recognize the above as an idiom
>>and rewrite it so the array lookup is done only once.
>>
>>
> 

I will maintain my position that we should simply have a couple of builtins:

Previous thread:
http://www.digitalmars.com/d/archives/digitalmars/D/26554.html

The builtin methods would handle all cases pretty easily:

 bool get( in key, out value )
 bool insert( in key, in value, out oldvalue )


int[ char[] ] 	map;

int n;
if ( map.get( "Hello" , n ) ) { ..do something with n.. }
else { ... no value in n ..  }

if ( map.insert( "Hello", 200, n ) ) { .. previous value in n.. }
else { .. no previous value was in map.. }

Built in methods are overloadable whereas 'in' is not.

-DavidM
« First   ‹ Prev
1 2 3 4