Thread overview
AA strange behavior
Jun 12, 2006
marc michel
Jun 12, 2006
Oskar Linde
Jun 12, 2006
lanael
Jun 13, 2006
Carlos Santander
Jun 13, 2006
Georg Wrede
Jun 13, 2006
Oskar Linde
Jun 13, 2006
Sean Kelly
Jun 13, 2006
Oskar Linde
June 12, 2006
I found a strange behavior using AAs :

--------------------- >8 ----------------
import std.stdio;
import std.stream;

void main() {

int[][char[]] aa;

int i;
char[] s;
File f=new File("bla", FileMode.In );
while ( ! f.eof ) {
s= f.readLine();
aa[s] ~= i++;
}

f.close;

// no more luck with this :
//      aa.rehash;

foreach( char[] s, int[] i; aa ) {
writefln( "%s  =>  %d", s, i );
}

writefln("\n------------------");

// workaround :
//  while ( aa.length > 0) {

foreach( char[]s, int[] i; aa ) {
aa.remove(s);
writefln("\"%s\" => removed ",s);
}

// }

writefln("\n------there's still : ----------");
foreach( char[] s, int[] i; aa ) {
writefln( "%s  =>  %d", s, i );
}
writefln("------END---------");
}
--------------------- >8 ----------------


with a "bla" file like this one for example :


--------------------- >8 ----------------
apple
orange
pear
strawberry
cuncumber
lemon
salad
tomato
blackberry
orange
lemon
tomato
potatoe
root
--------------------- >8 ----------------


result :

--------------------- >8 ----------------
C:\home\dev\d>aa
tomato  =>  [7,11]
strawberry  =>  [3]
blackberry  =>  [8]
orange  =>  [1,9]
potatoe  =>  [12]
root  =>  [13]
salad  =>  [6]
apple  =>  [0]
lemon  =>  [5,10]
cuncumber  =>  [4]
pear  =>  [2]

------------------
"tomato" => removed
"strawberry" => removed
"blackberry" => removed
"orange" => removed
"root" => removed
"salad" => removed
"apple" => removed
"lemon" => removed
"cuncumber" => removed
"pear" => removed

------there's still : ----------
potatoe  =>  [12]
------END---------

--------------------- >8 ----------------


Note : I also tried to add "aa.rehash" after filling aa; with no more luck.
The only workaround is to add a "while( aa.length > 0 ) " surroundind the
foreach loop which does aa.remove().


Do I need holidays ?
Does this had been discussed many times already ?




June 12, 2006
marc michel skrev:
> I found a strange behavior using AAs :

[snip]

> foreach( char[]s, int[] i; aa ) {
> aa.remove(s);
> writefln("\"%s\" => removed ",s);
> }

You may not delete aa elements within a foreach loop. The foreach iterator will be confused.

> The only workaround is to add a "while( aa.length > 0 ) " surroundind the
> foreach loop which does aa.remove().

There are other ways. For instance:

foreach(key;aa.keys)
	aa.remove(key);

or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented:

struct BB { void *[] buckets; size_t nodes; }

...

(cast(BB*)&aa).buckets = null;
(cast(BB*)&aa).nodes = 0;

(A wish would be for the above to be implemented as aa.clear())

Or if you just want to forget about your current aa instance:

aa = null;

/Oskar
June 12, 2006
> There are other ways. For instance:
>
> foreach(key;aa.keys)
> 	aa.remove(key);

ah, yes, that's it. I remember now : the "keys" property !
I forgot about this one, thanks !
Now, I'm sure I need holidays, cause I also remember this question has already been asked :/


> or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented:

In fact, I have real code using this kind of AA in which I call some  glDeleteTextures() functions... but thanks anyway !


June 13, 2006
Oskar Linde escribió:
> marc michel skrev:
>> I found a strange behavior using AAs :
> 
> [snip]
> 
>> foreach( char[]s, int[] i; aa ) {
>> aa.remove(s);
>> writefln("\"%s\" => removed ",s);
>> }
> 
> You may not delete aa elements within a foreach loop. The foreach iterator will be confused.
> 
>> The only workaround is to add a "while( aa.length > 0 ) " surroundind the
>> foreach loop which does aa.remove().
> 
> There are other ways. For instance:
> 
> foreach(key;aa.keys)
>     aa.remove(key);
> 
> or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented:
> 
> struct BB { void *[] buckets; size_t nodes; }
> 
> ....
> 
> (cast(BB*)&aa).buckets = null;
> (cast(BB*)&aa).nodes = 0;
> 
> (A wish would be for the above to be implemented as aa.clear())
> 

I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.

> Or if you just want to forget about your current aa instance:
> 
> aa = null;
> 
> /Oskar


-- 
Carlos Santander Bernal
June 13, 2006

Carlos Santander wrote:
> Oskar Linde escribió:
>> There are other ways. For instance:
>>
>> foreach(key;aa.keys)
>>     aa.remove(key);
>>
>> or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented:
>>
>> struct BB { void *[] buckets; size_t nodes; }
>>
>> ....
>>
>> (cast(BB*)&aa).buckets = null;
>> (cast(BB*)&aa).nodes = 0;
>>
>> (A wish would be for the above to be implemented as aa.clear())
>>
> I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
> 
>> Or if you just want to forget about your current aa instance:
>>
>> aa = null;

Hmm.

Of course reuse is good. Greenpeace Likes Reuse(tm)!

But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all.

What's the real cost of creating a new hash, compared with emptying the old one?

((Besides, too much needless reuse only makes code harder to understand.))

---

FWIW, if reusing hashes really does turn out more efficient (or smarter and not more error prone), and become the Recommended Practice, then I, too, absolutely vote for aa.clear()!

And if not, we sure as heck should _not_ implement it!
June 13, 2006
Georg Wrede skrev:
> 
> 
> Carlos Santander wrote:
>> Oskar Linde escribió:
>>> There are other ways. For instance:
>>>
>>> foreach(key;aa.keys)
>>>     aa.remove(key);
>>>
>>> or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented:
>>>
>>> struct BB { void *[] buckets; size_t nodes; }
>>>
>>> ....
>>>
>>> (cast(BB*)&aa).buckets = null;
>>> (cast(BB*)&aa).nodes = 0;
>>>
>>> (A wish would be for the above to be implemented as aa.clear())
>>>
>> I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
>>
>>> Or if you just want to forget about your current aa instance:
>>>
>>> aa = null;
> 
> Hmm.
> 
> Of course reuse is good. Greenpeace Likes Reuse(tm)!
> 
> But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all.

The AA is a reference type. There can be many references to the same AA. aa = null will only change one reference, while the proposed aa.clear() would clear the actual AA that all references refer to. There is a significant semantic difference.

> What's the real cost of creating a new hash, compared with emptying the old one?

Nothing significant.

/Oskar
June 13, 2006
Oskar Linde wrote:
> Georg Wrede skrev:
>>
>>
>> Carlos Santander wrote:
>>> Oskar Linde escribió:
>>>> There are other ways. For instance:
>>>>
>>>> foreach(key;aa.keys)
>>>>     aa.remove(key);
>>>>
>>>> or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented:
>>>>
>>>> struct BB { void *[] buckets; size_t nodes; }
>>>>
>>>> ....
>>>>
>>>> (cast(BB*)&aa).buckets = null;
>>>> (cast(BB*)&aa).nodes = 0;
>>>>
>>>> (A wish would be for the above to be implemented as aa.clear())
>>>>
>>> I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
>>>
>>>> Or if you just want to forget about your current aa instance:
>>>>
>>>> aa = null;
>>
>> Hmm.
>>
>> Of course reuse is good. Greenpeace Likes Reuse(tm)!
>>
>> But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all.
> 
> The AA is a reference type. There can be many references to the same AA. aa = null will only change one reference, while the proposed aa.clear() would clear the actual AA that all references refer to. There is a significant semantic difference.

What about "delete aa"?  Or was the goal to keep the buckets around and just toss the data?


Sean
June 13, 2006
Sean Kelly skrev:
> Oskar Linde wrote:
>> Georg Wrede skrev:
>>>
>>>
>>> Carlos Santander wrote:
>>>> Oskar Linde escribió:
>>>>> There are other ways. For instance:
>>>>>
>>>>> foreach(key;aa.keys)
>>>>>     aa.remove(key);
>>>>>
>>>>> or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented:
>>>>>
>>>>> struct BB { void *[] buckets; size_t nodes; }
>>>>>
>>>>> ....
>>>>>
>>>>> (cast(BB*)&aa).buckets = null;
>>>>> (cast(BB*)&aa).nodes = 0;

I just realized this is wrong. I had accidentally been linking to an old version of Phobos. In its current incarnation, those two lines should be:

        (*(cast(BB**)&aa)).buckets = null;
        (*(cast(BB**)&aa)).nodes = 0;


>>>>>
>>>>> (A wish would be for the above to be implemented as aa.clear())
>>>>>
>>>> I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
>>>>
>>>>> Or if you just want to forget about your current aa instance:
>>>>>
>>>>> aa = null;
>>>
>>> Hmm.
>>>
>>> Of course reuse is good. Greenpeace Likes Reuse(tm)!
>>>
>>> But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all.
>>
>> The AA is a reference type. There can be many references to the same AA. aa = null will only change one reference, while the proposed aa.clear() would clear the actual AA that all references refer to. There is a significant semantic difference.
> 
> What about "delete aa"?  Or was the goal to keep the buckets around and just toss the data?

That could work as a syntax, but I think a .clear() method is clearer. The goal is not to keep the buckets around. Just work with multiple references to the same AA.

int[int] table1;
int[int] table2;
table1[1] = 1;
table2 = table1;
table1.clear();
assert(table2.length == 0);

/Oskar