September 30, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #21 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
it's widely known trick. at least among us, old farts.

--
September 30, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #22 from hsteoh@quickfur.ath.cx ---
PR updated.

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

Martin Nowak <code@dawg.eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |code@dawg.eu

--- Comment #23 from Martin Nowak <code@dawg.eu> ---
I don't really agree with this patch.
HashTables are unordered and optimized for random indexing by a key.
If you run a tight loop and constantly query front (abusing byKey) then of
course that's expensive O(N^2/2). It's like taking a singly linked list and
removing the last element until it's empty.
Instead we should offer a nice and generic remove function that takes a
predicate and allows to batch remove entries.

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #24 from Martin Nowak <code@dawg.eu> ---
In other words, this optimizes the following code

    foreach (_; 0 .. 1_000_000)
        aa.byKey();

but pessimizes the following

    aa.remove(key);

It does NOT improve iterating byKey.
So unless there is a valid use case to call aa.byKey.front in a loop this would
only improve a pointless microbenchmark.
Batch deletion is much better done with a dedicated remove function.

    aa.remove((k, v) => pred(k));

Any volunteers for implementing the latter?

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #25 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
and we can make it alot less expensive without losing speed. and all this with only 4/8 bytes of overhead for single AA (not for single AA entry, but for the whole AA).

the only questionable thing with this patch is that it assumes that AA is mutable. but we can't have AA literals anyway, AFAIK.


and no, it's not really pessimizes remove() if noone used .byKey/.byValue on AA, and even if that properties was used, next rehashing will turn off caching again. so i daresay that remove() slowdown is not noticable at all.

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #26 from bearophile_hugs@eml.cc ---
(In reply to Martin Nowak from comment #24)

> So unless there is a valid use case to call aa.byKey.front in a loop this would only improve a pointless microbenchmark.

See here, it's not just a pointless microbenchmark: http://forum.dlang.org/thread/efmjlfpehqqfszcrxmwq@forum.dlang.org

As shown by others, there are ways to rewrite that code to avoid most of the very slow part, but popping the "first" item from an an associative array is a sufficiently common operation (I have written and seen plenty of Python code that does it).

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #27 from hsteoh@quickfur.ath.cx ---
@bearophile: perhaps that's an indication that what you want is actually an *ordered* map, not a hash map?

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #28 from Steven Schveighoffer <schveiguy@yahoo.com> ---
New simpler pull which shouldn't affect performance of aa.remove

https://github.com/D-Programming-Language/druntime/pull/979

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #29 from bearophile_hugs@eml.cc ---
(In reply to hsteoh from comment #27)
> @bearophile: perhaps that's an indication that what you want is actually an *ordered* map, not a hash map?

Nope. If you take a look at the link, that code doesn't need an ordered container. The "pop" operation just needs to give one arbitrary item.

--
October 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #30 from Steven Schveighoffer <schveiguy@yahoo.com> ---
bearophile or ketmar, can you test the new PR? The timings I get on my system are not very different from the original.

--