Jump to page: 1 25  
Page
Thread overview
[Issue 13410] Performance problem with associative array byKey/byValue
Sep 01, 2014
Ketmar Dark
Sep 01, 2014
Ketmar Dark
Sep 01, 2014
Ketmar Dark
Sep 01, 2014
Ketmar Dark
Sep 01, 2014
Orvid King
Sep 01, 2014
Ketmar Dark
Sep 02, 2014
safety0ff.bugz
Sep 02, 2014
Ketmar Dark
Sep 02, 2014
Ketmar Dark
Sep 02, 2014
safety0ff.bugz
Sep 26, 2014
Ketmar Dark
Sep 26, 2014
Ketmar Dark
Sep 29, 2014
Ketmar Dark
Sep 30, 2014
Ketmar Dark
Oct 01, 2014
Martin Nowak
Oct 01, 2014
Martin Nowak
Oct 01, 2014
Ketmar Dark
Oct 01, 2014
Martin Nowak
Oct 02, 2014
Ketmar Dark
Oct 02, 2014
Ketmar Dark
Oct 10, 2014
Ketmar Dark
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #1 from bearophile_hugs@eml.cc ---
The exact problem was found by Daniel Kozak.

--
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

Ketmar Dark <ketmar@ketmar.no-ip.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ketmar@ketmar.no-ip.org

--- Comment #2 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
Created attachment 1407
  --> https://issues.dlang.org/attachment.cgi?id=1407&action=edit
simple 'first used bucket' caching for AAs

adding 'first used bucket' caching to aa halves execution time for the sample.

seems that we can't do better with current AAs and without hurting overall AA performance. aa.remove() hurts caching and there is no sense to recalculate cache on each remove.

p.s. please note that this patch is not heavily tested, it's just a quick 'hack-in'.

--
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #3 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
Created attachment 1408
  --> https://issues.dlang.org/attachment.cgi?id=1408&action=edit
simple 'first used bucket' caching for AAs with '_aaDelX' recaching

here is another patch which tries to revalidate cache in `_aaDelX()`. this
makes aa.remove() little slower but… but execution time for both samples are
nearly identical now.

and for bearophile's sample here (
http://forum.dlang.org/thread/efmjlfpehqqfszcrxmwq@forum.dlang.org ) we have
some small improvement against the previous patch too (~0.4 seconds).

--
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

Ketmar Dark <ketmar@ketmar.no-ip.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #1407|0                           |1
        is obsolete|                            |
   Attachment #1408|0                           |1
        is obsolete|                            |

--- Comment #4 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
Created attachment 1409
  --> https://issues.dlang.org/attachment.cgi?id=1409&action=edit
simple 'first used bucket' caching for AAs

aaaaaand we have a WINNER!

this final patch should not hurt 'normal' AA usage (it will not update 'first
used cache' on each operation) yet speds up this syntetic test (~3 seconds -->
!0.03 seconds) and code from the forum (~3.5 seconds --> 1.1 seconds).

--
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #5 from bearophile_hugs@eml.cc ---
(In reply to Ketmar Dark from comment #4)

and code from the forum (~3.5 seconds --> 1.1 seconds).

For me the C++ version of the code runs in about 0.20 seconds, so there's still a significant performance difference.

--
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #6 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
(In reply to bearophile_hugs from comment #5)
> For me the C++ version of the code runs in about 0.20 seconds, so there's still a significant performance difference.
yes, AA still needs to search for the "first item" many times, caching just makes it faster in some cases. this can't be changed without complete rewrite of AA (and with performance hit for other use cases).

"get first item" will always be costly for AAs, that's the way AAs works. there are alot of 'buckets' for items, many of which are empty. to get *known* item from AA we can choose the necessary bucket with simple '%' operation, but to find "first available" item we must loop thru all buckets until we find the used one.

my patch does caching of "first used bucket" index (and tries to keep that in sync when AA contents changes), but sometimes i have to just "reset" cache, or get/remove operations will become unnecessarely slow.

besides, alot of adding/removing between 'byKey' calls will update cache for no reason at all, doing unnecessary calculations.

i tried to "amortise" that bad cases by not updating cache when there was no calls to byKey/byValue for the given AA, and by resetting cache on AA rehashing. this allows us to has almost no slowdowns when we using AAs without iterating and we can just call aa.rehash before adding/deleting big amount of data to turn off caching (besides, adding alot of data will eventually hit rehashing too).

i can't think out anything better for now. trying to speed up our use case will lead only to code bloating.

--
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #7 from Orvid King <blah38621@gmail.com> ---
It would be nice if this were done as a PR on github, as it would be much easier to review.

--
September 01, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #8 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
(In reply to Orvid King from comment #7)
> It would be nice if this were done as a PR on github, as it would be much easier to review.
sorry, each time i'm thinking about registering at github god kills one kitten.

--
September 02, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

--- Comment #9 from bearophile_hugs@eml.cc ---
(In reply to Ketmar Dark from comment #8)

> but to find "first available" item we must loop thru all buckets until we find the used one.

Right. (But using an unordered_map in the C++ code from the newsgroup the C++ code only gets a little more than twice slower, that seems about twice faster than the patched D code. I don't know the source of such performance difference.)


> sorry, each time i'm thinking about registering at github god kills one kitten.

Thank you for your work. Probably someone else will create a pull request with this.

--
September 02, 2014
https://issues.dlang.org/show_bug.cgi?id=13410

safety0ff.bugz <safety0ff.bugz@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |safety0ff.bugz@gmail.com

--- Comment #10 from safety0ff.bugz <safety0ff.bugz@gmail.com> ---
(In reply to bearophile_hugs from comment #9)
>
> Right. (But using an unordered_map in the C++ code from the newsgroup the C++ code only gets a little more than twice slower, that seems about twice faster than the patched D code. I don't know the source of such performance difference.)
> 

C++ unordered_map uses templates and can do all kinds of things the D AA's
currently can't.
It can inline/devirtualize the comparison functions, it can store the key and
value together, etc.

D's AAs essentially go through a C interface, and so this limits what it can do.

I don't think there's a great data structure for this task out of the box in D:
- Using std.container's red-black tree isn't ideal because there's no
equivalent to std::map's bracket operator (without doing multiple lookups.)
- You can't roll a bag (linked list) into the hash table because since key and
value are stored separately you cannot map a reference to a value to a
reference to the key (without storing a copy of the key inside the value and
then doing a lookup.)

As for ketmar's patch, I don't think we should introduce a slowdown in _aaDelX.
The first entry can be simply treated as a hint, meaning that the first bucket
is greater or equal to the hint.
I think _aaDelX should leave the cache value alone since _aaGetFirstEntry
checks that the bucket is non-null anyway.
Furthermore, I don't think the plus one indexing is necessary.

--
« First   ‹ Prev
1 2 3 4 5