Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
December 10, 2008 [Issue 2504] New: Reserve for associative arrays | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=2504 Summary: Reserve for associative arrays Product: D Version: 2.021 Platform: PC OS/Version: Windows Status: NEW Severity: enhancement Priority: P2 Component: DMD AssignedTo: bugzilla@digitalmars.com ReportedBy: dsimcha@yahoo.com It appears that adding an element to an associative array always triggers a memory allocation. Especially in multithreaded code, this is inefficient. It would be nice if associative arrays had a .reserve(size_t n) property, which reserved enough space for n objects, and stored the capacity internally. The idea is that, until the reserve buffer is exhausted, no interaction of any kind with the GC would be needed to add an element to the AA. -- |
September 23, 2009 [Issue 2504] Reserve for associative arrays | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2504 ZY Zhou <rinick@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rinick@gmail.com --- Comment #1 from ZY Zhou <rinick@gmail.com> 2009-09-23 01:38:43 PDT --- (In reply to comment #0) > It appears that adding an element to an associative array always triggers a memory allocation. Especially in multithreaded code, this is inefficient. It would be nice if associative arrays had a .reserve(size_t n) property, which reserved enough space for n objects, and stored the capacity internally. The idea is that, until the reserve buffer is exhausted, no interaction of any kind with the GC would be needed to add an element to the AA. It's not all about memory int[int] a; foreach(i;0..20_000_000) a[i] = i; //takes 2 _minutes_ foreach(i;0..20_000_000) a[i] = i; //do it again, only 2 seconds 60 times slower. If it's memory allocation problem, the result should be similar to Dynamic array: int[] t; foreach(i;0..20_000_000) t ~= i; // 2 seconds foreach(i;0..20_000_000) t[i] = i; // 0.3 seconds I guess AA rehashs the whole array when capacity changes, and rehashing is much slower than memory allocation. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
December 01, 2009 [Issue 2504] Reserve for associative arrays | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2504 --- Comment #2 from David Simcha <dsimcha@yahoo.com> 2009-11-30 20:36:16 PST --- No, this is because, on the second run, the program has already reserved a bunch of memory from the OS, so the GC doesn't run as often. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
December 01, 2009 [Issue 2504] Reserve for associative arrays | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2504 --- Comment #3 from ZY Zhou <rinick@gmail.com> 2009-11-30 21:24:04 PST --- (In reply to comment #2) > No, this is because, on the second run, the program has already reserved a bunch of memory from the OS, so the GC doesn't run as often. try this: import std.stdio; void main(){ int[int] a; foreach(i;0..20_000_000){ a[i] = i; if((i&0xFFFF) == 0) writeln(i); } } You can see how slow it becomes when AA is large. I don't think memory allocation could take so much time. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 17, 2010 [Issue 2504] Reserve for associative arrays | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2504 Michael Rynn <y0uf00bar@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |y0uf00bar@gmail.com --- Comment #4 from Michael Rynn <y0uf00bar@gmail.com> 2010-04-17 04:50:13 PDT --- See enhancement 4098. I have experimented with a NodeHeap for the AA, that allocates Nodes in much bigger blocks, and this does speed by up a considerable amount the insertion and cleanup times. The NodeHeap idea is a single sized block dedicated heap manager created for each AA instance. No advantage at all for lookup times. The disadvantage is that Nodes released by a remove, are not returned to the GC pool until the entire block of Nodes has been removed. Of course when it does get freed, its in a big blocks at once. Also when allocating Nodes one after the other, I suppose a smart memory heap might be carving up much bigger blocks anyway, with various Pools for different object sizes. Having a manual AA.clear helps the GC as well. Even so, having a node heap and using for the first time, the system must get a new big bunch of memory, and maybe that will always take some time. Its not in the current 4098 set (which is a lot to swallow already), but I it could be an optional run time extra parameter to the HashMap/HashSet setup wrapper. It uses lots of really small nodes. The AA management object would then have a non-null heap manager object, to allocate and free blocks from a dedicated instance pool (Just like the Tango HashMap, which does exactly this). I have already proposed to add some more fields to the hash map management object, and one more non-default runtime option (only available from HashMap template wrapper) will hopefully not be a heap of trouble. If it was an option, would people use it wisely? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation