Thread overview | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
July 03, 2012 popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
I'm not sure what's going on here. Using this code to solve a reddit programming challenge to return the Nth term of a Fibonacci-like sequence of arbitrary length: module main; import std.stdio, std.array, std.datetime; ulong f(int k, int n) { ulong[] nums = new ulong[k]; nums[$ - 1] = 1; ulong total = 1; foreach(i;k..n + 1) { nums ~= total % 10^^8; total += nums[$ - 1] - nums[$ - k - 1]; nums.popFront(); //The optional, trouble line } return nums[$ - 1]; } void main() { StopWatch sw; sw.start; f(100, 10_000).writeln(); f(3^^13, 5^^10).writeln; sw.peek.msecs.writeln("msecs"); } I assumed using popFront to keep the memory use in check would be the sensible thing to do but memory used goes from 170MB to 250MB when adding popFront. What's going on and how should I have built this data structure? |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | On Tuesday, 3 July 2012 at 02:20:23 UTC, ixid wrote:
> I'm not sure what's going on here. Using this code to solve a reddit programming challenge to return the Nth term of a Fibonacci-like sequence of arbitrary length:
>
> module main;
> import std.stdio, std.array, std.datetime;
>
> ulong f(int k, int n) {
> ulong[] nums = new ulong[k];
> nums[$ - 1] = 1;
> ulong total = 1;
> foreach(i;k..n + 1) {
> nums ~= total % 10^^8;
> total += nums[$ - 1] - nums[$ - k - 1];
> nums.popFront(); //The optional, trouble line
> }
> return nums[$ - 1];
> }
>
> void main() {
> StopWatch sw;
> sw.start;
> f(100, 10_000).writeln();
> f(3^^13, 5^^10).writeln;
> sw.peek.msecs.writeln("msecs");
> }
>
> I assumed using popFront to keep the memory use in check would be the sensible thing to do but memory used goes from 170MB to 250MB when adding popFront. What's going on and how should I have built this data structure?
Depending on the size, it definitely would happen anyways. Correct me if I'm wrong, but once you use popFront, you effectively modify the source to contain a slice of it's original data. If you want to extend that data, the slice can't be expanded safely since it doesn't own the whole memory.
Try this: (Tested, gives 80Mb and 400ms faster)
replace:ulong[] nums = new ulong[k];
with: ulong[] nums;
nums.reserve(k + n + 1);
nums.length = k;
|
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to Era Scarecrow | That fixed the memory behaviour and is faster. Not using popFront at all uses the same memory and is faster still. Given popFront is advancing a range does this mean the underlying array is not being deleted? How would one delete the information you don't need any more if so? |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | ixid: > Given popFront is advancing a range does this mean the underlying array is not being deleted? How would one delete the information you don't need any more if so? I think the GC can't deallocate part of an array. It's a single memory zone, and it has to go all at once. Maybe you need a fitter data structure, like a deque, like the Python collections.deque: http://docs.python.org/library/collections.html#collections.deque Or similar data structure of C++: http://www.cplusplus.com/reference/stl/deque/ Bye, bearophile |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | It's a pity that the D standard library seems to lack rather a lot of these things. Permutation functions are also an annoying one not to have. =/ |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | ixid: > It's a pity that the D standard library seems to lack rather a lot of these things. I've taken a better look at your code, for this program a deque wasn't so needed. This is quick: import std.stdio; ulong hFib(in uint k, in uint n) pure nothrow { auto nums = new ulong[k + n + 1]; nums[k - 1] = 1; ulong total = 1; foreach (i; k .. n + 1) { nums[i] = total % (10 ^^ 8); total += nums[i] - nums[i - k]; } return nums[n]; } void main() { hFib(100, 10_000).writeln(); hFib(3 ^^ 13, 5 ^^ 10).writeln(); } In C: #include <stdio.h> #include <stdlib.h> #include <stdint.h> uint64_t hFib(const int k, const int n) { uint64_t *nums = calloc(k + n + 1, sizeof(uint64_t)); if (nums == NULL) return 0; nums[k - 1] = 1; uint64_t total = 1; for (int i = k; i < n + 1; i++) { nums[i] = total % 100000000LLU; total += nums[i] - nums[i - k]; } return nums[n]; } int main() { printf("%llu\n", hFib(100, 10000)); printf("%llu\n", hFib(1594323, 9765625)); return 0; } DMD 32 bit: L5A: mov EDX,020h[ESP] mov EAX,01Ch[ESP] mov EBX,05F5E100h push ESI xor ECX,ECX push EDI call near ptr __ULDIV@ pop EDI pop ESI mov EDX,018h[ESP] mov EAX,ESI mov [ESI*8][EDX],EBX sub EAX,EDI mov 4[ESI*8][EDX],ECX sub EBX,[EAX*8][EDX] sbb ECX,4[EAX*8][EDX] add 01Ch[ESP],EBX adc 020h[ESP],ECX inc ESI cmp ESI,EBP jb L5A GCC 4.7.1: L6: movl %esi, (%esp) movl %edi, 4(%esp) movl $100000000, 8(%esp) movl $0, 12(%esp) call ___umoddi3 movl %eax, (%ebx,%ebp,8) movl %edx, 4(%ebx,%ebp,8) subl (%ebx), %eax sbbl 4(%ebx), %edx addl %eax, %esi adcl %edx, %edi addl $8, %ebx cmpl 24(%esp), %ebx jne L6 LLVM: .LBB0_2: # =>This Inner Loop Header: Depth=1 movl %esi, 4(%esp) movl %edi, (%esp) movl $0, 12(%esp) movl $100000000, 8(%esp) # imm = 0x5F5E100 calll __umoddi3 movl 24(%esp), %ecx # 4-byte Reload movl %eax, (%ecx,%ebp,8) movl %edx, 4(%ecx,%ebp,8) addl %edi, %eax adcl %esi, %edx subl -800(%ecx,%ebp,8), %eax sbbl -796(%ecx,%ebp,8), %edx addl $1, %ebp adcl $0, %ebx cmpl $10001, %ebp # imm = 0x2711 movl %eax, %edi movl %edx, %esi jne .LBB0_2 Run-time, D 0.75 seconds (dmd 2.060alpha 32 bit, -O - release -inline), C 0.40 seconds (gcc 4.7.1 32 bit, -Ofast -flto -S -std=c99). > Permutation functions are also an annoying one not to have. =/ Try this: http://rosettacode.org/wiki/Permutations#Faster_Lazy_Version Bye, bearophile |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Tuesday, July 03, 2012 15:29:51 bearophile wrote: > ixid: > > Given popFront is advancing a range does this mean the underlying array is not being deleted? How would one delete the information you don't need any more if so? > > I think the GC can't deallocate part of an array. It's a single memory zone, and it has to go all at once. http://dlang.org/d-array-article.html - Jonathan M Davis |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | Thank you, that was faster. I decided to try the obvious method of reusing the same array giving similar speed to your suggestion while using less memory: ulong f2(int k, int n) { auto nums = new ulong[k + 1]; nums[$ - 1] = 1; int iter = k; ulong total = 0; foreach(i;k..n + 1) { int iter_next = iter + 1 > k? 0 : iter + 1; total += nums[iter] - nums[iter_next]; nums[iter_next] = total % 10^^8; iter = iter_next; } return nums[iter]; } |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | ixid: > int iter_next = iter + 1 > k? 0 : iter + 1; This is becoming a "fixed size circular queue". But maybe a modulus is faster than a branch here. (It's better when k is always a power of two, you don't need a modulus. And even better if your size is a multiple of the page size). > nums[iter_next] = total % 10^^8; In such cases I suggest to add (), to not force the person that reads the code reader to remember the precedence of uncommon operators. Bye, bearophile |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | > This is becoming a "fixed size circular queue". But maybe a modulus is faster than a branch here. I have tried, and it's slower both in D and C. I don't know why. #include <stdio.h> #include <stdlib.h> #include <stdint.h> uint64_t hFib(const uint32_t k, const uint32_t n) { uint64_t *nums = calloc(k + 1, sizeof(uint64_t)); if (nums == NULL) exit(1); uint32_t iter = k; nums[k] = 1; uint64_t total = 1; for (uint32_t i = k; i < n + 1; i++) { const uint32_t iter_next = (iter + 1 > k) ? 0 : (iter + 1); total += nums[iter] - nums[iter_next]; nums[iter_next] = total % 100000000LLU; iter = iter_next; } return nums[iter]; } int main() { printf("%llu\n", hFib(100, 10000)); printf("%llu\n", hFib(1594323, 9765625)); return 0; } Runtime of this C version: about 240 ms. Bye, bearophile |
Copyright © 1999-2021 by the D Language Foundation