Thread overview
Closures and memory allocation
Jun 22, 2019
Anonymouse
Jun 22, 2019
Cym13
Jun 22, 2019
Cym13
Jun 22, 2019
kinke
June 22, 2019
I'm looking into why my thing does so many memory allocations. Profiling with kcachegrind shows _d_allocmemory being called upon entering a certain function, lots and lots of times.

It's a function that receives concurrency messages, so it contains nested functions that close over local variables. Think receiveTimeout(0.seconds, &nested1, &nested2, &nested3, ...) with 13 pointers to nested functions passed.

When entering the following function, does it allocate:

1. 0 times, because while there are closures defined, none is ever called?
2. 2 times, because there are closures over two variables?
3. 20 times, because there are 20 unique closures?

void foo()
{
    int i;
    long l;

    void foo01(bool b_) { i += 1; }
    void foo02(string s) { i += 2; }
    void foo03(int n) { i += 3; }
    void foo04(float f) { i += 4; }
    void foo05(double d) { i += 5; }
    void foo06() { i += 6; }
    void foo07() { i += 7; }
    void foo08() { i += 8; }
    void foo09() { i += 9; }
    void foo10() { i += 10; }
    void foo11() { l += 11; }
    void foo12() { l += 12; }
    void foo13() { l += 13; }
    void foo14() { l += 14; }
    void foo15() { l += 15; }
    void foo16() { l += 16; }
    void foo17() { l += 17; }
    void foo18() { l += 18; }
    void foo19() { l += 19; }
    void foo20() { l += 20; }

    auto f01 = &foo01;
    auto f02 = &foo02;
    auto f03 = &foo03;
    auto f04 = &foo04;
    auto f05 = &foo05;
    auto f06 = &foo06;
    auto f07 = &foo07;
    auto f08 = &foo08;
    auto f09 = &foo09;
    auto f10 = &foo10;
    auto f11 = &foo11;
    auto f12 = &foo12;
    auto f13 = &foo13;
    auto f14 = &foo14;
    auto f15 = &foo15;
    auto f16 = &foo16;
    auto f17 = &foo17;
    auto f18 = &foo18;
    auto f19 = &foo19;
    auto f20 = &foo20;
}
June 22, 2019
On Saturday, 22 June 2019 at 16:52:07 UTC, Anonymouse wrote:
> I'm looking into why my thing does so many memory allocations. Profiling with kcachegrind shows _d_allocmemory being called upon entering a certain function, lots and lots of times.
>
> It's a function that receives concurrency messages, so it contains nested functions that close over local variables. Think receiveTimeout(0.seconds, &nested1, &nested2, &nested3, ...) with 13 pointers to nested functions passed.
>
> When entering the following function, does it allocate:
>
> 1. 0 times, because while there are closures defined, none is ever called?
> 2. 2 times, because there are closures over two variables?
> 3. 20 times, because there are 20 unique closures?
>

Clearly this is a good time for you to learn about the tools D offers to profile allocations. There is the --profile=gc DMD argument that you can use but here there's something better: DMD's GC has a few hooks that are directly inside druntime and therefore available to any D program.

Putting your above code in test.d you can then do:

$ dmd test.d
$ ./test --DRT-gcopt=profile:1
        Number of collections:  2
        Total GC prep time:  0 milliseconds
        Total mark time:  0 milliseconds
        Total sweep time:  0 milliseconds
        Max Pause Time:  0 milliseconds
        Grand total GC time:  0 milliseconds
GC summary:    1 MB,    2 GC    0 ms, Pauses    0 ms <    0 ms

And here is your answer: two allocations. More information about --DRT-gcopt there: https://dlang.org/spec/garbage.html
June 22, 2019
On Saturday, 22 June 2019 at 19:26:13 UTC, Cym13 wrote:
> On Saturday, 22 June 2019 at 16:52:07 UTC, Anonymouse wrote:
>>[...]
>
> Clearly this is a good time for you to learn about the tools D offers to profile allocations. There is the --profile=gc DMD argument that you can use but here there's something better: DMD's GC has a few hooks that are directly inside druntime and therefore available to any D program.
>
> [...]

Ooops, sorry I went a bit fast there, --DRT-gcopt gives you the number of collections, not allocations.
June 22, 2019
On Saturday, 22 June 2019 at 16:52:07 UTC, Anonymouse wrote:
> When entering the following function, does it allocate:
>
> 1. 0 times, because while there are closures defined, none is ever called?
> 2. 2 times, because there are closures over two variables?
> 3. 20 times, because there are 20 unique closures?

4. One time, allocating a single closure for all variables referenced in nested functions. Easy to check by inspecting the generated asm (https://run.dlang.io/is/qO3JxO):

void receive(void delegate() a, void delegate() b) {}

void foo()
{
    int i;
    long l;

    void foo1() { i += 1; }
    void foo2() { l += 1; }

    receive(&foo1, &foo2);
}

When adding `scope` to the `receive()` params, the closure is allocated on the stack.