Thread overview
Avoid gratuitous closure allocations
Sep 20, 2019
Ali Çehreli
Sep 20, 2019
Andrea Fontana
Sep 20, 2019
H. S. Teoh
Sep 21, 2019
Jonathan M Davis
September 20, 2019
tl;dr Instead of returning an object that uses local state, return an object that uses member variables.

We've discovered one such allocation inside std.format.sformat today during our local meetup[1], started fixing it, and discovered that it has already been fixed by ag0aep6g just 19 days ago after a forum discussion[2]. Awesome! :)

Although the sizes of such closures are usually small, any garbage collection allocation can have a big impact on program performance especially in multi-threaded programs due to D's current stop-the-world GC scheme. It is so easy to fall into this pessimization that I've used one in a recent forum post[3] myself.

For example, here is a range function that mimicks std.range.enumerate with the help of a Voldemort type[4]:

import std.stdio;
import std.range;
import std.typecons;

auto enumerated(R)(R range) {
  size_t i;

  struct Range {
    auto empty()    { return range.empty; }
    auto front()    { return tuple(i, range.front); }
    auto popFront() { range.popFront(); ++i; }
  }

  // The returned object requires a closure because it uses
  // 'range' and 'i' from the local context. OUCH OUCH OUCH!
  return Range();
}

void main() {
  // Aside: Tuple expansion special "feature" of D (as 'i' and 'e' here)
  foreach (i, e; iota(42, 50).enumerated) {
    writefln!"%s: %s"(i, e);
  }
}

If you compile the program with the -profile=gc command line switch (I used dmd), you will see a log file created in the current directory: profilegc.log. That file will point to 1 GC allocation of 32 bytes inside your source code. That allocation may seem trivial but if you used enumerated() multiple times e.g. in an inner loop, you would be "stoping the world" many times unnecessarily.

The solution is trivial in this case:

1) Move all local state to the struct; i.e. define a 'range' member variable inside the struct and make 'i' a member variable of the struct.

Note: Copying the 'range' parameter to a member variable and consuming that member variable instead may behave differently for some range types but it's off-topic for this discussion. :)

2) Although there would be no closure allocations after step 1, define the struct as 'static' to guarantee that it will stay that way even after changing the code in the future.

The new function is the following:

auto enumerated(R)(R range) {
  static struct Range {    // 2) Defined as 'static'
    R range;               // 1a) Member variable
    size_t i;              // 1b) Member variable

    auto empty()    { return range.empty; }
    auto front()    { return tuple(i, range.front); }
    auto popFront() { range.popFront(); ++i; }
  }

  return Range(range);     // 1c) Pass the parameter to the object
}

Now the unnecessary closure allocation is gone.

Ali

[1] https://www.meetup.com/D-Lang-Silicon-Valley/
[2] https://forum.dlang.org/post/tdgiytvqpyxevjtqgbao@forum.dlang.org
[3] https://forum.dlang.org/post/qlovst$okj$1@digitalmars.com
[4] https://wiki.dlang.org/Voldemort_types
September 20, 2019
On Friday, 20 September 2019 at 11:21:22 UTC, Ali Çehreli wrote:
> tl;dr Instead of returning an object that uses local state, return an object that uses member variables.

Really good to know tips!
September 20, 2019
On Fri, Sep 20, 2019 at 01:03:29PM +0000, Andrea Fontana via Digitalmars-d-learn wrote:
> On Friday, 20 September 2019 at 11:21:22 UTC, Ali Çehreli wrote:
> > tl;dr Instead of returning an object that uses local state, return an object that uses member variables.
> 
> Really good to know tips!

In fact, most (all?) of Phobos code is written this way.


T

-- 
Questions are the beginning of intelligence, but the fear of God is the beginning of wisdom.
September 20, 2019
On Friday, September 20, 2019 5:21:22 AM MDT Ali Çehreli via Digitalmars-d- learn wrote:
> tl;dr Instead of returning an object that uses local state, return an object that uses member variables.

The other issue this helps with is problems related to having multiple contexts. IIRC, without it, some predicates don't work due to the compiler complaining about there being more than one context. However, I pretty much always use static structs for ranges though, so I haven't run into the issue recently enough to recall the exact details. In general though, I'd say that it's best pratice to use static structs in functions and avoid needing local context. Sometimes, it makes sense to do so, but in general, giving your struct access to the local scope in the function is going to result in closures being allocated when they could have easily been avoided.

- Jonathan M Davis