Really easy optimization with std.experimental.allocator

September 18, 2016

Posted by Ryan

Permalink

Ryan

Permalink

I've been learning about allocators in D. Much easier than C++, and this little program shows a really easy optimization. Just use the IAllocator interface with the GC.
-------------------------------------------------------------------
import std.datetime;
import std.experimental.allocator;
import std.experimental.allocator.mallocator;
import std.stdio;

enum sizes = [10,100, 1_000, 10_000];

// A function that creates several arrays on the GC heap with new, and assigns
// a single value, so it doesn't get compiled out....
void GCNew()
{

  byte[] b1;

  foreach(sz; sizes)
  {
    b1 = new byte[](sz);
    b1[1] = 10;
    assert(b1[1] == 10);
    assert(b1[0] == 0);
  }
}

// Same as the function above, but with an IAllocator interface
void IAlloc(IAllocator alloc = theAllocator)
{
  byte[] b1;

  foreach(sz; sizes)
  {
    b1 = alloc.makeArray!byte(sz);
    b1[1] = 10;
    assert(b1[1] == 10);
    assert(b1[0] == 0);
    alloc.dispose(b1);
  }
}

void main()
{
  enum iterations = 1_000;

  writefln("        GCNew: %d", benchmark!GCNew(iterations)[0].usecs);

  writefln("    IAlloc GC: %d", benchmark!IAlloc(iterations)[0].usecs);

  writefln("IAlloc malloc: %d", benchmark!(
    ()
    {
      IAlloc(allocatorObject(Mallocator.instance));
    })
  (iterations)[0].usecs);
}

-------------------------------------------------------------------
Results on my iMac with rdmd -release -boundscheck=off -O allocTest.d
        GCNew: 7467
    IAlloc GC: 1657
IAlloc malloc: 1575

I think it works because each time you call dispose it tells the GC to mark that memory as available, without the GC needing to do a collection sweep. This could be a really useful tip in the allocators section, as I see converting to IAllocator with the GC as the first step in testing optimizations with allocators.

On Sunday, 18 September 2016 at 01:44:10 UTC, Ryan wrote:
> I think it works because each time you call dispose it tells the GC to mark that memory as available, without the GC needing to do a collection sweep. This could be a really useful tip in the allocators section, as I see converting to IAllocator with the GC as the first step in testing optimizations with allocators.

A bit more than that - because you dispose and allocate same amount of memory, you effectively reuse same memory block in GC pools over and over again. It is hardly surprising that this is much faster than "honest" allocation of lot of memory across different pools (CPU memory cache access pattern alone will make a huge difference).

Forums