Taming the optimizer

Jun 14, 2018

Mike Franklin

Jun 14, 2018

Johan Engelen

Jun 15, 2018

David Nadlinger

June 14, 2018

Taming the optimizer

Posted by Mike Franklin

Permalink

Mike Franklin

Permalink

I'm trying to run benchmarks on my memcpy implementation (https://forum.dlang.org/post/trenuawrekkbewjudmsy@forum.dlang.org) using LDC with optimizations enabled (e.g. LDC -O3 memcpyd.d).  In my first implementation, the optimizer stripped out most of the code I was trying to measure.

Using the information at https://stackoverflow.com/questions/40122141/preventing-compiler-optimizations-while-benchmarking, I've created this:

void use(void* p)
{
    version(LDC)
    {
        import ldc.llvmasm;
         __asm("", "r", p);
    }
}

void clobber()
{
    version(LDC)
    {
        import ldc.llvmasm;
        __asm("","~{memory}");
    }
}

// `f` is the function I wish to benchmark.  it's an
// implementation of memcpy in D
Duration benchmark(T, alias f)(const T* src, T* dst)
{
    enum iterations = 10_000_000;
    Duration result;
    auto sw = StopWatch(AutoStart.yes);

    sw.reset();
    foreach (_; 0 .. iterations)
    {
        f(src, dst);
        use(dst);
        clobber();
    }
    result = sw.peek();

    return result;
}

This seems to work, but I don't know that I've implemented it properly; especially the `use` function.  How would you write this to achieve a real-world optimized measurement?  What's the equivalent of...

static void escape(void *p) {
  asm volatile("" : : "g"(p) : "memory");
}

... in LDC inline assembly?

Thanks,
Mike

On Thursday, 14 June 2018 at 03:39:39 UTC, Mike Franklin wrote: > > Using the information at https://stackoverflow.com/questions/40122141/preventing-compiler-optimizations-while-benchmarking, I've created this: Have you read this too? https://llvm.org/docs/LangRef.html#inline-assembler-expressions > This seems to work, but I don't know that I've implemented it properly; especially the `use` function. How would you write this to achieve a real-world optimized measurement? What's the equivalent of... > > static void escape(void *p) { > asm volatile("" : : "g"(p) : "memory"); > } Your use function may be correct, I'm not 100% sure. The escape function you ask for is clobbering _all_ memory (not only the memory accessible through `p`), so that then becomes: void escape(void* p) { import ldc.llvmasm; __asm("", "r,~{memory}", p); // added the memory clobber here } -Johan

Hi Mike, On 14 Jun 2018, at 4:39, Mike Franklin via digitalmars-d-ldc wrote: > What's the equivalent of... > > static void escape(void *p) { > asm volatile("" : : "g"(p) : "memory"); > } > > ... in LDC inline assembly? As you probably found out already, the LLVM flavour inline assembly is somewhat sparsely documented. However, Clang supports GCC-style inline assembly, so if you have a piece of (GC)C code that does what you want, you can use its LLVM IR output as a guide. For example, this is the relevant code generated for `escape`, obtained using `clang -emit-llvm -S` (on Apple clang-900.0.39.2): call void asm sideeffect "", "imr,~{memory},~{dirflag},~{fpsr},~{flags}"(i8* %0) As the clobber string is generated programmatically, it might not always be as concise as possible, though, and sometimes there might be more than one supported syntax for the same concept (like `g` and `imr`). If you wanted to come up with a well-researched proposal for druntime intrinsics to do these things (e.g. how would escape/use interact with CSE on strongly pure functions), this would be a very valuable contribution to the state of benchmarking in D. — David

Forums