Thread overview
how to benchmark pure functions?
Oct 27, 2022
ab
Oct 27, 2022
Imperatorn
Oct 27, 2022
H. S. Teoh
Oct 27, 2022
Dennis
Oct 29, 2022
max haughton
Oct 28, 2022
ab
Oct 28, 2022
Imperatorn
Oct 29, 2022
Siarhei Siamashka
October 27, 2022

Hi,

when trying to compare different implementations of the optimized builds of a pure function using benchmark from std.datetime.stopwatch, I get times equal to zero, I suppose because the functions are not executed as they do not have side effects.

The same happens with the example from the documentation:
https://dlang.org/library/std/datetime/stopwatch/benchmark.html

How can I prevent the compiler from removing the code I want to measure? Is there some utility in the standard library or pragma that I should use?

Thanks

AB

October 27, 2022

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:

>

Hi,

when trying to compare different implementations of the optimized builds of a pure function using benchmark from std.datetime.stopwatch, I get times equal to zero, I suppose because the functions are not executed as they do not have side effects.

The same happens with the example from the documentation:
https://dlang.org/library/std/datetime/stopwatch/benchmark.html

How can I prevent the compiler from removing the code I want to measure? Is there some utility in the standard library or pragma that I should use?

Thanks

AB

Sorry, I don't understand what you're saying.

The examples work for me. Can you provide an exact code example which does not work as expected for you?

October 27, 2022
On Thu, Oct 27, 2022 at 06:20:10PM +0000, Imperatorn via Digitalmars-d-learn wrote:
> On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:
> > Hi,
> > 
> > when trying to compare different implementations of the optimized builds of a pure function using benchmark from std.datetime.stopwatch, I get times equal to zero, I suppose because the functions are not executed as they do not have side effects.
> > 
> > The same happens with the example from the documentation: https://dlang.org/library/std/datetime/stopwatch/benchmark.html
> > 
> > How can I prevent the compiler from removing the code I want to measure?  Is there some utility in the standard library or pragma that I should use?
[...]

To prevent the optimizer from eliding the function completely, you need to do something with the return value.  Usually, this means you combine the return value into some accumulating variable, e.g., if it's an int function, have a running int accumulator that you add to:

	int funcToBeMeasured(...) pure { ... }

	int accum;
	auto results = benchmark!({
		// Don't just call funcToBeMeasured and ignore the value
		// here, otherwise the optimizer may delete the call
		// completely.
		accum += funcToBeMeasured(...);
	});

Then at the end of the benchmark, do something with the accumulated value, like print out its value to stdout, so that the optimizer doesn't notice that the value is unused, and decide to kill all previous assignments to it. Something like `writeln(accum);` at the end should do the trick.


T

-- 
Indifference will certainly be the downfall of mankind, but who cares? -- Miquel van Smoorenburg
October 27, 2022

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:

>

How can I prevent the compiler from removing the code I want to measure?

With many C compilers, you can use volatile assembly blocks for that. With LDC -O3, a regular assembly block also does the trick currently:

void main()
{
    import std.datetime.stopwatch;
    import std.stdio: write, writeln, writef, writefln;
    import std.conv : to;

    void f0() {}
    void f1()
    {
        foreach(i; 0..4_000_000)
        {
            // nothing, loop gets optimized out
        }
    }
    void f2()
    {
        foreach(i; 0..4_000_000)
        {
            // defeat optimizations
            asm @safe pure nothrow @nogc {}
        }
    }
    auto r = benchmark!(f0, f1, f2)(1);
    writeln(r[0]); // 4 μs
    writeln(r[1]); // 4 μs
    writeln(r[2]); // 1 ms
}
October 28, 2022

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:

>

Hi,

when trying to compare different implementations of the optimized builds of a pure function using benchmark from std.datetime.stopwatch, I get times equal to zero, I suppose because the functions are not executed as they do not have side effects.

The same happens with the example from the documentation:
https://dlang.org/library/std/datetime/stopwatch/benchmark.html

How can I prevent the compiler from removing the code I want to measure? Is there some utility in the standard library or pragma that I should use?

Thanks

AB

Thanks to H.S. Teoh and Dennis for the suggestions, they both work. I like the empty asm block a bit more because it is less invasive, but it only works with ldc.

@Imperatorn see Dennis code for an example. std.datetime.benchmark works, but at high optimization level (-O2, -O3) the loop can be removed and the time brought down to 0hnsec. E.g. try "ldc2 -O3 -run dennis.d".

AB

October 28, 2022

On Friday, 28 October 2022 at 09:48:14 UTC, ab wrote:

>

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:

>

[...]

Thanks to H.S. Teoh and Dennis for the suggestions, they both work. I like the empty asm block a bit more because it is less invasive, but it only works with ldc.

@Imperatorn see Dennis code for an example. std.datetime.benchmark works, but at high optimization level (-O2, -O3) the loop can be removed and the time brought down to 0hnsec. E.g. try "ldc2 -O3 -run dennis.d".

AB

Yeah I didn't read carefully enough sorry 🌷

October 29, 2022

On Friday, 28 October 2022 at 09:48:14 UTC, ab wrote:

>

Thanks to H.S. Teoh and Dennis for the suggestions, they both work. I like the empty asm block a bit more because it is less invasive, but it only works with ldc.

I used the volatileLoad/volatileStore functions to ensure that the compiler doesn't find a way to optimize out the code (for example, move repetitive calculations out of the loop or even do them at compile time) and the RDTSC/RDTSCP instruction via inline assembly for measurements: https://gist.github.com/ssvb/5c926ed9bc755900fdaac3b71a0f7cfd

The goal was to have a very fast way to check (with no measurable overhead) whether reasonable optimization options had been supplied to the compiler.

October 29, 2022

On Thursday, 27 October 2022 at 18:41:36 UTC, Dennis wrote:

>

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:

>

How can I prevent the compiler from removing the code I want to measure?

With many C compilers, you can use volatile assembly blocks for that. With LDC -O3, a regular assembly block also does the trick currently:

void main()
{
    import std.datetime.stopwatch;
    import std.stdio: write, writeln, writef, writefln;
    import std.conv : to;

    void f0() {}
    void f1()
    {
        foreach(i; 0..4_000_000)
        {
            // nothing, loop gets optimized out
        }
    }
    void f2()
    {
        foreach(i; 0..4_000_000)
        {
            // defeat optimizations
            asm @safe pure nothrow @nogc {}
        }
    }
    auto r = benchmark!(f0, f1, f2)(1);
    writeln(r[0]); // 4 μs
    writeln(r[1]); // 4 μs
    writeln(r[2]); // 1 ms
}

I recommend a volatile data dependency rather than injecting volatile ASM into code FYI i.e. don't modify the pure function but rather make sure the result is actually used in the eyes of the compiler.