Thread overview
cost of calling class function
Feb 22, 2017
Dušan Pavkov
Feb 23, 2017
Seb
Feb 23, 2017
Jeremy DeHaan
Feb 23, 2017
Jonathan M Davis
Feb 24, 2017
Chris Wright
Feb 23, 2017
Johan Engelen
Feb 23, 2017
Johan Engelen
Feb 23, 2017
Patrick Schluter
Feb 23, 2017
Johan Engelen
February 22, 2017
Hello,

I have tried to measure how much would some simple task be faster in D than in C#. I ported some simple code from C# to D 1:1 almost without changes and C# code was faster. After eliminating causes one by one I have an example which shows where the problem is. If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.

Here is the code:

import std.stdio;
import std.conv;
import std.datetime;

public float getTotal(string s, int add)
{
	float result = add;
	for (int j = 0; j < s.length; j++)
	{
		result += s[j];
	}
	return result;
}

class A
{
	public float getTotal(string s, int add)
	{
		float result = add;
		for (int j = 0; j < s.length; j++)
		{
			result += s[j];
		}
		return result;
	}
}

void main(string[] args)
{
	StopWatch sw;
	sw.start();

	int n = args.length == 2 ? to!int(args[1]) : 100000;

	string inputA = "qwertyuiopasdfghjklzxcvbnm0123456789";
	double total = 0;
	for (int i = 0; i < n; i++)
	{
		for (int ii = 0; ii < inputA.length; ii++)
		{
			total += getTotal(inputA, i);
		}
	}
	sw.stop();
	writeln("direct call: ");
	writeln("total: ", total);
	writeln("elapsed: ", sw.peek().msecs, " [ms]");
	writeln();

    total = 0;
	auto a = new A();
	sw.reset();
	sw.start();
	for (int i = 0; i < n; i++)
	{
		for (int ii = 0; ii < inputA.length; ii++)
		{
			total += a.getTotal(inputA, i);
		}
	}
	sw.stop();
	writeln("func in class call: ", total);
	writeln("total: ", total);
	writeln("elapsed: ", sw.peek().msecs, " [ms]");
}


here are the build configuration and execution times:

C:\projects\D\benchmarks\reduced problem>dub run --config=application --arch=x86_64 --build=release-nobounds --compiler=ldc2
Performing "release-nobounds" build using ldc2 for x86_64.
benchmark1 ~master: target for configuration "application" is up to date.
To force a rebuild of up-to-date targets, run again with --force.
Running .\benchmark1.exe
direct call:
total: 1.92137e+11
elapsed: 4 [ms]

func in class call: 1.92137e+11
total: 1.92137e+11
elapsed: 138 [ms]

Thanks in advance.
February 23, 2017
On Wednesday, 22 February 2017 at 23:49:43 UTC, Dušan Pavkov wrote:
> Hello,
>
> I have tried to measure how much would some simple task be faster in D than in C#. I ported some simple code from C# to D 1:1 almost without changes and C# code was faster. After eliminating causes one by one I have an example which shows where the problem is. If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.

I think I can provide a couple of pointers for one reason. The function isn't final and virtual calls are inefficient:

https://dlang.org/spec/function.html#virtual-functions
http://forum.dlang.org/post/mailman.840.1332033836.4860.digitalmars-d@puremagic.com
https://issues.dlang.org/show_bug.cgi?id=11616
http://wiki.dlang.org/DIP51

AFAICT though it was approved, the switch to final by default has never happened.
February 23, 2017
On Thursday, 23 February 2017 at 01:48:40 UTC, Seb wrote:
> AFAICT though it was approved, the switch to final by default has never happened.

I believe Andrei made an executive decision to shut down final by default.
February 22, 2017
On Thursday, February 23, 2017 02:17:02 Jeremy DeHaan via Digitalmars-d wrote:
> On Thursday, 23 February 2017 at 01:48:40 UTC, Seb wrote:
> > AFAICT though it was approved, the switch to final by default has never happened.
>
> I believe Andrei made an executive decision to shut down final by default.

Yeah, the change that introduced virtual to start the change to making class member functions non-virtual by default was actually committed, and then Andrei found out about it and insisted that it be reverted. So, it was reverted, and we're never going to get non-virtual by default.

- Jonathan M Davis

February 23, 2017
On Wednesday, 22 February 2017 at 23:49:43 UTC, Dušan Pavkov wrote:
> 
> If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.

Interesting test case, thanks :-)
Adding "final" to the class method nullifies the speed difference.
Somehow, LDC does not devirtualize the call in your testcase. Without the for-loops the call is nicely devirtualized, so no performance difference.

-Johan

February 23, 2017
On Thursday, 23 February 2017 at 16:25:34 UTC, Johan Engelen wrote:
> On Wednesday, 22 February 2017 at 23:49:43 UTC, Dušan Pavkov wrote:
>> 
>> If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.
>
> Interesting test case, thanks :-)
> Adding "final" to the class method nullifies the speed difference.
> Somehow, LDC does not devirtualize the call in your testcase. Without the for-loops the call is nicely devirtualized, so no performance difference.

We're in good company: both clang and gcc also do not devirtualize the call when the loopcount is too large (when the loop count is 4, the indirect calls are gone, when it is 160, they are back).

Btw, with PGO, the performance is 4 ms(direct call) vs 6 ms (virtual call). Pathological, but still.

I am submitting a DConf talk on optimization and the cost of D idioms. This gave me some new ideas to present, thanks :)

-Johan
February 23, 2017
On Thursday, 23 February 2017 at 17:02:55 UTC, Johan Engelen wrote:
> On Thursday, 23 February 2017 at 16:25:34 UTC, Johan Engelen wrote:
>> [...]
>
> We're in good company: both clang and gcc also do not devirtualize the call when the loopcount is too large (when the loop count is 4, the indirect calls are gone, when it is 160, they are back).
>
> Btw, with PGO, the performance is 4 ms(direct call) vs 6 ms (virtual call). Pathological, but still.
>
> I am submitting a DConf talk on optimization and the cost of D idioms. This gave me some new ideas to present, thanks :)
>
> -Johan

Marking the method as @pure changes anything?
February 23, 2017
On Thursday, 23 February 2017 at 19:57:18 UTC, Patrick Schluter wrote:
>
> Marking the method as @pure changes anything?

Here is the link to play with it yourself :-)

https://godbolt.org/g/se4dCZ


February 24, 2017
On Wed, 22 Feb 2017 18:31:37 -0800, Jonathan M Davis via Digitalmars-d wrote:

> On Thursday, February 23, 2017 02:17:02 Jeremy DeHaan via Digitalmars-d wrote:
>> On Thursday, 23 February 2017 at 01:48:40 UTC, Seb wrote:
>> > AFAICT though it was approved, the switch to final by default has never happened.
>>
>> I believe Andrei made an executive decision to shut down final by default.
> 
> Yeah, the change that introduced virtual to start the change to making class member functions non-virtual by default was actually committed, and then Andrei found out about it and insisted that it be reverted. So, it was reverted, and we're never going to get non-virtual by default.
> 
> - Jonathan M Davis

It's an interesting debate, but there's not a ton of reason to prefer one over the other design-wise. It can be considered for D3, but for D2, the ship has sailed.