View mode: basic / threaded / horizontal-split · Log in · Help
April 13, 2005
speed of polymorphic calls
Hi,

   	I was comparing the speed of polymorphic calls in D versus other 
languages (SmartEiffel, for example) and noticed that D's numbers do not 
shine.

   	Having a Smalltalk background, I used the favorite little benchmark 
of Smalltalkers: fibonacci (tests polymorphic sends and integer 
arithmetic). The sources for SmartEiffel and D are at the end.

   	SmartEiffel 2.1b3 computes fib(12) in a loop of 1,000,000 times in 
3.04 seconds.

   	D computes fib(12) in a loop of 1,000,000 times in 7 seconds.

   	(Java JRE 1.4.1_01 does it in 3.73 seconds thanks to its JIT)

   	Is there any hope that the generated D code will be 
improved/optimized ? Maybe D's OO runtime could incorporate Polymorphic 
Inline Caches etc ? ( http://c2.com/cgi/wiki?PolymorphicInlineCaches )

   	Note: if you switch to a pure functional approach (get rid of the 
class around fibr etc) the number comes down to 2 seconds.

   	Cheers,

marcio

---------------------- cut here --------------------------------
class FIBOO -- eiffel wants all uppercase letters for class name

creation
   make

feature
	make is
		local
			n, count, fib_value, i: INTEGER;
			start_time, end_time : MICROSECOND_TIME
		do
			print( "Fibonacci (recursive, OO) in SmartEiffel%N" )
			n := argument(1).to_integer
			count := argument(2).to_integer

			start_time.update
			from i := 0 until i = count loop
	 		   fib_value := fibr (n)
			   i := i + 1
			end
			end_time.update
			print( "%N " + count.to_string + " times:  fibOO(" + 
n.to_string + ")= " + fib_value.to_string)
			print( "%N time (s)= " + (start_time.elapsed_seconds 
(end_time)).to_string)
		end -- make

	fibr (n : INTEGER) : INTEGER is
	    do
	      if (n <= 2) then
	         Result := 1
	      else
	         Result := fibr (n-1) + fibr (n-2)
	      end
	    end -- fibr

end -- class FIBOO
---------------------- cut here --------------------------------

import std.c.stdio;
import std.string;
import std.date;

/* 
 Compute fibonacci recursively, in an OO-way ,
 in D ( http://www.digitalmars.com/d/index.html ) 
*/

class Fib {
	int fibr(int n) {
	  if (n <= 2) return(1);
	  else return fibr(n-1)+fibr(n-2);
	}
}

int main (char[][] args) {
	if (args.length<3) {
		printf ("fib <num> <loopCount>");
		return -1;
	}
	int num = atoi (args[1]);
	int loopCount = atoi (args[2]);
	Fib f = new Fib(); 
	int result;
	d_time theStart = getUTCtime();
	for (int i = 0; i < loopCount; i++) {
		result = f.fibr(num);
	}
	d_time theEnd = getUTCtime();
	float totalTimeInSeconds = (theEnd - theStart) / TicksPerSecond;
	printf("fibOO(%i)=%i  (%i times in %f s)", num, result, loopCount, 
totalTimeInSeconds);
	return 0;
}
---------------------- cut here --------------------------------
April 13, 2005
Re: speed of polymorphic calls
I tried this out just to test a theory and the results surprised me.  I altered
the code a bit to this:

import std.c.stdio;
import std.c.stdlib;
import std.c.string;
import std.c.time;

/* 
Compute fibonacci recursively, in an OO-way ,
in D ( http://www.digitalmars.com/d/index.html ) 
*/

class Fib {
int fibr(int n) {
if (n <= 2) return(1);
else return fibr(n-1)+fibr(n-2);
}
}

int main (char[][] args) {
if (args.length<3) {
printf ("fib <num> <loopCount>");
return -1;
}
int num = atoi (args[1]);
int loopCount = atoi (args[2]);
Fib f = new Fib(); 
int result;
clock_t theStart = clock();
for (int i = 0; i < loopCount; i++) {
result = f.fibr(num);
}
clock_t theEnd = clock();
double totalTimeInSeconds = cast(double) (theEnd - theStart) / CLOCKS_PER_SEC;
printf("fibOO(%i)=%i  (%i times in %f s)", num, result, loopCount,
totalTimeInSeconds);
return 0;
}

I compiled with options:

-release -O -inline

and ran:

fib 12 1000000

twice.  The average time was ~4.4 seconds.  Next, I edited the code to make
Fin.fibr a final method, re-compiled, and re-ran.  The result?  An average time
of ~8.8 seconds!  Why in the world is declaring the method "final" doubling
execution time?


Sean
April 14, 2005
Re: speed of polymorphic calls
Sean Kelly <sean@f4.ca> wrote:

[...]
> Why in the world
> is declaring the method "final" doubling execution time?

By a mistake of yours?

My test did not confirm your result, because declaring `fibr' 
`final' caused a significant drop of
11% execution time compiled by dmd 0.120 from 2.422s to 2.156s
15% execution time compiled by gdc 0.10  from 1.953s to 1.656s

-manfred
April 14, 2005
Re: speed of polymorphic calls
Marcio <mqmnews321@sglebs.com> wrote:

[...]
>          D computes fib(12) in a loop of 1,000,000 times in 7
>          seconds. 
[...]

Did you use `-release -O -inline'? This seems to reduce execution 
time by more than 60%.

-manfred
April 14, 2005
Re: speed of polymorphic calls
In article <d3kjd9$2sc7$1@digitaldaemon.com>, Manfred Nowak says...
>
>Sean Kelly <sean@f4.ca> wrote:
>
>[...]
>> Why in the world
>> is declaring the method "final" doubling execution time?
>
>By a mistake of yours?
>
>My test did not confirm your result, because declaring `fibr' 
>`final' caused a significant drop of
>11% execution time compiled by dmd 0.120 from 2.422s to 2.156s
>15% execution time compiled by gdc 0.10  from 1.953s to 1.656s

I must have messed something up doing my first test.  When I run it now, the
standard implementation averages ~4.0 seconds and adding 'final' reduces the
time to ~3.3 seconds.


Sean
April 14, 2005
Re: speed of polymorphic calls
How were they compiled?

DMD should have been: dmd -O -inline -release <file_name>

I get the same results with DMD vs. the latest SmartEiffel v2-1 downloaded from
the loria Site (with the same code as below).

If I finalize or make the fibr() method private, DMD is actually faster.

I compiled the SE code with: compile -boost -no_split -O3 fib.e -o fib_e

The compiler options for DMD can be viewed by just typing 'dmd' at the command
line. They are also here: http://digitalmars.com/d/dcompiler.html

- Dave

In article <Xns9637C09FD665Fmyidtoken@63.105.9.61>, Marcio says...
>
>Hi,
>
>    	I was comparing the speed of polymorphic calls in D versus other 
>languages (SmartEiffel, for example) and noticed that D's numbers do not 
>shine.
>
>    	Having a Smalltalk background, I used the favorite little benchmark 
>of Smalltalkers: fibonacci (tests polymorphic sends and integer 
>arithmetic). The sources for SmartEiffel and D are at the end.
>
>    	SmartEiffel 2.1b3 computes fib(12) in a loop of 1,000,000 times in 
>3.04 seconds.
>
>    	D computes fib(12) in a loop of 1,000,000 times in 7 seconds.
>
>    	(Java JRE 1.4.1_01 does it in 3.73 seconds thanks to its JIT)
>
>    	Is there any hope that the generated D code will be 
>improved/optimized ? Maybe D's OO runtime could incorporate Polymorphic 
>Inline Caches etc ? ( http://c2.com/cgi/wiki?PolymorphicInlineCaches )
>
>    	Note: if you switch to a pure functional approach (get rid of the 
>class around fibr etc) the number comes down to 2 seconds.
>
>    	Cheers,
>
>marcio
>
>---------------------- cut here --------------------------------
>class FIBOO -- eiffel wants all uppercase letters for class name
>
>creation
>    make
>
>feature
>	make is
>		local
>			n, count, fib_value, i: INTEGER;
>			start_time, end_time : MICROSECOND_TIME
>		do
>			print( "Fibonacci (recursive, OO) in SmartEiffel%N" )
>			n := argument(1).to_integer
>			count := argument(2).to_integer
>
> 			start_time.update
>			from i := 0 until i = count loop
>	 		   fib_value := fibr (n)
>			   i := i + 1
>			end
>			end_time.update
>			print( "%N " + count.to_string + " times:  fibOO(" + 
>n.to_string + ")= " + fib_value.to_string)
>			print( "%N time (s)= " + (start_time.elapsed_seconds 
>(end_time)).to_string)
>		end -- make
>
>	fibr (n : INTEGER) : INTEGER is
>	    do
>	      if (n <= 2) then
>	         Result := 1
>	      else
>	         Result := fibr (n-1) + fibr (n-2)
>	      end
>	    end -- fibr
>
>end -- class FIBOO
>---------------------- cut here --------------------------------
>
>import std.c.stdio;
>import std.string;
>import std.date;
>
>/* 
>  Compute fibonacci recursively, in an OO-way ,
>  in D ( http://www.digitalmars.com/d/index.html ) 
>*/
>
>class Fib {
>	int fibr(int n) {
>	  if (n <= 2) return(1);
>	  else return fibr(n-1)+fibr(n-2);
>	}
>}
>
>int main (char[][] args) {
>	if (args.length<3) {
>		printf ("fib <num> <loopCount>");
>		return -1;
>	}
>	int num = atoi (args[1]);
>	int loopCount = atoi (args[2]);
>	Fib f = new Fib(); 
>	int result;
>	d_time theStart = getUTCtime();
>	for (int i = 0; i < loopCount; i++) {
>		result = f.fibr(num);
>	}
>	d_time theEnd = getUTCtime();
>	float totalTimeInSeconds = (theEnd - theStart) / TicksPerSecond;
>	printf("fibOO(%i)=%i  (%i times in %f s)", num, result, loopCount, 
>totalTimeInSeconds);
>	return 0;
>}
>---------------------- cut here --------------------------------
>
>
April 14, 2005
Re: speed of polymorphic calls
In article <d3kjd9$2sc7$1@digitaldaemon.com>, Manfred Nowak says...
>
>Sean Kelly <sean@f4.ca> wrote:
>
>[...]
>> Why in the world
>> is declaring the method "final" doubling execution time?
>
>By a mistake of yours?
>
>My test did not confirm your result, because declaring `fibr' 
>`final' caused a significant drop of
>11% execution time compiled by dmd 0.120 from 2.422s to 2.156s
>15% execution time compiled by gdc 0.10  from 1.953s to 1.656s
>
>-manfred

What compiler options?

W/o 'final' or 'private' DMD is the same as GDC (well, 5% diff, not 25% like
your results).

W/ final or private, DMD is actually a tad bit faster.

- Dave
April 14, 2005
Re: speed of polymorphic calls
Dave <Dave_member@pathlink.com> wrote in
news:d3kn2c$2ul7$1@digitaldaemon.com: 

> 
> How were they compiled?

   	D: I had used just -O

   	SmartEiffel: with this BAT:

rem Don't forget to activate the MS tools first:
rem  "C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin
\vcvars32.bat"

compile -boost -no_gc -relax -no_split fiboo



> DMD should have been: dmd -O -inline -release <file_name>


   	With -O -inline -release Now I get this:

[OO]
dmd -O -inline -release fibOO.d
C:\dmd\bin\..\..\dm\bin\link.exe fibOO,,,user32+kernel32/noi;

fibOO 12 1000000
fibOO(12)=144  (1000000 times in 4.000000 s)


[non-OO]
dmd -O -inline -release fib.d
C:\dmd\bin\..\..\dm\bin\link.exe fib,,,user32+kernel32/noi;

fib 12 1000000
fib(12)=144  (1000000 times in 2.000000 s)


D's OO version is still a bit slower than SmartEiffel:

fibOO 12 1000000
Fibonacci (recursive, OO) in SmartEiffel
1000000 times:  fibOO(12)= 144
time (s)= 3.766000



And still a bit slower than Java (source at the end):

java FibOO 12 1000000
Fibonacci (recursive, OO) in Java
1000000 times: fibOO(12)= 144
time(ms)= 3641


I am guessing that SmartEiffel is doing its awesome compile-time 
optimizations of polymorphic calls (when they are not really polymorphic), 
and Java is doing its smart JIT thing.

Question is: shouldn't D be as fast as (or faster than) SmartEiffel ?

marcio
---------

public class FibOO {
//	Fibonacci benchmark done in an OO way	

 public int fibr(int n) {
	if (n <= 2)
		return 1;
	else
		return fibr(n - 1) + fibr(n - 2);
 }

 public static void main(String[] args) {
	System.out.println("Fibonacci (recursive, OO) in Java");
	int n = Integer.parseInt(args[0]);
	int count = Integer.parseInt(args[1]);
	FibOO fibOO= new FibOO();
	int fib = 0;
	long start = System.currentTimeMillis();
	for (int i = 0 ; i < count ; i++) {
		fib = fibOO.fibr(n);
	}
	long stop = System.currentTimeMillis();
	System.out.println(" " + count + " times: fibOO(" + n + ")= " + fib);
	System.out.println(" time(ms)= " + (stop - start));
 }
}
April 14, 2005
Re: speed of polymorphic calls
Yes - there does seem to be a penalty for OO in any case for this test. The 
reason is probably because all methods are considered virtual in D by 
default. One of the things pointed out by others in this thread is that 
"finalizing" the class or method gets rid of some of this penalty and makes 
up the difference between D and either SE or Java.

For example final class Fib { ... } or final fibr(...)

IIRC, it's been a goal stated in the past for the compiler to 
"auto-finalize" if it can. Because of the language design this is something 
D compilers can readily do, even across source files. But it hasn't been 
implemented yet - the D compiler is not at v1.0 yet (but that's not to imply 
that v1.0 will do auto-finalization either - I don't know).

If you're interested in taking a look at how D does vs. other languages in 
general, the best source right now is probably:

http://shootout.alioth.debian.org/great/benchmark.php?test=all&lang=all&sort=fullcpu

Thanks,
- Dave

"Marcio" <mqmnews321@sglebs.com> wrote in message 
news:Xns96385C92189CDmyidtoken@63.105.9.61...
> Dave <Dave_member@pathlink.com> wrote in
> news:d3kn2c$2ul7$1@digitaldaemon.com:
>
>>
>> How were they compiled?
>
>    D: I had used just -O
>
>    SmartEiffel: with this BAT:
>
> rem Don't forget to activate the MS tools first:
> rem  "C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin
> \vcvars32.bat"
>
> compile -boost -no_gc -relax -no_split fiboo
>
>
>
>> DMD should have been: dmd -O -inline -release <file_name>
>
>
>    With -O -inline -release Now I get this:
>
> [OO]
> dmd -O -inline -release fibOO.d
> C:\dmd\bin\..\..\dm\bin\link.exe fibOO,,,user32+kernel32/noi;
>
> fibOO 12 1000000
> fibOO(12)=144  (1000000 times in 4.000000 s)
>
>
> [non-OO]
> dmd -O -inline -release fib.d
> C:\dmd\bin\..\..\dm\bin\link.exe fib,,,user32+kernel32/noi;
>
> fib 12 1000000
> fib(12)=144  (1000000 times in 2.000000 s)
>
>
> D's OO version is still a bit slower than SmartEiffel:
>
> fibOO 12 1000000
> Fibonacci (recursive, OO) in SmartEiffel
> 1000000 times:  fibOO(12)= 144
> time (s)= 3.766000
>
>
>
> And still a bit slower than Java (source at the end):
>
> java FibOO 12 1000000
> Fibonacci (recursive, OO) in Java
> 1000000 times: fibOO(12)= 144
> time(ms)= 3641
>
>
> I am guessing that SmartEiffel is doing its awesome compile-time
> optimizations of polymorphic calls (when they are not really polymorphic),
> and Java is doing its smart JIT thing.
>
> Question is: shouldn't D be as fast as (or faster than) SmartEiffel ?
>
> marcio
> ---------
>
> public class FibOO {
> // Fibonacci benchmark done in an OO way
>
>  public int fibr(int n) {
> if (n <= 2)
> return 1;
> else
> return fibr(n - 1) + fibr(n - 2);
>  }
>
>  public static void main(String[] args) {
> System.out.println("Fibonacci (recursive, OO) in Java");
> int n = Integer.parseInt(args[0]);
> int count = Integer.parseInt(args[1]);
> FibOO fibOO= new FibOO();
> int fib = 0;
> long start = System.currentTimeMillis();
> for (int i = 0 ; i < count ; i++) {
> fib = fibOO.fibr(n);
> }
> long stop = System.currentTimeMillis();
> System.out.println(" " + count + " times: fibOO(" + n + ")= " + fib);
> System.out.println(" time(ms)= " + (stop - start));
>  }
> }
April 14, 2005
Re: speed of polymorphic calls
"Dave" <Dave_member@pathlink.com> wrote in
news:d3m1oa$15de$1@digitaldaemon.com: 

> IIRC, it's been a goal stated in the past for the compiler to 
> "auto-finalize" if it can. Because of the language design this is
> something D compilers can readily do, even across source files. But it
> hasn't been implemented yet - the D compiler is not at v1.0 yet (but
> that's not to imply that v1.0 will do auto-finalization either - I
> don't know). 


 It will be interesting to see when it gets added.

 I also wonder if global analysis & this optimization can be done at all if 
you allow the system to be extended via "OO DLLs". I mean components written 
in your OO language which can be linked in at runtime. For example, OTI/IBM 
Smalltalk's ICX stuff, or Java JARs etc. Does D support this ? Does it plan 
to ? Interestingly enough, SmartEiffel does not support this feature, but 
only a "monolythic app". It will be interesting to see what D will be capable 
of doing in this area... speed versus runtime OO extensions... Will it be 
able to do both ?

marcio
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home