Thread overview | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
April 13, 2005 speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Hi, I was comparing the speed of polymorphic calls in D versus other languages (SmartEiffel, for example) and noticed that D's numbers do not shine. Having a Smalltalk background, I used the favorite little benchmark of Smalltalkers: fibonacci (tests polymorphic sends and integer arithmetic). The sources for SmartEiffel and D are at the end. SmartEiffel 2.1b3 computes fib(12) in a loop of 1,000,000 times in 3.04 seconds. D computes fib(12) in a loop of 1,000,000 times in 7 seconds. (Java JRE 1.4.1_01 does it in 3.73 seconds thanks to its JIT) Is there any hope that the generated D code will be improved/optimized ? Maybe D's OO runtime could incorporate Polymorphic Inline Caches etc ? ( http://c2.com/cgi/wiki?PolymorphicInlineCaches ) Note: if you switch to a pure functional approach (get rid of the class around fibr etc) the number comes down to 2 seconds. Cheers, marcio ---------------------- cut here -------------------------------- class FIBOO -- eiffel wants all uppercase letters for class name creation make feature make is local n, count, fib_value, i: INTEGER; start_time, end_time : MICROSECOND_TIME do print( "Fibonacci (recursive, OO) in SmartEiffel%N" ) n := argument(1).to_integer count := argument(2).to_integer start_time.update from i := 0 until i = count loop fib_value := fibr (n) i := i + 1 end end_time.update print( "%N " + count.to_string + " times: fibOO(" + n.to_string + ")= " + fib_value.to_string) print( "%N time (s)= " + (start_time.elapsed_seconds (end_time)).to_string) end -- make fibr (n : INTEGER) : INTEGER is do if (n <= 2) then Result := 1 else Result := fibr (n-1) + fibr (n-2) end end -- fibr end -- class FIBOO ---------------------- cut here -------------------------------- import std.c.stdio; import std.string; import std.date; /* Compute fibonacci recursively, in an OO-way , in D ( http://www.digitalmars.com/d/index.html ) */ class Fib { int fibr(int n) { if (n <= 2) return(1); else return fibr(n-1)+fibr(n-2); } } int main (char[][] args) { if (args.length<3) { printf ("fib <num> <loopCount>"); return -1; } int num = atoi (args[1]); int loopCount = atoi (args[2]); Fib f = new Fib(); int result; d_time theStart = getUTCtime(); for (int i = 0; i < loopCount; i++) { result = f.fibr(num); } d_time theEnd = getUTCtime(); float totalTimeInSeconds = (theEnd - theStart) / TicksPerSecond; printf("fibOO(%i)=%i (%i times in %f s)", num, result, loopCount, totalTimeInSeconds); return 0; } ---------------------- cut here -------------------------------- |
April 13, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marcio | I tried this out just to test a theory and the results surprised me. I altered the code a bit to this: import std.c.stdio; import std.c.stdlib; import std.c.string; import std.c.time; /* Compute fibonacci recursively, in an OO-way , in D ( http://www.digitalmars.com/d/index.html ) */ class Fib { int fibr(int n) { if (n <= 2) return(1); else return fibr(n-1)+fibr(n-2); } } int main (char[][] args) { if (args.length<3) { printf ("fib <num> <loopCount>"); return -1; } int num = atoi (args[1]); int loopCount = atoi (args[2]); Fib f = new Fib(); int result; clock_t theStart = clock(); for (int i = 0; i < loopCount; i++) { result = f.fibr(num); } clock_t theEnd = clock(); double totalTimeInSeconds = cast(double) (theEnd - theStart) / CLOCKS_PER_SEC; printf("fibOO(%i)=%i (%i times in %f s)", num, result, loopCount, totalTimeInSeconds); return 0; } I compiled with options: -release -O -inline and ran: fib 12 1000000 twice. The average time was ~4.4 seconds. Next, I edited the code to make Fin.fibr a final method, re-compiled, and re-ran. The result? An average time of ~8.8 seconds! Why in the world is declaring the method "final" doubling execution time? Sean |
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly <sean@f4.ca> wrote:
[...]
> Why in the world
> is declaring the method "final" doubling execution time?
By a mistake of yours?
My test did not confirm your result, because declaring `fibr'
`final' caused a significant drop of
11% execution time compiled by dmd 0.120 from 2.422s to 2.156s
15% execution time compiled by gdc 0.10 from 1.953s to 1.656s
-manfred
|
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marcio | Marcio <mqmnews321@sglebs.com> wrote:
[...]
> D computes fib(12) in a loop of 1,000,000 times in 7
> seconds.
[...]
Did you use `-release -O -inline'? This seems to reduce execution time by more than 60%.
-manfred
|
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manfred Nowak | In article <d3kjd9$2sc7$1@digitaldaemon.com>, Manfred Nowak says... > >Sean Kelly <sean@f4.ca> wrote: > >[...] >> Why in the world >> is declaring the method "final" doubling execution time? > >By a mistake of yours? > >My test did not confirm your result, because declaring `fibr' >`final' caused a significant drop of >11% execution time compiled by dmd 0.120 from 2.422s to 2.156s >15% execution time compiled by gdc 0.10 from 1.953s to 1.656s I must have messed something up doing my first test. When I run it now, the standard implementation averages ~4.0 seconds and adding 'final' reduces the time to ~3.3 seconds. Sean |
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marcio | How were they compiled? DMD should have been: dmd -O -inline -release <file_name> I get the same results with DMD vs. the latest SmartEiffel v2-1 downloaded from the loria Site (with the same code as below). If I finalize or make the fibr() method private, DMD is actually faster. I compiled the SE code with: compile -boost -no_split -O3 fib.e -o fib_e The compiler options for DMD can be viewed by just typing 'dmd' at the command line. They are also here: http://digitalmars.com/d/dcompiler.html - Dave In article <Xns9637C09FD665Fmyidtoken@63.105.9.61>, Marcio says... > >Hi, > > I was comparing the speed of polymorphic calls in D versus other >languages (SmartEiffel, for example) and noticed that D's numbers do not shine. > > Having a Smalltalk background, I used the favorite little benchmark >of Smalltalkers: fibonacci (tests polymorphic sends and integer arithmetic). The sources for SmartEiffel and D are at the end. > > SmartEiffel 2.1b3 computes fib(12) in a loop of 1,000,000 times in >3.04 seconds. > > D computes fib(12) in a loop of 1,000,000 times in 7 seconds. > > (Java JRE 1.4.1_01 does it in 3.73 seconds thanks to its JIT) > > Is there any hope that the generated D code will be >improved/optimized ? Maybe D's OO runtime could incorporate Polymorphic Inline Caches etc ? ( http://c2.com/cgi/wiki?PolymorphicInlineCaches ) > > Note: if you switch to a pure functional approach (get rid of the >class around fibr etc) the number comes down to 2 seconds. > > Cheers, > >marcio > >---------------------- cut here -------------------------------- >class FIBOO -- eiffel wants all uppercase letters for class name > >creation > make > >feature > make is > local > n, count, fib_value, i: INTEGER; > start_time, end_time : MICROSECOND_TIME > do > print( "Fibonacci (recursive, OO) in SmartEiffel%N" ) > n := argument(1).to_integer > count := argument(2).to_integer > > start_time.update > from i := 0 until i = count loop > fib_value := fibr (n) > i := i + 1 > end > end_time.update > print( "%N " + count.to_string + " times: fibOO(" + >n.to_string + ")= " + fib_value.to_string) > print( "%N time (s)= " + (start_time.elapsed_seconds >(end_time)).to_string) > end -- make > > fibr (n : INTEGER) : INTEGER is > do > if (n <= 2) then > Result := 1 > else > Result := fibr (n-1) + fibr (n-2) > end > end -- fibr > >end -- class FIBOO >---------------------- cut here -------------------------------- > >import std.c.stdio; >import std.string; >import std.date; > >/* > Compute fibonacci recursively, in an OO-way , > in D ( http://www.digitalmars.com/d/index.html ) >*/ > >class Fib { > int fibr(int n) { > if (n <= 2) return(1); > else return fibr(n-1)+fibr(n-2); > } >} > >int main (char[][] args) { > if (args.length<3) { > printf ("fib <num> <loopCount>"); > return -1; > } > int num = atoi (args[1]); > int loopCount = atoi (args[2]); > Fib f = new Fib(); > int result; > d_time theStart = getUTCtime(); > for (int i = 0; i < loopCount; i++) { > result = f.fibr(num); > } > d_time theEnd = getUTCtime(); > float totalTimeInSeconds = (theEnd - theStart) / TicksPerSecond; > printf("fibOO(%i)=%i (%i times in %f s)", num, result, loopCount, >totalTimeInSeconds); > return 0; >} >---------------------- cut here -------------------------------- > > |
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manfred Nowak | In article <d3kjd9$2sc7$1@digitaldaemon.com>, Manfred Nowak says... > >Sean Kelly <sean@f4.ca> wrote: > >[...] >> Why in the world >> is declaring the method "final" doubling execution time? > >By a mistake of yours? > >My test did not confirm your result, because declaring `fibr' >`final' caused a significant drop of >11% execution time compiled by dmd 0.120 from 2.422s to 2.156s >15% execution time compiled by gdc 0.10 from 1.953s to 1.656s > >-manfred What compiler options? W/o 'final' or 'private' DMD is the same as GDC (well, 5% diff, not 25% like your results). W/ final or private, DMD is actually a tad bit faster. - Dave |
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dave | Dave <Dave_member@pathlink.com> wrote in news:d3kn2c$2ul7$1@digitaldaemon.com: > > How were they compiled? D: I had used just -O SmartEiffel: with this BAT: rem Don't forget to activate the MS tools first: rem "C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin \vcvars32.bat" compile -boost -no_gc -relax -no_split fiboo > DMD should have been: dmd -O -inline -release <file_name> With -O -inline -release Now I get this: [OO] dmd -O -inline -release fibOO.d C:\dmd\bin\..\..\dm\bin\link.exe fibOO,,,user32+kernel32/noi; fibOO 12 1000000 fibOO(12)=144 (1000000 times in 4.000000 s) [non-OO] dmd -O -inline -release fib.d C:\dmd\bin\..\..\dm\bin\link.exe fib,,,user32+kernel32/noi; fib 12 1000000 fib(12)=144 (1000000 times in 2.000000 s) D's OO version is still a bit slower than SmartEiffel: fibOO 12 1000000 Fibonacci (recursive, OO) in SmartEiffel 1000000 times: fibOO(12)= 144 time (s)= 3.766000 And still a bit slower than Java (source at the end): java FibOO 12 1000000 Fibonacci (recursive, OO) in Java 1000000 times: fibOO(12)= 144 time(ms)= 3641 I am guessing that SmartEiffel is doing its awesome compile-time optimizations of polymorphic calls (when they are not really polymorphic), and Java is doing its smart JIT thing. Question is: shouldn't D be as fast as (or faster than) SmartEiffel ? marcio --------- public class FibOO { // Fibonacci benchmark done in an OO way public int fibr(int n) { if (n <= 2) return 1; else return fibr(n - 1) + fibr(n - 2); } public static void main(String[] args) { System.out.println("Fibonacci (recursive, OO) in Java"); int n = Integer.parseInt(args[0]); int count = Integer.parseInt(args[1]); FibOO fibOO= new FibOO(); int fib = 0; long start = System.currentTimeMillis(); for (int i = 0 ; i < count ; i++) { fib = fibOO.fibr(n); } long stop = System.currentTimeMillis(); System.out.println(" " + count + " times: fibOO(" + n + ")= " + fib); System.out.println(" time(ms)= " + (stop - start)); } } |
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marcio | Yes - there does seem to be a penalty for OO in any case for this test. The reason is probably because all methods are considered virtual in D by default. One of the things pointed out by others in this thread is that "finalizing" the class or method gets rid of some of this penalty and makes up the difference between D and either SE or Java. For example final class Fib { ... } or final fibr(...) IIRC, it's been a goal stated in the past for the compiler to "auto-finalize" if it can. Because of the language design this is something D compilers can readily do, even across source files. But it hasn't been implemented yet - the D compiler is not at v1.0 yet (but that's not to imply that v1.0 will do auto-finalization either - I don't know). If you're interested in taking a look at how D does vs. other languages in general, the best source right now is probably: http://shootout.alioth.debian.org/great/benchmark.php?test=all&lang=all&sort=fullcpu Thanks, - Dave "Marcio" <mqmnews321@sglebs.com> wrote in message news:Xns96385C92189CDmyidtoken@63.105.9.61... > Dave <Dave_member@pathlink.com> wrote in news:d3kn2c$2ul7$1@digitaldaemon.com: > >> >> How were they compiled? > > D: I had used just -O > > SmartEiffel: with this BAT: > > rem Don't forget to activate the MS tools first: > rem "C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin > \vcvars32.bat" > > compile -boost -no_gc -relax -no_split fiboo > > > >> DMD should have been: dmd -O -inline -release <file_name> > > > With -O -inline -release Now I get this: > > [OO] > dmd -O -inline -release fibOO.d > C:\dmd\bin\..\..\dm\bin\link.exe fibOO,,,user32+kernel32/noi; > > fibOO 12 1000000 > fibOO(12)=144 (1000000 times in 4.000000 s) > > > [non-OO] > dmd -O -inline -release fib.d > C:\dmd\bin\..\..\dm\bin\link.exe fib,,,user32+kernel32/noi; > > fib 12 1000000 > fib(12)=144 (1000000 times in 2.000000 s) > > > D's OO version is still a bit slower than SmartEiffel: > > fibOO 12 1000000 > Fibonacci (recursive, OO) in SmartEiffel > 1000000 times: fibOO(12)= 144 > time (s)= 3.766000 > > > > And still a bit slower than Java (source at the end): > > java FibOO 12 1000000 > Fibonacci (recursive, OO) in Java > 1000000 times: fibOO(12)= 144 > time(ms)= 3641 > > > I am guessing that SmartEiffel is doing its awesome compile-time optimizations of polymorphic calls (when they are not really polymorphic), and Java is doing its smart JIT thing. > > Question is: shouldn't D be as fast as (or faster than) SmartEiffel ? > > marcio > --------- > > public class FibOO { > // Fibonacci benchmark done in an OO way > > public int fibr(int n) { > if (n <= 2) > return 1; > else > return fibr(n - 1) + fibr(n - 2); > } > > public static void main(String[] args) { > System.out.println("Fibonacci (recursive, OO) in Java"); > int n = Integer.parseInt(args[0]); > int count = Integer.parseInt(args[1]); > FibOO fibOO= new FibOO(); > int fib = 0; > long start = System.currentTimeMillis(); > for (int i = 0 ; i < count ; i++) { > fib = fibOO.fibr(n); > } > long stop = System.currentTimeMillis(); > System.out.println(" " + count + " times: fibOO(" + n + ")= " + fib); > System.out.println(" time(ms)= " + (stop - start)); > } > } |
April 14, 2005 Re: speed of polymorphic calls | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dave | "Dave" <Dave_member@pathlink.com> wrote in news:d3m1oa$15de$1@digitaldaemon.com: > IIRC, it's been a goal stated in the past for the compiler to "auto-finalize" if it can. Because of the language design this is something D compilers can readily do, even across source files. But it hasn't been implemented yet - the D compiler is not at v1.0 yet (but that's not to imply that v1.0 will do auto-finalization either - I don't know). It will be interesting to see when it gets added. I also wonder if global analysis & this optimization can be done at all if you allow the system to be extended via "OO DLLs". I mean components written in your OO language which can be linked in at runtime. For example, OTI/IBM Smalltalk's ICX stuff, or Java JARs etc. Does D support this ? Does it plan to ? Interestingly enough, SmartEiffel does not support this feature, but only a "monolythic app". It will be interesting to see what D will be capable of doing in this area... speed versus runtime OO extensions... Will it be able to do both ? marcio |
Copyright © 1999-2021 by the D Language Foundation