Thread overview
Efficient dynamic dispatch without virtual function tables: the SmallEiffel compiler
Mar 07, 2008
Craig Black
Mar 07, 2008
Craig Black
Mar 07, 2008
Robert Fraser
Mar 07, 2008
Kris
Mar 07, 2008
Craig Black
Mar 07, 2008
bearophile
Mar 07, 2008
Craig Black
Mar 08, 2008
Robert Fraser
March 07, 2008
I found this article on the internet.  It concurs with my theories about the overhead of virtual function calls.  Here's the abstract:

SmallEiffel is an Eiffel compiler which uses a fast simple type inference
mechanism to remove most late binding calls, replacing them by static
bindings. Starting from the system's entry point, it compiles only
statically living code, which saves compiling and then removing dead code.
As the whole system is analyzed at compile time, multiple inheritance and
genericity do not cause any overhead.SmallEiffel features a coding scheme
which eliminates the need for virtual function tables. Dynamic dispatch is
implemented without any array access but uses a simple static binary branch
code. We show that this implementation makes it possible to use modern
hardware very efficiently. It also allows us to inline more calls even when
dynamic dispatch is required. Some more dispatch sites are removed after the
type inference algorithm has been performed, if the different branches of a
dispatch site lead to the same code.The advantage of this approach is that
it greatly speeds up execution time and considerably decreases the amount of
generated code.
-Craig




March 07, 2008
BTW,  I am VERY interested in this stuff.  I am already kludging my C++ code to improve performance by reducing the number of virtual function calls performed.  (The most common functions get called non-virtually, and the rest are called virtually.)  If this feature was built into a compiler that would greatly simplify things.  I would allow me to use virtual functions more freely without worrying so much about the overhead involved.

Another idea:  a feature like this would probably require each function in the virtual hiearchy to be assigned a precedence.  The precedence would represent how often the function is expected to be called.  Functions with a higher precedence could be called optimized to be called directly, whereas lower precedence functions could be called via a vtable.  The precedence could be determined by profiling the code or by specifying a precedence explicitly.  Perhaps something like:

class A
{
public:
   void func() { ... }
}

class B : A
{
public:
   precedence(1) void func() { .. }
}

class C : A
{
   precedence(2) void func() { ... }
}

-Craig


March 07, 2008
Craig Black wrote:
> I found this article on the internet.  It concurs with my theories about the overhead of virtual function calls.  Here's the abstract:
> 
> SmallEiffel is an Eiffel compiler which uses a fast simple type inference mechanism to remove most late binding calls, replacing them by static bindings. Starting from the system's entry point, it compiles only statically living code, which saves compiling and then removing dead code. As the whole system is analyzed at compile time, multiple inheritance and genericity do not cause any overhead.SmallEiffel features a coding scheme which eliminates the need for virtual function tables. Dynamic dispatch is implemented without any array access but uses a simple static binary branch code. We show that this implementation makes it possible to use modern hardware very efficiently. It also allows us to inline more calls even when dynamic dispatch is required. Some more dispatch sites are removed after the type inference algorithm has been performed, if the different branches of a dispatch site lead to the same code.The advantage of this approach is that it greatly speeds up execution time and considerably decreases the amount of generated code.
> -Craig

This has been something I always wondered whether it was possible or not, and I guess it is ;-P. I would *love* to see this in D.

March 07, 2008
"Craig Black" <cblack@ara.com> wrote in message news:fqrt7n$1aj2$1@digitalmars.com...
>I found this article on the internet.  It concurs with my theories about the overhead of virtual function calls.  Here's the abstract:
>
> SmallEiffel is an Eiffel compiler which uses a fast simple type inference
> mechanism to remove most late binding calls, replacing them by static
> bindings. Starting from the system's entry point, it compiles only
> statically living code, which saves compiling and then removing dead code.
> As the whole system is analyzed at compile time, multiple inheritance and
> genericity do not cause any overhead.SmallEiffel features a coding scheme
> which eliminates the need for virtual function tables. Dynamic dispatch is
> implemented without any array access but uses a simple static binary
> branch code. We show that this implementation makes it possible to use
> modern hardware very efficiently. It also allows us to inline more calls
> even when dynamic dispatch is required. Some more dispatch sites are
> removed after the type inference algorithm has been performed, if the
> different branches of a dispatch site lead to the same code.The advantage
> of this approach is that it greatly speeds up execution time and
> considerably decreases the amount of generated code.
> -Craig


We've seen the results of this quite clearly in D, where xml DOM parsing is actually faster than SAX parsing (which should cause some folks to do a double-take about now). The reason is that the DOM parser avoids vcalls, whereas the SAX parser currently does not. That overhead is sufficient to offset all the extra work performed by the DOM parser. Yeah, this is a perhaps special-case, but an interesting one all the same?


March 07, 2008
"Kris" <foo@bar.com> wrote in message news:fqrvp5$1iu8$1@digitalmars.com...
>
> "Craig Black" <cblack@ara.com> wrote in message news:fqrt7n$1aj2$1@digitalmars.com...
>>I found this article on the internet.  It concurs with my theories about the overhead of virtual function calls.  Here's the abstract:
>>
>> SmallEiffel is an Eiffel compiler which uses a fast simple type inference mechanism to remove most late binding calls, replacing them by static bindings. Starting from the system's entry point, it compiles only statically living code, which saves compiling and then removing dead code. As the whole system is analyzed at compile time, multiple inheritance and genericity do not cause any overhead.SmallEiffel features a coding scheme which eliminates the need for virtual function tables. Dynamic dispatch is implemented without any array access but uses a simple static binary branch code. We show that this implementation makes it possible to use modern hardware very efficiently. It also allows us to inline more calls even when dynamic dispatch is required. Some more dispatch sites are removed after the type inference algorithm has been performed, if the different branches of a dispatch site lead to the same code.The advantage of this approach is that it greatly speeds up execution time and considerably decreases the amount of generated code. -Craig
>
>
> We've seen the results of this quite clearly in D, where xml DOM parsing is actually faster than SAX parsing (which should cause some folks to do a double-take about now). The reason is that the DOM parser avoids vcalls, whereas the SAX parser currently does not. That overhead is sufficient to offset all the extra work performed by the DOM parser. Yeah, this is a perhaps special-case, but an interesting one all the same?

Intersting case.  I would disagree that this is a special case.  Virtual function calls are a common bottleneck iobject-oriented code.


March 07, 2008
Craig Black:
>Some more dispatch sites are removed after the type inference algorithm has been performed,

ShedSkin has one of the most powerful type inference engines around, but it gets rather slow. So the things you say too probably require a cost in compile time.


> Virtual function calls are a common bottleneck iobject-oriented code.

Do you have any static to support what you say here? I have read an article that says that in typical OOP C++ code virtual calls slow down code by not too much:
"The Direct Cost of Virtual Function Calls in C++" (1996, oldish):
http://www.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.pdf

Bye,
bearophile
March 07, 2008
"bearophile" <bearophileHUGS@lycos.com> wrote in message news:fqs79a$26cq$1@digitalmars.com...
> Craig Black:
>>Some more dispatch sites are removed after the type inference algorithm has been performed,
>
> ShedSkin has one of the most powerful type inference engines around, but it gets rather slow. So the things you say too probably require a cost in compile time.
>
>
>> Virtual function calls are a common bottleneck iobject-oriented code.
>
> Do you have any static to support what you say here? I have read an
> article that says that in typical OOP C++ code virtual calls slow down
> code by not too much:
> "The Direct Cost of Virtual Function Calls in C++" (1996, oldish):
> http://www.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.pdf
>
> Bye,
> bearophile

As far as hardware is concerned, the longer the pipelines, the worse the penalty for virtual functions.  And modern Intel CPU's have incredibly long pipelines.  I do a lot of OOP and know from experience how bad this bottleneck can become.  Reducing the number of virtual functions for optimal performance is a big pain in the ass.

Here's an article that talks about reducing virtual function calls in C++. It talks specifically about doing this for game AI.

http://aigamedev.com/programming-tips/optimize-virtual-functions

-Craig


March 08, 2008
bearophile wrote:
> ShedSkin has one of the most powerful type inference engines around, but it gets rather slow. So the things you say too probably require a cost in compile time.

A fast compiler is rather necessary for debug/ad-hoc builds, but for release builds (where optimization is turned on), compile time is much less of an issue (unless you're trying to profile something and making frequent changes...). In other words: as long as it's a compiler option, this isn't a problem at all.