On 13 April 2012 15:53, Kagamin <spam@here.lot> wrote:
On Sunday, 8 April 2012 at 12:02:10 UTC, Alex Rønne Petersen wrote:
This sounds important to me. If it is also possible to do the work with
generated tables, and not calling thousands of indirect functions in
someone's implementation, it would be nice to reserve that possibility.
Indirect function calls in hot loops make me very nervous for non-x86
machines.

Yes, I agree here. The last thing we need is a huge amount of kinda-sorta-virtual function calls on ARM, MIPS, etc. It may work fine on x86, but anywhere else, it's really not what you want in a GC.

What's the problem with virtual calls on ARM?

No other processors have branch prediction units anywhere near the sophistication of modern x86. Any call through a function pointer stalls the pipeline, pipelines are getting longer all the time, and PPC has even more associated costs/hazards.
Most processors can only perform trivial binary branch prediction around an 'if'.
It also places burden on the icache (unable to prefetch), and of course the dcache, both of which are much less sophisticated than x86 aswell.
Compiler can't do anything with code locality (improves icache usage), since the target is unknown at compile time... there are also pipeline stalls introduced by the sequence of indirect pointer lookups preceding any virtual call.
Virtuals are possibly the worst hazard to modern CPU's, and the hardest to detect/profile, since their cost is evenly spread throughout the entire code base, you can never gauge their true impact on your performance. You also can't easily measure the affect of icache misses on your code, suffice to say, you will have MANY more in virtual-heavy code.

While I'm at it. 'final:' and 'virtual' keyword please ;)