On 10 January 2012 08:09, Martin Nowak <dawg@dawgfoto.de> wrote:

Am 07.01.2012, 21:44 Uhr, schrieb Piotr Szturmaj <bncrbme@jadamspam.pl>:

The idea is to make versions of code that are environment dependent and never change during runtime, _without_ resorting to if statements. This statement would be valid only inside function bodies.

Examples of such versions may be:
* supported SIMD CPU extensions MMX, SSE, SSE2, etc.
* AMD vs Intel CPU, to use instructions that are not available on both
* different OS versions (XP/Vista/7, Linux kernel versions)

Why that instead of current if statement?
* some additional speed, avoids multiple checks in frequent operations
* making specific executables (f.i. SSE4 only) by limiting set of supported runtime options during compile time

Code example:

void main()
{
version(rt_SSE4)
{
...
}
else version(rt_SSE2)
{
...
}
else
{
// portable code
}
}

In this example program checks the supported extensions only once, before calling main(). Then it modifies the function code to make it execute only versions that match.

Runtime version identifiers may be set inside shared static constructors of modules (this implies that rt-version may not be used inside of them). SIMD extensions would preferably be set by druntime with help of core.cpuid.

Code modification mechanism is up to implementation. One that come to my mind is inserting unconditional jumps by the compiler then and fixing them up before calling main().

Additional advantage is possibility to generate executables for particular environments. This may help reduce execucutable size when targeting specific CPU, especially some constrained/embedded system. Also many cpuid checks inside druntime may be avoided.

Just thinking loud :)

Because it could only fix non-inlined code you
can as well use lazy binding using thunks.

// use static to make it re-entrant safe
__gshared R function(Args) doSomething = &setThunk;

R setThunk(Args args)
{
if (sse4)
{
doSomeThing = &sse4Impl;
}
else if (sse2)
{
doSomeThing = &sse2Impl;
}
else
{
doSomeThing = &nativeImpl;
}

return doSomeThing(args);
}

Much simpler, thread safe and more efficient.

__gshared SSE2 = tuple("foo", &sse2Foo, "bar", &sse2Bar);
__gshared SSE4 = tuple("foo", &sse4Foo, "bar", &sse4Bar);
__gshared typeof(SSE2)* _impl;

shared static this()
{
if (sse4)
{
_impl = &SSE4;
}
else if (sse2)
{
_impl = &SSE2;
}
else
{
_impl = &Native;
}
}

_impl.foo(args);