July 29, 2012
std.variant is so incredibly slow! It's practically unusable for anything,
which requires even a tiny bit of performance.
Benchmark done under -noboundscheck -inline -release -O:

import std.stdio;
import std.variant;
import std.datetime;

void on()
{
auto var = Variant(5);
int i = var.get!int;
}

void off()
{
auto var = 5;
int i = var;
}

void main()
{
writeln(benchmark!(on, off)(100000));
}

The result is:

[TickDuration(25094), TickDuration(98)]

There are tons of cases, where a simple typeless data storage is necessary. No type information, no type checking - just a low-level storage, upon which Variant and other dynamic-type constructs can be built.

I want to ask the community: what's the best way to store any variable in a typeless storage, so that one could store any variable in that storage and get a reference to that variable given its static type with no type checking and minimal overhead compared to a statically typed storage and with type-safe storage (postblits, garbage collection...)?

-- 
Bye,
Gor Gyolchanyan.


July 29, 2012
On 7/29/12 8:17 AM, Gor Gyolchanyan wrote:
> std.variant is so incredibly slow! It's practically unusable for
> anything, which requires even a tiny bit of performance.

You do realize you actually benchmark against a function that does nothing, right? Clearly there are ways in which we can improve std.variant to the point initialization costs assignment of two words, but this benchmark doesn't help. (Incidentally I just prepared a class at C++ and Beyond on benchmarking, and this benchmark makes a lot of the mistakes described therein...)


Andrei
July 29, 2012
On Sun, Jul 29, 2012 at 6:17 PM, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> wrote:

> On 7/29/12 8:17 AM, Gor Gyolchanyan wrote:
>
>> std.variant is so incredibly slow! It's practically unusable for anything, which requires even a tiny bit of performance.
>>
>
> You do realize you actually benchmark against a function that does nothing, right? Clearly there are ways in which we can improve std.variant to the point initialization costs assignment of two words, but this benchmark doesn't help. (Incidentally I just prepared a class at C++ and Beyond on benchmarking, and this benchmark makes a lot of the mistakes described therein...)
>
>
> Andrei
>

I do compare it with nothing, just to see how many times does it exceed the
performance of static typed storage. The point is that Variant is extremely
slow.
All I want is to find out how to implement a very fast typeless storage
with maximum performance and type safety.

-- 
Bye,
Gor Gyolchanyan.


July 29, 2012
> I do compare it with nothing, just to see how many times does it exceed the
> performance of static typed storage. The point is that Variant is extremely
> slow.
> All I want is to find out how to implement a very fast typeless storage
> with maximum performance and type safety.

Isn't that a paradox? What kind of type safety do you expect from a typeless storage?
July 29, 2012
On 29-Jul-12 18:17, Andrei Alexandrescu wrote:
> On 7/29/12 8:17 AM, Gor Gyolchanyan wrote:
>> std.variant is so incredibly slow! It's practically unusable for
>> anything, which requires even a tiny bit of performance.
>
> You do realize you actually benchmark against a function that does
> nothing, right? Clearly there are ways in which we can improve
> std.variant to the point initialization costs assignment of two words,
> but this benchmark doesn't help. (Incidentally I just prepared a class
> at C++ and Beyond on benchmarking, and this benchmark makes a lot of the
> mistakes described therein...)
>
>
> Andrei


This should be more relevant then:

//fib.d
import std.datetime, std.stdio, std.variant;

auto fib(Int)()
{
	Int a = 1, b = 1;
	for(size_t i=0; i<100; i++){
		Int c = a + b;
		a = b;
		b = c;
	}
	return a;	
}

void main()
{
	writeln(benchmark!(fib!int, fib!long, fib!Variant)(10_000));
}


dmd -O -inline -release fib.d

Output:

[TickDuration(197), TickDuration(276), TickDuration(93370107)]

I'm horrified. Who was working on std.variant enhancements? Please chime in.

-- 
Dmitry Olshansky
July 29, 2012
On Sun, Jul 29, 2012 at 6:43 PM, Dmitry Olshansky <dmitry.olsh@gmail.com>wrote:

> On 29-Jul-12 18:17, Andrei Alexandrescu wrote:
>
>> On 7/29/12 8:17 AM, Gor Gyolchanyan wrote:
>>
>>> std.variant is so incredibly slow! It's practically unusable for anything, which requires even a tiny bit of performance.
>>>
>>
>> You do realize you actually benchmark against a function that does nothing, right? Clearly there are ways in which we can improve std.variant to the point initialization costs assignment of two words, but this benchmark doesn't help. (Incidentally I just prepared a class at C++ and Beyond on benchmarking, and this benchmark makes a lot of the mistakes described therein...)
>>
>>
>> Andrei
>>
>
>
> This should be more relevant then:
>
> //fib.d
> import std.datetime, std.stdio, std.variant;
>
> auto fib(Int)()
> {
>         Int a = 1, b = 1;
>         for(size_t i=0; i<100; i++){
>                 Int c = a + b;
>                 a = b;
>                 b = c;
>         }
>         return a;
> }
>
> void main()
> {
>         writeln(benchmark!(fib!int, fib!long, fib!Variant)(10_000));
> }
>
>
> dmd -O -inline -release fib.d
>
> Output:
>
> [TickDuration(197), TickDuration(276), TickDuration(93370107)]
>
> I'm horrified. Who was working on std.variant enhancements? Please chime in.
>
> --
> Dmitry Olshansky
>

Thank you for demonstrating my point. :-)

-- 
Bye,
Gor Gyolchanyan.


July 29, 2012
On Sun, Jul 29, 2012 at 6:41 PM, Tobias Pankrath <tobias@pankrath.net>wrote:

> I do compare it with nothing, just to see how many times does it exceed the
>> performance of static typed storage. The point is that Variant is
>> extremely
>> slow.
>> All I want is to find out how to implement a very fast typeless storage
>> with maximum performance and type safety.
>>
>
> Isn't that a paradox? What kind of type safety do you expect from a typeless storage?
>

The kind, which doesn't forget about postblits and destructors. This isn't a paradox;

-- 
Bye,
Gor Gyolchanyan.


July 29, 2012
On Sunday, 29 July 2012 at 14:43:09 UTC, Dmitry Olshansky wrote:
> On 29-Jul-12 18:17, Andrei Alexandrescu wrote:
>> On 7/29/12 8:17 AM, Gor Gyolchanyan wrote:
>>> std.variant is so incredibly slow! It's practically unusable for
>>> anything, which requires even a tiny bit of performance.
>>
>> You do realize you actually benchmark against a function that does
>> nothing, right? Clearly there are ways in which we can improve
>> std.variant to the point initialization costs assignment of two words,
>> but this benchmark doesn't help. (Incidentally I just prepared a class
>> at C++ and Beyond on benchmarking, and this benchmark makes a lot of the
>> mistakes described therein...)
>>
>>
>> Andrei
>
>
> This should be more relevant then:
>
> //fib.d
> import std.datetime, std.stdio, std.variant;
>
> auto fib(Int)()
> {
> 	Int a = 1, b = 1;
> 	for(size_t i=0; i<100; i++){
> 		Int c = a + b;
> 		a = b;
> 		b = c;
> 	}
> 	return a;	
> }
>
> void main()
> {
> 	writeln(benchmark!(fib!int, fib!long, fib!Variant)(10_000));
> }
>
>
> dmd -O -inline -release fib.d
>
> Output:
>
> [TickDuration(197), TickDuration(276), TickDuration(93370107)]
>
> I'm horrified. Who was working on std.variant enhancements? Please chime in.

I thought this results are a bit strange, so I converted the result to seconds. This gave me:

[3.73e-06, 3.721e-06, 2.97281]

One million inner loop iterations in under 4 microseconds? My processor's frequency isn't measured in THz, so something strange must be going on here. In order to find out what it was, I changed the code to this:

    writeln(benchmark!(fib!int, fib!long)(1000_000_000)[]
        .map!"a.nsecs() * 1.0e-9");

and used a profiler on it. The relevant part of the output is:

    0.00 :	  445969:       test   %r12d,%r12d
    0.00 :	  44596c:       je     445975 <_D3std8date
   46.67 :	  44596e:       inc    %ebx
    0.00 :	  445970:       cmp    %r12d,%ebx
    0.00 :	  445973:       jb     44596e <_D3std8date
    0.00 :	  445975:       lea    -0x18(%rbp),%rdi
    0.00 :	  445979:       callq  45a048 <_D3std8date
    0.00 :	  44597e:       mov    %rax,0x0(%r13)
    0.00 :	  445982:       lea    -0x18(%rbp),%rdi
    0.00 :	  445986:       callq  459fb4 <_D3std8date
    0.00 :	  44598b:       xor    %ebx,%ebx
    0.00 :	  44598d:       test   %r12d,%r12d
    0.00 :	  445990:       je     445999 <_D3std8date
   53.33 :	  445992:       inc    %ebx
    0.00 :	  445994:       cmp    %r12d,%ebx
    0.00 :	  445997:       jb     445992 <_D3std8date


As you can see, most of the time is spent in two loops with empty body, so your code is benchmarking Variant against nothing, too. Adding asm{ nop; } to fib changes the output to this:

[0.00437154, 0.00444938, 3.03917]

Whih is still a huge difference.

July 29, 2012
On 7/29/12 10:43 AM, Dmitry Olshansky wrote:
> I'm horrified. Who was working on std.variant enhancements? Please chime
> in.

I guess you just volunteered! When I looked at it this morning I noticed a few signs of bit rot, e.g. opAssign returns by value and such. (Only planting a "ref" there improves performance a good amount.)

Variant has a simple design with (in case of int) an int and a pointer to a function. Many of its operations incur an indirect call through that pointer. This makes operations slower than the time-honored design of using an integral tag and switching on it, but offers in return the ability to hold any type without needing to enumerate all types explicitly.

We can use the pointer to function as a tag for improving performance on primitive types.


Andrei


July 29, 2012
On 7/29/12 2:57 PM, jerro wrote:
> I thought this results are a bit strange, so I converted the result to
> seconds. This gave me:
>
> [3.73e-06, 3.721e-06, 2.97281]

Very nice analysis!

Andrei
« First   ‹ Prev
1 2 3
Top | Discussion index | About this forum | D home