February 14, 2013
On Thursday, February 14, 2013 13:27:30 monarch_dodra wrote:
> Actually, (and IMO, this is a very big problem), these structures are *always* null by default. There is no easy way to "default initialize" structs in D :(

You mean there's no way to default-construct them. They're always default- initialized to their init value if you don't explicitly initialize them.

- Jonathan M Davis
February 14, 2013
Dmitry Olshansky wrote:
>> D2 (DMD 2.059, -O):             11,389,556
>
> I'd add :
>
> -release -inline
>
> or it may not inline away temporary copies.

Thank you for the suggestion.  However, the result after "-O -release -inline" is still 11,389,556.
February 14, 2013
When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.

I expect performance will increase substantially even with the extra heap allocations.

Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.

--rt
February 14, 2013
On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb@ucora.com> wrote:

> When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.
>
> I expect performance will increase substantially even with the extra heap allocations.
>
> Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.

If it was pass by ref, then rbt.insert(5) would not work.  And in fact, I wouldn't want things passed by ref if the element is int.  I have to admit, I did not consider expensive postblits when I designed it.  Almost all my testing is with integer types.

Unfortunately, D is in this state where taking a ref parameter means strictly lvalue.  So passing rvalues will not work.  D does not have the notion that C++ has where const type& can accept both rvalue and lvalue.  We desperately need something similar.

-Steve
February 14, 2013
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven Schveighoffer wrote:
> On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb@ucora.com> wrote:
>
>> When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.
>>
>> I expect performance will increase substantially even with the extra heap allocations.
>>
>> Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.
>
> If it was pass by ref, then rbt.insert(5) would not work.  And in fact, I wouldn't want things passed by ref if the element is int.  I have to admit, I did not consider expensive postblits when I designed it.  Almost all my testing is with integer types.

You can make it work by implementing two insert functions, one with pass by ref and one with pass by value. The pass by value implementation simply calls the pass by ref version, and everything from there is pass by ref where it makes sense as an optimization.

I understand the implementation difficulties, because we have multiple situations to consider and optimize for, such as simple POD which can often pass by value efficiently, but then we have complex structs that do not pass by value efficiently and should be passed by ref instead, but there's no way to detect this easily (I suppose you could try looking at the size and adjust the pass-by method somehow, but ...), and finally if you accept a class then you have different needs because a class is a pointer not a POD value, again I suppose you can detect if a class is being passed and adjust the implementation but that introduces more complexity ...

In some ways this is where C++ got it right from a consistency POV, where a class is no different from a struct, so whatever works for a class also works for a struct. In D, structs and classes operate in fundamentally different ways making implementing generics more difficult to do.

> Unfortunately, D is in this state where taking a ref parameter means strictly lvalue.  So passing rvalues will not work.  D does not have the notion that C++ has where const type& can accept both rvalue and lvalue.  We desperately need something similar.
>
> -Steve

Yes, I agree.

--rt
February 14, 2013
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven Schveighoffer wrote:
> On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb@ucora.com> wrote:
>
>> When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.
>>
>> I expect performance will increase substantially even with the extra heap allocations.
>>
>> Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.
>
> If it was pass by ref, then rbt.insert(5) would not work.  And in fact, I wouldn't want things passed by ref if the element is int.  I have to admit, I did not consider expensive postblits when I designed it.  Almost all my testing is with integer types.
>
> Unfortunately, D is in this state where taking a ref parameter means strictly lvalue.  So passing rvalues will not work.  D does not have the notion that C++ has where const type& can accept both rvalue and lvalue.  We desperately need something similar.
>
> -Steve

There are discussions for such thing for almost a year - but nothing has changed so far.
February 14, 2013
On Thursday, 14 February 2013 at 19:56:33 UTC, Namespace wrote:
>
> There are discussions for such thing for almost a year - but nothing has changed so far.

I think its finally on the priority radar, but there's still other things in more dire need to be resolved first, like @property and lack of shared library support, and probably something else too ...

--rt
February 14, 2013
Rob T wrote:
> When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.

Thank you all for assistance, I've finally pinned it down!  It's the default magic "a < b" that devours the most copies of my precious structs!  Once I change the declaration:

-----
alias RedBlackTree !(element) container;
-----

to

-----
bool lessfun (ref element a, ref element b)
{
	return a < b;
}

alias RedBlackTree !(element, lessfun) container;
-----

... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556.  While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than before: O (n) versus O (n log n) calls.  And it is now less of a bottleneck in my original program.

This raises the question to the default behavior of std.functional.binaryFun.  Sure, for integers, class references and small types, the current way is the best.  On the other hand, for large structs and maybe some other complex objects that obey value semantics, it seems it would be beneficial to, given a string like "a < b", produce a function taking arguments by reference and not by value.  Maybe there is some simple criterion (size of type greater than size_t - seems bad? type is a struct - better?) which can be used with static if to generate "(ref a, ref b)" instead of "(a, b)" version by default.  Of course, all that could only work if there is a more detailed hierarchy of binary functions, at least a binaryFunConst taking const arguments.

-----
Ivan Kazmenko.
February 14, 2013
Ivan Kazmenko:

> Thank you all for assistance, I've finally pinned it down!

This is great. The first step to solve a problem is to find it. The second step is to formalize it. So I suggest you to create a small benchmark that shows what you mean (it should not include RedBlackTree) and put it in Bugzilla, with some timings.

Bye,
bearophile
February 14, 2013
On Thursday, 14 February 2013 at 20:20:59 UTC, Rob T wrote:
> On Thursday, 14 February 2013 at 19:56:33 UTC, Namespace wrote:
>>
>> There are discussions for such thing for almost a year - but nothing has changed so far.
>
> I think its finally on the priority radar, but there's still other things in more dire need to be resolved first, like @property and lack of shared library support, and probably something else too ...
>
> --rt

Yes, not to mention
  - A package system for Phobos
  - Half Float Datatypes
  - JSON
  - bug fixes and merging of new features in D1 (although no further development of D1 should be done since December 31)
and probably something else too.