std.container.RedBlackTree versus C++ std::set (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » std.container.RedBlackTree versus C++ std::set (page 5)

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Steven Schveighoffer
in reply to monarch_dodra

Steven Schveighoffer

Posted in reply to monarch_dodra

On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra@gmail.com> wrote:

> Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!)

Huh?

a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).

HOWEVER, when you get to the point of executing a == b, it's implemented as !(a < b) && !(b < a), which could more efficiently be rewritten as a.opEquals(b) or a.opCmp(b) == 0.

In the case of red black tree however, the nature of traversal makes it so that you only have the redundant call once (when you have found the insertion spot, you already have called one half, you just have to call the other half).

For dcollections, I don't accept a < b, I accept the int-returning compare function (which I think is pass by ref, so shouldn't have this problem).  The std.container.RedBlackTree is a port of the same code, but adapted to the more phobos-style functions.

-Steve

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Dan
in reply to Steven Schveighoffer

Dan

Posted in reply to Steven Schveighoffer

On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer wrote:
> On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra@gmail.com> wrote:
>
>> Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!)
>
> Huh?
>
> a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).

Yes. But isn't opCmp still more work than just a<b? It must be because it returns three states instead of two. So, I think the general warning on opCmp is warranted especially with structs. For structs we could use compile time reflection to provide automatic efficient opCmp behavior. Section 3.1 here outlines a way:

https://github.com/patefacio/d-help/blob/master/doc/canonical.pdf

Any comments appreciated.

Thanks
Dan

>
> HOWEVER, when you get to the point of executing a == b, it's implemented as !(a < b) && !(b < a), which could more efficiently be rewritten as a.opEquals(b) or a.opCmp(b) == 0.

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by monarch_dodra
in reply to Steven Schveighoffer

monarch_dodra

Posted in reply to Steven Schveighoffer

On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer wrote:
> On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra@gmail.com> wrote:
>
>> Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!)
>
> Huh?
>
> a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).

Right... sorry.

But still, calling opCmp is usually a bit more expensive that a simpler function dedicated to giving you a < b.

99% of the time, I agree that it doesn't matter mutch, and the gain in semantic power trumps performance, but for a dedicated algorithm, eg sort, there is a definite performance difference...

> HOWEVER, when you get to the point of executing a == b, it's implemented  as !(a < b) && !(b < a), which could more efficiently be rewritten as  a.opEquals(b) or a.opCmp(b) == 0.

And that.

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Dan
in reply to Jonathan M Davis

Dan

Posted in reply to Jonathan M Davis

On Thursday, 14 February 2013 at 20:53:26 UTC, Jonathan M Davis wrote:
>
> And Walter and Andrei both seem to think that having expensive postlbits is a
> design mistake. The range stuff sort of tries to support it with the move*
> primitives, but Andrei's been wanting to argue for just assuming that
> postblits are cheap. And Walter was actually arguing at one point for making
> it illegal (probably by getting rid of postblit all together - I don't
> remember the details). Plenty of the rest of us don't agree, but to some
> extent, it is true that having an expensive postblit is asking for it. On a
> related note, we still need a solution for dealing with const postblits
> (probably be introducing copy constructors - IIRC Andrei was considering
> phasing out postblits in favor of copy constructors, which would be
> unnecessary, but we do need a way to deal with deep copying const structs
> which hold reference types which need to be deep copied.
>

When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?

Regarding replacing expensive postblits with copy constructors - how does that help? The expense will remain the same - if you need a transitive copy, you need a transitive copy.

Thanks
Dan

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Namespace
in reply to Dan

Namespace

Posted in reply to Dan

> When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?

Make a new thread where Walter and Andrei may submit an official staement.
There are certainly other interested and this thread is probably not for more suitable.

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Dan
in reply to Steven Schveighoffer

Dan

Posted in reply to Steven Schveighoffer

On Thursday, 14 February 2013 at 19:31:36 UTC, Steven Schveighoffer wrote:
>
> If it was pass by ref, then rbt.insert(5) would not work.  And in fact, I wouldn't want things passed by ref if the element is int.

What is the reason for the second objection - is just performance?
Is performance really an issue?

From this thread it looks like fundamental types by ref or value is not really a big deal in terms of performance. OTOH - as size and/or postblit execution gets expensive the cost *is* significant.
---------------------------
4 bytes: using cref_(int size) took 29[ms]
4 bytes: using inref(int size) took 29[ms]
4 bytes: using in___(int size) took 30[ms]

8 bytes: using cref_(int size) took 29[ms]
8 bytes: using inref(int size) took 28[ms]
8 bytes: using in___(int size) took 31[ms]
...

128 bytes: using cref_(int size) took 29[ms]
128 bytes: using inref(int size) took 29[ms]
128 bytes: using in___(int size) took 290[ms]
---------------------------


> I have to admit, I did not consider expensive postblits when I designed it.  Almost all my testing is with integer types.
>

For performance, it seems by ref should always be preferred in generic code because you can not know the cost of postblit - or copy construction if that were a future solution for structs. Granted the no rvalue passing is a pain - but is it a big deal in library/generic code?

Thanks,
Dan

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Jonathan M Davis
in reply to Dan

Jonathan M Davis

Posted in reply to Dan

On Friday, February 15, 2013 19:14:21 Dan wrote:
> When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?

Yeah, they're unlikely to say much until something has actually been decided. Frequently, if you try and pin them down on stuff like that, you won't get an answer, especially if you ask out of the blue. But regardless, I wouldn't expect postblits to be gotten rid of (because that would break code). Rather, they would be discouraged, and newer code would be written to use copy constructors instead. But nothing has been decided for certain AFAIK.

> Regarding replacing expensive postblits with copy constructors - how does that help? The expense will remain the same - if you need a transitive copy, you need a transitive copy.

The problem has nothing to do with const. The problem const and immutable. It's impossible to write a const or immutable postblit constructor, because postblit operates by doing the copy and then allowing you to change it, whereas const and immutable cannot be changed. You need to construct the copy as const or immutable, which means you need to make the changes when you make the copy, which basically means that you need a copy constructor.

Efficiency comes in when arguing whether a postblit or copy constructor is appropriate in the first place or whether they can be implemented efficiently. Walter seems to be of the opinion that they should only be used in cases where they're actually efficient (i.e. you never use them for deep copying, only for handling stuff like ref-counting). He's much more in favor of COW. But much as he thinks that way, I don't think that there's any real risk of that being codified in the language at this point. Not only would it be very difficult to restrict postblit like that (if not impossible), but it would break tons of code (which Walter is very much against), and most everyone would scream (and several already did when he brought it up).

Andrei likes the idea of basically requiring that ranges have O(1) postblits/copy constructors so that algorithms can assume that copying is cheap, but he's never decided that we're going that way for certain, and even if we did, it would just mean that we wouldn't write code which cared about the expense of copying ranges, and anyone who wrote ranges that were expensive to copy would take a performance hit. Honestly, I suspect that that's mostly where we are already. And given how ranges work, an expensive postblit doesn't really make sense anyway.

No, I think that the only real concern here is if/when we'll get copy constructors so that we can copy const or immutable structs that require postblits or copy constructors. Walter and Andrei are very concerned about achieving and maintaining stability of the language and library, so any major breaking changes (especially to the language) would need to have very good reasons and would likely be discussed in detail in the newsgroup first. So, while I think that some of what they've said would be of some concern if D were still being designed (as opposed to being stabilized), given where we are, it's unlikely to be a concern.

- Jonathan M Davis

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Jonathan M Davis
in reply to Dan

Jonathan M Davis

Posted in reply to Dan

On Friday, February 15, 2013 20:12:08 Dan wrote:
> Granted the no rvalue passing is a pain - but is it a big deal in library/generic code?

Yes, it's a big deal. Any place that it's part of an API, it's a big deal. It forces you to create what are essentially useless variables all over the place if you require ref. We _will_ have a solution similar to const& at some point, and that's what the correct way to go to solve this problem will be, but that situation still needs to be sorted out (and the situation with DIP 25 - http://wiki.dlang.org/DIP25 - could have an impact on that given that it's also has to do with changes to ref).

- Jonathan M Davis

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Jonathan M Davis
in reply to Namespace

Jonathan M Davis

Posted in reply to Namespace

On Friday, February 15, 2013 19:43:07 Namespace wrote:
> > When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?
> 
> Make a new thread where Walter and Andrei may submit an official
> staement.
> There are certainly other interested and this thread is probably
> not for more suitable.

They're not going to do that until they've decided, and AFAIK, they haven't. Attempting to essentially confront them and get them to say what they plan to do generally doesn't get you anywhere - especially since they're unlikely to say much until they actually are planning to do it, and it's usually the case that no decision has been made.

- Jonathan M Davis

February 15, 2013

Re: std.container.RedBlackTree versus C++ std::set

Posted by Namespace
in reply to Jonathan M Davis

Namespace

Posted in reply to Jonathan M Davis

> They're not going to do that until they've decided, and AFAIK, they haven't.
> Attempting to essentially confront them and get them to say what they plan to
> do generally doesn't get you anywhere - especially since they're unlikely to
> say much until they actually are planning to do it, and it's usually the case
> that no decision has been made.
>
> - Jonathan M Davis

I did not mean the const& thing. This is hopeless and will certainly take some months or years.
My post was based on "[...] probably by getting rid of postblit all together.".

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation