Thread overview
struct vs. class containers (was: Re: dcollections 1.0 and 2.0a beta released)
Re: struct vs. class containers
May 23, 2010
bearophile
May 24, 2010
Jacob Carlborg
May 24, 2010
Michel Fortin
May 24, 2010
Jacob Carlborg
Re: struct vs. class containers (was: Re: dcollections 1.0 and 2.0a
May 24, 2010
bearophile
May 23, 2010
I've thought for a very long time about the class vs. struct choice in a container, and I came to a startling conclusion: it (almost) doesn't matter. Could be either, and the tradeoffs involved are nonessential. Here they are:

1. Using a class makes implementing members easier because there's no need to do work through an additional member. With a struct, you need a pimpl approach. For example:

struct Array {
    struct Impl {
        ...
    }
    Impl * impl;
    ...
    @property length() const { return impl->length; }
    ...
}

2. struct gives you more power in managing collection's own memory, as long as the collection doesn't escape item addresses or other addresses of internal handles. So all other things being equal, struct has a net advantage.

3. The creation syntaxes are different. For Phobos, I suggest adding a simple function make() to std.algorithm. make!T(a, b, c) returns a newly constructed object T by invoking the constructor with arguments a, b, and c. That way we can make client code virtually agnostic of the class/struct choice for a container.

That's it. Otherwise, one could use either to build a container. Let me note that I have reached the conclusion that containers should be at best reference types, with a meta-constructor Value!(C) that takes a container C and makes it into a value type.

This result is the end of a long journey. I am quite sure I have it right. If anything, I am amazed at how much dogma I had to unlearn before reaching it. And I'm glad I did - this is very D-like: I wasn't looking for a class-based solution, and I wasn't looking for a struct-based solution. I was only looking for the truth - and surprise, this slightly asymmetric duality came forth.

It seems Steve took the weekend off. At this point I'm poised to prepare a list of items that would condition putting dcollections in phobos. But I'd rather have Steve (and everyone) acquire their own motivation from my arguments, instead of operating changes as a concession.


Andrei
May 23, 2010
Andrei Alexandrescu:

> 1. Using a class makes implementing members easier because there's no need to do work through an additional member. With a struct, you need a pimpl approach. For example:
> 
> struct Array {
>      struct Impl {
>          ...
>      }
>      Impl * impl;
>      ...
>      @property length() const { return impl->length; }
>      ...
> }

I am far from being an expert as you, and while I have read about the PImpl in C++, I don't understand what you are saying here. I am sorry.


> 3. The creation syntaxes are different.

In C# you use 'new' for both structs allocated on the stack and classes allocated on the heap: http://msdn.microsoft.com/en-us/library/51y09td4%28VS.71%29.aspx


> Let me note that I have reached the conclusion that containers should be at best reference types, with a meta-constructor Value!(C) that takes a container C and makes it into a value type.

Does Value!() use static introspection and placement new to instantiate the given class on the stack? :-)

Bye and thank you,
bearophile
May 24, 2010
On 2010-05-23 17:36:52 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

> I've thought for a very long time about the class vs. struct choice in a container, and I came to a startling conclusion: it (almost) doesn't matter. Could be either, and the tradeoffs involved are nonessential.

I'm starting to wonder whether this discussion really belongs in D.announce. Perhaps you should repost this on the general newsgroup and we shall discuss it there?

This newsgroup was obviously the right place for the initial dcollection announcement, be we've sidetracked quite a lot since then.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

May 24, 2010
On 05/23/2010 05:06 PM, bearophile wrote:
> Andrei Alexandrescu:
>
>> 1. Using a class makes implementing members easier because there's no
>> need to do work through an additional member. With a struct, you need a
>> pimpl approach. For example:
>>
>> struct Array {
>>       struct Impl {
>>           ...
>>       }
>>       Impl * impl;
>>       ...
>>       @property length() const { return impl->length; }
>>       ...
>> }
>
> I am far from being an expert as you, and while I have read about the PImpl in C++, I don't understand what you are saying here. I am sorry.

If you want to define a class you go like this:

class CBlah
{
    private int payload_;
    @property int payload() { return payload_; }
    @property void payload(int p) { payload_ = p; }
}

Right? The class has by birth reference semantics:

auto obj = new CBlah;
obj.payload = 42;
auto obj1 = obj;
obj1.payload = 43;
assert(obj.payload == 43);

If you want to define a struct with ref semantics you define it like this:

struct SBlah
{
    private struct Impl
    {
        private int payload_;
    }
    private Impl * pImpl;
    @property int payload() { return pImpl->payload_; }
    @property void payload(int p) { pImpl->payload_ = p; }
}

auto obj = SBlah();
obj.pImpl = new SBlah.Impl; // meh, a crappy detail that we need to
                            // discuss later
obj.payload = 42;
auto obj1 = obj;
obj1.payload = 43;
assert(obj.payload == 43);

That's what I said.

>> 3. The creation syntaxes are different.
>
> In C# you use 'new' for both structs allocated on the stack and classes allocated on the heap:
> http://msdn.microsoft.com/en-us/library/51y09td4%28VS.71%29.aspx

That's pretty retarded (sorry retard). There was opportunity there to get rid of new altogether.

>> Let me note that I have reached the conclusion that containers should
>> be at best reference types, with a meta-constructor Value!(C) that takes
>> a container C and makes it into a value type.
>
> Does Value!() use static introspection and placement new to instantiate the given class on the stack? :-)

No. It does copy it automatically. There should be no notable difference between Value!(Array!int) and std::vector<int>.


Andrei
May 24, 2010
On 2010-05-23 23.36, Andrei Alexandrescu wrote:
> I've thought for a very long time about the class vs. struct choice in a
> container, and I came to a startling conclusion: it (almost) doesn't
> matter. Could be either, and the tradeoffs involved are nonessential.
> Here they are:
>
> 1. Using a class makes implementing members easier because there's no
> need to do work through an additional member. With a struct, you need a
> pimpl approach. For example:
>
> struct Array {
> struct Impl {
> ...
> }
> Impl * impl;
> ...
> @property length() const { return impl->length; }
> ...
> }

Isn't that just what "alias this" and "opDispatch" are for ?

> 2. struct gives you more power in managing collection's own memory, as
> long as the collection doesn't escape item addresses or other addresses
> of internal handles. So all other things being equal, struct has a net
> advantage.
>
> 3. The creation syntaxes are different. For Phobos, I suggest adding a
> simple function make() to std.algorithm. make!T(a, b, c) returns a newly
> constructed object T by invoking the constructor with arguments a, b,
> and c. That way we can make client code virtually agnostic of the
> class/struct choice for a container.
>
> That's it. Otherwise, one could use either to build a container. Let me
> note that I have reached the conclusion that containers should be at
> best reference types, with a meta-constructor Value!(C) that takes a
> container C and makes it into a value type.
>
> This result is the end of a long journey. I am quite sure I have it
> right. If anything, I am amazed at how much dogma I had to unlearn
> before reaching it. And I'm glad I did - this is very D-like: I wasn't
> looking for a class-based solution, and I wasn't looking for a
> struct-based solution. I was only looking for the truth - and surprise,
> this slightly asymmetric duality came forth.
>
> It seems Steve took the weekend off. At this point I'm poised to prepare
> a list of items that would condition putting dcollections in phobos. But
> I'd rather have Steve (and everyone) acquire their own motivation from
> my arguments, instead of operating changes as a concession.
>
>
> Andrei


-- 
/Jacob Carlborg
May 24, 2010
On 2010-05-24 00.06, bearophile wrote:
> Andrei Alexandrescu:
>
>> 1. Using a class makes implementing members easier because there's no
>> need to do work through an additional member. With a struct, you need a
>> pimpl approach. For example:
>>
>> struct Array {
>>       struct Impl {
>>           ...
>>       }
>>       Impl * impl;
>>       ...
>>       @property length() const { return impl->length; }
>>       ...
>> }
>
> I am far from being an expert as you, and while I have read about the PImpl in C++, I don't understand what you are saying here. I am sorry.
>
>
>> 3. The creation syntaxes are different.
>
> In C# you use 'new' for both structs allocated on the stack and classes allocated on the heap:
> http://msdn.microsoft.com/en-us/library/51y09td4%28VS.71%29.aspx

I've never liked that, always thinking that it maybe allocating on the heap.

>> Let me note that I have reached the conclusion that containers should
>> be at best reference types, with a meta-constructor Value!(C) that takes
>> a container C and makes it into a value type.
>
> Does Value!() use static introspection and placement new to instantiate the given class on the stack? :-)
>
> Bye and thank you,
> bearophile


-- 
/Jacob Carlborg
May 24, 2010
On Sun, 23 May 2010 17:36:52 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> I've thought for a very long time about the class vs. struct choice in a container, and I came to a startling conclusion: it (almost) doesn't matter. Could be either, and the tradeoffs involved are nonessential. Here they are:
>
> 1. Using a class makes implementing members easier because there's no need to do work through an additional member. With a struct, you need a pimpl approach. For example:
>
> struct Array {
>      struct Impl {
>          ...
>      }
>      Impl * impl;
>      ...
>      @property length() const { return impl->length; }
>      ...
> }
>
> 2. struct gives you more power in managing collection's own memory, as long as the collection doesn't escape item addresses or other addresses of internal handles. So all other things being equal, struct has a net advantage.

Yes, but most containers are node-based, so as long as you use structs for nodes, you are generally in control of the bulk of allocation (dcollections does this).

>
> 3. The creation syntaxes are different. For Phobos, I suggest adding a simple function make() to std.algorithm. make!T(a, b, c) returns a newly constructed object T by invoking the constructor with arguments a, b, and c. That way we can make client code virtually agnostic of the class/struct choice for a container.
>

Classes have builtin object monitors.  But you could handle that via a struct as well.  The language support wouldn't be there though.

> That's it. Otherwise, one could use either to build a container. Let me note that I have reached the conclusion that containers should be at best reference types, with a meta-constructor Value!(C) that takes a container C and makes it into a value type.

The thing that classes really give you is the language support.  Structs need to be hand-crafted to support the same syntax.  Classes are enforced as always being reference types and always being on the heap.  The Value!(C) is questionable because creating the head of a container on the stack leads to easily escaped stack references.

But yeah, a struct as a 'smart pointer' could work, as long as you don't 'auto-create' like AA's do.

>
> It seems Steve took the weekend off.

My son's 2nd birthday party was Saturday, and I had visitors, sorry :)  I did some responses, but at some point if you want to remain married, you have to pay attention to your family.

-Steve
May 24, 2010
On 05/24/2010 10:58 AM, Steven Schveighoffer wrote:
> On Sun, 23 May 2010 17:36:52 -0400, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>
>> I've thought for a very long time about the class vs. struct choice in
>> a container, and I came to a startling conclusion: it (almost) doesn't
>> matter. Could be either, and the tradeoffs involved are nonessential.
>> Here they are:
>>
>> 1. Using a class makes implementing members easier because there's no
>> need to do work through an additional member. With a struct, you need
>> a pimpl approach. For example:
>>
>> struct Array {
>> struct Impl {
>> ...
>> }
>> Impl * impl;
>> ...
>> @property length() const { return impl->length; }
>> ...
>> }
>>
>> 2. struct gives you more power in managing collection's own memory, as
>> long as the collection doesn't escape item addresses or other
>> addresses of internal handles. So all other things being equal, struct
>> has a net advantage.
>
> Yes, but most containers are node-based, so as long as you use structs
> for nodes, you are generally in control of the bulk of allocation
> (dcollections does this).

I'm saying, the net advantage of struct is that it can safely and deterministically release memory in its destructor.

>> 3. The creation syntaxes are different. For Phobos, I suggest adding a
>> simple function make() to std.algorithm. make!T(a, b, c) returns a
>> newly constructed object T by invoking the constructor with arguments
>> a, b, and c. That way we can make client code virtually agnostic of
>> the class/struct choice for a container.
>>
>
> Classes have builtin object monitors. But you could handle that via a
> struct as well. The language support wouldn't be there though.

Oh, and the monitor isn't needed so that's an unused word there. I'd forgotten about it.

>> That's it. Otherwise, one could use either to build a container. Let
>> me note that I have reached the conclusion that containers should be
>> at best reference types, with a meta-constructor Value!(C) that takes
>> a container C and makes it into a value type.
>
> The thing that classes really give you is the language support. Structs
> need to be hand-crafted to support the same syntax. Classes are enforced
> as always being reference types and always being on the heap.

I'd agreed classes are more convenient because they make the implementation straightforward.

> The
> Value!(C) is questionable because creating the head of a container on
> the stack leads to easily escaped stack references.

I don't understand this. Could you give an example?

> But yeah, a struct as a 'smart pointer' could work, as long as you don't
> 'auto-create' like AA's do.
>
>>
>> It seems Steve took the weekend off.
>
> My son's 2nd birthday party was Saturday, and I had visitors, sorry :) I
> did some responses, but at some point if you want to remain married, you
> have to pay attention to your family.

Congrats. And yes, wise words :o).


Andrei

May 24, 2010
On Mon, 24 May 2010 12:18:02 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 05/24/2010 10:58 AM, Steven Schveighoffer wrote:
>> On Sun, 23 May 2010 17:36:52 -0400, Andrei Alexandrescu
>> <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> I've thought for a very long time about the class vs. struct choice in
>>> a container, and I came to a startling conclusion: it (almost) doesn't
>>> matter. Could be either, and the tradeoffs involved are nonessential.
>>> Here they are:
>>>
>>> 1. Using a class makes implementing members easier because there's no
>>> need to do work through an additional member. With a struct, you need
>>> a pimpl approach. For example:
>>>
>>> struct Array {
>>> struct Impl {
>>> ...
>>> }
>>> Impl * impl;
>>> ...
>>> @property length() const { return impl->length; }
>>> ...
>>> }
>>>
>>> 2. struct gives you more power in managing collection's own memory, as
>>> long as the collection doesn't escape item addresses or other
>>> addresses of internal handles. So all other things being equal, struct
>>> has a net advantage.
>>
>> Yes, but most containers are node-based, so as long as you use structs
>> for nodes, you are generally in control of the bulk of allocation
>> (dcollections does this).
>
> I'm saying, the net advantage of struct is that it can safely and deterministically release memory in its destructor.

OK.

>
>>> 3. The creation syntaxes are different. For Phobos, I suggest adding a
>>> simple function make() to std.algorithm. make!T(a, b, c) returns a
>>> newly constructed object T by invoking the constructor with arguments
>>> a, b, and c. That way we can make client code virtually agnostic of
>>> the class/struct choice for a container.
>>>
>>
>> Classes have builtin object monitors. But you could handle that via a
>> struct as well. The language support wouldn't be there though.
>
> Oh, and the monitor isn't needed so that's an unused word there. I'd forgotten about it.

Why isn't the monitor needed?  Will we no longer be able to use a globally shared container object?

>
>>> That's it. Otherwise, one could use either to build a container. Let
>>> me note that I have reached the conclusion that containers should be
>>> at best reference types, with a meta-constructor Value!(C) that takes
>>> a container C and makes it into a value type.
>>
>> The thing that classes really give you is the language support. Structs
>> need to be hand-crafted to support the same syntax. Classes are enforced
>> as always being reference types and always being on the heap.
>
> I'd agreed classes are more convenient because they make the implementation straightforward.
>
>> The
>> Value!(C) is questionable because creating the head of a container on
>> the stack leads to easily escaped stack references.
>
> I don't understand this. Could you give an example?

For example, dcollections' TreeMap has an 'end' node that defines the end of the container.  It actually is the root of the RB tree and all the data is on the left child of the 'end' node.  This makes it always the last node iterated.

The end node is actually member of the class, it's not allocated separately on the heap for efficiency and a less complex constructor.  So let's say TreeMap could be on the stack, here's a situation where an escape would occur:

auto foo()
{
   TreeMap!(string, uint) tmap; // allocated on stack
   ... // fill up tmap;
   return tmap["hi"..tmap.end]; // oops! end is part of the range, and it was allocated on the stack.
}

In dcollections, all the classes have some sort of members that are associated with the entire container, not the elements.  These would disappear once a stack-based container exited scope, but there may be dangling references returned.

To get around this, you could ensure that all data was allocated on the heap, but then wouldn't Value!(C) be almost useless?  What data would be allocated on the stack that is part of the container that would never be referenced by the heap-based parts?

-Steve
May 24, 2010
Andrei Alexandrescu:
> Oh, and the monitor isn't needed so that's an unused word there. I'd forgotten about it.

It can be invented some kind of property or pragma for classes, to remove the monitor from all the instances of that class. I don't know if this can be done, if it is useful, and if it's worth the implementation effort.

Bye,
bearophile