Thread overview
struct vs class
Nov 14, 2010
spir
Nov 14, 2010
bearophile
Nov 14, 2010
div0
Nov 14, 2010
spir
Nov 14, 2010
Lutger Blijdestijn
November 14, 2010
Hello,


There seems to be 2 main differences between structs & classes:
1. structs instances are direct values, implement value semantics; while class instances are referenced (actually "pointed")
2. classes can be subtyped/subclassed in a simple way; structs cannot be really subtyped -- but there is the "alias this" hack

I am trying to understand what should be the rationale behind choosing one or the other form of structured type. Letting aside subtyping considerations, meaning concentrating on the consequences of the first point above. Here are my views on the topic:

* On the semantic side:
An element should be referenced when its meaning is of a "thing", an entity that has an identity distinct from its value; a things remains "itself" whatever its evolution in time; it can also be multiply referenced. For instance, a visual form that can change would be a thing.
An element should be a value type when it provides information about a thing; a value makes no sense by itself, it is bound to what it describes an aspect of; referencing a value is meaningless, only copy makes no sense. For instance, the position & color of a visual form should be values.

* On the efficiency side:
Struct instances have fast creation, because allocated on the stack (*)? Class instances are costly to create. But then, passing class instances around is light, since only a reference is copied, while struct instances are copied at the field level.
Both of these points may conflict with semantic considerations above: we may want to use structs for fast creation, but if ever they mean "things", we must think at referencing them manually and/or using ref parameters. We may want to use classes for light passing, but if they mean values, we must either never assign them or manually copy their content. It's playing with fire: very dangerous risks of semantic breaks in both cases...


Here is an example: a module implementating general-purpose tree/node structure.
Let us say there are both a Tree & Node types -- the nodes do not implement global methods, only the tree does. So, a node just holds an element and a set of child nodes, possibly methods to manipulate these fields. A tree in addition has methods to traverse nodes, search, insert/remove, whatever...
What kinds should be Node & Tree? Why? Are there sensible alternatives? If yes, what are the advantages and drawback of each? In what cases?
(questions, questions, questions...)


Denis

(*) Is this true? And why is the stack more efficient than the heap?

-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

November 14, 2010
spir:

> a value makes no sense by itself, it is bound to what it describes an aspect of; referencing a value is meaningless, only copy makes no sense. For instance, the position & color of a visual form should be values.

Structs may have a meaning by themselves, all kind of member functions, and you may manage structs by pointer.


> Struct instances have fast creation, because allocated on the stack (*)?

You may allocate them on the heap too, with "new". And if you put a struct inside an object instance, among its fields, then the struct is allocated where the class instance is allocated, generally (but not always) on the heap. Structs may also be allocated in the static section, like static variable, and as global constant too.


> Class instances are costly to create. But then, passing class instances around is light, since only a reference is copied, while struct instances are copied at the field level.

But structs may also be passed around by ref, and by pointer.


> What kinds should be Node & Tree? Why? Are there sensible alternatives? If yes, what are the advantages and drawback of each? In what cases?

If efficiency is not so strong, then using classes is the simpler solution. Otherwise you want to use structs plus various kinds of C hacks.


> (*) Is this true? And why is the stack more efficient than the heap?

In Java the GC keeps the objects in separated groups, the first generation is managed as a stack, so allocation is almost as fast as stack allocation. But current D GC is more primitive, it uses a more complex allocation scheme that requires more time. Allocation on a stack is easy, you ideally only need to increment an index/pointer...

Bye,
bearophile
November 14, 2010
On 14/11/2010 11:08, spir wrote:
> Hello,
>
>
> There seems to be 2 main differences between structs&  classes:
> 1. structs instances are direct values, implement value semantics;
> while class instances are referenced (actually "pointed")
> 2. classes can be subtyped/subclassed in a simple way; structs
> cannot be really subtyped -- but there is the "alias this" hack
>
> I am trying to understand what should be the rationale behind choosing
> or the other form of structured type. Letting aside subtyping
> considerations,
> meaning concentrating on the consequences of the first point above.

> Here are my views on the topic:
>
> * On the semantic side:
> An element should be referenced when its meaning is of a "thing",
> an entity that has an identity distinct from its value; a things
> remains "itself" whatever its evolution in time; it can also be
> multiply referenced. For instance, a visual form that can change
> would be a thing.

> An element should be a value type when it provides information
> about a thing; a value makes no sense by itself, it is bound to
> what it describes an aspect of; referencing a value is meaningless,
> only copy makes no sense. For instance, the position&
> color of a visual form should be values.

One of the best descriptions of the differences I've seen so far.

> * On the efficiency side:
> Struct instances have fast creation, because allocated on the stack
> (*)? Class instances are costly to create. But then, passing
> class instances around is light, since only a reference is
> copied, while struct instances are copied at the field level.

Structs get allocated on the stack when used as local vars, they also get embedded in classes when they are members. i.e. if you add a 40 byte struct to a class, the class instance size goes up by ~40 bytes (depending on padding).

You can still new a Struct if needed and you can always use ref & plain pointers as well, so you can easily blur the line of Structs not having identity.

Struct's don't have a virtual table pointer or a monitor pointer, so they can be smaller than class instances.

You can also precisely control the content and layout of a struct, which you can't do with a class; so you'd have to use Structs when doing vertices in 3d programs that you want to pass to OpenGL/DirectX

> Both of these points may conflict with semantic considerations above:
> we may want to use structs for fast creation, but if ever they mean
> "things", we must think at referencing them manually and/or using
> ref parameters. We may want to use classes for light passing,
> but if they mean values, we must either never assign them or
> manually copy their content. It's playing with fire: very
> dangerous risks of semantic breaks in both cases...

Perhaps, but they are tools to implement a program design;
it's the program design that should be driving your choices not abstract semantic considerations. Go down that route and you'll never get anything done.

> Here is an example: a module implementating general-purpose tree/node structure.
> Let us say there are both a Tree&  Node types -- the nodes do not implement
> global methods, only the tree does. So, a node just holds an element
> and a set of child nodes, possibly methods to manipulate these fields.
> A tree in addition has methods to traverse nodes, search,
> insert/remove, whatever...

> What kinds should be Node&  Tree? Why? Are there sensible alternatives? If yes, what
> are the advantages and drawback of each? In what cases?
> (questions, questions, questions...)

Well Tree would probably be a class as you are likely to be passing it around a lot and Node types would be structs.

For Node's you don't need inheritance, you want to keep their size as small as possible and they are not visible outside the context of the tree.

And on the other hand, you probably don't want virtual methods in your tree either, so you could make that a smart pointer type struct as well and then you've got a struct masquerading as a reference type...


>
> Denis
>
> (*) Is this true? And why is the stack more efficient than the heap?

You can allocate from the stack in amortised constant time; usually a very fast time. If you need 10 bytes of stack, you just subtract 10 from the stack pointer which is a single ASM instruction and as long as you don't hit the bottom of the stack you are done.

If you do hit the bottom of the stack (on windows/linux) you'll attempt to access none accessible memory; this will invoke the OS's memory manager.

If you've got enough free memory and free address space you'll be given a new stack page and things will continue, otherwise you'll get a stack overflow exception/seg fault.

Allocating new memory runs through D runtime and in principal may take an unbounded amount of time to find/allocate the size request. If you program has a very fragmented heap and you've loads of memory allocated it might take the allocator a very long time to determine it doesn't have any free memory and then finally go to the OS for more memory.


> -- -- -- -- -- -- --
> vit esse estrany ☣
>
> spir.wikidot.com
>


-- 
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk
November 14, 2010
On Sun, 14 Nov 2010 12:02:35 +0000
div0 <div0@sourceforge.net> wrote:

> > Both of these points may conflict with semantic considerations above:
>  > we may want to use structs for fast creation, but if ever they mean
>  > "things", we must think at referencing them manually and/or using
>  > ref parameters. We may want to use classes for light passing,
>  > but if they mean values, we must either never assign them or
>  > manually copy their content. It's playing with fire: very
>  > dangerous risks of semantic breaks in both cases...
> 
> Perhaps, but they are tools to implement a program design;
> it's the program design that should be driving your choices not abstract
> semantic considerations. Go down that route and you'll never get
> anything done.

I do not really get what you mean with "program design" as opposed to "semantic considerations". I tend to think that "program design" should precisely be driven by "semantic considerations" -- and that it's the main purpose of good language design to allow this as straightforwardly as possible. A bad langage is for me one in which one can hardly express what is meant -- leading to what I call "semantic distorsion" ;-).
So, what is "program design" for you?

Denis
-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

November 14, 2010
spir wrote:

> On Sun, 14 Nov 2010 12:02:35 +0000
> div0 <div0@sourceforge.net> wrote:
> 
>> > Both of these points may conflict with semantic considerations above:
>>  > we may want to use structs for fast creation, but if ever they mean
>>  > "things", we must think at referencing them manually and/or using
>>  > ref parameters. We may want to use classes for light passing,
>>  > but if they mean values, we must either never assign them or
>>  > manually copy their content. It's playing with fire: very
>>  > dangerous risks of semantic breaks in both cases...
>> 
>> Perhaps, but they are tools to implement a program design;
>> it's the program design that should be driving your choices not abstract
>> semantic considerations. Go down that route and you'll never get
>> anything done.
> 
> I do not really get what you mean with "program design" as opposed to "semantic considerations". I tend to think that "program design" should precisely be driven by "semantic considerations" -- and that it's the main purpose of good language design to allow this as straightforwardly as possible. A bad langage is for me one in which one can hardly express what is meant -- leading to what I call "semantic distorsion" ;-). So, what is "program design" for you?
> 
> Denis
> -- -- -- -- -- -- --
> vit esse estrany ☣
> 
> spir.wikidot.com

If I may add my 2 cents: program design is where, in another response, you choose to pass a cursor into a source text by reference and have methods update that reference directly instead of returning a new cursor. You have chosen a class, but a pointer inside a struct it could be done as well.