Why are structs and classes so different?

Why are structs and classes so different?
May 15, 2022 Kevin Bailey
May 15, 2022 Alain De Vos
May 16, 2022 forkit
May 16, 2022 bauss
May 16, 2022 Kevin Bailey
May 16, 2022 Ola Fosheim Grøstad
May 16, 2022 Ali Çehreli
May 17, 2022 Kevin Bailey
May 17, 2022 Adam D Ruppe
May 17, 2022 Ali Çehreli
May 15, 2022 Guillaume Piolat
May 15, 2022 Mike Parker
May 15, 2022 Kevin Bailey
May 15, 2022 Ola Fosheim Grøstad
May 15, 2022 Mike Parker
May 15, 2022 Ali Çehreli
May 16, 2022 Tejas
May 16, 2022 Ali Çehreli
May 18, 2022 forkit
May 19, 2022 Ola Fosheim Grøstad
May 15, 2022 H. S. Teoh
May 16, 2022 IGotD-
May 16, 2022 H. S. Teoh
May 15, 2022 Ali Çehreli
May 16, 2022 Johan
May 16, 2022 Alain De Vos
May 16, 2022 Kevin Bailey
May 16, 2022 Ali Çehreli
May 16, 2022 Ali Çehreli
May 17, 2022 Johan
May 16, 2022 forkit
May 16, 2022 Walter Bright

May 15, 2022

Posted by Kevin Bailey

Permalink

Kevin Bailey

Permalink

I've done some scripting in D over the years but I never dug into D until recently. I'm going through Learning D and I was reminded that structs and classes are so different.

- struct methods are non-virtual while class methods are virtual
- Thus, structs can't inherit, because how would you find the child's destructor given a parent pointer?
- On the stack, structs by-value but classes are by-reference

I'm trying to understand why it is this way. I assume that there's some benefit for designing it this way. I'm hoping that it's not simply accidental, historical or easier for the compiler writer.

One problem that this causes is that I have to remember different rules when using them. This creates the additional load of learning and remembering which types are which from someone else's library.

A bigger problem is that, if I have a struct that I suddenly want to inherit from, I have to change all my code. In addition to that work, in both of these cases, one could easily do it wrong:

// Fine with a struct, fatal with a class.
Foo foo;

At least in C++, the compiler would complain. In D, not even a warning.

Why is it this way? What is the harm of putting a class object on the stack?

May 15, 2022

Re: Why are structs and classes so different?

Posted by Alain De Vos
in reply to Kevin Bailey

Permalink

Alain De Vos

Posted in reply to Kevin Bailey

Permalink

Can i summarize ,
structs are value-objects which live on the stack.
class instances are reference objects which live on the heap.

May 15, 2022

Re: Why are structs and classes so different?

Posted by Guillaume Piolat
in reply to Kevin Bailey

Permalink

Guillaume Piolat

Posted in reply to Kevin Bailey

Permalink

On Sunday, 15 May 2022 at 15:26:40 UTC, Kevin Bailey wrote:
> I'm trying to understand why it is this way. I assume that there's some benefit for designing it this way. I'm hoping that it's not simply accidental, historical or easier for the compiler writer.

Perhaps someone more informed will chime in, but there is a reason to avoid object inheritance with value types, and force them to be reference types.

https://stackoverflow.com/questions/274626/what-is-object-slicing

If we want to avoid that problem, then object with inheritance and virtual functions have to be reference types.

But you still need values types. So now you have both struct and class, like in C# (Hejlsberg, 2000).

For an escape hatch, D has library ways to have structs with virtual functions (there is a DUB package for that), and classes on the stack (Scoped!T, RefCounted!T, a __traits).

May 15, 2022

Re: Why are structs and classes so different?

Posted by Mike Parker
in reply to Kevin Bailey

Permalink

Mike Parker

Posted in reply to Kevin Bailey

Permalink

On Sunday, 15 May 2022 at 15:26:40 UTC, Kevin Bailey wrote:

I'm trying to understand why it is this way. I assume that there's some benefit for designing it this way. I'm hoping that it's not simply accidental, historical or easier for the compiler writer.

There's a problem that arises with pass-by-value subclasses called "object slicing". Effectively, it's possible to "slice off" members of superclasses. It's one of many pitfalls to be avoided in C++. There you typically find the convention of structs being used as POD (Plain Old Data) types (i.e., no inheritance) and classes when you want inheritance, in which case they are passed around to functions as references.

For reference:
https://stackoverflow.com/questions/274626/what-is-object-slicing

D basically bakes the C++ convention into the language, with a class system inspired by Java.

One problem that this causes is that I have to remember different rules when using them. This creates the additional load of learning and remembering which types are which from someone else's library.

Of all the complexity we need to remember as programmers, this is fairly low on the totem pole. Simple rule: if you need (or want) inheritance, use classes. If not, use structs.

A bigger problem is that, if I have a struct that I suddenly want to inherit from, I have to change all my code.

You should generally know up front if you need inheritance or not. In cases where you change your mind, you'll likely find that you have very little code to change. Variable declarations, sure. And if you were passing struct instances to functions, you'd want to change the function signatures, but that should be the lion's share of what you'd need to change. After all, D doesn't use the -> syntax for struct pointers, so any members you access would be via the dot operator.

In addition to that work, in both of these cases, one could easily do it wrong:

// Fine with a struct, fatal with a class.
Foo foo;

At least in C++, the compiler would complain. In D, not even a warning.

You generally find out about that pretty quickly in development, though. That's a good reason to get into the habit of implementing and running unit tests, so if you do make changes and overlook something like this, then your tests will catch it if normal operation of the program doesn't.

Why is it this way? What is the harm of putting a class object on the stack?

I've answered the "why" above. As to the the second question, there's no harm in putting a class on the stack:

import std.stdio;

class Clazz {
    ~this() { writeln("Bu-bye"); }
}

void clazzOnStack() {
    writeln("Entered");
    scope c = new Clazz;
    writeln("Leaving");
}

void main()
{
    clazzOnStack();
    writeln("Back in main");
}

You'll find here that the destructor of c in clazzOnStack is called when the function exits, just as if it were a struct. scope in a class variable declaration will cause it to the class to be allocated on the stack.

Note, though, that c still a reference to the instance. You aren't manipulating the class instance directly. If you were to pass c to a function doSomething that accepts a Clazz handle, it makes no difference that the instance is allocated on the stack. doSomething would neither know nor care. c is a handle, so you aren't passing the instance directly and it doesn't matter where it's allocated.

There's more to the story than just reference type vs. value type. Structs have deterministic destruction, classes by default do not (scope can give it to you as demonstrated above). See my blog post on the topic for some info. (And I'm reminded I need to write the next article in that series; time goes by too fast).

Everyone has their own criteria for when to choose class and when to choose struct. For me, I default to struct. I consider beforehand if I need inheritance, and if yes, then I ask myself if I can get by without deterministic destruction. There are ways to simulate inheritance with structs, and ways to have more control over destruction with classes, so there are options either way.

May 15, 2022

Re: Why are structs and classes so different?

Posted by Ali Çehreli
in reply to Kevin Bailey

Permalink

Ali Çehreli

Posted in reply to Kevin Bailey

Permalink

On 5/15/22 08:26, Kevin Bailey wrote:

> structs and classes are so different.

I think a more fundamental question is why structs and classes both exist at all. If they could be the same, one kind would be sufficient. And the answer is there are value types and there are reference types in programming.

Ali

May 15, 2022

Re: Why are structs and classes so different?

Posted by Kevin Bailey
in reply to Mike Parker

Permalink

Kevin Bailey

Posted in reply to Mike Parker

Permalink

Hi Mike (and Guillaume, since you posted the same link),

Thanks for the long explanation.

I've been programming in C++ full time for 32 years, so I'm familiar with slicing. It doesn't look to me like there's a concern here.

There seem to be a couple different questions here. I suspect that you answered a different one than I asked.

One question is, how should we pass objects - by value or by reference? In C++, you can do either, of course, but you take your chances if you pass by value - both in safety AND PERFORMANCE. The bottom line is that no one passes by value, even for PODs (although we may return even large objects.)

But I asked a different question: Why can't I put a class object on the stack? What's the danger?

Note that operating on that object hasn't changed. If I pass by reference, it's no different than if I had created a reference.

One might say, Well, if D creates by value, then it has to pass by value. But it doesn't; it has the 'ref' keyword.

One might want to avoid passing by value accidentally. Ok, one could have D pass class objects by reference implicitly.

I don't like things silently changing like that, so one might have D forbid all but pass by 'ref' or pointer for class objects.

In any case, this doesn't quite address the instantiate by value issue.

If there's still a case where putting an object on the stack breaks, I would greatly appreciate seeing a few lines of example code.

I hope Ali's answer isn't the real reason. I would be sad if D risked seg faults just to make class behavior "consistent".

thx

May 15, 2022

Re: Why are structs and classes so different?

Posted by Ola Fosheim Grøstad
in reply to Kevin Bailey

Permalink

Ola Fosheim Grøstad

Posted in reply to Kevin Bailey

Permalink

On Sunday, 15 May 2022 at 20:05:05 UTC, Kevin Bailey wrote:

I've been programming in C++ full time for 32 years, so I'm familiar with slicing. It doesn't look to me like there's a concern here.

Yes, slicing is not the issue. Slicing is a problem if you do "assignments" through a reference that is typed as the superclass… so references won't help.

The original idea might have been that structs were value-types, but that is not the case in D. They are not, you can take the address…

So what you effectively have is that structs follow C layout rules and D classes are not required to (AFAIK), but there is an ABI and C++ layout classes, so that freedom is somewhat limited… D classes also have a mutex for monitors.

In an ideal world one might be tempted to think that classes were ideal candidates for alternative memory management mechanisms since they are allocated on the heap. Sadly this is also not true since D is a system level programming language and you get to bypass that "type characteristic" and can force them onto the stack if you desire to do so…

May 15, 2022

Re: Why are structs and classes so different?

Posted by Mike Parker
in reply to Kevin Bailey

Permalink

Mike Parker

Posted in reply to Kevin Bailey

Permalink

On Sunday, 15 May 2022 at 20:05:05 UTC, Kevin Bailey wrote:

>
> One question is, how should we pass objects - by value or by reference? In C++, you can do either, of course, but you take your chances if you pass by value - both in safety AND PERFORMANCE. The bottom line is that no one passes by value, even for PODs (although we may return even large objects.)

Pass struct instances by ref or by value as needed, just as you do in C++.

For classes, you never have direct access to the instance. Your class reference is a handle (a pointer) that is always passed by value.

>
> But I asked a different question: Why can't I put a class object on the stack? What's the danger?

I answered that one. You can put a class on the stack with `scope`. There is no danger in that.

If you're wanting direct access to the class instance, like you would have with a struct, you don't have that in D. Classes are modeled on Java, not C++.

>
> Note that operating on that object hasn't changed. If I pass by reference, it's no different than if I had created a reference.

Again, you never have direct access to the object instance. You always access it through the handle.

>
> One might say, Well, if D creates by value, then it has to pass by value. But it doesn't; it has the 'ref' keyword.

Everything is passed by value unless the `ref` keyword is present.

>
> One might want to avoid passing by value accidentally. Ok, one could have D pass class objects by reference implicitly.

How do you pass by value accidentally? By forgetting the `ref` keyword?

>
> I don't like things silently changing like that, so one might have D forbid all but pass by 'ref' or pointer for class objects.

I don't understand where you're coming from here. How can things silently change?

>
> I hope Ali's answer isn't the real reason. I would be sad if D risked seg faults just to make class behavior "consistent".
>

Where is the risk of seg faults? Are you referring to the fact that class references are default initialized to null?

May 15, 2022

Re: Why are structs and classes so different?

Posted by Ali Çehreli
in reply to Kevin Bailey

Permalink

Ali Çehreli

Posted in reply to Kevin Bailey

Permalink

On 5/15/22 13:05, Kevin Bailey wrote:

> I've been programming in C++ full time for 32 years

Hi from an ex-C++'er. :) I managed to become at least a junior expert in C++ between 1996-2015. I don't use C++ since then.

I still think my answer is the real one. My implied question remains: Why does C++ have struct and class disticnction? I know they have different default access specifications but does that warrant two kinds?

I claim there are two types in C++ as well: value types and reference types. And types of an inheritance hirerachy are by convention reference types. As others reminded on this thread, C++ programmers follow guidelines to treat types of hierarchies as reference types.

> so I'm familiar
> with slicing. It doesn't look to me like there's a concern here.

Slicing renders types of class hierarchies reference types. They can't be value types because nobody wants to pass a Cat sliced as an Animal. It's always a programmer error.

All D does is (just like C# did) appreciate the differences between these two kinds and utilize existing keywords.

> One question is, how should we pass objects - by value or by reference?
> In C++, you can do either, of course, but you take your chances if you
> pass by value - both in safety AND PERFORMANCE.

D is very different from C++ when it comes to that topic:

- Since classes are reference types, there is no issue with performance whatsoever: It is just a pointer copy behind the scenes.

- Since structs are value types, they can be shallow-copied without any concern. (D disallows self-referencing structs.) Only when it matters, one writes the copy constructor or the post-blit. (And this happens very rarely.)

- rvalues are moved by default. They don't get copied at all. (Only for structs because classes don't have rvalues.)

> The bottom line is that
> no one passes by value, even for PODs (although we may return even large
> objects.)

I know it very well. In reality, nobody should care unless it matters semantically: Only if the programmer wants to pass an object by reference it should be done so. For example, to mutate an object or store a reference to it.

You must be familiar with the following presentation by Herb Sutter how parameter passing is a big problem. (Yet, nobody realizes until a speaker like Herb Sutter makes a presentation about it.)

  https://www.youtube.com/watch?v=qx22oxlQmKc&t=923s

Such concerns don't exist in D especially after fixing the "in parameters" feature. Semantically, the programmer should say "this is an input to this function". The programmer should not be concerned whether the number of bytes is over a threshold for that specific CPU or twhether the copy constructor may be expensive.

D does not have such issues. The programmer can do this:

- Compile with -preview=in

- Mark function parameters as in (the ones that are input):

auto foo(in A a, in B b) {
  // ...
}

The compiler should deal with how to pass parameters. The programmer provides the semantics and D follows these rules:

  https://dlang.org/spec/function.html#in-params

Although one of my colleagues advices me to not be negative towards C++, having about 20 years of experience with C++, I am confident C++ got this wrong and D got this right. D programmers don't write move constructors or move assignment. Such concepts don't even exist.

In summary, if a programmer has to think about pass-by-reference, that programmer has been conditioned to think that way. It has always been wrong. Passing by reference should have been about semantics. (Herb Sutter uses the word "intent" in that presentation.)

> But I asked a different question: Why can't I put a class object on the
> stack? What's the danger?

There is no danger. One way I like is std.typecons.scoped:

import std.stdio;
import std.typecons;

class C {
  ~this() {
    writeln(__FUNCTION__);
  }
}

void main() {
  {
    auto c = scoped!C();
  }
  writeln("after scope");
}

> Note that operating on that object hasn't changed. If I pass by
> reference, it's no different than if I had created a reference.

(Off-topic: I always wonder whether pass-by-reference comes with performance cost. After all, the members of by-reference struct will have to be accessed through a pointer, right? Shouldn't pass-by-value be faster for certain types? I think so but I never bothered to check the size threshold below which to confidently pass-by-value.)

> One might say, Well, if D creates by value, then it has to pass by
> value. But it doesn't; it has the 'ref' keyword.

That's only when one wants to pass a reference to an object. I blindly pass structs by-value. The reason is, I don't think any struct is really large to cost byte copying. It's just shallow copy and it works. (Note that there are not much copy constructors in D.)

> I hope Ali's answer isn't the real reason. I would be sad if D risked
> seg faults just to make class behavior "consistent".

I don't understand the seg fault either but my answer was to underline the fact that D sees two distinct kinds of types: value types and reference types. C++ does have reference types as well but they are implied by convention. Otherwise the programmer hits the slicing issue.

Ali

May 15, 2022

Re: Why are structs and classes so different?

Posted by H. S. Teoh
in reply to Kevin Bailey

Permalink

H. S. Teoh

Posted in reply to Kevin Bailey

Permalink

On Sun, May 15, 2022 at 08:05:05PM +0000, Kevin Bailey via Digitalmars-d-learn wrote: [...]
> But I asked a different question: Why can't I put a class object on the stack? What's the danger?
[...]

You can. Use core.lifetime.emplace.

Though even there, there's the theoretical problem of stack corruption: if you have an emplaced class object O and you try to assign a derived class object to it, you could end up trashing your stack (the derived object doesn't fit in the stack space allocated to store only a base class instance).  Generally, though, the language would prevent this. In D this doesn't happen because emplace just gives you the class instance as a reference (to a stack location, but nonetheless), and reassignment just updates the reference, it doesn't actually overwrite the base class instance.


T

-- 
If it tastes good, it's probably bad for you.

Top | Forum index | About this forum

Forums