August 15, 2007
Deewiant wrote:
> Bill Baxter wrote:
>> In large part it all stems from the decision to make classes and structs
>> different.  That seems nice in theory, but as time goes on the two are
>> becoming more and more alike.  Classes can now be stack-allocated with
>> scope.  Structs are supposedly going to get constructors, and maybe even
>> destructors.  It kind of makes me think that maybe separating structs
>> and classes was a bad idea.  I have yet to ever once say to myself
>> "thank goodness structs and classes are different in D!", though plenty
>> of times I've muttered "dangit why can't I just get a default copy
>> constructor for this class" or "shoot, these static opCalls are
>> annoying" or "dang, I wish I could get a weak reference to a struct", etc.
>>
> 
> I think the separation was a good idea. I like structs not having hidden fields,
> being laid out in memory exactly as I write the code, and having no performance
> overhead due to vtables, etc. 

Well, in C++ you only get those things if you declare one or more functions as virtual.  D automates flagging functions with 'virtual', so you couldn't do the same thing in D.  But you could imagine a D with a 'nonvirtual' keyword that if stuck into a class would make it POD.

> In C++, the only difference is that structs
> default to public while classes default to private. Where's the use in that?

Very little point in it, I agree.  C++ could have just stuck with 'struct' and been just fine.  Part of my point was that maybe we didn't really need two different things in D either.

> Just use a class for everything - and most people do. I like my POD datatype,
> thank you very much.

I like my POD datatypes too.  And I like the fact that in C++ I can change from POD to not-POD by adding one little word to the class/struct ("virtual").  Almost all code that relies on that POD type will still work if it gains a vtable and one or more virtual methods.  In D it's much more difficult.  static opCalls need to be changed to this() constructors, and every bit of code that uses the thing will have to be changed -- mostly Foo*'s will need to be changed to Foo's.  But some places may be better off changing to scope Foo.  So you need to look closely at the code you're changing.

> With that said, I agree that hiding class pointers may not have been a good
> idea. Still, I can't think of many points which would make it a particularly bad
> idea.

I guess the main down side is just the inconsistency of value/ptr usages --  Foo/Foo* on the one hand and scope Foo/Foo on the other.  It's easy to get used to lack of pointers in Java because it's consistent.  But in D you have to go back and forth between C-like and Java-like line-by-line in your code.

Put it this way, if I download someone's container library, I don't want to have to think about whether their CoolSet is a class or a struct in order to create one by value.  Note that built-in containers are basically struct-like, so in some ways it makes sense for user defined containers to be structs, too.  And C++ STL containers are almost always used by-value (i.e. you almost never see "new std::vector()").  But on the other hand a container may want to use interfaces or inheritance in the implementation, and writing static opCalls is just a pain, so it also also makes sense to use a class.  The fact that it could reasonably be either class or struct means there's one more bit of information I have to keep in mind for every user-defined type in my code.

> Defaulting to stack allocation somehow would be better: it's easy to write
> "auto" instead of "scope" and forget about it. But then, the performance in such
> cases rarely matters, and when it does, you do notice such things because you're
> looking for them.

I don't think the performance is as much of an issue as consistency.

--bb
August 15, 2007
Gregor Richards wrote:
> Bill Baxter wrote:
>> Gregor Richards wrote:
>>> #1 advantage of making them always by-value: Ridiculously inconsistent use of by-value passing, ref passing and pointer passing.
>>>
>>> void foo(Thing a);
>>> void foo(ref Thing a);
>>> void foo(Thing *a);
>>>
>>> Wait, that's not an advantage at all, that's C++.
>>
>> I don't get your point.  Each of those things has different uses in C++.
>>
>> // if Thing is small/PoD and you don't want changes to affect caller
>> void foo(Thing a);
>>
>> // if Thing is bigger, you want to allow caller-visible changes, and you  require calling with non-null
>> void foo(ref Thing a);
>>
>> // same as above, but you want to allow null too
>> void foo(Thing *a);
>>
>>
>> The ref-can't-be-null thing may not hold for D, but it's true of C++.
>>
>> --bb
> 
> This just makes writing code more complicated. It's difficult to remember whether some function wants a reference or a pointer, and you have to be careful about how you treat them because they're fundamentally different (see the operator overloading post). The advantage is virtually nil, and it would create this huge complication. That's why it's terrible in C++, and that's why it would be terrible in D.

D functions can also take pointers, values or references, so I don't see that D is really that different in this respect.  I guess you mean that, *if* you're talking about a class type, then you can be pretty certain you won't need to call a function using a derefernce like foo(&theThing).  Yeh, I guess that is a positive point of hiding pointers.  Of course it comes at the expense of making it impossible to pass a class by value.  You could also just outlaw pass-by-value for classes, and automatically dereference any value classes passed to functions expecting a class pointer.  So I don't think it's strictly necessary to hide the pointer in order to get this benefit you speak of.

--bb
August 15, 2007
Bill Baxter wrote:
> I like my POD datatypes too.  And I like the fact that in C++ I can change from POD to not-POD by adding one little word to the class/struct ("virtual").  Almost all code that relies on that POD type will still work if it gains a vtable and one or more virtual methods.  In D it's much more difficult.  static opCalls need to be changed to this() constructors, and every bit of code that uses the thing will have to be changed -- mostly Foo*'s will need to be changed to Foo's.  But some places may be better off changing to scope Foo.  So you need to look closely at the code you're changing.
> 

You're right, explicit virtual helps. Classes could default to virtual, and structs to non-virtual. Now _that_ would be handy. Newbies or people who don't care about this low-level stuff could just use classes for everything as it is now, but struct use would be simpler.

> Put it this way, if I download someone's container library, I don't want to have to think about whether their CoolSet is a class or a struct in order to create one by value.

This I agree with. Even type inference doesn't solve this problem.

What I like about C++ is that "new" means "allocate on the heap". In D, you have to add "unless storing as 'scope'".

> I don't think the performance is as much of an issue as consistency.

I don't see the consistency problem between stack and heap allocation.

In fact, upon further reflection, what heap allocation has going for it is that it's more consistent: you can do "Obj o = new Obj" followed by a "return o", and it works, just as you can do "int x = 5" followed by "return x". If stack allocation were the default, "return o" would be a problem, because it's a pointer to the now-invalid stack.

-- 
Remove ".doesnotlike.spam" from the mail address.
August 15, 2007
Reiner Pope wrote:
> Bill Baxter wrote:
> 
> How about allowing operator overloads on classes which must be treated as reference types?
> 
> ByValue* a;
> // can't overload opAdd usefully:
> assert(a.opAdd(5) != a + 5);
> 
> ByRef b;
> // but we can if the pointer is hidden
> assert(b.opAdd(5) == a + 5);

That's a good point.  But it does kind of fall into the category of "no need to type '*' everywhere".  You could just say *a + 5.

Or in C++ you could use a reference: ByValue&.

But yeh, you pretty much need real reference types like in C++ if you don't hide the pointer.  Another alternative might be to make a distinction between traditional raw memory pointers and pointers to objects.  Then a+5 could automatically dereference 'a' if it's an object  pointer.  And if you want to do pointer arithmetic you would need some special syntax.  But now that I think about it, that 'object pointer' is really just a reference type by another name.  It's a pointer that acts like a value by automatically dereferencing when needed.

--bb
August 15, 2007
Deewiant wrote:
> I don't see the consistency problem between stack and heap allocation.
> 
> In fact, upon further reflection, what heap allocation has going for it is that
> it's more consistent: you can do "Obj o = new Obj" followed by a "return o", and
> it works, just as you can do "int x = 5" followed by "return x". If stack
> allocation were the default, "return o" would be a problem, because it's a
> pointer to the now-invalid stack.

...unless storing as 'scope'.  But anyway, I'm not sure what you mean: if by-value were the default, then 'return o' would return a copy of 'o' not a pointer to it.  And even if it did return a pointer to it, it's no different than trying to return a pointer to a scope class, or to any other function-local data.  So maybe you mean "more safe" rather than "more consistent"?

--bb
August 15, 2007
I agree, IMHO this was a mistake.  With automatic type deduction, the lack of '*' doesn't really save any typing anymore, at least for declarations, which (I suspect) are by far the most common use:
	auto temp = new MyClass;

It saves (a tiny bit of) typing on function declarations, and avoids the need for a copy constructor, but we could have eliminated copy constructors simply by making pass-by-value class arguments to functions illegal.

But the worst of it all, IMHO, is what you (and others) have pointed out: the lack of clarity about whether a declaration is a struct-by-value or class-by-reference.  That means that if a container was originally implemented as a struct, you can't ever change it to a class, since that would require rewriting all of the code that uses it.  Doesn't that violate one of the basic principles of Object Orientation - the hiding of implementation?

Russ

Bill Baxter wrote:
> I'm starting to seriously wonder if it was a good idea to hide the pointers to classes.  It seemed kinda neat when I was first using D that I could avoid typing *s all the time.  But as time wears on it's starting to seem more like a liability.
> 
> Bad points:
> - Harder to tell whether you're dealing with a pointer or not
>   (c.f. the common uninitialized 'MyObject obj;' bug)
> - To build on the stack, have to use 'scope'
> 
> Good points:
> + No need to type '*' everywhere when you use class objects
> + ??? anything else ???
> 
> 
> I guess I feel like that if D were Java, I'd never see the pointers, "Foo foo;" would always mean a heap-allocated thing, and everything would be cool. But with D sometimes "Foo foo" means a stack allocated thing, and sometimes it doesn't.  I'm starting to think the extra cognitive load isn't really worth it.
> 
> In large part it all stems from the decision to make classes and structs different.  That seems nice in theory, but as time goes on the two are becoming more and more alike.  Classes can now be stack-allocated with scope.  Structs are supposedly going to get constructors, and maybe even destructors.  It kind of makes me think that maybe separating structs and classes was a bad idea.  I have yet to ever once say to myself "thank goodness structs and classes are different in D!", though plenty of times I've muttered "dangit why can't I just get a default copy constructor for this class" or "shoot, these static opCalls are annoying" or "dang, I wish I could get a weak reference to a struct", etc.
> 
> --bb
August 15, 2007
Russell Lewis Wrote:
> But the worst of it all, IMHO, is what you (and others) have pointed
> out: the lack of clarity about whether a declaration is a
> struct-by-value or class-by-reference.  That means that if a container
> was originally implemented as a struct, you can't ever change it to a
> class, since that would require rewriting all of the code that uses it.
>   Doesn't that violate one of the basic principles of Object Orientation
> - the hiding of implementation?

I would tend to disagree. Classes are reference types, structs are value types. When I refactor something from a class to a struct, I *expect* the behavior of function arguments to switch from by-refrence to by-value.

Furthermore, since classes *cannot* be passed by value, it makes no sense to have additional syntax, which will lead to begginers writing 'Object o;' instead of 'Object* o;' - which will not compile under your suggestion.
August 15, 2007
Bill Baxter wrote:
> Deewiant wrote:
>> In fact, upon further reflection, what heap allocation has going for it is that it's more consistent: you can do "Obj o = new Obj" followed by a "return o", and it works, just as you can do "int x = 5" followed by "return x". If stack allocation were the default, "return o" would be a problem, because it's a pointer to the now-invalid stack.
> 
> ...unless storing as 'scope'.  But anyway, I'm not sure what you mean: if by-value were the default, then 'return o' would return a copy of 'o' not a pointer to it.

D'oh! Good point.

> And even if it did return a pointer to it, it's no different than trying to return a pointer to a scope class, or to any other function-local data.  So maybe you mean "more safe" rather than "more consistent"?
> 

In accordance with my "d'oh" above, yes, that's what I meant. ;-)

-- 
Remove ".doesnotlike.spam" from the mail address.
August 15, 2007
Deewiant wrote:
> 
> What I like about C++ is that "new" means "allocate on the heap". In D, you have
> to add "unless storing as 'scope'".

I'm not sure I agree.  Stack allocation from the presence of "scope" is a QOI issue, it's not guaranteed in the spec.  All "scope" really says is "destroy what this reference refers to when the reference goes out of scope."


Sean
August 15, 2007
Tristam MacDonald wrote:
> Russell Lewis Wrote:
>> But the worst of it all, IMHO, is what you (and others) have pointed out: the lack of clarity about whether a declaration is a struct-by-value or class-by-reference.  That means that if a container was originally implemented as a struct, you can't ever change it to a class, since that would require rewriting all of the code that uses it.   Doesn't that violate one of the basic principles of Object Orientation - the hiding of implementation?

Tristam, your response felt a little "hotter" than I what I prefer.  If I sounded a little flamish in my post, I apologize.  If I'm misreading you, I also apologize!

> I would tend to disagree. Classes are reference types, structs are value types. When I refactor something from a class to a struct, I *expect* the behavior of function arguments to switch from by-refrence to by-value.

I think that you made my point for me...if refactoring requires lots of rewrites, then the use-side of the code has too much insight into the implementation.  I don't see any fundamental reason why classes need to be reference types, other than history.  (Yes, they are currently all on the heap, even scope variables, but even that is a decision that I could change, if there was a reason for it.)

That's MHO, anyhow.

> Furthermore, since classes *cannot* be passed by value, it makes no sense to have additional syntax, which will lead to begginers writing 'Object o;' instead of 'Object* o;' - which will not compile under your suggestion.

What's wrong with
	Object o;
?  If I had a magic wand, I would declare that syntax to be equivalent to the current syntax:
	scope Object o = new Object;
or, better, to make it a stack variable.


So the current implementation requires "extra syntax" as well (for scope declarations).  It's a question of which extra syntax is better.  IMHO, making classes consistent with structs is more desirable than eliminating a few stray asteriskes.  The beginners who stumble across the "can't pass a class as a by-value function parameter" rule can be easily educated with a well-written error message emitted by the compiler.


I don't have a magic wand, and I don't want everybody to have to rewrite their old code.  But I continue to hope that this might change in 2.0, or maybe 3.0. :)