August 16, 2007
Walter Bright wrote:
> Brad Roberts wrote:
>> On Wed, 15 Aug 2007, Walter Bright wrote:
>>
>>> Bill Baxter wrote:
>>>> I'm starting to seriously wonder if it was a good idea to hide the pointers
>>>> to classes.  It seemed kinda neat when I was first using D that I could
>>>> avoid typing *s all the time.  But as time wears on it's starting to seem
>>>> more like a liability.
>>> Having classes be by reference only has some serious advantages:
>>>
>>> 1) Classes are polymorphic and inheritable. By making them reference only, the
>>> "slicing" problem inherent in C++ is completely and cleanly avoided.
>>>
>>> 2) You can't write a class in C++ without paying attention to assignment
>>> overloading and copy constructors. With classes as a reference type, these
>>> issues are simply irrelevant.
>>
>> They're _less_ relevant.  There's still the valid usecase of acting like a value to avoid instance sharing where copy construction and assignment's are.  This is where it's the author of the class making that decision rather than the user that you talk about in other responses to this thread.
> 
> I strongly feel that for that usecase, one should be using a value type, i.e. a struct, not a class. C++ classes are neither one nor the other, thereby doing neither well.
> 
> 
>> That's the distinction between 'by value' vs 'by reference' and 'as a value' vs 'as references'.  IE, access vs behavior.  I like a strong enforcement and distinction between the access part, but I do believe that it should be possible for a class write to achieve value semantics.
> 
> I don't agree, I think there is much to be gained by drawing a strong distinction. Value and reference types are *fundamentally* different. Classes are an OOP type, and giving them value semantics introduces all the gotchas C++ has with them (like the slicing problem).

The problem I see is that it's really not always clear whether a value or reference type is desired.  Currently I'm thinking about container classes.  Value semantics are convenient and efficient for small containers.  D's built-in containers are basically structs as is often pointed out.  If you create lots of little Sets here and there you don't want each Set to require two allocations (one for the object itself, and another for adding some content to it).  Only one allocation is really required there.  But you might also want your set to implement some interfaces (a la Tango).  Then you're forced into a class even though you don't particularly want anyone to subclass your Set.  And even though you'd rather use it primarily as a value type (for efficiency, plus since the intent is a 'final' class there will be no derived classes and thus there are no slicing issues).


--bb
August 16, 2007
Bill Baxter wrote:
> I'm starting to seriously wonder if it was a good idea to hide the pointers to classes.  It seemed kinda neat when I was first using D that I could avoid typing *s all the time.  But as time wears on it's starting to seem more like a liability.


The simple distinction between reference types and value types has irked me from the very beginning.  As a programmer using something that somebody else wrote, I shouldn't have to know its storage type. Distinctions between the two pop up here and there throughout D code. For example, value_type[] and ref_type[] have completely different copy behaviors.


> 
> Bad points:
> - Harder to tell whether you're dealing with a pointer or not
>   (c.f. the common uninitialized 'MyObject obj;' bug)
> - To build on the stack, have to use 'scope'
> 
> Good points:
> + No need to type '*' everywhere when you use class objects
> + ??? anything else ???


I was thinking recently of an interesting syntax twist...  Always require & when trying to get to an address, and use the raw variable name and "." to refer to whatever is pointed to.  Obtaining an address would require &.

It's an interesting change, but would likely be too confusing to switch over.  Somehow I bet even suggesting the idea will mark me as a crack pot :)
August 16, 2007
Bill Baxter, el 16 de agosto a las 10:07 me escribiste:
> So yeh, you've eliminated slicing, but I'm not really convinced it was such a huge problem that it warranted a syntax upheaval in the first place.

I don't have slicing problem either, but I've seen a lot of code with that problem from novice (and not so novice) programmers, and it was really hard to debug. A place where is fairly common is on exception handling.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
 .------------------------------------------------------------------------,
  \  GPG: 5F5A8D05 // F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05 /
   '--------------------------------------------------------------------'
El amor es como una reina ortopédica.
	-- Poroto
August 16, 2007
Jason House wrote:
> Bill Baxter wrote:
>> I'm starting to seriously wonder if it was a good idea to hide the pointers to classes.  It seemed kinda neat when I was first using D that I could avoid typing *s all the time.  But as time wears on it's starting to seem more like a liability.
> 
> 
> The simple distinction between reference types and value types has irked me from the very beginning.  As a programmer using something that somebody else wrote, I shouldn't have to know its storage type. Distinctions between the two pop up here and there throughout D code. For example, value_type[] and ref_type[] have completely different copy behaviors.
> 
> 
>>
>> Bad points:
>> - Harder to tell whether you're dealing with a pointer or not
>>   (c.f. the common uninitialized 'MyObject obj;' bug)
>> - To build on the stack, have to use 'scope'
>>
>> Good points:
>> + No need to type '*' everywhere when you use class objects
>> + ??? anything else ???
> 
> 
> I was thinking recently of an interesting syntax twist...  Always require & when trying to get to an address, and use the raw variable name and "." to refer to whatever is pointed to.  Obtaining an address would require &.
> 
> It's an interesting change, but would likely be too confusing to switch over.  Somehow I bet even suggesting the idea will mark me as a crack pot :)

Well realistically none of this is likely to change anyway, so you're free to suggest anything you want.  :-)
I was thinking about something like that as well, though.  But couldn't really think how to make it useful.

My thinking was that in D we have classes sort of "shifted" by one from structs

        pointer-to-pointer   pointer   value
struct    &(&p)                &p        p
struct*    &p                   p       *p
class      &p                   p       N/A

So instead of that make structs default to pointers too, to shift everything back to be the same:

        pointer-to-pointer   pointer   value
struct#       &(&p)            &p        p
struct        &p                p       *p
class         &p                p       N/A

'#' (or whatever -- perhaps re-use 'scope') becomes the "by value" indicator.  So to do a struct-by-value you'd declare:

  MyStruct# Foo;

while
  MyStruct Bar;

would be a pointer/reference just like MyClass is.

Some problems are:
- what do you do about built-in value types like 'int'?  Doesn't make sense to make 'int x' be a pointer type, IMHO.  And if the built-in types behave differently from structs have you gained much?
- for structs, as-value should be the common case, so you'll have to use a '#' most every time you use a struct.
- it still won't make the struct -> class transition all that much easier unless classes gain the ability to be passed around by value.

--bb
August 16, 2007
Walter Bright wrote:
> Gregor Richards wrote:
>> OK, OK, I guess I should respond with an argument from computer science as well :)
>>
>> In the normal definition of Object Orientation, an object is a means of storing a context in which operations can be performed. The abstraction behind this (yay we're modeling the universe in code but realistically we aren't yay) is irrelevant, as fundamentally OO is just a means of storing and passing contexts. Because this is a context, it makes no sense whatsoever to pass it around with duplication - duplicating contexts is nonsense.
>>
>> structs are sort of a hack for compatibility and/or optimization. They are not contexts, they are means of creating more complicated values. While a "Point" could be a struct, really being a more complicated value, an "NPC" would always be a class, since it is a context.
>>
>> The fact that this is inconsistent with C++ is irrelevant: D is more to the spirit of good OO.
> 
> You put your finger on the very good reason why polymorphic, inheritable types in D are restricted to being reference types, not value types. OOP requires this characteristic.

For one, rather restricted, notion of OOP.  There are many, many views of what constitutes OOP in the PL community.

> In C++, an OOP class can be used/misused by the user as a value type or a reference type, all out of the purview of the class designer. The class designer must control this, not the class user.

It's normal in C++ to make "entity" classes (those that you're calling reference types) noncopyable.  It's also normal to make base classes abstract.  Thus idioms easily prevent the basic misuses.

I can't think of any reason why a value type would object to the identity of its objects being used (though functional languages often hide object identities).

-- James
August 16, 2007
Johan Granberg wrote:
> Russell Lewis wrote:
>> I don't see any fundamental reason why classes need to
>> be reference types, other than history.
> 
> What about this situation.
> 
> //begin C++
> 
> class A{
>         int val;
> };
> class B:public A{
>         int foo;
> };
> 
> int main(int argc,char**argvs){
>         A a;
>         B b;
>         a=b;//HERE what happens to b's member foo?
> }
> 
> //end C++
> 
> it's my impression that D's classes are reference types to avoid that specific problem.

// begin non-naive C++

class A { // a base class
public:
  int val;
protected:
  A() {}
};
class B : public A {
public:
  int foo;
};
int main() {
  A a; // compilation error here.
  B b;
  a = b; // mu
}

// end C++

It's a slight exaggeration to say that much of D's design exists to compensate for programmers finding it too much trouble/too hard to learn idioms needed to use C++ safely.

However, value semantics (such as comparing for equality) don't work well with polymorphism.  C++ avoids this problem when used conventionally; I'm not sure if D falls into the same trap as most OO languages by allowing equality comparisons between objects of different classes.

-- James
August 16, 2007
Walter Bright wrote:
> Johan Granberg wrote:
>> Russell Lewis wrote:
>>> I don't see any fundamental reason why classes need to
>>> be reference types, other than history.
>>
>> What about this situation.
>>
>> //begin C++
>>
>> class A{
>>         int val;
>> };
>> class B:public A{
>>         int foo;
>> };
>>
>> int main(int argc,char**argvs){
>>         A a;
>>         B b;
>>         a=b;//HERE what happens to b's member foo?
>> }
>>
>> //end C++
>>
>> it's my impression that D's classes are reference types to avoid that specific problem.
> 
> That's known as the 'slicing' problem. It's pernicious in that it can be extremely hard to expose via testing or code reviews, yet will expose the program to unpredictable behavior.

It's trivially detected by various automated tools, which can flag any non-abstract base class.  (Such classes almost invariably indicate bad design in any case.)  Clearly it would be simple for a compiler to detect when a concrete class was used as a base class.  There's no need to remove value semantics in order to solve this problem; it's something of a sledgehammer solution.

-- James

August 16, 2007
Walter Bright wrote:
> Bill Baxter wrote:
>> I'm starting to seriously wonder if it was a good idea to hide the pointers to classes.  It seemed kinda neat when I was first using D that I could avoid typing *s all the time.  But as time wears on it's starting to seem more like a liability.
> 
> Having classes be by reference only has some serious advantages:
> 
> 1) Classes are polymorphic and inheritable. By making them reference only, the "slicing" problem inherent in C++ is completely and cleanly avoided.

Trivially avoided in C++ also.

> 2) You can't write a class in C++ without paying attention to assignment overloading and copy constructors. With classes as a reference type, these issues are simply irrelevant.

Trivial in C++: entity types are declared as "noncopyable", base classes are made abstract, and problems disappear.

> 3) Value types just don't work for polymorphic behavior. They must be by
> reference. There's no way in C++ to ensure that your class instances are
> used properly by reference only (hence (2)). In fact, in C++, it's
> *extra work* to use them properly.

Declaring them as noncopyable seems to avoid the problems to which you refer, and should be normal for entity types in C++.

(I'm not sure which extra work you're referring to: using
some form of GC for memory management?)

> Value types are fundamentally different from reference types. D gives you the choice.

C++ gives you *more* choice by allowing easier migration between the two without imposing performance or syntactic differences.

-- James
August 16, 2007
James Dennett Wrote:

> It's trivially detected by various automated tools, which can flag any non-abstract base class.  (Such classes almost invariably indicate bad design in any case.)

Now that's just patently untrue. I've never done any serious work in C++ outside of school, so perhaps that's a belief some people hold in the C++ world, but I doubt it's a very common one.

I'd say it's about an 80-20 split for code I write and 60-40 in standard libraries. That is, 80% of the time I'm extending something I am indeed extending an abstract base, but 20% of the time I'm extending a concrete class is helpful, and in the case of libraries you can't change, necessary.

Here's a real-world example. At work recently (I don't think this violates my NDA...) I was asked to write a fake POP & IMAP server (it gives a certain number of new messages per hour to every account for load testing) in Java. I have a class UserAcct which is used to track persistent accounts so that UIDs remain contiguous. When I added support for IMAP idle, some additional information was needed to handle idling subscriptions/send events/etc. My options were to either create an "imap idle subscription" class to wrap a subscription with a pointer to the UserAcct class (composition) or to extend the UserAcct with a SubscribedUserAcct (only a small fraction of the users would have IMAP idle enabled), or just to add some extra fields to UserAcct.

I selected the second option (extension) here. Arguably, composition may have been the better choice from a design standpoint, and I certainly respect this opinion. However:

- Composition removes polymorphism: The obvious. If I needed to override or change a function of the base class, this would not be an option with composition. In addition, were I using a wrapper class, I copuld no longer keep a map of "UserAcct" objects, I would need two different data structures, one of the wrapper and one of the accounts.

- Composition costs additional memory: the extra 8 bytes (in Java and D, not sure about C++) overhead of the additional object, plus another 4 for the reference (actually, that averages to 6 because the Sun JVM likes to align classes on 8-byte word boundaries). Since I had recently run into some OutOfMemory errors when it was loaded with 50,000+ accounts, I was loath to incur much additional heap usage than I had to.

Anyways, this is getting a bit off-topic. However a sweeping statement like "all base classes should be abstract" just seems very wrong to me.

I personally think D's behavior is perfect, though I think non-polymorphic struct inheritance (i.e. as syntactic sugar for mixins) and constructors would be nice. The distinction, IMO, shows D to be a more mature language with different use-cases for structs and classes clearly defined.

That, and you've mentioned specific idioms in a few of your messages. Idioms are great, but not everybody follows them (especially people new to a language). Making something part of the language standardizes it. That's why having unittests, asserts, preconditions, etc., in D is such a boon: while that could all be done using a library, there'd probably be three or four different libraries out there to do it and 75% of users wouldn't know about them or use those features at all.
August 16, 2007
On Thu, 16 Aug 2007 01:35:17 +0400, Walter Bright <newshound1@digitalmars.com> wrote:

> Bill Baxter wrote:
>> I'm starting to seriously wonder if it was a good idea to hide the pointers to classes.  It seemed kinda neat when I was first using D that I could avoid typing *s all the time.  But as time wears on it's starting to seem more like a liability.
>
> Having classes be by reference only has some serious advantages:
>
> 1) Classes are polymorphic and inheritable. By making them reference only, the "slicing" problem inherent in C++ is completely and cleanly avoided.

From my expirience this is a problem of C++ beginners.

> 2) You can't write a class in C++ without paying attention to assignment overloading and copy constructors. With classes as a reference type, these issues are simply irrelevant.
>
> 3) Value types just don't work for polymorphic behavior. They must be by reference. There's no way in C++ to ensure that your class instances are used properly by reference only (hence (2)). In fact, in C++, it's *extra work* to use them properly.

But from another side all types in C++ (internal or user defined) is a first class citizens. It is especially important in generic programming, where I can write:

template< class T >
class ValueHolder {
  T * m_value
public :
  ValueHolder() : m_value( new T() ) {}
  ...
};

and this code will work as with int, as with std::string, as with ValueHolder<SomeAnotherType>.

> Value types are fundamentally different from reference types. D gives you the choice.

After some languages where there aren't distinction beetwen various kind of types (e.g. C++/Eiffel/Ruby) D looks very strange here. And this is one of the troubles in studying D.

-- 
Regards,
Yauheni Akhotnikau