Thread overview
Design questions : Copying and primitive-likes
Feb 22, 2003
davids
Feb 22, 2003
Sean L. Palmer
Feb 26, 2003
Farmer
Mar 07, 2003
Farmer
Mar 08, 2003
David Simon
Mar 08, 2003
Farmer
February 22, 2003
I'm pretty excited by D; I've always liked C++, but annoyed by the known problems that implementations were prevented from fixing due to C compatability. I really like the idea of a language similar in spirit and feel to C++, sans the known problems and plus a few additional handy features.

But looking over the design, a couple things puzzle me. I'm not sure if they've been discussed before here, as there's no search mechanism on the web interface and my ISP doesnt have NNTP, so if I'm inadvertantely restarting an old flame war, hand me a link and I'll go quiet-like. On to the questions:

First off, I notice that none of the operators you'd need in order to create classes that act like arrays and associative arrays are in the list of overloadable operators. Is this omission intentional or is it just a matter of waiting till someone gets around to implementing it?

In the design overview, I notice that the Java concept of allocating all class objects to the heap is adopted. This has some definite advantages: doing things on the heap by default means that the safest behavior polymorphically is the standard one, and it avoids unneccessary copying.

However, I don't really understand why this hasn't simply been made the default as opposed to the only option, and I also don't understand why primitives are made to behave so differently from class objects. This introduces some problems:

- There's no template-safe way to make a copy of something. Sure, you could have
a standard function 'clone()' or somesuch, but having it done by the language is
better since it's official, and it could be implemented for by D itself in most
cases.
- Similarly, there's no template-safe way to make a reference of something. This
is because the behavior of class objects differs from primitives; if you do A =
B in a template where A and B are of the parameterized type, this will produce
different results depending on whether or not its a primitive or a class object,
and theres no safe way to check except to specialize for each individual
primitive type the template supports.
- You cannot make a class object that behaves just like a primitive object in
terms of assignment, since you cannot overload operator= (or is this just an
accidental ommision from the overloadable ops list?)
- Things using Object cannot work with primitives.

A good example of a case for class objects needing to have primitive-like semantics is a math library with types for rectangles, points, and so on. These types are not suitable for D structs, since they need methods, but they are not suitable for D classes either, since they're small enough to be considered primitive-like and to be noticabely more efficient on the stack, and since a user of this library could easily be using them all over the place as temporary variables in their own algorithms.


February 22, 2003
<davids@argia.net> wrote in message news:b3779f$fu5$1@digitaldaemon.com...
> I'm pretty excited by D; I've always liked C++, but annoyed by the known problems that implementations were prevented from fixing due to C
compatability.
> I really like the idea of a language similar in spirit and feel to C++,
sans the
> known problems and plus a few additional handy features.

If you ask me, D seems more like a cross between Java and C++ and C.  Some features of all 3.  It's not attempting to go the same route C++ went, even though it has generics and operator overloading.

> But looking over the design, a couple things puzzle me. I'm not sure if
they've
> been discussed before here, as there's no search mechanism on the web
interface
> and my ISP doesnt have NNTP, so if I'm inadvertantely restarting an old
flame
> war, hand me a link and I'll go quiet-like. On to the questions:

Discussion is always good.

> First off, I notice that none of the operators you'd need in order to
create
> classes that act like arrays and associative arrays are in the list of overloadable operators. Is this omission intentional or is it just a
matter of
> waiting till someone gets around to implementing it?

There was a vote, and not enough people voted for them I guess.  I think it's handy to overload array access.  I'd want to be able to overload higher-dimensional forms too, such as operator [int x, int y, int z].

If you can overload array indexing you should also be able to overload iteration.  There's no standard iteration mechanism yet in D though.  There was some debate a long time ago about being able to apply an operation designed for an array element to all the elements in an array or slice, but this has never actually been implemented yet.  That's the closest thing to iteration that has been seriously considered so far that I've seen.  Lots of talk about it and many proposals but nothing that looks like a possible consensus has been approached.  Do you have any suggestions?

> In the design overview, I notice that the Java concept of allocating all
class
> objects to the heap is adopted. This has some definite advantages: doing
things
> on the heap by default means that the safest behavior polymorphically is
the
> standard one, and it avoids unneccessary copying.

For smallish classes it creates unnecessary allocation.  You can always pass structs by reference.  What it does is make it safer to share pointers to objects.  It doesn't completely solve any safety or robustness problems, and doesn't make things much easier for itself say for implementing a much more efficient form of garbage collection.  Still the same old malloc with a spiffy new generic garbage collector bolted on almost as an afterthought.

> However, I don't really understand why this hasn't simply been made the
default
> as opposed to the only option, and I also don't understand why primitives
are
> made to behave so differently from class objects. This introduces some
problems:
>
> - There's no template-safe way to make a copy of something. Sure, you
could have
> a standard function 'clone()' or somesuch, but having it done by the
language is
> better since it's official, and it could be implemented for by D itself in
most
> cases.

Agreed.  There are different forms of operator == (and === (that's 3 ='s) ) which compare by value or by reference, but it's unclear what they do if applied to the wrong kind of type (what does === do on a value type?)  But operator = is used to copy by value *and* for copy by reference.  Yes it can be confusing.  There is no standard way of getting what most would call a deep copy of a class object.

> - Similarly, there's no template-safe way to make a reference of
something. This
> is because the behavior of class objects differs from primitives; if you
do A =
> B in a template where A and B are of the parameterized type, this will
produce
> different results depending on whether or not its a primitive or a class
object,
> and theres no safe way to check except to specialize for each individual primitive type the template supports.

I'd personally prefer more integration of types myself.

> - You cannot make a class object that behaves just like a primitive object
in
> terms of assignment, since you cannot overload operator= (or is this just
an
> accidental ommision from the overloadable ops list?)

D assumes that assignment isn't something you would want to overload.   I don't really agree.

> - Things using Object cannot work with primitives.

Yeah there's no boxing/unboxing like in C#.  Yet.

> A good example of a case for class objects needing to have primitive-like semantics is a math library with types for rectangles, points, and so on.
These
> types are not suitable for D structs, since they need methods, but they
are not
> suitable for D classes either, since they're small enough to be considered primitive-like and to be noticabely more efficient on the stack, and since
a
> user of this library could easily be using them all over the place as
temporary
> variables in their own algorithms.

D structs can have methods and overloaded operators.  They just can't have constructors and destructors.  In fact, D structs are intended for just such primitives as points and rectangles.

Sean


February 26, 2003
"Sean L. Palmer" <seanpalmer@directvinternet.com> wrote in news:b37dbj$lbe$1@digitaldaemon.com:

> <davids@argia.net> wrote in message news:b3779f$fu5$1@digitaldaemon.com...
>> In the design overview, I notice that the Java concept of allocating all
> class
>> objects to the heap is adopted. This has some definite advantages: doing
> things
>> on the heap by default means that the safest behavior polymorphically is
> the
>> standard one, and it avoids unneccessary copying.
> 
> For smallish classes it creates unnecessary allocation.  You can
> always pass structs by reference.  What it does is make it safer to
> share pointers to objects.  It doesn't completely solve any safety or
> robustness problems, and doesn't make things much easier for itself
> say for implementing a much more efficient form of garbage collection.
>  Still the same old malloc with a spiffy new generic garbage collector
> bolted on almost as an afterthought.
> 
IMO having classes on the gc heap and structs on stack (you cannot bring
them on the gc heap, only the malloc heap) makes life much easier:
-virtual methods are always dispatched dynamically
-no risk of slicing off data when passing parameters by value
-no risk of slicing off data when creating arrays or hashtables for objects
-objects can be returned efficiently by functions
-behaviour of assignments is consistent and predicable for maintainers
-no risk of returning invalid pointers to objects on the stack

Actually since DMD 0.56 you can allocate objects on the stack, but I did not check that out (yet). But I guess, this feature should be removed from D. Looks unsafe and incomplete, e.g. I want objects embedded within other objects without costly indirection. Anyway, a compiler could automatically detect when it is safe to put objects onto the stack. I read a paper "Marmot:An Optimizing Compiler for Java" about a native Java compiler that did this.


>> However, I don't really understand why this hasn't simply been made the
> default
>> as opposed to the only option, and I also don't understand why primitives
> are
>> made to behave so differently from class objects. This introduces some
> problems:
>>
>> - There's no template-safe way to make a copy of something.
Maybe there is one, I think you can specialize the templates to simple types and classes (types derived from Object), but a bug in the compiler currently prevents this.


> Sure, you could have
>> a standard function 'clone()' or somesuch, but having it done by the
> language is
>> better since it's official, and it could be implemented for by D itself in
> most
>> cases.
Most objects cannot simply be copied, it requires extra thoughts by the
programmer to make things work right.
If you want to make shallow copies or deep copies of your objects you can
create a clone method. Implementation is straight forward:
1)Move all member variables that should be shallowed copied in a struct.
2)Use this struct as the member variable of the class.
3)Create a new instance in the clone method and assign the struct member to
the new instance.
4) If some members (e.g. some pointers) require a deep copy, you must copy
of each them and assign these copies to the new instance.

The code generated by DMD is as efficient as the implicit copy constructor
of C++-compilers, as far as I can judge that.


> 
> Agreed.  There are different forms of operator == (and === (that's 3 ='s) ) which compare by value or by reference, but it's unclear what they do if applied to the wrong kind of type (what does === do on a value type?)  But operator = is used to copy by value *and* for copy by reference.  Yes it can be confusing.  There is no standard way of getting what most would call a deep copy of a class object.
We could use a new operator := for making copywise assignment and operator = for copy-only-reference semantic. But I think, that even operator === should be removed. It will cause just too many bugs for too many people.

	Object o;
	if (o==null)  //BUG must be if (o===null)
	{}
At least if the D compiler would issue a warning here, it would help in
most cases.


> 
>> - Similarly, there's no template-safe way to make a reference of
> something. This
>> is because the behavior of class objects differs from primitives; if you
> do A =
>> B in a template where A and B are of the parameterized type, this will
> produce
>> different results depending on whether or not its a primitive or a class
> object,
>> and theres no safe way to check except to specialize for each individual primitive type the template supports.
> 
> I'd personally prefer more integration of types myself.
> 
>> - You cannot make a class object that behaves just like a primitive object
> in
>> terms of assignment, since you cannot overload operator= (or is this just
> an
>> accidental ommision from the overloadable ops list?)
> 
> D assumes that assignment isn't something you would want to overload.
>  I don't really agree.
> 
>> - Things using Object cannot work with primitives.
> 
> Yeah there's no boxing/unboxing like in C#.  Yet.
I don't see how boxing/unboxing as done in C# helps anything, except for attracting Visual Basic people. Primitive types and Object types are made so different for performance reasons. D could make everything to behave as an Object but not the otherway round. The boxing/unboxing thing of C# is just syntax sugar for creating an wrapper-object for the primitive value, generated by the compiler behind you back. I guess that if templates were part of the initial relase of .Net, C# would not have boxing/unboxing support. I think boxing is intented mainly for container classes.

>> A good example of a case for class objects needing to have primitive-like semantics is a math library with types for rectangles, points, and so on.
> These
>> types are not suitable for D structs, since they need methods, but they
> are not
>> suitable for D classes either, since they're small enough to be considered primitive-like and to be noticabely more efficient on the stack, and since
> a
>> user of this library could easily be using them all over the place as
> temporary
>> variables in their own algorithms.
> 
> D structs can have methods and overloaded operators.  They just can't have constructors and destructors.  In fact, D structs are intended for just such primitives as points and rectangles.
I guess, constructors and destructors could be added to structs without performance loss, as long as any copy-constructor or assigment operator is disallowed. But why is Walter against them? Just because people would ask for copy-constructors as soon as he adds constructors? That does not seems rational to me.


Just some thoughts.

Farmer
March 07, 2003
I wrote
> IMO having classes on the gc heap and structs on stack (you cannot bring them on the gc heap, only the malloc heap) makes life much easier:
>-virtual methods are always dispatched dynamically
> -no risk of slicing off data when passing parameters by value
> -no risk of slicing off data when creating arrays or hashtables for
> objects
>-objects can be returned efficiently by functions
> -behaviour of assignments is consistent and predicable for maintainers -no risk of returning invalid pointers to objects on the stack

The primary problem with arrays (or hashtables, depends on implementation) is not slicing off data, but corrupt data when accessing array elements polymorphically. But for D this would currently be no issue, as covariance of arrays are not allowed.



 Just a correction.

 Farmer

March 08, 2003
>If you want to make shallow copies or deep copies of your objects you can
>create a clone method. Implementation is straight forward:
>1)Move all member variables that should be shallowed copied in a struct.
>2)Use this struct as the member variable of the class.
>3)Create a new instance in the clone method and assign the struct member to
>the new instance.
>4) If some members (e.g. some pointers) require a deep copy, you must copy
>of each them and assign these copies to the new instance.

The approach you have above has some problems; it doesnt work with templates unless they understand your convention for naming the clone function, and accessing the members of the substruct requires a bit. Another issue is that its just sort of using a property of struct (copying) as a property of class by using a struct; but why not just make this a property of classes in the first place?

The biggest problem I had was not whether or not things are placed on the heap or the stack, but that class objects have different semantics for the same operation as non-class objects not on the gc heap. This just seems inconsistent, and it forces any coder writing templates to very carefully avoid any use of the '=' operator that assumes a copy was or was not made, which is pretty much every use that I can think of...

This has no chance of happening so late in the design phase, but what might've done this would be to use copy-on-write for class objects, and to give them = syntax for copying. Then, they'd be consistent regular stack objects in terms of assignment, but you'd still get polymorphic behavior naturally in containers and arguments, avoid slicing, and avoid unnecessary copies.

The best way to do class copying automatically is probably to differentiate owning and non-owning pointers (this could also make auto-serialization systems easier as well). This is what a lot of smart pointer systems in C++ do, including Boost's and the standard library auto_ptr. The difference between deep copy and shallow copy isn't important; the point of making a copy is to create an entirely seperate object from the original, which pretty much implies a deep copy (or something that acts like one, like a copy-on-write).


March 08, 2003
Hi,

comments are embedded.

David Simon <David_member@pathlink.com> wrote in news:b4c929$312g$1@digitaldaemon.com:

>>If you want to make shallow copies or deep copies of your objects you
>>can create a clone method. Implementation is straight forward:
>>1)Move all member variables that should be shallowed copied in a
>>struct. 2)Use this struct as the member variable of the class.
>>3)Create a new instance in the clone method and assign the struct
>>member to the new instance.
>>4) If some members (e.g. some pointers) require a deep copy, you must
>>copy of each them and assign these copies to the new instance.
> 
> The approach you have above has some problems; it doesnt work with templates unless they understand your convention for naming the clone function, and accessing the members of the substruct requires a bit. Another issue is that its just sort of using a property of struct (copying) as a property of class by using a struct; but why not just make this a property of classes in the first place?

One design goal of D is to keep the standard rather simple.

I only see two minor problems:
-More typing for programmers (solution: wasting less time in pointless
meetings, so one *has* plenty of time for typing D code ;-).
-Templates must understand a clonable convention: That's possible with
interfaces in D.

> The biggest problem I had was not whether or not things are placed on the heap or the stack, but that class objects have different semantics for the same operation as non-class objects not on the gc heap. This just seems inconsistent [...]
That's the best point about it. Class objects are objects, everything else is not an object. No superficial unification of language types. An int is not an object; it's more likely to be register in the CPU ;-).  The programming modell that is induced by D's semantic for objects is likely to be as fun as it is in Java or C#.

>, and it forces any coder writing templates to
> very carefully avoid any use of the '=' operator that assumes a copy was or was not made, which is pretty much every use that I can think of...
Could you please post just one of them?
I wanted to write a template that unifies the copying behaviour of the
template parameter. Other templates could use this template if they
required the unified behaviour for their parameters. But I did not succeed
for two reasons:
-due to a bug in the compiler, I could not specialize for objects. It's
fixed now.
-I was not able to think of an example that requires a unified behaviour.



> 
> This has no chance of happening so late in the design phase, but what might've done this would be to use copy-on-write for class objects, and to give them = syntax for copying. Then, they'd be consistent regular stack objects in terms of assignment, but you'd still get polymorphic behavior naturally in containers and arguments, avoid slicing, and avoid unnecessary copies.
> 
> The best way to do class copying automatically is probably to differentiate owning and non-owning pointers (this could also make auto-serialization systems easier as well). This is what a lot of smart pointer systems in C++ do, including Boost's and the standard library auto_ptr. The difference between deep copy and shallow copy isn't important; the point of making a copy is to create an entirely seperate object from the original, which pretty much implies a deep copy (or something that acts like one, like a copy-on-write).
> 

Don't think that the concepts you mentioned here would be useful for D. D's GC deals with the issues solved by smart-pointers and COW classes in C++, in a much simpler way.



Farmer.