Jump to page: 1 2 3
Thread overview
For 2.0, or D++: A Thought about Class References & Pointers
Aug 25, 2004
Russ Lewis
Aug 26, 2004
Sean Kelly
Aug 26, 2004
Russ Lewis
Aug 26, 2004
Matthias Becker
Aug 26, 2004
Russ Lewis
Aug 26, 2004
Ilya Minkov
Aug 26, 2004
Russ Lewis
Aug 26, 2004
Ilya Minkov
Aug 26, 2004
Sai
Aug 26, 2004
Sai
Aug 26, 2004
Sai
Aug 29, 2004
Ben Hinkle
Aug 29, 2004
Mike Swieton
Aug 29, 2004
Russ Lewis
Aug 29, 2004
Ben Hinkle
Aug 30, 2004
Russ Lewis
Aug 30, 2004
Ilya Minkov
Sep 02, 2004
Matthias Becker
Sep 02, 2004
Regan Heath
Sep 02, 2004
Sean Kelly
Sep 03, 2004
Russ Lewis
Aug 31, 2004
Sha Chancellor
Sep 01, 2004
ninjadroid
Sep 01, 2004
Russ Lewis
Sep 01, 2004
Regan Heath
Sep 02, 2004
ninjadroid
Sep 02, 2004
Regan Heath
Sep 02, 2004
ninjadroid
August 25, 2004
I'm becoming increasingly convinced that not requiring * for class reference variables was a mistake.  In other words, I think that the syntax for declaring a new class variable on the heap should have been:
	MyClass *ptr = new MyClass;
Note that I'm not suggesting that you should be able to use class variables directly (since that would require copy constructors and other nightmares) - instead, I'm suggesting that we keep the same language semantics, but simply require a '*' for all class reference variables.



I have a couple of reasons for this:

1) Templates
The first one I remember coming across was template code.  Say you want to work with something by reference.  If the thing is a class, then you must use the syntax "T foo" while if it is anything else, you must use the syntax "T* foo".  It would be better if there was consistency.  If there is an '*', then it is a pointer to something.  If there is not, then it is a literal variable.

2) Newbie learning curve
Many people have posted questions on the newsgroup, asking why their code segfaults when they write code like this:
	MyClass foo;
	foo.DoStuff();
We don't know how many more people just got turned off to D and never asked the question here.  If the '*' were required, then newbies would learn quickly that the right thing to do is:
	MyClass *foo = new MyClass;
	foo.DoStuff();

3) Objects on Heap
We've had many long discussions about objects on the heap.  The fact that D lacked them lead to 'auto' variables, and still people want more.  Frankly, I think that objects on the heap make a lot of sense sometimes;  we could have them and still avoid the copy constructor stuff.

What if the compiler let you write this code:
	MyClass foo;
which was syntax sugar for this:
	/* on stack */ MyClass _implicit_foo;
	/* ref */ MyClass *foo = &_implicit_foo;

The class object would be constructed on the stack.  Like 'auto', it would be automatically destroyed when the variable goes out of scope.

The 'foo' variable would work pretty much like an "out" parameter to a function.  Internally, it would be a pointer; however, you could use it like a value variable.  So you could write this code:
	MyClass foo;
	MyClass *ptr = &foo;

Moreover, you make these rules:
	* It is illegal to ever assign a value to a class variable.
	  (You can assign values to pointers to classes.)
	  So this code is illegal:
		MyClass a,b;
		a = b; /* syntax error */
	* You can never have a class variable as a function arg.
	  (You can have a pointer to a class as an arg.)
Then, we don't have the copy constructor problem.  If you want to pass an object to a function, you must pass a pointer to it, not the actual value.



***NOTE NOTE NOTE NOTE***
I want to say that I am still very much in support of D.  I think that it is the best language out there (by far) and I use it as much as I'm able.  So please, I'm not saying that "D is broken" - far from it. However, I think that the next generation language, be it D 2.0 or D++, should consider this argument and maybe make the change.

August 26, 2004
In article <cgj8er$2lat$1@digitaldaemon.com>, Russ Lewis says...
>
>1) Templates

What about "inout T foo"

>2) Newbie learning curve

While I'm a C/C++ person, I don't agree with this argument.  The same could be said of Java and yet folks have picked it up easily enough.  I think this is just a language detail that people have to learn.

>3) Objects on Heap
>We've had many long discussions about objects on the heap.  The fact
>that D lacked them lead to 'auto' variables, and still people want more.

I think all people want is some guarantee that an auto object they construct is on the stack, and that is something I am not convinced is necessary.  If something is declared as auto then I think it's fair to let the compiler decide where to put it.  The stack is a completely viable option for compilers who want to optimize performance a bit.

>Then, we don't have the copy constructor problem.  If you want to pass an object to a function, you must pass a pointer to it, not the actual value.

I don't know, I kind of like D's current semantics.  But it might be interesting to experiment with some sort of official copy constructor semantics.  Or perhaps COW is the way to go for most instances.  Either way, I'm not sure I'm keen on bringing more C pointer terminology into D than it already has.


Sean


August 26, 2004
>I have a couple of reasons for this:
>
>1) Templates
>The first one I remember coming across was template code.  Say you want
>to work with something by reference.  If the thing is a class, then you
>must use the syntax "T foo" while if it is anything else, you must use
>the syntax "T* foo".  It would be better if there was consistency.  If
>there is an '*', then it is a pointer to something.  If there is not,
>then it is a literal variable.

What about the inconsitency of op= ? Sometimes it's a flat copy sometimes a
reference assignment.
So do we then get
foo = bar; // reference assignment
*foo = *bar;  // flat copy
???


August 26, 2004
Russ Lewis schrieb:
> I'm becoming increasingly convinced that not requiring * for class reference variables was a mistake.  In other words, I think that the syntax for declaring a new class variable on the heap should have been:
>     MyClass *ptr = new MyClass;

Yikes.

> I have a couple of reasons for this:
> 
> 1) Templates

The official position by Walter was that one usually needs 2 templates in the case of classes and structs.

In general, i think templates play an important role in D by allowing to parametrically duplicate some code - something that wasn't possible in Delphi and thus would lead to unsafe containers or every project having to write out (duplicate manually) ThisTypeList and ThatTypeList and ThisTypeBlah in masses - that's what we get to avoid. But i'm not so sure that making templates have the same role as in C++, being the universal hole-filler for everything, makes much sense.

> 2) Newbie learning curve

The newbies would just not learn D if we do as you suggest. :) Pointer on classes and combined struct/ class semantics is one of the most confusing things for newbees in C++, so the Delphi and Java syntax/ semantics like we have now is much newbee-safer. But your suggestion is by far more confusing, i believe. I just try to put myself in the position of a newbee.

Another thing is that if we do as you suggest, people coming from C++ would requiere that we implement a complete C++ struct/class semantics, and not just allow to use objects by pointer.

> 3) Objects on Heap
> We've had many long discussions about objects on the heap.  The fact that D lacked them lead to 'auto' variables, and still people want more.  Frankly, I think that objects on the heap make a lot of sense sometimes;  we could have them and still avoid the copy constructor stuff.

Objects on the heap? they are on the heap. Perhaps you mean on the stack?

And hey, it is so common to read in C++ libraries "do not construct this on the stack else this breaks". There are just so many things that are better off being on the heap - especially in D where the performance of the garbage collector scan can be severely improved for the heap but not much for the stack.

Besides, the implementation would be with hoops and advert effect on performance, since D tries to reduce the number of implicit finalization blocks which are inserted by C++ all over the place. That's apparently the reason why Walter doesn't want structs to have destructors.

Perhaps some better support for auto classes would be desired. I'd say it's preferred against adding more stack functionality to usual classes or destructors to structs, since it will not have the "surprising" advert performance effect, and the usage is somewhat better documented that way. And it's less ugly. :)

> Moreover, you make these rules:
>     * It is illegal to ever assign a value to a class variable.
>       (You can assign values to pointers to classes.)
>       So this code is illegal:
>         MyClass a,b;
>         a = b; /* syntax error */
>     * You can never have a class variable as a function arg.
>       (You can have a pointer to a class as an arg.)
> Then, we don't have the copy constructor problem.  If you want to pass an object to a function, you must pass a pointer to it, not the actual value.

Hmmm. What it all gives us, that we have to decorate all the common usage cases with a "*", and where the traditional syntax is left it is a bug. If we do anything like that, it should be with a special syntax for the special case, not the other way around.

> ***NOTE NOTE NOTE NOTE***
> I want to say that I am still very much in support of D.  I think that it is the best language out there (by far) and I use it as much as I'm able.  So please, I'm not saying that "D is broken" - far from it. However, I think that the next generation language, be it D 2.0 or D++, should consider this argument and maybe make the change.

Noone is accusing you of anything. After all, it is just opinions. Luckily, we have Walter who would not let anyone just spoil his language. :)

-eye
August 26, 2004
I guess the problem is with consistancy,

Either be consistent in syntax for (stack or heap based) memory allocations or mention it clearly in documentation all possible syntaxes.

We have none right now. Few pages of HTML documentation with scattered information without search facility is really discouraging.

I have complete faith in Walter, I believe what ever he has decided on syntax there must be some rationale for it.

So I would be very happy if Walter or some one could dedicate a page on all possible syntaxes. Being a proactive person, let me start first.

Please correct me if I am wrong, and please add entries of I missed some syntaxes.

(I am trying to embed HTML in this post)


<table border="0" cellpadding="0" cellspacing="0" style="border-collapse:
collapse" bordercolor="#111111" width="85%" id="AutoNumber1">
<tr>
<td width="54%"><i><b>Syntax</b></i></td>
<td width="21%"><i><b>'var' is reference or 'value' ?</b></i></td>
<td width="28%"><i><b>if reference,<br>
&nbsp;is memory allocated ?</b></i></td>
</tr>
<tr>
<td width="54%"><i><b>For Classes</b></i></td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
<tr>
<td width="54%">Class_type var;</td>
<td width="21%">reference</td>
<td width="28%">no</td>
</tr>
<tr>
<td width="54%">Class_type var = new Class_type();</td>
<td width="21%">reference</td>
<td width="28%">yes</td>
</tr>
<tr>
<td width="54%">&nbsp;</td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
<tr>
<td width="54%"><i><b>For Structs</b></i></td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
<tr>
<td width="54%">Struct_type var;</td>
<td width="21%">value</td>
<td width="28%">-</td>
</tr>
<tr>
<td width="54%">Struct_type* var;</td>
<td width="21%">reference</td>
<td width="28%">no</td>
</tr>
<tr>
<td width="54%">Struct_type* var = new Struct_type;</td>
<td width="21%">reference </td>
<td width="28%">yes</td>
</tr>
<tr>
<td width="54%">&nbsp;</td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
<tr>
<td width="54%"><i><b>For Arrays</b></i></td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
<tr>
<td width="54%">Static_array[100] var;</td>
<td width="21%">reference</td>
<td width="28%">yes</td>
</tr>
<tr>
<td width="54%">Dynamic_array[] var;</td>
<td width="21%">reference</td>
<td width="28%">yes</td>
</tr>
<tr>
<td width="54%">Dynamic_array[] var = new Dynamic_array[];&nbsp; </td>
<td width="21%">reference</td>
<td width="28%">yes</td>
</tr>
<tr>
<td width="54%">&nbsp;</td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
<tr>
<td width="54%"><i><b>For Primitives</b></i></td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
<tr>
<td width="54%">Primitive_type var; </td>
<td width="21%">value</td>
<td width="28%">-</td>
</tr>
<tr>
<td width="54%">Primitive_type* var;</td>
<td width="21%">reference</td>
<td width="28%">no</td>
</tr>
<tr>
<td width="54%">&nbsp;</td>
<td width="21%">&nbsp;</td>
<td width="28%">&nbsp;</td>
</tr>
</table>



August 26, 2004
I am sorry, the HTML didn't get formatted, may be this whole message is in <pre> tags.

here is the text version

---------------------------------------------------------------------------------------------
Syntax                                      'var' is reference             if
reference,
or 'value' ?            is memory allocated ?

---- For Classes
----------------------------------------------------------------------------

Class_type var;                                reference                      no

Class_type var = new Class_type();             reference
yes

---- For Structs
----------------------------------------------------------------------------    
Struct_type var;                               value                          - Struct_type* var;                              reference                      no

Struct_type* var = new Struct_type;            reference yes

---- For Arrays
----------------------------------------------------------------------------    
Static_array[100] var;                         reference
yes
Dynamic_array[] var;                           reference
yes
Dynamic_array[] var = new Dynamic_array[];     reference
yes

---- For Primitives
-------------------------------------------------------------------------     
Primitive_type var;                            value                           -

Primitive_type* var;                           reference
no

---------------------------------------------------------------------------------------------


August 26, 2004
I am sorry again ...... it looks like the lines got truncated
after 80(?) characters on a line.

Here is another version .... looks better and easy to
add more entries.

---- For Classes ---------------------------------------

Syntax                           : Class_type var; Reference or value               : reference if reference is memory allocated : no


Syntax                           : Class_type var = new Class_type();
Reference or value               : reference
if reference is memory allocated : yes


---- For Structs ---------------------------------------

Syntax                           : Struct_type var;
Reference or value               : value
if reference is memory allocated : -

Syntax                           : Struct_type* var;
Reference or value               : reference
if reference is memory allocated : no

Syntax                           : Struct_type* var = new Struct_type;
Reference or value               : reference
if reference is memory allocated : yes

---- For Arrays  ---------------------------------------

Syntax                           : Static_array[100] var; Reference or value               : reference if reference is memory allocated : yes

Syntax                           : Dynamic_array[] var;
Reference or value               : reference
if reference is memory allocated : yes

Syntax                           : Dynamic_array[] var = new Dynamic_array[];
Reference or value               : reference
if reference is memory allocated : yes

---- For Primitives ------------------------------------

Syntax                           : Primitive_type var;
Reference or value               : value
if reference is memory allocated : -

Syntax                           : Primitive_type* var; Reference or value               : reference if reference is memory allocated : no

--------------------------------------------------------



August 26, 2004
Matthias Becker wrote:
>>I have a couple of reasons for this:
>>
>>1) Templates
>>The first one I remember coming across was template code.  Say you want to work with something by reference.  If the thing is a class, then you must use the syntax "T foo" while if it is anything else, you must use the syntax "T* foo".  It would be better if there was consistency.  If there is an '*', then it is a pointer to something.  If there is not, then it is a literal variable.
> 
> What about the inconsitency of op= ? Sometimes it's a flat copy sometimes a
> reference assignment.
> So do we then get
> foo = bar; // reference assignment
> *foo = *bar;  // flat copy
> ???

I'm not 100% clear whether you're agreeing with me, or arguing. :)  So I'll try to state my view about this subject...

This example is just another reason why lacking the '*' makes templates hard to write.  If you write this code:
	template <T> void foo(T arg) {
		T mine = arg;
		...
	}
is 'mine' a copy of the value in arg, or a copy of a reference?  It would be better if we knew if we were assigning a reference or a value.

August 26, 2004
Ilya Minkov wrote:
>> 3) Objects on Heap
>> We've had many long discussions about objects on the heap.  The fact that D lacked them lead to 'auto' variables, and still people want more.  Frankly, I think that objects on the heap make a lot of sense sometimes;  we could have them and still avoid the copy constructor stuff.
> 
> Objects on the heap? they are on the heap. Perhaps you mean on the stack?

Oops!  Yes, of course, you're right.

August 26, 2004
Sean Kelly wrote:
> In article <cgj8er$2lat$1@digitaldaemon.com>, Russ Lewis says...
> 
>>1) Templates
> 
> What about "inout T foo"

And if you declare local variables?  inout doesn't help me then.

>>2) Newbie learning curve
> 
> While I'm a C/C++ person, I don't agree with this argument.  The same could be
> said of Java and yet folks have picked it up easily enough.  I think this is
> just a language detail that people have to learn.

I understand that argument; yet I ask the question: why design a language element such that people "have" to learn it?  Why do something that almost guarantees that people coming from C++ will struggle with segfaults for a while before they get working code?  I think that if we do what I suggest here, the compiler can instruct the coder with its syntax error messages:

If we don't allow objects on the stack (as below), then when a newbie first writes:
	MyClass foo;
the compiler can respond with an error "ERROR: Class objects cannot be allocated on the heap.  Use a pointer variable, and allocate an object with operator new instead."  Bingo!  The newbie immediately knows what to do, with no segfaults.

OTOH, if we allow objects on the heap, then C++ programmers come over and it just works the way they expect.  The first time that they try to assign something:
	MyClass bar,baz;
	bar = baz;
the compiler can tell them "ERROR: Class variables cannot be assigned. Use pointers to classes instead."  And now the C++ programmers know exactly how to use the language as well.

« First   ‹ Prev
1 2 3