Jump to page: 1 2 3
Thread overview
Proposal: Operator overloading without temporaries
Mar 27, 2006
Don Clugston
Mar 27, 2006
Craig Black
Mar 27, 2006
Dave
Mar 27, 2006
Charles
Mar 27, 2006
Sean Kelly
Mar 28, 2006
kris
Mar 28, 2006
pragma
Mar 28, 2006
Sean Kelly
Mar 28, 2006
Charles
Mar 27, 2006
Walter Bright
Mar 27, 2006
Regan Heath
Mar 27, 2006
Sean Kelly
Mar 28, 2006
James Dunne
Mar 28, 2006
Don Clugston
Mar 28, 2006
Sean Kelly
Mar 28, 2006
Norbert Nemec
Mar 28, 2006
Don Clugston
Mar 28, 2006
Norbert Nemec
Mar 28, 2006
James Dunne
Mar 28, 2006
xs0
Mar 28, 2006
James Dunne
Mar 28, 2006
Sean Kelly
Mar 29, 2006
Norbert Nemec
Mar 28, 2006
Oskar Linde
Apr 02, 2006
Bruno Medeiros
Apr 03, 2006
Don Clugston
Apr 04, 2006
Bruno Medeiros
Apr 05, 2006
Don Clugston
March 27, 2006
Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.

For example,
X = A - ((B*C) + D)* E;

becomes:
T1 = B * C;
T2 = T1 + D;
T3 = T2 * E;
T4 = A - T3;
X = T4;
Four objects were created, whereas only one was strictly required.
In C++, there are libraries like Blitz++ which use complicated expression templates in order to avoid these creating these temporaries, and provide performance comparable with FORTRAN. I think D can do much better...
Note that temporaries are avoided when using the opXXXAssign() operators like +=.

===========
  Proposal
===========
(1) Allow the compiler to assume that b = b + c  can be replaced with b += c. (In C++, operator + and operator += are just symbols, the compiler doesn't know that there is any relationship between them).
In the example above, this would allow the compiler to generate:
T1 = B * C;
T1 += D;
T1 *= E;

and we have eliminated two of the three temporaries.
(2). Fill in the gaps in the operator overloading table by introducing
opAddAssign_r, opSubAssign_r, etc.

Just as A.opSubAssign(B)
is the operation  A -= B  or equivalently  A = A - B, similarly

A.opSubAssign_r(B)
would mean
A = B - A.
and would only occur when temporaries are generated in expressions. Like -=, it's an operation which can frequently be performed very efficiently, but at present the language has no way of expressing it.

Our original example then becomes:

T1 = B.opMul(C);
T1.opAddAssign(D);
T1.opMulAssign(E);
T1.opSubAssign_r(A);
X = T1;
... and all the useless temporaries are gone!

More formally, when the expression tree for an expression is generated:
With a binary operator XXX, operating on left & right nodes:

if (the left node is *not* an original leaf node) {
   // the left node is a temporary, does not need to be preserved.
   // we don't care if the right node is a temporary or not
   look for opXXXAssign().
} else if (the the right node is not an original leaf node) {
   // the right node is a temporary
   look for opXXXAssign_r()
} else {
  // both left and right nodes are leaf nodes, we have to
  // create a temporary
   look for opXXX(), just as it does now.
}

These rules also cope with the situation where temporaries are required:
eg
X = (A*B) + (C*D);
becomes
T1 = A*B;
T2 = C*D;
T1 += T2;
X = T1;

If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.

There are a couple of decisions to be made:
(I) should the compiler use opAdd() and generate a temporary, if opAddAssign_r() doesn't exist, to preserve existing behaviour? I think the answer to this is YES.
(II) should the compiler use opAdd() and generate a temporary, if oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm inclined to answer YES.
(III) If the code includes +=, and there is an opAdd() but no opAddAssign(), should the compiler accept this, and just generate an opAdd() followed by an assignment?? This would mean that opAdd() would generate the += operation as well as +, while opAddAssign() would be a performance enhancement. (It would still be possible to have opAddAssign() without opAdd(), to have += but not +, but it would not be possible to have + without +=). This would mean that += would be *purely* syntactic sugar.

Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.

Comments?
March 27, 2006
"Don Clugston" <dac@nospam.com.au> wrote in message news:e087or$dgm$1@digitaldaemon.com...
> Comments?

I really, really like this idea.  Granted, I don't do too much with operator overloading, but this seems like a very solid improvement to it.


March 27, 2006
Beautiful! An elegant solution to a long standing and annoying problem.

-Craig


March 27, 2006
Hear, hear! And my opinions are (I) yes, (II) yes and (III) no. And I'd like to
suggest that this optimization would be applied to built-in's like strX = strX ~
strY; => strX ~= strY;

The problem this addresses has driven me nuts in the C++ world as I've maintained/optimized code. (i.e.: operator += is defined but not used where it could be).

- Dave

In article <e087or$dgm$1@digitaldaemon.com>, Don Clugston says...
>
>Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.
>
>For example,
>X = A - ((B*C) + D)* E;
>
>becomes:
>T1 = B * C;
>T2 = T1 + D;
>T3 = T2 * E;
>T4 = A - T3;
>X = T4;
>Four objects were created, whereas only one was strictly required.
>In C++, there are libraries like Blitz++ which use complicated
>expression templates in order to avoid these creating these temporaries,
>and provide performance comparable with FORTRAN. I think D can do much
>better...
>Note that temporaries are avoided when using the opXXXAssign() operators
>like +=.
>
>===========
>   Proposal
>===========
>(1) Allow the compiler to assume that b = b + c  can be replaced with b
>+= c. (In C++, operator + and operator += are just symbols, the compiler
>doesn't know that there is any relationship between them).
>In the example above, this would allow the compiler to generate:
>T1 = B * C;
>T1 += D;
>T1 *= E;
>
>and we have eliminated two of the three temporaries.
>(2). Fill in the gaps in the operator overloading table by introducing
>opAddAssign_r, opSubAssign_r, etc.
>
>Just as A.opSubAssign(B)
>is the operation  A -= B  or equivalently  A = A - B, similarly
>
>A.opSubAssign_r(B)
>would mean
>A = B - A.
>and would only occur when temporaries are generated in expressions. Like
>-=, it's an operation which can frequently be performed very
>efficiently, but at present the language has no way of expressing it.
>
>Our original example then becomes:
>
>T1 = B.opMul(C);
>T1.opAddAssign(D);
>T1.opMulAssign(E);
>T1.opSubAssign_r(A);
>X = T1;
>... and all the useless temporaries are gone!
>
>More formally, when the expression tree for an expression is generated: With a binary operator XXX, operating on left & right nodes:
>
>if (the left node is *not* an original leaf node) {
>    // the left node is a temporary, does not need to be preserved.
>    // we don't care if the right node is a temporary or not
>    look for opXXXAssign().
>} else if (the the right node is not an original leaf node) {
>    // the right node is a temporary
>    look for opXXXAssign_r()
>} else {
>   // both left and right nodes are leaf nodes, we have to
>   // create a temporary
>    look for opXXX(), just as it does now.
>}
>
>These rules also cope with the situation where temporaries are required:
>eg
>X = (A*B) + (C*D);
>becomes
>T1 = A*B;
>T2 = C*D;
>T1 += T2;
>X = T1;
>
>If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.
>
>There are a couple of decisions to be made:
>(I) should the compiler use opAdd() and generate a temporary, if
>opAddAssign_r() doesn't exist, to preserve existing behaviour? I think
>the answer to this is YES.
>(II) should the compiler use opAdd() and generate a temporary, if
>oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm
>inclined to answer YES.
>(III) If the code includes +=, and there is an opAdd() but no
>opAddAssign(), should the compiler accept this, and just generate an
>opAdd() followed by an assignment?? This would mean that opAdd() would
>generate the += operation as well as +, while opAddAssign() would be a
>performance enhancement. (It would still be possible to have
>opAddAssign() without opAdd(), to have += but not +, but it would not be
>possible to have + without +=). This would mean that += would be
>*purely* syntactic sugar.
>
>Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.
>
>Comments?


March 27, 2006
Wow looks good ... too good.  How could this have gone un[noticed|implemented] in the  last 20 years ?  I'm anxious to here Walters take.

1. yes, 2. yes, 3. over my head :).

Charlie



Don Clugston wrote:
> Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.
> 
> For example,
> X = A - ((B*C) + D)* E;
> 
> becomes:
> T1 = B * C;
> T2 = T1 + D;
> T3 = T2 * E;
> T4 = A - T3;
> X = T4;
> Four objects were created, whereas only one was strictly required.
> In C++, there are libraries like Blitz++ which use complicated expression templates in order to avoid these creating these temporaries, and provide performance comparable with FORTRAN. I think D can do much better...
> Note that temporaries are avoided when using the opXXXAssign() operators like +=.
> 
> ===========
>   Proposal
> ===========
> (1) Allow the compiler to assume that b = b + c  can be replaced with b += c. (In C++, operator + and operator += are just symbols, the compiler doesn't know that there is any relationship between them).
> In the example above, this would allow the compiler to generate:
> T1 = B * C;
> T1 += D;
> T1 *= E;
> 
> and we have eliminated two of the three temporaries.
> (2). Fill in the gaps in the operator overloading table by introducing
> opAddAssign_r, opSubAssign_r, etc.
> 
> Just as A.opSubAssign(B)
> is the operation  A -= B  or equivalently  A = A - B, similarly
> 
> A.opSubAssign_r(B)
> would mean
> A = B - A.
> and would only occur when temporaries are generated in expressions. Like -=, it's an operation which can frequently be performed very efficiently, but at present the language has no way of expressing it.
> 
> Our original example then becomes:
> 
> T1 = B.opMul(C);
> T1.opAddAssign(D);
> T1.opMulAssign(E);
> T1.opSubAssign_r(A);
> X = T1;
> ... and all the useless temporaries are gone!
> 
> More formally, when the expression tree for an expression is generated:
> With a binary operator XXX, operating on left & right nodes:
> 
> if (the left node is *not* an original leaf node) {
>    // the left node is a temporary, does not need to be preserved.
>    // we don't care if the right node is a temporary or not
>    look for opXXXAssign().
> } else if (the the right node is not an original leaf node) {
>    // the right node is a temporary
>    look for opXXXAssign_r()
> } else {
>   // both left and right nodes are leaf nodes, we have to
>   // create a temporary
>    look for opXXX(), just as it does now.
> }
> 
> These rules also cope with the situation where temporaries are required:
> eg
> X = (A*B) + (C*D);
> becomes
> T1 = A*B;
> T2 = C*D;
> T1 += T2;
> X = T1;
> 
> If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.
> 
> There are a couple of decisions to be made:
> (I) should the compiler use opAdd() and generate a temporary, if opAddAssign_r() doesn't exist, to preserve existing behaviour? I think the answer to this is YES.
> (II) should the compiler use opAdd() and generate a temporary, if oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm inclined to answer YES.
> (III) If the code includes +=, and there is an opAdd() but no opAddAssign(), should the compiler accept this, and just generate an opAdd() followed by an assignment?? This would mean that opAdd() would generate the += operation as well as +, while opAddAssign() would be a performance enhancement. (It would still be possible to have opAddAssign() without opAdd(), to have += but not +, but it would not be possible to have + without +=). This would mean that += would be *purely* syntactic sugar.
> 
> Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.
> 
> Comments?
March 27, 2006
I think it's a great idea.


March 27, 2006
On Mon, 27 Mar 2006 10:29:13 +0200, Don Clugston <dac@nospam.com.au> wrote:
> Comments?

I was wondering the exact same thing recently:
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/35382

Do you think we gain anything by using the lhs variable where possible. i.e.

Instead of:

T1 = B.opMul(C);
T1.opAddAssign(D);
T1.opMulAssign(E);
T1.opSubAssign_r(A);
X = T1;

We have:

X = B;
X.opMulAssign(C);
X.opAddAssign(D);
X.opMulAssign(E);
X.opSubAssign_r(A);

It seems to me that this results in 1 less temporary and therefore 1 less assignment.

Of course, it doesn't help/work in cases where there is no existing lhs, i.e.

  foo(A - ((B*C) + D)* E);

Regan
March 27, 2006
Don Clugston wrote:
> Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.
...
> If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.

Very nice.  And much better than expression templates.  I'm all for it.

> There are a couple of decisions to be made:
> (I) should the compiler use opAdd() and generate a temporary, if opAddAssign_r() doesn't exist, to preserve existing behaviour? I think the answer to this is YES.

Yes.

> (II) should the compiler use opAdd() and generate a temporary, if oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm inclined to answer YES.

Yes.

> (III) If the code includes +=, and there is an opAdd() but no opAddAssign(), should the compiler accept this, and just generate an opAdd() followed by an assignment?? This would mean that opAdd() would generate the += operation as well as +, while opAddAssign() would be a performance enhancement. (It would still be possible to have opAddAssign() without opAdd(), to have += but not +, but it would not be possible to have + without +=). This would mean that += would be *purely* syntactic sugar.
> 
> Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.

I'd say no to this initially, and see how things sort out.  It may be that this turns out to be desirable and it may not.  But theoretically, I'd prefer to know when an OpAssign fn is required that I haven't provided than to have the compiler silently accept the syntax anyway.


Sean
March 27, 2006
Charles wrote:
> Wow looks good ... too good.  How could this have gone un[noticed|implemented] in the  last 20 years ?  I'm anxious to here Walters take.

I have a feeling this may be a lot more difficult in C++ because of ADL--there are simply a lot more functions to be evaluated when building expression trees.  Also, the standard doesn't seem to consider things from a compiler writer's perspective, which this three-value code optimization requires.


Sean
March 28, 2006
Sean Kelly wrote:
> Charles wrote:
> 
>> Wow looks good ... too good.  How could this have gone un[noticed|implemented] in the  last 20 years ?  I'm anxious to here Walters take.
> 
> 
> I have a feeling this may be a lot more difficult in C++ because of ADL--there are simply a lot more functions to be evaluated when building expression trees.  Also, the standard doesn't seem to consider things from a compiler writer's perspective, which this three-value code optimization requires.
> 
> 
> Sean

ADL? Does that stand for Attention Deficit Level?
« First   ‹ Prev
1 2 3