Proposal: Operator overloading without temporaries - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Proposal: Operator overloading without temporaries

Thread overview

Proposal: Operator overloading without temporaries
Mar 27, 2006 Don Clugston
Mar 27, 2006 Jarrett Billingsley
Mar 27, 2006 Craig Black
Mar 27, 2006 Dave
Mar 27, 2006 Charles
Mar 27, 2006 Sean Kelly
Mar 28, 2006 kris
Mar 28, 2006 pragma
Mar 28, 2006 Sean Kelly
Mar 28, 2006 Charles
Mar 27, 2006 Walter Bright
Mar 27, 2006 Regan Heath
Mar 27, 2006 Sean Kelly
Mar 28, 2006 James Dunne
Mar 28, 2006 Don Clugston
Mar 28, 2006 Sean Kelly
Mar 28, 2006 Norbert Nemec
Mar 28, 2006 Don Clugston
Mar 28, 2006 Norbert Nemec
Mar 28, 2006 James Dunne
Mar 28, 2006 xs0
Mar 28, 2006 James Dunne
Mar 28, 2006 Sean Kelly
Mar 29, 2006 Norbert Nemec
Mar 28, 2006 Oskar Linde
Apr 02, 2006 Bruno Medeiros
Apr 03, 2006 Don Clugston
Apr 04, 2006 Bruno Medeiros
Apr 05, 2006 Don Clugston

March 27, 2006

Proposal: Operator overloading without temporaries

Posted by Don Clugston

Don Clugston

Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.

For example,
X = A - ((B*C) + D)* E;

becomes:
T1 = B * C;
T2 = T1 + D;
T3 = T2 * E;
T4 = A - T3;
X = T4;
Four objects were created, whereas only one was strictly required.
In C++, there are libraries like Blitz++ which use complicated expression templates in order to avoid these creating these temporaries, and provide performance comparable with FORTRAN. I think D can do much better...
Note that temporaries are avoided when using the opXXXAssign() operators like +=.

===========
  Proposal
===========
(1) Allow the compiler to assume that b = b + c  can be replaced with b += c. (In C++, operator + and operator += are just symbols, the compiler doesn't know that there is any relationship between them).
In the example above, this would allow the compiler to generate:
T1 = B * C;
T1 += D;
T1 *= E;

and we have eliminated two of the three temporaries.
(2). Fill in the gaps in the operator overloading table by introducing
opAddAssign_r, opSubAssign_r, etc.

Just as A.opSubAssign(B)
is the operation  A -= B  or equivalently  A = A - B, similarly

A.opSubAssign_r(B)
would mean
A = B - A.
and would only occur when temporaries are generated in expressions. Like -=, it's an operation which can frequently be performed very efficiently, but at present the language has no way of expressing it.

Our original example then becomes:

T1 = B.opMul(C);
T1.opAddAssign(D);
T1.opMulAssign(E);
T1.opSubAssign_r(A);
X = T1;
... and all the useless temporaries are gone!

More formally, when the expression tree for an expression is generated:
With a binary operator XXX, operating on left & right nodes:

if (the left node is *not* an original leaf node) {
   // the left node is a temporary, does not need to be preserved.
   // we don't care if the right node is a temporary or not
   look for opXXXAssign().
} else if (the the right node is not an original leaf node) {
   // the right node is a temporary
   look for opXXXAssign_r()
} else {
  // both left and right nodes are leaf nodes, we have to
  // create a temporary
   look for opXXX(), just as it does now.
}

These rules also cope with the situation where temporaries are required:
eg
X = (A*B) + (C*D);
becomes
T1 = A*B;
T2 = C*D;
T1 += T2;
X = T1;

If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.

There are a couple of decisions to be made:
(I) should the compiler use opAdd() and generate a temporary, if opAddAssign_r() doesn't exist, to preserve existing behaviour? I think the answer to this is YES.
(II) should the compiler use opAdd() and generate a temporary, if oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm inclined to answer YES.
(III) If the code includes +=, and there is an opAdd() but no opAddAssign(), should the compiler accept this, and just generate an opAdd() followed by an assignment?? This would mean that opAdd() would generate the += operation as well as +, while opAddAssign() would be a performance enhancement. (It would still be possible to have opAddAssign() without opAdd(), to have += but not +, but it would not be possible to have + without +=). This would mean that += would be *purely* syntactic sugar.

Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.

Comments?

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Jarrett Billingsley
in reply to Don Clugston

Jarrett Billingsley

Posted in reply to Don Clugston

"Don Clugston" <dac@nospam.com.au> wrote in message news:e087or$dgm$1@digitaldaemon.com...
> Comments?

I really, really like this idea.  Granted, I don't do too much with operator overloading, but this seems like a very solid improvement to it.

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Craig Black
in reply to Don Clugston

Craig Black

Posted in reply to Don Clugston

Beautiful! An elegant solution to a long standing and annoying problem.

-Craig

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Dave
in reply to Don Clugston

Dave

Posted in reply to Don Clugston

Hear, hear! And my opinions are (I) yes, (II) yes and (III) no. And I'd like to
suggest that this optimization would be applied to built-in's like strX = strX ~
strY; => strX ~= strY;

The problem this addresses has driven me nuts in the C++ world as I've maintained/optimized code. (i.e.: operator += is defined but not used where it could be).

- Dave

In article <e087or$dgm$1@digitaldaemon.com>, Don Clugston says...
>
>Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.
>
>For example,
>X = A - ((B*C) + D)* E;
>
>becomes:
>T1 = B * C;
>T2 = T1 + D;
>T3 = T2 * E;
>T4 = A - T3;
>X = T4;
>Four objects were created, whereas only one was strictly required.
>In C++, there are libraries like Blitz++ which use complicated
>expression templates in order to avoid these creating these temporaries,
>and provide performance comparable with FORTRAN. I think D can do much
>better...
>Note that temporaries are avoided when using the opXXXAssign() operators
>like +=.
>
>===========
>   Proposal
>===========
>(1) Allow the compiler to assume that b = b + c  can be replaced with b
>+= c. (In C++, operator + and operator += are just symbols, the compiler
>doesn't know that there is any relationship between them).
>In the example above, this would allow the compiler to generate:
>T1 = B * C;
>T1 += D;
>T1 *= E;
>
>and we have eliminated two of the three temporaries.
>(2). Fill in the gaps in the operator overloading table by introducing
>opAddAssign_r, opSubAssign_r, etc.
>
>Just as A.opSubAssign(B)
>is the operation  A -= B  or equivalently  A = A - B, similarly
>
>A.opSubAssign_r(B)
>would mean
>A = B - A.
>and would only occur when temporaries are generated in expressions. Like
>-=, it's an operation which can frequently be performed very
>efficiently, but at present the language has no way of expressing it.
>
>Our original example then becomes:
>
>T1 = B.opMul(C);
>T1.opAddAssign(D);
>T1.opMulAssign(E);
>T1.opSubAssign_r(A);
>X = T1;
>... and all the useless temporaries are gone!
>
>More formally, when the expression tree for an expression is generated: With a binary operator XXX, operating on left & right nodes:
>
>if (the left node is *not* an original leaf node) {
>    // the left node is a temporary, does not need to be preserved.
>    // we don't care if the right node is a temporary or not
>    look for opXXXAssign().
>} else if (the the right node is not an original leaf node) {
>    // the right node is a temporary
>    look for opXXXAssign_r()
>} else {
>   // both left and right nodes are leaf nodes, we have to
>   // create a temporary
>    look for opXXX(), just as it does now.
>}
>
>These rules also cope with the situation where temporaries are required:
>eg
>X = (A*B) + (C*D);
>becomes
>T1 = A*B;
>T2 = C*D;
>T1 += T2;
>X = T1;
>
>If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.
>
>There are a couple of decisions to be made:
>(I) should the compiler use opAdd() and generate a temporary, if
>opAddAssign_r() doesn't exist, to preserve existing behaviour? I think
>the answer to this is YES.
>(II) should the compiler use opAdd() and generate a temporary, if
>oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm
>inclined to answer YES.
>(III) If the code includes +=, and there is an opAdd() but no
>opAddAssign(), should the compiler accept this, and just generate an
>opAdd() followed by an assignment?? This would mean that opAdd() would
>generate the += operation as well as +, while opAddAssign() would be a
>performance enhancement. (It would still be possible to have
>opAddAssign() without opAdd(), to have += but not +, but it would not be
>possible to have + without +=). This would mean that += would be
>*purely* syntactic sugar.
>
>Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.
>
>Comments?

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Charles
in reply to Don Clugston

Charles

Posted in reply to Don Clugston

Wow looks good ... too good.  How could this have gone un[noticed|implemented] in the  last 20 years ?  I'm anxious to here Walters take.

1. yes, 2. yes, 3. over my head :).

Charlie



Don Clugston wrote:
> Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.
> 
> For example,
> X = A - ((B*C) + D)* E;
> 
> becomes:
> T1 = B * C;
> T2 = T1 + D;
> T3 = T2 * E;
> T4 = A - T3;
> X = T4;
> Four objects were created, whereas only one was strictly required.
> In C++, there are libraries like Blitz++ which use complicated expression templates in order to avoid these creating these temporaries, and provide performance comparable with FORTRAN. I think D can do much better...
> Note that temporaries are avoided when using the opXXXAssign() operators like +=.
> 
> ===========
>   Proposal
> ===========
> (1) Allow the compiler to assume that b = b + c  can be replaced with b += c. (In C++, operator + and operator += are just symbols, the compiler doesn't know that there is any relationship between them).
> In the example above, this would allow the compiler to generate:
> T1 = B * C;
> T1 += D;
> T1 *= E;
> 
> and we have eliminated two of the three temporaries.
> (2). Fill in the gaps in the operator overloading table by introducing
> opAddAssign_r, opSubAssign_r, etc.
> 
> Just as A.opSubAssign(B)
> is the operation  A -= B  or equivalently  A = A - B, similarly
> 
> A.opSubAssign_r(B)
> would mean
> A = B - A.
> and would only occur when temporaries are generated in expressions. Like -=, it's an operation which can frequently be performed very efficiently, but at present the language has no way of expressing it.
> 
> Our original example then becomes:
> 
> T1 = B.opMul(C);
> T1.opAddAssign(D);
> T1.opMulAssign(E);
> T1.opSubAssign_r(A);
> X = T1;
> ... and all the useless temporaries are gone!
> 
> More formally, when the expression tree for an expression is generated:
> With a binary operator XXX, operating on left & right nodes:
> 
> if (the left node is *not* an original leaf node) {
>    // the left node is a temporary, does not need to be preserved.
>    // we don't care if the right node is a temporary or not
>    look for opXXXAssign().
> } else if (the the right node is not an original leaf node) {
>    // the right node is a temporary
>    look for opXXXAssign_r()
> } else {
>   // both left and right nodes are leaf nodes, we have to
>   // create a temporary
>    look for opXXX(), just as it does now.
> }
> 
> These rules also cope with the situation where temporaries are required:
> eg
> X = (A*B) + (C*D);
> becomes
> T1 = A*B;
> T2 = C*D;
> T1 += T2;
> X = T1;
> 
> If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.
> 
> There are a couple of decisions to be made:
> (I) should the compiler use opAdd() and generate a temporary, if opAddAssign_r() doesn't exist, to preserve existing behaviour? I think the answer to this is YES.
> (II) should the compiler use opAdd() and generate a temporary, if oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm inclined to answer YES.
> (III) If the code includes +=, and there is an opAdd() but no opAddAssign(), should the compiler accept this, and just generate an opAdd() followed by an assignment?? This would mean that opAdd() would generate the += operation as well as +, while opAddAssign() would be a performance enhancement. (It would still be possible to have opAddAssign() without opAdd(), to have += but not +, but it would not be possible to have + without +=). This would mean that += would be *purely* syntactic sugar.
> 
> Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.
> 
> Comments?

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Walter Bright
in reply to Don Clugston

Walter Bright

Posted in reply to Don Clugston

I think it's a great idea.

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Regan Heath
in reply to Don Clugston

Regan Heath

Posted in reply to Don Clugston

On Mon, 27 Mar 2006 10:29:13 +0200, Don Clugston <dac@nospam.com.au> wrote:
> Comments?

I was wondering the exact same thing recently:
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/35382

Do you think we gain anything by using the lhs variable where possible. i.e.

Instead of:

T1 = B.opMul(C);
T1.opAddAssign(D);
T1.opMulAssign(E);
T1.opSubAssign_r(A);
X = T1;

We have:

X = B;
X.opMulAssign(C);
X.opAddAssign(D);
X.opMulAssign(E);
X.opSubAssign_r(A);

It seems to me that this results in 1 less temporary and therefore 1 less assignment.

Of course, it doesn't help/work in cases where there is no existing lhs, i.e.

  foo(A - ((B*C) + D)* E);

Regan

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Sean Kelly
in reply to Don Clugston

Sean Kelly

Posted in reply to Don Clugston

Don Clugston wrote:
> Background: Operator overloading, in the form it exists in C++ and currently in D, inherently results in sub-optimal code, because it always results in unnecessary temporary objects being created.
...
> If this were implemented, it would permanently eradicate (for D) the most significant advantage which Fortran has managed to retain over object-oriented languages. And I really don't think it would be difficult to implement, or have negative side-effects.

Very nice.  And much better than expression templates.  I'm all for it.

> There are a couple of decisions to be made:
> (I) should the compiler use opAdd() and generate a temporary, if opAddAssign_r() doesn't exist, to preserve existing behaviour? I think the answer to this is YES.

Yes.

> (II) should the compiler use opAdd() and generate a temporary, if oppAddAssign() doesn't exist, to preserve existing behaviour? Again, I'm inclined to answer YES.

Yes.

> (III) If the code includes +=, and there is an opAdd() but no opAddAssign(), should the compiler accept this, and just generate an opAdd() followed by an assignment?? This would mean that opAdd() would generate the += operation as well as +, while opAddAssign() would be a performance enhancement. (It would still be possible to have opAddAssign() without opAdd(), to have += but not +, but it would not be possible to have + without +=). This would mean that += would be *purely* syntactic sugar.
> 
> Decision III would be a little more difficult to implement and is of less obvious merit, I only mention it as a possibility.

I'd say no to this initially, and see how things sort out.  It may be that this turns out to be desirable and it may not.  But theoretically, I'd prefer to know when an OpAssign fn is required that I haven't provided than to have the compiler silently accept the syntax anyway.


Sean

March 27, 2006

Re: Proposal: Operator overloading without temporaries

Posted by Sean Kelly
in reply to Charles

Sean Kelly

Posted in reply to Charles

Charles wrote:
> Wow looks good ... too good.  How could this have gone un[noticed|implemented] in the  last 20 years ?  I'm anxious to here Walters take.

I have a feeling this may be a lot more difficult in C++ because of ADL--there are simply a lot more functions to be evaluated when building expression trees.  Also, the standard doesn't seem to consider things from a compiler writer's perspective, which this three-value code optimization requires.


Sean

March 28, 2006

Re: Proposal: Operator overloading without temporaries

Posted by kris
in reply to Sean Kelly

kris

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Charles wrote:
> 
>> Wow looks good ... too good.  How could this have gone un[noticed|implemented] in the  last 20 years ?  I'm anxious to here Walters take.
> 
> 
> I have a feeling this may be a lot more difficult in C++ because of ADL--there are simply a lot more functions to be evaluated when building expression trees.  Also, the standard doesn't seem to consider things from a compiler writer's perspective, which this three-value code optimization requires.
> 
> 
> Sean

ADL? Does that stand for Attention Deficit Level?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation