Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
September 20, 2013 References | ||||
---|---|---|---|---|
| ||||
Running this code I would expect to get "ref" three times, but ... ------- import std.stdio; struct A { int[128] data; ref A opAdd(const ref A a) { A cp = this; cp.data[] += a.data[]; return cp; } } void fun(A a) { writeln("copy"); } void fun(const ref A a) { writeln("ref"); } void main() { A a, b; fun(a + b); //prints "copy" fun(A()); //prints "copy" fun(a); //prints "copy" } ------- After some tests I concluded that is not possible to pass the result of a+b as a reference (which is what interests me most) and this creates an evident performance problem. Is there any solution that I missed ? |
September 20, 2013 Re: References | ||||
---|---|---|---|---|
| ||||
Posted in reply to andrea9940 | Remove the "ref" in "ref A opAdd(const ref A a) {" |
September 20, 2013 Re: References | ||||
---|---|---|---|---|
| ||||
Posted in reply to andrea9940 | On Friday, 20 September 2013 at 09:36:18 UTC, andrea9940 wrote: > Running this code I would expect to get "ref" three times, but ... > > ------- > import std.stdio; > struct A { > int[128] data; > ref A opAdd(const ref A a) { > A cp = this; > cp.data[] += a.data[]; > return cp; > } > } > > void fun(A a) { writeln("copy"); } > void fun(const ref A a) { writeln("ref"); } > > void main() { > A a, b; > fun(a + b); //prints "copy" > fun(A()); //prints "copy" > fun(a); //prints "copy" This prints 'ref' if you change func(A a) to func(const A a) the match of const ref isn't prefered over A a because const need an implicit conversion. > } > ------- > > After some tests I concluded that is not possible to pass the result of a+b as a reference (which is what interests me most) and this creates an evident performance problem. > Is there any solution that I missed ? a + b is an rvalue and will moved to func(A a). This is even faster than ref. ;) A() is also moved. If you have two functions, one with const A a and one with ref const A a, the latter accepts lvalues which are passed by ref and the first accepts rvalues which are moved. So no performance problem. ;) If you're function is a template you can use 'auto ref' which automatically generate two versions of your function: one with and one without ref. |
September 20, 2013 Re: References | ||||
---|---|---|---|---|
| ||||
Posted in reply to Namespace | On Friday, 20 September 2013 at 09:44:51 UTC, Namespace wrote: > This prints 'ref' if you change func(A a) to func(const A a) the match of const ref isn't prefered over A a because const need an implicit conversion. Thanks for the tip. > a + b is an rvalue and will moved to func(A a). This is even faster than ref. ;) > A() is also moved. Mmm... here is the assembly generated (with -release -O), it seems to me that the struct is always copied. ; fun(a + b) LEA EAX,[EBP-604] PUSH EAX LEA EAX,[EBP-404] PUSH EAX LEA EAX,[EBP-804] CALL 00402010 ;opAdd() MOV EBX,EAX ;pointer to the stack MOV ECX,80 ADD EBX,1FC 004020C3: PUSH DWORD PTR DS:[EBX] ; copy ! SUB EBX,4 ; copy ! LOOP SHORT 004020C3 ; copy ! CALL 00402048 ; fun(const A a) ;fun(A()) PUSH 80 MOV EAX,OFFSET 00422930 PUSH EAX CALL 00405DB0 MOV DWORD PTR SS:[EBP-4],EAX MOV ECX,DWORD PTR SS:[EBP-4] XOR EAX,EAX MOV DWORD PTR DS:[ECX],EAX MOV EAX,DWORD PTR SS:[EBP-4] MOV DWORD PTR DS:[EAX+4],0 ; a lot of movs later ... MOV EAX,DWORD PTR SS:[EBP-4] MOV DWORD PTR DS:[EAX+1FC],0 MOV ESI,DWORD PTR SS:[EBP-4] LEA EDI,[EBP-204] MOV ECX,80 REP MOVS DWORD PTR ES:[EDI],DWORD PTR DS:[ESI] ADD ESP,8 ; ^ a lot of istructions to set a struct to zero imo LEA EBX,[EBP-8] MOV ECX,80 0040271B: PUSH DWORD PTR DS:[EBX] ; copy ! SUB EBX,4 ; copy ! LOOP SHORT 0040271B ; copy ! CALL 00402048 ;fun(const A a) ; fun(a) LEA EAX,[EBP-804] ;load the reference CALL 00402060 ; fun(const ref A a) |
September 20, 2013 Re: References | ||||
---|---|---|---|---|
| ||||
Posted in reply to andrea9940 | On Friday, 20 September 2013 at 10:29:24 UTC, andrea9940 wrote:
> On Friday, 20 September 2013 at 09:44:51 UTC, Namespace wrote:
>
>> This prints 'ref' if you change func(A a) to func(const A a) the match of const ref isn't prefered over A a because const need an implicit conversion.
> Thanks for the tip.
>
>> a + b is an rvalue and will moved to func(A a). This is even faster than ref. ;)
>> A() is also moved.
> Mmm... here is the assembly generated (with -release -O), it seems to me that the struct is always copied.
>
> ; fun(a + b)
> LEA EAX,[EBP-604]
> PUSH EAX
> LEA EAX,[EBP-404]
> PUSH EAX
> LEA EAX,[EBP-804]
> CALL 00402010 ;opAdd()
> MOV EBX,EAX ;pointer to the stack
> MOV ECX,80
> ADD EBX,1FC
> 004020C3: PUSH DWORD PTR DS:[EBX] ; copy !
> SUB EBX,4 ; copy !
> LOOP SHORT 004020C3 ; copy !
> CALL 00402048 ; fun(const A a)
>
> ;fun(A())
> PUSH 80
> MOV EAX,OFFSET 00422930
> PUSH EAX
> CALL 00405DB0
> MOV DWORD PTR SS:[EBP-4],EAX
> MOV ECX,DWORD PTR SS:[EBP-4]
> XOR EAX,EAX
> MOV DWORD PTR DS:[ECX],EAX
> MOV EAX,DWORD PTR SS:[EBP-4]
> MOV DWORD PTR DS:[EAX+4],0
> ; a lot of movs later ...
> MOV EAX,DWORD PTR SS:[EBP-4]
> MOV DWORD PTR DS:[EAX+1FC],0
> MOV ESI,DWORD PTR SS:[EBP-4]
> LEA EDI,[EBP-204]
> MOV ECX,80
> REP MOVS DWORD PTR ES:[EDI],DWORD PTR DS:[ESI]
> ADD ESP,8
> ; ^ a lot of istructions to set a struct to zero imo
> LEA EBX,[EBP-8]
> MOV ECX,80
> 0040271B: PUSH DWORD PTR DS:[EBX] ; copy !
> SUB EBX,4 ; copy !
> LOOP SHORT 0040271B ; copy !
> CALL 00402048 ;fun(const A a)
>
> ; fun(a)
> LEA EAX,[EBP-804] ;load the reference
> CALL 00402060 ; fun(const ref A a)
If you have doubt about the assembly, please feeld free to open a bug report.
And for performance:
----
import std.stdio;
struct A {
public:
uint id;
this(uint id) {
this.id = id;
writeln("CTor A with ", id);
}
this(this) {
writeln("Postblit A with ", this.id);
}
~this() {
writeln("DTor A with ", this.id);
}
A opBinary(string op : "+")(ref const A a) {
return A(this.id + a.id);
}
}
void func(const A a) {
writeln("Value call with A::", a.id);
}
void func(ref const A a) {
writeln("Ref call with A::", a.id);
}
void main()
{
A a = A(42);
A b = A(23);
func(a + b);
func(A(1337));
func(a);
}
----
Output:
----
CTor A with 42
CTor A with 23
CTor A with 65
Value call with A::65
DTor A with 65
CTor A with 1337
Value call with A::1337
DTor A with 1337
Ref call with A::42
DTor A with 23
DTor A with 42
----
No Postblit call.
|
September 20, 2013 Re: References | ||||
---|---|---|---|---|
| ||||
Posted in reply to Namespace | > Output: > ---- > CTor A with 42 > CTor A with 23 > CTor A with 65 > Value call with A::65 > DTor A with 65 > CTor A with 1337 > Value call with A::1337 > DTor A with 1337 > Ref call with A::42 > DTor A with 23 > DTor A with 42 > ---- > > No Postblit call. The beahvior is correct but at assembly level the compiler still creates a copy on the stack. D code: http://pastebin.com/sTLhnNdV Disassembly: http://pastebin.com/5emmwqJc I'll post a bug report. Thanks for everything. |
Copyright © 1999-2021 by the D Language Foundation