May 04, 2013 Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Thanks to the many recent threads on this, and the dips on it, everyone was pretty much up to speed and ready to find a resolution. This resolution only deals with the memory safety issue. The first point is that rvalues are turned into references by the simple expedient of creating a temporary, copying the rvalue into the temporary, and taking the address of that temporary. Therefore, the issue is really about returning references to stack variables that have gone out of scope. From a memory safety issue, this is unacceptable as D strives to be a memory safe language. The solution in other languages of "just don't do that" is invalid for D. Cases where this can occur: Case A: ref T fooa(ref T t) { return t; } ref T bar() { T t; return fooa(t); } Case B: ref T foob(ref U u) { return u.t; } // note that T is derivable from U ref U bar() { T t; return foob(t); } Case C: struct S { T t; ref T fooc() { return t; } } ref T bar() { S s; return s.fooc(); } Case D: Returning ref to uplevel local: ref T food() { T t; ref T bar() { return t; } return bar(); } case E: Transitively calling other functions: ref T fooe(T t) { return fooa(t); } Observations: 1. Always involves a return statement. 2. The return type must always be the type of the stack variable or a type type derived from a stack variable's type via safe casting or subtyping. 3. Returning rvalues is the same issue, as rvalues are always turned into local stack temporaries. 4. Whether a function returns a ref derived from a parameter or not is not reflected in the function signature. 5. Always involves passing a local by ref to a function that returns by ref, and that function gets called in a return statement. Scope Ref http://wiki.dlang.org/DIP35 is one solution, but Andrei and I argued strongly against it due to the perceived complexity the user would face with it. I also argued against it due to Case C (where would the scope annotation go) and the possibility that functions returning ref would have to appear in pairs - one with scope ref parameters, the other without - and the copy/pasta duplication of the function bodies (which appears in C++ const& functions). Andrei & I argued that we needed to make it work with just ref annotations. Static Compiler Detection (in @safe mode): 1. Do not allow taking the address of a local variable, unless doing a safe type 'paint' operation. 2. In some cases, such as nested, private, and template functions, the source is always available so the compiler can error on those. Because of the .di file problem, doing this with auto return functions is problematic. 3. Issue error on return statements where the expression may contain a ref to a local that is going out of scope, taking into account the observations. Runtime Detection There are still a few cases that the compiler cannot statically detect. For these a runtime check is inserted, which compares the returned ref pointer to see if it lies within the stack frame of the exiting function, and if it does, halts the program. The cost will be a couple of CMP instructions and an LEA. These checks would be omitted if the -noboundscheck compiler switch was provided. The runtime check would not be on all ref returning functions. It'll only be on those where the compiler cannot prove a ref to a local is not being returned. The good thing about the runtime detection is that ref's use is restricted enough that merely executing all the code paths will check all the possibilities. |
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
> Thanks to the many recent threads on this, and the dips on it, everyone was pretty much up to speed and ready to find a resolution. This resolution only deals with the memory safety issue.
And to anybody who couldn't make it to DConf: You definitely missed something here. There were literally hours and hours of heated, yet focused debate about the issue. Although with all the smart people around, we should have probably tackled some much bigger problem, say world poverty… ;)
David
|
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On 5/4/13 2:56 PM, David Nadlinger wrote:
> On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
>> Thanks to the many recent threads on this, and the dips on it,
>> everyone was pretty much up to speed and ready to find a resolution.
>> This resolution only deals with the memory safety issue.
>
> And to anybody who couldn't make it to DConf: You definitely missed
> something here. There were literally hours and hours of heated, yet
> focused debate about the issue. Although with all the smart people
> around, we should have probably tackled some much bigger problem, say
> world poverty… ;)
>
> David
Next year.
Andrei
|
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
> Runtime Detection
>
> There are still a few cases that the compiler cannot statically detect. For these a runtime check is inserted, which compares the returned ref pointer to see if it lies within the stack frame of the exiting function, and if it does, halts the program. The cost will be a couple of CMP instructions and an LEA. These checks would be omitted if the -noboundscheck compiler switch was provided.
Thanks for taking the time to detail the solution, I was quite curious.
Runtime Detection and opt-out with "-noboundscheck" is a stroke of genius!
"couple of CMP instructions"
should be possible to reduce to only one with the "normal" unsigned range check idiom, no?
Looking forwards to hear more cool news. :)
|
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tove | > Runtime Detection and opt-out with "-noboundscheck" is a stroke of genius! > Thanks. ;-) Araq wrote in January: You can also look at how Algol solved this over 40 years ago: Insert a runtime check that the escaping reference does not point to the current stack frame which is about to be destroyed. The check should be very cheap at runtime but it can be deactivated in a release build for efficiency just like it is done for array indexing. http://forum.dlang.org/thread/mailman.3107.1356856707.5162.digitalmars-d@puremagic.com?page=6 |
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Araq | On 5/4/13 4:15 PM, Araq wrote:
>> Runtime Detection and opt-out with "-noboundscheck" is a stroke of
>> genius!
>>
>
> Thanks. ;-)
>
> Araq wrote in January:
>
> You can also look at how Algol solved this over 40 years ago:
> Insert a runtime check that the escaping reference does not point
> to the current stack frame which is about to be destroyed. The
> check should be very cheap at runtime but it can be deactivated
> in a release build for efficiency just like it is done for array
> indexing.
>
> http://forum.dlang.org/thread/mailman.3107.1356856707.5162.digitalmars-d@puremagic.com?page=6
Whoa. Kudos!
Andrei
|
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | So just to be clear, "ref" parameters can now take rvalues? There's one minor problem I see with this: S currentVar; void makeCurrent(ref S var) { currentVar = var; } makeCurrent(getRValue()); If "makeCurrent" knew that "var" was an rvalue it could avoid calling "postblit" on currentVar, because it's simply a move operation, thus saving a potentially costly deep copy operation and extra destructor call. |
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 5/4/2013 1:20 PM, Andrei Alexandrescu wrote:
> On 5/4/13 4:15 PM, Araq wrote:
>>> Runtime Detection and opt-out with "-noboundscheck" is a stroke of
>>> genius!
>>>
>>
>> Thanks. ;-)
>>
>> Araq wrote in January:
>>
>> You can also look at how Algol solved this over 40 years ago:
>> Insert a runtime check that the escaping reference does not point
>> to the current stack frame which is about to be destroyed. The
>> check should be very cheap at runtime but it can be deactivated
>> in a release build for efficiency just like it is done for array
>> indexing.
>>
>> http://forum.dlang.org/thread/mailman.3107.1356856707.5162.digitalmars-d@puremagic.com?page=6
>>
>
> Whoa. Kudos!
Araq for the win!
|
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | You mean DIP 36, not DIP 35. ;) Any estimates as to when the whole is implemented? So dmd 2.064, 2.070, etc.? |
May 04, 2013 Re: Rvalue references - The resolution | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tove | On Saturday, 4 May 2013 at 19:40:36 UTC, Tove wrote:
> On Saturday, 4 May 2013 at 18:33:04 UTC, Walter Bright wrote:
>> Runtime Detection
>>
>> There are still a few cases that the compiler cannot statically detect. For these a runtime check is inserted, which compares the returned ref pointer to see if it lies within the stack frame of the exiting function, and if it does, halts the program. The cost will be a couple of CMP instructions and an LEA. These checks would be omitted if the -noboundscheck compiler switch was provided.
>
> Thanks for taking the time to detail the solution, I was quite curious.
>
> Runtime Detection and opt-out with "-noboundscheck" is a stroke of genius!
>
> "couple of CMP instructions"
> should be possible to reduce to only one with the "normal" unsigned range check idiom, no?
>
> Looking forwards to hear more cool news. :)
It shouldn't be expensive. Additionally, consider that returning by reference is quite rare in practice.
Due to D semantic, returning by reference isn't a performance improvement (you get full performance returning by value in D), so you only return by reference when you intend to keep identity (ie, when you intend to modify a given value, in containers for instance).
I still think this is inferior to Rust's solution and like to see ref as a equivalent of the Rust burrowed pointer. It achieve the same safety at compile time instead at runtime, and incurs no extra complexity except in some very rare cases (when you have a function taking several arguments by ref and returning also by ref and the lifetime of the returned ref isn't the union of the lifetime of the ref parameters - a very specific case).
Talking with people at DConf, it seems that many of them didn't knew about how Rust solve that issue, and so I'm not sure if we should validate the proposal.
At a first glance, it seems that the proposal allow for rather painless later inclusion of the concept of burrowed pointer, and we can ensure that this is effectively the case, I'm definitively for it.
But we shouldn't close the door to that concept. After all, D is about doing as much as possible at compile time, and when we have the choice to trade a runtime check against a compile time one, we must go for it.
|
Copyright © 1999-2021 by the D Language Foundation