Borrowing and Ownership

I finally got around to writing up some thoughts on @safe borrowing and ownership in D. I didn't spend nearly enough time on this post, so the details of this proposal might not be optimal yet, and it is likely to miss a few details. The TLDR is that `scope` pointers and built-in references should behave like Rust borrowed pointers. (Except lifetimes will be tracked through function calls and data structures a lot less precisely, at least initially.) The meaning of `T*` should not change from what it is today.

First, note that even though there is a lot of confusion around this, `@safe` is currently not inherently broken. It provides memory safety (modulo implementation bugs in the compiler). The problem we want to solve is that @safe code does not support exposing direct references into the guts of data structures that use memory management schemes other than tracing GC. @trusted is currently broken, however (see further below in this post).


Basic assumptions:
- We want to start with simple rules that ensure memory safety of slightly more expressive @safe code instead of comprehensive ones that ensure both safety and very high expressiveness. (I have more ambitious ideas than what I discuss here, but I doubt those are realistic for D right now.)
- With DIP 1021 accepted, `scope` is headed to mean controlled lifetime without mutable aliasing. (`ref` implies `scope`).
- Tracing GC is a successful way to write @safe programs and should be continued to be supported as an option.

In particular, @live is a dead end, because:
- It either provides no guarantees or it breaks memory safety of @safe code.
- It wants to change the meaning of `T*` based on a function attribute.
- It breaks D programs that want to use the GC.

The next steps should instead be roughly as follows:

Clarify the meaning of `T*` in impure `@safe` code:

- A non-`scope` built-in pointer in impure `@safe` code points to a value whose lifetime (e.g. a GC pointer or a pointer into the data segment) and unrestricted aliasing. The same holds true for non-`scope` class references. This is true today, but should be explicitly stated in the language specification to prevent confusion.

- In @system code, `T*` is a pointer with arbitrary lifetime, and @trusted code needs to ensure @safe code cannot access a `T*` whose lifetime may be less than the last possible time that @safe code might access the pointer.

Improve `@trusted`:

- The problem with `@trusted` is that it has no defense against `@safe` code destroying its invariants or accessing raw pointers that are only meant to be manipulated by `@trusted` code. There should therefore be a way to mark data as `@trusted` (or equivalent), such that `@safe` code can not access it.

Change the meaning of `scope`:

- `scope` should apply to all types of data equally, not only built-in pointers and references. The most obvious use case for this is @safe interfacing with a C library that exposes handles as structs with an integer field but specifies undefined behavior if those handles are mismanaged. Not everything that is a manually-managed reference to something is a built-in pointer or reference.

- Non-immutable non-scope values may not be assigned to `scope` values. In particular, non-`immutable` `scope` member functions cannot accept a non-`scope` receiver. This is necessary, because otherwise you immediately break the aliasing guarantee DIP 1021 aims to introduce.

- `scope` on a struct does not imply its fields are `scope`. (It is perfectly fine to store a GC pointer within something with a scoped lifetime.)

- Fields can be `scope`. `scope` fields cannot be accessed through a non-`scope` receiver. The lifetime of `scope` fields ends when the lifetime of the enclosing object ends.

- `scope` has to be a type constructor.

- A non-`scope` pointer cannot be dereferenced if that would yield a `scope` value. (However, such a `scope` value can be moved somewhere else through a non-scope pointer.)

Add borrowing rules:

- When copying a mutable `scope` value to another mutable `scope` value, access to the original value has to be disabled until the copy's lifetime ends.

- When copying a mutable `scope` value to a `const` `scope` value, the original value has to become `const` until the copy's lifetime ends.

- When copying a `const` `scope` value to a `const` `scope` value, the original value only has to outlive the copy.

- In particular, when taking the address of a value on the stack, the resulting `scope`d pointer will restrict access to that variable according to those rules until its lifetime ends. The `return` annotation can be used to track such assignments through function calls.

- For stack values, data flow analysis can be used to detect values that can be temporarily promoted to `scope`. Overloaded functions should prefer the `scope` overload.

Example: Library implementation of Unique pointers with @safe borrowing (`const`/`immutable`/`class` interactions left out for simplicity):

---
struct Unique(T){
    @trusted private scope T* payload;
    @disable this(this);
    auto borrow()@trusted return{ // (`return` refers to `ref this`)
        // potentially many references to unique pointer exist,
        // need runtime check
        // here, we'll just temporarily null out the Unique reference.
        static struct Borrowed{
            @trusted private scope Unique!T* self;
            @trusted private scope T* payload;
            @disable this(this);
            ~this()@trusted{ self.payload=payload; }
            return scope(T*) borrow()@trusted scope{
                return payload;
            }
            alias borrow this;
        }
        auto borrowed=payload;
        payload=null;
        return scope(Borrowed)(&this,borrowed);
    }
    scope(T*) borrow()@trusted scope return{
        // only one reference to unique pointer exists,
        // just return payload
	// note that while this does not actually return
	// a reference to `this`, we want the calling `@safe`
	// code to treat it as if it did, so that this can be
	// a `@trusted` function
        return payload;
    }
    ~this(){
        destroy((()@trusted=>payload)());
        ()@trusted{
            free(payload);
            payload=null;
        }
    }
    alias borrow this; // enable implicit borrowing
}
Unique!T makeUnique(T,A...)(A args){
    auto p=malloc(...);
    ...;
    return Unique!T(p);
}
---

---
void main(){
    auto p=makeUnique!int(3);
    ++*p; // ok, p is temporarily promoted to `scope` and `++` is
          // evaluated on a borrowed p.
    {
        scope Unique!int* q=[p].ptr;
        ++*p; // error, p is borrowed by q
    }
    ++*p; // ok, q went out of scope
    Unique!int* q=[p].ptr; // ok
    ++*p; // ok
    // however, this line used the non-scope overload of `borrow` as
    // `p`can no longer be promoted to `scope`
    auto r=q; // ok
    ++**q; // ok
    static void foo(ref int x, Unique!int* y){
       assert((*y).borrow() is null); // reference disabled temporarily
       ++x; // ok
    }
    foo((*q).borrow(),r);
    foo((*r).borrow(),q);
}
---

Similar strategies work for manually-allocated arrays and reference counting.
For @safe reference counting for mutable payloads, there always needs to be a runtime check on borrow, similar to the first implementation of the `borrow` function above. This could be implemented by reserving a bit in the reference count for keeping track of such mutable borrows. To enable both const and mutable borrows, one would probably need two reference counts, one for normal references and one for const borrows. (Note that Rust uses similar runtime checks for safe reference counting.)

The main drawback of this proposal is that it doesn't separate control of lifetime and control of aliasing, doing so would however require adding another type qualifier and does not have precedent in Rust.

On 27.10.19 23:36, Timon Gehr wrote: > - The problem with `@trusted` is that it has no defense against `@safe` code destroying its invariants or accessing raw pointers that are only meant to be manipulated by `@trusted` code. There should therefore be a way to mark data as `@trusted` (or equivalent), such that `@safe` code can not access it. I know that the exact syntax isn't important (yet). Anyway: @safe code can call @trusted functions. So it would be odd if @safe code weren't allowed to access @trusted data. I think `@system` would be a better fit for restricting access.

On 28/10/2019 11:36 AM, Timon Gehr wrote: > - The problem with `@trusted` is that it has no defense against `@safe` code destroying its invariants or accessing raw pointers that are only meant to be manipulated by `@trusted` code. There should therefore be a way to mark data as `@trusted` (or equivalent), such that `@safe` code can not access it. This seems artificially restrictive for this proposal. However, we could instead split this off into its own DIP allowing attributes to act like visibility modifiers for variables. I may not be convinced that this is required, but following it through to completion would be a good idea if its done at all. > Change the meaning of `scope`: > > - `scope` should apply to all types of data equally, not only built-in pointers and references. The most obvious use case for this is @safe interfacing with a C library that exposes handles as structs with an integer field but specifies undefined behavior if those handles are mismanaged. Not everything that is a manually-managed reference to something is a built-in pointer or reference. A primary usecase for this type of system is systemy-handles like a window, it would force it to remain on a single thread and can auto-dealloc when done. Replacing refcounting (which is perfectly ok but doesn't look great).

On Sunday, 27 October 2019 at 22:36:30 UTC, Timon Gehr wrote: > - The problem with `@trusted` is that it has no defense against `@safe` code destroying its invariants or accessing raw pointers that are only meant to be manipulated by `@trusted` code. There should therefore be a way to mark data as `@trusted` (or equivalent), such that `@safe` code can not access it. Would it be possible to accomplish this by putting the @trusted code and data in its own module, and using private? Assuming that the outstanding loopholes that allow bypassing private in @safe code are fixed, at least.

On 28.10.19 01:23, Paul Backus wrote: > On Sunday, 27 October 2019 at 22:36:30 UTC, Timon Gehr wrote: >> - The problem with `@trusted` is that it has no defense against `@safe` code destroying its invariants or accessing raw pointers that are only meant to be manipulated by `@trusted` code. There should therefore be a way to mark data as `@trusted` (or equivalent), such that `@safe` code can not access it. > > Would it be possible to accomplish this by putting the @trusted code and data in its own module, and using private? Assuming that the outstanding loopholes that allow bypassing private in @safe code are fixed, at least. Not really, because one can always add a @safe function to that module. The official sales pitch for @safe says that you only have to audit @trusted functions, but not @safe functions, to locate all memory safety issues in your program.

On 27.10.19 23:36, Timon Gehr wrote: > ~this(){ > destroy((()@trusted=>payload)()); > ()@trusted{ > free(payload); > payload=null; > } > } Of course, this should be: ~this(){ destroy((()@trusted=>payload)()); ()@trusted{ free(payload); payload=null; }(); }

On Sunday, 27 October 2019 at 22:36:30 UTC, Timon Gehr wrote: > [snip] > > The main drawback of this proposal is that it doesn't separate control of lifetime and control of aliasing, doing so would however require adding another type qualifier and does not have precedent in Rust. I'm a little confused by this. What type qualifier would need to be added and having what properties?

Thank you for posting this. I think it's the 4th scheme so far for D! We certainly have an embarrassment of riches. Personally, I've been making progress on a prototype of my scheme. It bears a lot of resemblance to yours.

On Monday, 28 October 2019 at 03:42:16 UTC, Walter Bright wrote: > Thank you for posting this. I think it's the 4th scheme so far for D! We certainly have an embarrassment of riches. > > Personally, I've been making progress on a prototype of my scheme. It bears a lot of resemblance to yours. So @live isn't a thing anymore? Or I did I mis-read this: > In particular, @live is a dead end, because: ...

Forums