November 20, 2019
On 11/20/2019 4:16 AM, Timon Gehr wrote:
> - What do you want to achieve with borrowing/ownership in D?

I want to prevent the following common issues with pointer code:

1. use after free
2. neglecting to free
3. double free

4. safe casting to immutable
5. safe conversion to/from a shared pointer


> - What can already be done with @live? (Ideally with runnable code examples.)

The test cases included with the PR should give an idea.


> - How will I write a compiler-checked memory safe program that uses varied allocation strategies, including plain malloc,

I'm not sure what clarification you want about plain malloc/free, although there are limitations outlined in ob.md.

> tracing GC

I don't see any value OB adds to tracing GC. A GC is an entirely different solution.

> and reference counting?

The main difficulty (as you pointed out) with RC is holding on to an interior pointer via one reference while another reference free's what it's pointing to.
There's been a lot of progress with this with the addition of DIP25, DIP1000, and DIP1012. This further improves it by making the protections transitive.


> Right now, the only use I can see for @live is as an incomplete and unsound linting tool in @system code.

The unsoundness is in dealing with thrown exceptions, which I have some ideas on how to deal with, and conflating different allocators, which I don't have a good idea on.

> It doesn't make @safe code any more expressive. To me, added expressiveness in @safe code is the whole point of a borrowing scheme.

Technically, @live works by adding restrictions, not expressiveness.
November 21, 2019
On Wed, Nov 20, 2019 at 6:20 PM Walter Bright via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On 11/19/2019 11:49 PM, Manu wrote:
> > I haven't read thoroughly yet, although I have been following along
> > the way and understand the goals... but I really can't not say
> > straight up that I think `@live` is very upsetting to me.
> > I hate to bike-shed, but at face value `@live` is a very unintuitive
> > name. It offers me no intuition what to expect, and I have at no point
> > along this process has any idea what it means or why you chose that
> > word, and I think that's an immediate ref flag.
>
> The "live" refers to the data flow analysis which discovers which pointers are "live" or "dead" at any point in the flow graph. This is critical for what O/B is trying to do.
>
> `@ownerBorrow` just seems a little awkward :-)
>
> Andrei proposed `@live`, and I like it. It's short, sweet, and sounds good.
>
> > Are you really certain there's no way to do this without adding yet more attributes?
>
> We'll never be able to compile C-like code if we force an O/B system on all the code. There has to be a way to distinguish, like what `pure` does. D would be unusable if everything had to be `pure`. My understanding of O/B is you're going to have to redesign code and data structures to use it effectively. I.e. it'll break everything. Rust has a powerful enough marketing machine to convince people that redesigning your programs is a Good Thing (tm) and perhaps it is, but we don't have the muscle to do that.

Is there a path perhaps where you only attribute parameters instead of the whole function? We already have `scope`, `return`, etc on parameters... ?

> > It would be better if an attribute was not required
> > for this... we're already way overloaded in that department.
> > Timon appeared to have a competing proposal which didn't add an
> > attribute. I never saw your critique of his work, how do your relative
> > approaches compare?
>
> I don't have a good understanding of Timon's work yet.

Okay. I'd like to see a rigorous comparison somewhere with the key differences called out.
November 21, 2019
On 20.11.19 23:45, Walter Bright wrote:
> On 11/20/2019 4:16 AM, Timon Gehr wrote:
>> - What do you want to achieve with borrowing/ownership in D?
> 
> I want to prevent the following common issues with pointer code:
> 
> 1. use after free
> 2. neglecting to free
> 3. double free
> ...

GC prevents those, and those problems cannot appear in @safe code.
@live doesn't prevent them at the interface between @live and non-@live code.

> 4. safe casting to immutable
> 5. safe conversion to/from a shared pointer
> ...

What about user-defined types? What about allowing internal pointers into manually-managed memory to be exposed in @safe code?

> 
>> - What can already be done with @live? (Ideally with runnable code examples.)
> 
> The test cases included with the PR should give an idea.
> ...

Well, they are compiler tests, not use cases.

> 
>> - How will I write a compiler-checked memory safe program that uses varied allocation strategies, including plain malloc,
> 
> I'm not sure what clarification you want about plain malloc/free, although there are limitations outlined in ob.md.
> ...

I.e., it is not planned that we will be able to write such programs?

>> tracing GC
> 
> I don't see any value OB adds to tracing GC. A GC is an entirely different solution.
> ...

The worry is that @live _removes_ value from tracing GC. If every pointer is owns its data, how do I express a pointer to GC-owned memory? Do I need to write a "smart" pointer data type that's just a shallow wrapper for a GC pointer? Also, if I do that, how do I make sure different GC-backed pointers don't lend out the same owning pointer at the same time?

>> and reference counting?
> 
> The main difficulty (as you pointed out) with RC is holding on to an interior pointer via one reference while another reference free's what it's pointing to.

Yup.

> There's been a lot of progress with this with the addition of DIP25, DIP1000, and DIP1012. This further improves it by making the protections transitive.
> ...

As far as I can tell, @live doesn't bring us closer to @safe RC, because it applies to built-in pointers instead of library-defined smart pointers. I think this is completely backwards. Every owning pointer also needs to know the allocation strategy. Therefore, allowing built-in pointers to own their memory is vastly less useful than allowing library-defined smart pointers to do so.

> 
>> Right now, the only use I can see for @live is as an incomplete and unsound linting tool in @system code.
> 
> The unsoundness is in dealing with thrown exceptions, which I have some ideas on how to deal with,

That's not the only problem. @live code can call non-@live code and obtain pointers from such code. Therefore, @safe @live code is useless, as all accessible pointers mustn't be owning anyway.

> and conflating different allocators, which I don't have a good idea on.
> ...

Do the checks for library-defined smart pointers instead of built-in pointers. Built-in pointers shouldn't care about lifetime nor allocator.

>> It doesn't make @safe code any more expressive. To me, added expressiveness in @safe code is the whole point of a borrowing scheme.
> 
> Technically, @live works by adding restrictions, not expressiveness.

The point of adding restrictions is to gain expressiveness. It's why type systems are a good idea. In this case, the point of borrowing restrictions should be to enable @safe code to manipulate interior pointers into manually-managed data structures.
November 20, 2019
On 11/20/2019 2:57 PM, Manu wrote:
> Is there a path perhaps where you only attribute parameters instead of
> the whole function? We already have `scope`, `return`, etc on
> parameters... ?

Perhaps. Seems rather tedious, though.
November 20, 2019
On 11/20/2019 5:13 PM, Walter Bright wrote:
> On 11/20/2019 2:57 PM, Manu wrote:
>> Is there a path perhaps where you only attribute parameters instead of
>> the whole function? We already have `scope`, `return`, etc on
>> parameters... ?
> 
> Perhaps. Seems rather tedious, though.

And less auditable. `scope` and `return` are backed up with compiler checks when they aren't used. `@live` is conceptually different - it adds checks, it does not compensate for added checks.
November 21, 2019
On Wednesday, 20 November 2019 at 06:57:55 UTC, mipri wrote:
>snip

Wow, I never even attempted to compile dmd before despite how long I've hung around here... Never really appreciated just how fast compilation speed is. The entire compiler finishes in 9.4 seconds. Puts C/++-based projects to shame...
November 21, 2019
On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright wrote:
>
> I originally was going to add null to the data flow analysis. But I realized it would be rather useless:
>
>   T* foo();
>
>   T* p = foo(); // is p null or not?
>
> Very quickly, the flow analysis would drop into "dunno if it is null or not" so it just won't be worth much.

You might only have to maintain this "undefined state" for a short period
of time:

T* foo();
T* p = foo();  //Undefined state

if(p)
{
    //We opened Schrodinger's pointer and know it's not null anymore.
}
else
{
    //We know it's null here.
}

//Back to undefined state - we shouldn't deref p out here.

And honestly, knowing whether a pointer is in an undefined state is still a very
useful piece of information that the flow analysis would provide.
Dereferencing a pointer of unknown provenance should be an error in @live code,
no different than null.

>
> In order to make non-null checking actually work, the language semantics would likely need to change to make:
>
>    T*    a non-null pointer
>    T?*   an optionally null pointer
>
> or something like that. Two different pointer types would need to exist.

Maybe. As an alternative to that syntax, Consider a pointer with a "notnull" attribute (storage class?) and a corresponding "notnull" function attribute for the return type.

I'd expect runtime checks to be inserted in debug mode for the function return and removed in release mode.

>
> Something like this is orthogonal to what @live is trying to do, so I put it on the shelf for the time being.

It's orthogonal, but very useful for correctness. I'm anxious to see what comes of it...

...but hopefully not with ?*s scattered everywhere :)
November 21, 2019
On Thursday, 21 November 2019 at 03:47:21 UTC, Doc Andrew wrote:
>>    T*    a non-null pointer
>>    T?*   an optionally null pointer

Yes, optional pointers would be amazing, even though it breaks the syntax in most existing modules that uses pointers.

> alternative to that syntax, Consider a pointer with a "notnull" attribute (storage class?) and a corresponding "notnull" function attribute for the return type.

Ideally, the non-nullable pointer is the default, and the nullable pointer should be the oddball that requires explicit annotation.

Thus the main benefit of the "notnull" attribute would be preserving existing D, certainly a huge benefit.

Reason: I'd estimate that 80% to 95% of all pointer code can be written with non-nullable pointers. No proof, only a hunch. Many functions already return non-null than maybe-null, the type system merely doesn't know it yet. Even if we return a null pointer, then our caller, on receiving the maybe-null pointer from us, tends to immediately check, thus immediately convert to non-nullable, or else return early.

Whenever the common case requires annotation, we foster annotation salad:

    int notnull* foo() const @pure @safe nothrow @nogc

With class references, non-nullability is even more common than with pointers. Typically, null class references appear only as member variables during construction of another class:

    class A { /* ... */ }
    class B {
        A a;
        this() {
            /* here is the only time that a is ever null */
            a = new A();
        }
    }

> It's orthogonal, but very useful for correctness.

Yes, it's entirely orthogonal. :-)

@live is conceptually different enough from non-nullability, it feels best to solve them separately. @live feels more like @safe instead, and even those turned out independent.

Exciting topic, hard to fit onto existing D codebases.

-- Simon
November 21, 2019
Am Thu, 21 Nov 2019 03:47:21 +0000 schrieb Doc Andrew:

> On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright wrote:
>>
>> I originally was going to add null to the data flow analysis. But I realized it would be rather useless:
>>
>>   T* foo();
>>
>>   T* p = foo(); // is p null or not?
>>
>> Very quickly, the flow analysis would drop into "dunno if it is null or not" so it just won't be worth much.
> 
> You might only have to maintain this "undefined state" for a short period of time:
> 
> T* foo();
> T* p = foo();  //Undefined state
> 
> if(p)
> {
>      //We opened Schrodinger's pointer and know it's not null
> anymore.
> }
> else {
>      //We know it's null here.
> }
> 

TypeScript does that. In addition, it disallows dereferencing if the pointer is in the undefined state. So it forces you to check a pointer before you dereference it. I always found that to be quite useful.

-- 
Johannes
November 21, 2019
On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright wrote:
> In order to make non-null checking actually work, the language semantics would likely need to change to make:
>
>    T*    a non-null pointer
>    T?*   an optionally null pointer
>
> or something like that. Two different pointer types would need to exist.

If you're going to go with introducing new meanings like that, why not `&T` for the non-null pointer (to match Rust usage and the implicit comparison with taking the address of a stack variable)?

That would make it easier to introduce non-null pointers as opt-in without breaking existing code, and avoid any unintuitive differences with C APIs.

[By the way, isn't my bikeshed a lovely colour this season? :-)]

> Something like this is orthogonal to what @live is trying to do, so I put it on the shelf for the time being.

Sure, it makes sense to split the data flow analysis from the question of how any given pointer has its value set.

That said, it intuitively feels like a non-null pointer type might want to be implicitly @live.  One could avoid imposing that, but it's hard to think of a situation where on would want the one, without the other.

The one thing that Rust does with its references, which I find a little bit _too_ limiting, is the "single mutable reference" constraint.  They do this by default in order to avoid various multithreading failure cases, but it blocks some of the easy design options one could use in single-threaded code.

_As a default_ that probably makes sense, but it would be good to have an opt-out for code designs that are not going to have those failure cases.