Prototype of Ownership/Borrowing System for D (page 6) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Prototype of Ownership/Borrowing System for D (page 6)

November 22, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by Walter Bright
in reply to mipri

Walter Bright

Posted in reply to mipri

I'm going to suggest that further discussion of null be in a new thread, this thread is about the prototype O/B system.

November 22, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by Doc Andrew
in reply to mipri

Doc Andrew

Posted in reply to mipri

On Friday, 22 November 2019 at 06:43:01 UTC, mipri wrote:
>
> I'd sell it not as "it forces you to test" but "it reliably
> promotes nullable types to non-nullable types when this is safe"
>
> Kotlin example:
>
>   fun Int.test() = println("definitely not null")
>   fun Int?.test() = println("could be null")
>
>   fun main() {
>       val n: Int? = 10
>       n.test()
>       if (n != null)
>           n.test()
>   }
>
> Output:
>
>   could be null
>   definitely not null
>

That's pretty cool.

> So if you test a nullable type for null, everything else in the
> control path works with a non-nullable type. The effect on the
> program is that instead of having defensive tests throughout your
> program, you can have islands of code that definitely don't have to
> test, and only have the tests exactly where you need them. If you
> go too far in not having tests, the compiler will catch this as an
> error.
>
> I've never used TypeScript, but Kotlin's handling of nullable types
> is nice to the point I'd rather use the billion dollar mistake than
> an optional type.

Yes, after all, pointers are already an "optional" type, where null stands in
for None (or whatever).

>
> IMO what's really exciting about this though is that the compiler
> smarts that you need for it are *just short* of what you'd need for
> refinement types, so instead of just
>
>   T?  <-- it could be T or null
>
> you can have
>
>   T(5) <-- this type is statically known to be associated with the
>            number 5.
>
> And functions can be defined so that it's a compile-time error to
> pass a value to them without, somewhere in the statically known
> control path, doing something that associates them with 5.
>
> Similarly to how you can't just hand a char[4] to a D function with
> a signature that's asking for a char[5] - but for any type, for any
> purpose, and with automatic promotion of a char[???] to char[5] in
> the same manner that Kotlin promotes Int? to Int
>
> Add a SAT solver, and "this array is associated with a number that
> is definitely less than whatever this size_t is associated with",
> and you get compile-time guarantees that you don't need bounds
> checking for definite control paths in your program. That's cool.
> ATS does that: it's a compile-time error in ATS to look at argv[2]
> without checking argc.

Standby for some more discussion on formal verification... :)

I'm working on a program I'm calling ProveD (for now, at least) to do source-to-source translation using libdparse to generate Why3ML code and verification conditions that can be proven with Why3 and whatever SMT solvers you have installed. I don't have much to show yet, but I should have a PoC that others can hack on in a few months.

A lot of the effort I've spent on it so far is fighting with Ubuntu's broken Why3/Z3/Alt-Ergo/CVC4 packages (pro tip: don't use them) and goofing around with the OCaml package manager OPAM to get everything set up to where I can start running proofs.

I was going to try and get a little further before starting a thread for discussion, but if it comes up as a result of "The null Thread", that might be a good place to start.

-Doc

November 22, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by Doc Andrew
in reply to Robert M. Münch

Doc Andrew

Posted in reply to Robert M. Münch

On Friday, 22 November 2019 at 09:10:43 UTC, Robert M. Münch wrote:
>
> So, the name is related to how it's done, not what it gives to the user. That's like selling a thing and stating: "Look this was milled, turned etc." but not telling the customer what it does.
>
>> `@ownerBorrow` just seems a little awkward :-)
>
> Well, why not make it catchy: @borrow
>
> I'm sure 99.99% of all people would get a hint what this is about.

I always kind of thought @live would be a transitional thing until we figured out safe-by-default, and ownership/borrowed pointers would just be "the way."

November 22, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by jmh530
in reply to Doc Andrew

jmh530

Posted in reply to Doc Andrew

On Friday, 22 November 2019 at 14:48:11 UTC, Doc Andrew wrote:
> [snip]
>
> I always kind of thought @live would be a transitional thing until we figured out safe-by-default, and ownership/borrowed pointers would just be "the way."

While I like the idea of a safe-by-default compiler flag and I'm warming to @live as described by this thread, I'm not sure I'm convinced of the idea of @safe or @live being the default. The nice thing about @system being the default is that anyone from another language can immediately start doing stuff.

November 24, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 21.11.19 12:29, Walter Bright wrote:
> On 11/20/2019 4:59 PM, Timon Gehr wrote:
>> On 20.11.19 23:45, Walter Bright wrote:
>>> On 11/20/2019 4:16 AM, Timon Gehr wrote:
>>>> - What do you want to achieve with borrowing/ownership in D?
>>>
>>> I want to prevent the following common issues with pointer code:
>>>
>>> 1. use after free
>>> 2. neglecting to free
>>> 3. double free
>>> ...
>>
>> GC prevents those,
> 
> That's right. The GC is memory safe.
> 
>> and those problems cannot appear in @safe code.
> 
> @safe code has to call free() some time when manually managing memory.
> ...

@safe code cannot call free, because free is not @safe. In particular, I'm not supposed to free a GC pointer or a pointer into the static data segment. @live is useless in @safe code.

If you want @live to mean: "do these additional checks", that is fine, if people indeed want to write @system code with those checks without a guarantee that their code is safe if the checks pass.

>> @live doesn't prevent them at the interface between @live and non-@live code.
> 
> @live relies on any function it calls obeying @live conventions for its interface. This allows incremental adoption of @live code.
> ...

What it allows is one of the following:

1. a split of the language in two parts that cannot interoperate safely, in a language that claims to support memory safety.

2. @live checks provide no guarantees because they are optional and you can't rely on your callers to obey your desired borrowing/ownership interface.

If you want incremental adoption, just add the missing language features, and let them interoperate with existing code. New code will be written to take advantage of the new features. Don't change the meaning of existing language features based on a function attribute.

> 
>> What about user-defined types? What about allowing internal pointers into manually-managed memory to be exposed in @safe code?
> 
> Exposing an internal pointer in @live code is considered "borrowing" from the root of its container.
> ...

It makes no sense to let the caller decide. The entity exposing the internal pointer should say whether it is borrowed out or not. The data structure manages its invariants, not the caller.

> 
>>>> - How will I write a compiler-checked memory safe program that uses varied allocation strategies, including plain malloc,
>>>
>>> I'm not sure what clarification you want about plain malloc/free, although there are limitations outlined in ob.md.
>>> ...
>>
>> I.e., it is not planned that we will be able to write such programs?
> 
> I believe I covered that in ob.md. What am I missing?
> ...

You have previously attacked and dismissed my _sound_ designs for ostensibly not being checkable by the compiler (even though they are). I don't understand why this is not a concern for _your_ designs, which freely admit to being uncheckable and unsound. You can't say "@safe means memory safe and this is checked by the compiler" and at the same time "@live @safe relies on unchecked conventions to ensure memory safety across the @live boundary".

> 
>> The worry is that @live _removes_ value from tracing GC. If every pointer is owns its data, how do I express a pointer to GC-owned memory? Do I need to write a "smart" pointer data type that's just a shallow wrapper for a GC pointer? Also, if I do that, how do I make sure different GC-backed pointers don't lend out the same owning pointer at the same time?
> 
> @live does not distinguish a GC-allocated raw pointer from a malloc-allocated raw pointer. This means you'll be able to write generic @live code that can handle both equally.

I don't think this is the case. The GC-allocated raw pointer allows aliasing while @live does not allow aliasing. They have incompatible aliasing restrictions. It's like having a mutable and an immutable reference to the same memory location.

> Of course, if all you're using is the GC, you won't need to bother with @live at all.
> ...

I hope _nobody_ will have to bother with @live, but if they will, it will inevitably infect libraries and suddenly, yes, I will have to deal with it. Also, the GC is not _all_ I am using. I am using the GC when that makes sense and I am not using the GC when using the GC does not make sense. For example, I have code that is unsafe because it uses compile-time reflection to gain access to the internal array backing std.container.Array, because std.container.Array has no safe way to lend out that array to a caller, so it does not do it at all. Note that for this use case, it would be enough if there was some way for std.container.Array to state to the compiler that no invalidating operation may be called while the reference to the internal array is borrowed out. If std.container.Array can rely on all @safe code being checked that way, the function that borrows out the internal reference can be @safe. If it can't rely on that, no such @safe function can be written.

> 
>>> There's been a lot of progress with this with the addition of DIP25, DIP1000, and DIP1012. This further improves it by making the protections transitive.
>> As far as I can tell, @live doesn't bring us closer to @safe RC, because it applies to built-in pointers instead of library-defined smart pointers. I think this is completely backwards. Every owning pointer also needs to know the allocation strategy. Therefore, allowing built-in pointers to own their memory is vastly less useful than allowing library-defined smart pointers to do so.
> 
> Nothing about @live stops programmers from using library-defined smart pointers.

What about the fact that it is _optional_ for a /caller/ to respect the smart pointer's desired ownership restrictions? That's very restrictive for the smart pointer! It won't be able to provide @safe borrowing functionality.

> The smart pointer would be the owning pointer,

Why do you _need_ an unsound @live construct to let a smart pointer be an owning pointer?

> and if it exposed an internal pointer that internal pointer would be treated as "borrowing" from the owner and further access to the smart pointer would be denied until the borrower's last use.
> ...

Great. I want that in @safe code _if the smart pointer requests it_. No @live needed.

> 
>>> and conflating different allocators, which I don't have a good idea on.
>> Do the checks for library-defined smart pointers instead of built-in pointers. Built-in pointers shouldn't care about lifetime nor allocator.
> 
> People use raw pointers, and that isn't going away (because performance).

How about because the `new` operator returns pointers?

If there is a difference in performance between a T* and a `struct { T* payload; }` that's an issue with the backend and/or the ABI.

> Telling people "just use smart pointers" is like telling C++ people to do that.

C++ does not have @safe!

> It doesn't work reliably.
> ...

It works in @safe code because the smart pointer will be the only way for the @safe code to manually manage memory. E.g.:

struct MP(T){ // owning, malloc-backed pointer
    private T* payload;
    @disable this();
    @disable this(T*); // can't construct
    @disable this(this); // can't copy (move-only type; therefore track
                                        this type like you track
                                        pointers in @live now)
    pragma(inline,true){
        private @system ~this(){} // only current module can drop
                                  // values, in @system or @trusted code
        ref T borrow()return{ return *payload; }
    }
    // can borrow out internal pointer
    alias borrow this;
}

@safe MP!T malloc(); // type tracks allocator
@trusted void free(MP!T); // @safe because pointer is known to be unique and malloc'd

In order to call the safe free function you have to pass a pointer that was allocated with the matching smart pointer type. Note that by "smart" in this case I just mean it knows about the underlying allocator and it prevents the pointer from being leaked. There is no runtime behavior, it's all in the types.

I.e., this pointer type would use language features to precisely tell the compiler what restrictions its users have to obey. In this case, they may not invent new MP's, they may not copy MP's and they have to explicitly dispose of the MP. This is essentially what @live now does for all raw pointers, but maybe some data types only need a subset of the restrictions. In particular, raw pointers in @safe code need none of those restrictions.

There are potential issues if you try to borrow from some entity that is potentially referenced from somewhere else, so that should be disallowed. To bridge the gap, you can implement runtime checks, like Rust's cell does: https://doc.rust-lang.org/std/cell/

This is discussed in my ownership/borrowing post. (Note that my ownership/borrowing post assumes that `scope` pointers and `ref` pointers cannot alias. Aliasing restrictions could also be moved into a separate attribute for backwards-compatibility and better expressiveness.)

> The checks on smart pointers can be done with RAII and reference counting, and the dips already implemented.
> ...

Not sure what this is referring to.

> 
>> The point of adding restrictions is to gain expressiveness. It's why type systems are a good idea. In this case, the point of borrowing restrictions should be to enable @safe code to manipulate interior pointers into manually-managed data structures.
> 
> They can do that now as long as the container only exposes interior pointers as 'ref'.
> 

Not really, because aliasing is not considered. @live just assumes it does not exist and non-@live can introduce it arbitrarily. @live is all checks, no derived guarantees.

November 24, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by mipri
in reply to Timon Gehr

mipri

Posted in reply to Timon Gehr

On Saturday, 23 November 2019 at 23:40:05 UTC, Timon Gehr wrote:
> If you want @live to mean: "do these additional checks", that
> is fine, if people indeed want to write @system code with those
> checks without a guarantee that their code is safe if the
> checks pass.

It's a compile-time guarantee that a class of error can't occur
within the code so marked. If a non-@live caller makes use of
some @live code, and introduces his own errors, they're his
own error. @live remains in the language as a tool that the
calling code might use to gain the same protection. As it
expands in a code base, so shrink the places where these
errors may still be found.

There's not a choice between absolute perfect guarantees (with
some other design) vs. a complete absence of guarantees (with
this one). The choice is between language defaults, in how
easily the guarantees can be defeated, in what you can expect
of other people's code. This is why everyone doesn't say that
Rust has no real safety because you can always drop down an
unsafe {} block and do whatever you want, and why Rust hasn't
gotten a reputation as an unsafe language even as bugs are
occasionally found in the unsafe blocks of its standard
library.  unsafe {} isn't the default; since you have to opt
into errors it's very easy to avoid opting in; and you can
expect that other people will make sparing use of unsafe {}.

In D, @system is the default and you have to opt in to various
protections, so it's very different language, but @live is in
line with that language. I think it's also in line with any
language that still wants to be able to make reasonable use of
foreign code written in unsafe languages--a famous source of
annoyance for Rust.

> I hope _nobody_ will have to bother with @live,

It's really hard to see you as only having sincere technical
objections to @live after reading this. Either @live does no
good as you say you think, and enthusiasm for alternatives
to it will persist, or @live will have an effect and this
enthusiasm will wane. I don't think there's a future of "@live
is enthusiastically embraced even though it doesn't help at
all."

> but if they
> will, it will inevitably infect libraries and suddenly, yes, I
> will have to deal with it.

How will you have to deal with it? Code can't require that
their callers have @live. It's your preferred alternative that
would necessarily entangle users of libraries.

November 24, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by mipri
in reply to mipri

mipri

Posted in reply to mipri

On Sunday, 24 November 2019 at 02:10:41 UTC, mipri wrote:
> It's really hard to see you as only having sincere technical
> objections to @live after reading this.

Even after rewriting this so many times, I reckon it still
won't be received well.

So:

I'm actually very interested in criticisms of @live (I hope
more people are testing it than is apparent from the posts
here), and even of alternatives that won't happen. But I don't
have a four-year degree with a major of "the last 300 years of
your bitter disputes about language design", and every single
post of yours has required that. (I still have no idea what you
could possibly mean with a remark like "It doesn't make
@safe code any more expressive.")

I realize it's tiresome to repeat things that you think are
already established, though. Please feel free to disregard my
input.

November 24, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by Jab
in reply to mipri

Jab

Posted in reply to mipri

On Sunday, 24 November 2019 at 02:10:41 UTC, mipri wrote:
> On Saturday, 23 November 2019 at 23:40:05 UTC, Timon Gehr wrote:
>> If you want @live to mean: "do these additional checks", that
>> is fine, if people indeed want to write @system code with those
>> checks without a guarantee that their code is safe if the
>> checks pass.
>
> It's a compile-time guarantee that a class of error can't occur
> within the code so marked. If a non-@live caller makes use of
> some @live code, and introduces his own errors, they're his
> own error. @live remains in the language as a tool that the
> calling code might use to gain the same protection. As it
> expands in a code base, so shrink the places where these
> errors may still be found.

You can have use-after-free bugs happen in @live code, as a result of what is passed into @live functions.

void zoo() {
    int* p = cast(int*)malloc( int.sizeof * 2 );
    foo(p, p + 1);
}

@live void foo( int* p, int* q ) {
    free(p);

    *q = 10; // use after free.
}

The very error that @live is supposed to stop happens inside of the function on it's watch. Sure you can pass a garbage pointer to @safe, but the garbage pointer is created in code that isn't marked safe. Here it happens inside of @live itself.

I think the more dangerous thing as well is that it appears that it would complain that "q" is a dangling pointer. Which I assume it would want you to free(). But that may not be a pointer you want to free. So the compiler will be effectively be forcing you to create an error, or your code won't compile.

> I'm actually very interested in criticisms of @live (I hope more people are testing it than is apparent from the posts here),

I'm not too keen on testing it now. The only reason I'd see to test it is to find any flaws in the system, the doc provided illustrates that there is still a lot of known issues that need to be resolved first. Who knows what kind of changes will be made in that time.

November 24, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by mipri
in reply to Jab

mipri

Posted in reply to Jab

On Sunday, 24 November 2019 at 03:42:00 UTC, Jab wrote:
> You can have use-after-free bugs happen in @live code, as a result of what is passed into @live functions.
>
>
> void zoo() {
>     int* p = cast(int*)malloc( int.sizeof * 2 );
>     foo(p, p + 1);
> }
>
> @live void foo( int* p, int* q ) {
>     free(p);
>
>     *q = 10; // use after free.
> }

And this crashes dmd:
---
import core.stdc.stdlib: malloc, free;

@live void zoo() {
    int* p = cast(int*)malloc( int.sizeof * 2 );
    foo(p, p + 1);
}

@live void foo( scope int* p, scope int* q ) {
    *q = 10; // use after free.
}
---
core.exception.RangeError@dmd/ob.d(1526): Range violation

It's zoo's @live that does it.

> The very error that @live is supposed to stop happens inside of the function on it's watch. Sure you can pass a garbage pointer to @safe, but the garbage pointer is created in code that isn't marked safe. Here it happens inside of @live itself.

The user-after-free occurs there but the caller is still
violating foo()'s signature by saying "here are two pointers
that you are now the owner of" when it impossible for foo()
to dispose of them safely.

OTOH, I guess it's always the case that @live code can't
safely take ownership of an unknown pointer, because it can't
distinguish between a pointer the GC will clean up vs. a
pointer that a third party allocator must clean up vs. a
pointer to statically allocated memory that doesn't bear
cleaning up.

November 24, 2019

Re: Prototype of Ownership/Borrowing System for D

Posted by Ola Fosheim Grøstad
in reply to mipri

Ola Fosheim Grøstad

Posted in reply to mipri

On Sunday, 24 November 2019 at 02:33:50 UTC, mipri wrote:
> I'm actually very interested in criticisms of @live (I hope
> more people are testing it than is apparent from the posts
> here), and even of alternatives that won't happen.

I enjoyed reading your testing of this experimental feature. :-)

> But I don't
> have a four-year degree with a major of "the last 300 years of
> your bitter disputes about language design", and every single
> post of yours has required that.

There are many angles to verification and type systems, I don't think you will find anyone that has a complete understanding. Even professors have a rather narrow subfield of verifiable programming where they have deep understanding (and then some overview over the rest of the field).

What is certain though, is that there are no easy paths to a workable solution. So being sceptical is healthy...

@live should be watched as babysteps not as a solution.  But you need to take many babysteps to learn how to walk. So in that regard this is an interesting move.

I personally think that it would be better to split the language into two, one library-language and one application language. The application language should be almost as easy to deal with as Python, and then move all the complications and @attributes and what not down into the library-language.

I don't think application programmers want to deal with pure, live etc etc. You have to keep the semantics simple on the higher levels.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation