October 20, 2018
On Sat, Oct 20, 2018 at 9:45 AM Stanislav Blinov via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Saturday, 20 October 2018 at 16:18:53 UTC, aliak wrote:
>
> > class C {
> >   void f();
> >   void g() shared;
> > }
> >
> > void t1(shared C c) {
> >   c.g; // ok
> >   c.f; // error
> > }
> >
> > void t2(shared C c) {
> >   c.g; // ok
> >   c.f; // error
> > }
> >
> > auto c = new C();
> > spawn(&t1, c);
> > spawn(&t2, c);
> > c.f; // ok
> > c.g; // ok
>
> Those are not "ok". They're only "ok" under Manu's proposal so long as the author of C promises (via documentation) that that's indeed "ok". There can be no statically-enforced guarantees that those calls are "ok", or that issuing them in that order is "ok". Yet Manu keeps insisting that somehow there is.

I only insist that if you write a shared method, you promise that it
is threadsafe.
If f() undermines g() threadsafety, then **g() is NOT threadsafe**,
and you just write an invalid program.

You can write an invalid program in any imaginable number of ways; that's just not an interesting discussion. An interesting discussion is what we might to to help prevent writing such an invalid program... I don't suggest here what we can do to statically encorce this, but I suspect there does exist *some* options which may help, which can be further developments.

What I also assert is that *this unsafe code is rare*... it exists only at the bottom of the tooling stack, and anyone else using a shared object will not do unsafe, and therefore will not be able to create the problem. If you do unsafety anywhere near `shared`, you should feel nervous. I'm trying to make a world where you aren't *required* to do unsafety at every single interaction.

Understand: f() can only undermine g() promise of threadsafety **if
f() is not @safe**. Users won't create this situation accidentally,
they can only do it deliberately.
October 20, 2018
On Saturday, 20 October 2018 at 18:30:59 UTC, Manu wrote:
> On Sat, Oct 20, 2018 at 9:45 AM Stanislav Blinov via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>>
>> On Saturday, 20 October 2018 at 16:18:53 UTC, aliak wrote:
>>
>> > class C {
>> >   void f();
>> >   void g() shared;
>> > }
>>
>> Those are not "ok". They're only "ok" under Manu's proposal so long as the author of C promises (via documentation) that that's indeed "ok". There can be no statically-enforced guarantees that those calls are "ok", or that issuing them in that order is "ok". Yet Manu keeps insisting that somehow there is.
>
> I only insist that if you write a shared method, you promise that it is threadsafe.
> If f() undermines g() threadsafety, then **g() is NOT threadsafe**, and you just write an invalid program.
>
> You can write an invalid program in any imaginable number of ways; that's just not an interesting discussion. An interesting discussion is what we might to to help prevent writing such an invalid program... I don't suggest here what we can do to statically encorce this, but I suspect there does exist *some* options which may help, which can be further developments.
>
> What I also assert is that *this unsafe code is rare*... it exists only at the bottom of the tooling stack, and anyone else using a shared object will not do unsafe, and therefore will not be able to create the problem. If you do unsafety anywhere near `shared`, you should feel nervous. I'm trying to make a world where you aren't *required* to do unsafety at every single interaction.
>
> Understand: f() can only undermine g() promise of threadsafety **if f() is not @safe**. Users won't create this situation accidentally, they can only do it deliberately.

---

module expertcode;

@safe:

struct FileHandle {
    @safe:

    void[] read(void[] storage) shared;
    void[] write(const(void)[] buffer) shared;
}

FileHandle openFile(string path);
// only the owner can close
void closeFile(ref FileHandle);

void shareWithThreads(shared FileHandle*); // i.e. generate a number of jobs in some queue
void waitForThreads();                     // waits until all processing is done

module usercode;

import expertcode;

void processHugeFile(string path) {
    FileHandle file = openFile(path);
    shareWithThreads(&file);    // implicit cast
    waitForThreads();
    file.closeFile();
}

---

Per your proposal, everything in 'expertcode' can be written @safe, i.e. not violating any of the points that @safe forbids, or doing so only in a @trusted manner. As far as the language is concerned, this would mean that processHugeFile can be @safe as well.

Remove the call to `waitForThreads()` (assume user just forgot that, i.e. the "accident"). Nothing would change for the compiler: all calls remain @safe. And yet, if we're lucky, we get a consistent instacrash. If we're unlucky, we get memory corruption, or an unsolicited write to another currently open file, either of which can go unnoticed for some time.

Of course the program becomes invalid if you do that, there's no question about it, this goes for all buggy code. The problem is, definition of "valid" lies beyond the type system: it's an agreement between different parts of code, i.e. between expert programmers who wrote FileHandle et al., and users who write processHugeFile(). The main issue is that certain *runtime* conditions can still violate @safe-ty.

Your proposal makes the language more strict wrt. to writing @safe 'expertmodule', thanks to disallowing reads and writes through `shared`, which is great.
However the implicit conversion to `shared` doesn't in any way improve the situation as far as user code is concerned, unless I'm still missing something.
October 20, 2018
On Sat, Oct 20, 2018 at 10:10 AM Stanislav Blinov via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Saturday, 20 October 2018 at 16:48:05 UTC, Nicholas Wilson wrote:
> > On Saturday, 20 October 2018 at 09:04:17 UTC, Walter Bright wrote:
> >> On 10/19/2018 11:18 PM, Manu wrote:
> >>> The reason I ask is because, by my definition, if you have:
> >>> int* a;
> >>> shared(int)* b = a;
> >>>
> >>> While you have 2 numbers that address the same data, it is not actually aliased because only `a` can access it.
> >>
> >> They are aliased,
> >
> > Quoting Wikipedia:
> >
> >>two pointers A and B which have the same value, then the name A[0] aliases the name B[0]. In this case we say the pointers A and B alias each other. Note that the concept of pointer aliasing is not very well-defined – two pointers A and B may or may not alias each other, depending on what operations are performed in the function using A and B.
> >
> > In this case given the above: `a[0]` does not alias `b[0]` because `b[0]` is ill defined under Manu's proposal, because the memory referenced by `a` is not reachable through `b` because you can't read or write through `b`.
> >
> >> by code that believes it is unshared
> >
> > you cannot `@safe`ly modify the memory  through `b`, `a`'s view of the memory is unchanged in @safe code.
>
> And that's already a bug, because the language can't enforce threadsafe access through `a`, regardless of presence of `b`. Only the programmer can.
>
> >> and, code that believes it is shared.
> >
> > you cannot have non-atomic access though `b`, `b` has no @safe view of the memory, unless it is atomic (which by definition is synchronised).
>
> Synchronized with what? You still have `a`, which isn't `shared` and doesn't require any atomic access or synchronization. At this point it doesn't matter if it's an int or a struct. As soon as you share `a`, you can't just pretend that reading or writing `a` is safe.

`b` can't read or write `a`... accessing `a` is absolutely safe.
Someone must do something unsafe to undermine your threadsafety... and
if you write unsafe code and don't know what you're doing, there's
nothing that can help you.
Today, every interaction with shared is unsafe. Creating a safe
interaction with shared will lead to people not doing unsafe things at
every step.

> Encapsulate it all you want, safety only remains a
> contract of convention, the language can't enforce it.

You're talking about @trusted code again. You're fixated on unsafe interactions... my proposal is about SAFE interactions. I'm trying to obliterate unsafe interactions with shared.


> module expertcode;
>
> @safe:
>
> struct FileHandle {
>      @safe:
>
>      void[] read(void[] storage) shared;
>      void[] write(const(void)[] buffer) shared;
> }
>
> FileHandle openFile(string path);
> // only the owner can close
> void closeFile(ref FileHandle);
>
> void shareWithThreads(shared FileHandle*); // i.e. generate a
> number of jobs in some queue
> void waitForThreads();                     // waits until all
> processing is done
>
> module usercode;
>
> import expertcode;
>
> void processHugeFile(string path) {
>      FileHandle file = openFile(path);
>      shareWithThreads(&file);    // implicit cast
>      waitForThreads();
>      file.closeFile();
> }

This is a very strange program... I'm dubious it is in fact "expertcode"... but let's look into it.

File handle seems to have just 2 methods... and they are both threadsafe. Open and Close are free-functions. Close does not promise threadsafety itself (but of course, it doesn't violate read/write's promise, or the program is invalid).

I expect the only possible way to achieve this is by an internal mutex to make sure read/write/close calls are serialised. read and write will appropriately check their file-open state each time they perform their actions. What read/write do in the case of being called on a closed file... anyones guess? I'm gonna say they do no-op... they return a null pointer to indicate the error state.

Looking at the meat of the program; you open a file, and distribute it
to do accesses (I presume?)....
Naturally, this is a really weird thing to do, because even if the API
is threadsafe such that it doesn't crash and reads/writes are
serialised, the sequencing of reads/writes will be random, so I don't
believe any sane person (let alone an expert) would write this
program... but moving on.
Then you wait for them to finish, and close the file.

Fine. You have a file with randomly interleaved data... for whatever reason.

> Per your proposal, everything in 'expertcode' can be written @safe, i.e. not violating any of the points that @safe forbids, or doing so only in a @trusted manner. As far as the language is concerned, this would mean that processHugeFile can be @safe as well.

This program does appear to be safe (assuming that the implementations aren't invalid), but a very strange program nonetheless.

> Remove the call to `waitForThreads()` (assume user just forgot
> that, i.e. the "accident"). Nothing would change for the
> compiler: all calls remain @safe.

Yup.

> And yet, if we're lucky, we get
> a consistent instacrash. If we're unlucky, we get memory
> corruption, or an unsolicited write to another currently open
> file, either of which can go unnoticed for some time.

Woah! Now this is way off-piste..
Why would get a crash? Why would get memory corruption? None of those
things make sense.

So, you call closeFile immediately and read/write start returning null.
I'm going to assume that `shareWithThreads()` was implemented by an
'expert' who checked the function results for errors. It was detected
that the reads/write failed, and an error "failed to read file" was
emit, then the function returned promptly.
The uncertainty of what happens in this program is however
`shareWithThreads()` handles read/write emitting an error.

> Of course the program becomes invalid if you do that, there's no question about it, this goes for all buggy code.

In this case, I wouldn't say the program becomes 'invalid'; it is
valid for filesystem functions to return error states and you should
handle them.
In this case, read/write must return some "file not open" state, and
it should be handled properly.
This problem has nothing to do with threadsafety. It's a logic issue
related to threading, but that's got nothing to do with this.

> The problem is,
> definition of "valid" lies beyond the type system: it's an
> agreement between different parts of code, i.e. between expert
> programmers who wrote FileHandle et al., and users who write
> processHugeFile(). The main issue is that certain *runtime*
> conditions can still violate @safe-ty.

Perhaps you don't understand what @safe-ty means? It's a compiler
assertion that the code is memory-safe. It's not a magic attribute
that tells you that your program is right.
Runtime conditions being in a valid state is a high-level problem for
the program, and doesn't interacts with threadsafety in any
fundamental way, and not in any way that @safe has anything to do
with.
You're just describing normal high-level multi-threading logic
problems. `shared` does not and can not help you with that; you need
to look to libraries that offer threading support frameworks for that.
It can help you not write code that does invalid access to memory and
crash. That's the extent of its charter.

If a `shared` API is designed well, it can also offer strong implicit advice about how to correctly interact with API's. The compiler will coerce you to do the right things with error messages.

> Your proposal makes the language more strict wrt. to writing
> @safe 'expertmodule', thanks to disallowing reads and writes
> through `shared`, which is great.
> However the implicit conversion to `shared` doesn't in any way
> improve the situation as far as user code is concerned, unless
> I'm still missing something.

It does, it eliminates unsafe user interactions. It must be that way to be safe.
There were no casts above, it's great! And your program is safe!
(although it's wrong)

FWIW, I doubt anybody in their right mind would attempt to write a threadsafe filesystem API this way. Any such API would be structured COMPLETELY differently; it would likely have one `shared` method that would accept requests for deferred fulfillment, and handle unique objects associated with each request.

October 21, 2018
On 10/20/2018 11:30 AM, Manu wrote:
> You can write an invalid program in any imaginable number of ways;
> that's just not an interesting discussion.

What we're discussing is not an invalid program, but what guarantees the type system can provide.

D's current type system guarantees that a T* and a shared(T)* do not point to the same memory location in @safe code.

To get them to point to the same memory location, you've got to dip into @system code, where *you* become responsible for maintaining the guarantees.
October 21, 2018
On Sunday, 21 October 2018 at 09:04:34 UTC, Walter Bright wrote:
> On 10/20/2018 11:30 AM, Manu wrote:
>> You can write an invalid program in any imaginable number of ways;
>> that's just not an interesting discussion.
>
> What we're discussing is not an invalid program, but what guarantees the type system can provide.
>
> D's current type system guarantees that a T* and a shared(T)* do not point to the same memory location in @safe code.
>
> To get them to point to the same memory location, you've got to dip into @system code, where *you* become responsible for maintaining the guarantees.

The only difference between this and Manu's proposal is when you need to dip into @system code - in MP it's perfectly fine for the pointers to be equal, but when you want to read from or write to the address, you'll need to use @system. In other words, the dip into @system happens deeper in the codebase, meaning more code can be @safe.

--
  Simen
October 21, 2018
On Sun., 21 Oct. 2018, 2:05 am Walter Bright via Digitalmars-d, < digitalmars-d@puremagic.com> wrote:

> On 10/20/2018 11:30 AM, Manu wrote:
> > You can write an invalid program in any imaginable number of ways; that's just not an interesting discussion.
>
> What we're discussing is not an invalid program, but what guarantees the
> type
> system can provide.
>
> D's current type system guarantees that a T* and a shared(T)* do not point
> to
> the same memory location in @safe code.
>

My proposal guarantees that too, but in a more interesting way, because it opens the door to a whole working model. And it's totally @safe.

To get them to point to the same memory location, you've got to dip into
> @system
> code, where *you* become responsible for maintaining the guarantees.
>

My model preserves that property. Why do you think I'm running that static guarantee?

It's all irrelevant if you don't express any mechanism to *do* anything.
Shared today does not have any use. It simply expresses that data *is*
shared, and says nothing about what you can do with it.
If you don't express a safe mechanism for interacting with shared data,
then simply expressing the distinction of shared data really is completely
uninteresting.
It's just a marker that's mixed up in a bunch of unsafe code. I'm no more
satisfied than I am with C++.

Shared needs to do something; I propose that it strictly models operations that are threadsafe and semantic restrictions required to support that, and then you have a *usage* scheme, which is safe, and API conveys proper interaction.. not just an uninteresting marker.

I'm genuinely amazed that you're not intrigued by a @safe shared proposition. Nobly likes @safe more than you.

I could run our entire SMP stack 100% @safe.

I am going to fork D with this feature one way or another. It's the most
meaningful and compelling opportunity I've seen in ever. If there's ever
been a single thing that could truly move a bunch of C++ programmers, this
is it. C++ can do a crappy job of modelling most stuff in D, but it simply
can't go anywhere near this, and I've been working on competing C++ models
for months.
SMP is the future, we're going all-in this generation. Almost every
function in our codebase runs in an SMP environment... And I was staggered
that I was able to work this definition through to such a simple and
elegant set of rules.
I can't get my head around why people aren't more excited about this...
fully @safe SMP is huge!

>


October 21, 2018
On 10/20/2018 11:24 AM, Manu wrote:
> This is an unfair dismissal.

It has nothing at all to do with fairness. It is about what the type system guarantees in @safe code. To repeat, the current type system guarantees in @safe code that T* and shared(T)* do not point to the same memory location.

Does your proposal maintain that or not? It's a binary question.

> I'm not sure you've understood the proposal.
> This is the reason for the implicit conversion. It provides safe
> transition.

I don't see any way to make an implicit T* to shared(T)* safe, or vice versa. The T* code can create more aliases that the conversion doesn't know about, and the shared(T)* code can hand out aliases to other threads. So it all falls to pieces. Using a 'scope' qualifier won't work, because 'scope' isn't transitive, while shared is, i.e. U** and shared(U*)*.

> I'm not sure how to clarify it, what can I give you?

Write a piece of code that does such an implicit conversion that you argue is @safe. Make the code as small as possible. Your example:

> int* a;
> shared(int)* b = a;

This is not safe.

---- Manu's Proposal ---
@safe:
int i;
int* a = &i;
StartNewThread(a); // Compiles! Coder has no idea!

... in the new thread ...
void StartOfNewThread(shared(int)* b) {

    ... we have two threads accessing 'i',
    one thinks it is shared, the other unshared,
    and StartOfNewThread() has no idea and anyone
    writing code for StartOfNewThread() has no way
    to know anything is wrong ...

    lockedIncrement(b);  // Data Race!
}
--- Current D ---
@safe:
int i;
int* a = &i;
StartNewThread(a);   // Danger, Will Robinson! Does Not Compile!
StartNewThread(cast(shared(int)*) a) // Danger, Will Robinson!
                                     // Unsafe Cast! Does Not Compile!
---

Your proposal means that the person writing the lockedIncrement(), which is a perfectly reasonable thing to do, simply cannot write it in a way that has a @safe interface, because the person writing the lockedIncrement() library function has no way to know that the data it receives is actually unshared data.

I.e. @trusted code is obliged to proved a safe interface. Your proposal makes that impossible because the compiler would allow unshared data to be implicitly typed as shared.
October 21, 2018
On 21/10/2018 10:41 PM, Manu wrote:
> On Sun., 21 Oct. 2018, 2:05 am Walter Bright via Digitalmars-d, <digitalmars-d@puremagic.com <mailto:digitalmars-d@puremagic.com>> wrote:
> 
>     On 10/20/2018 11:30 AM, Manu wrote:
>      > You can write an invalid program in any imaginable number of ways;
>      > that's just not an interesting discussion.
> 
>     What we're discussing is not an invalid program, but what guarantees
>     the type
>     system can provide.
> 
>     D's current type system guarantees that a T* and a shared(T)* do not
>     point to
>     the same memory location in @safe code.
> 
> 
> My proposal guarantees that too, but in a more interesting way, because it opens the door to a whole working model. And it's totally @safe.
> 
>     To get them to point to the same memory location, you've got to dip
>     into @system
>     code, where *you* become responsible for maintaining the guarantees.
> 
> 
> My model preserves that property. Why do you think I'm running that static guarantee?
> 
> It's all irrelevant if you don't express any mechanism to *do* anything. Shared today does not have any use. It simply expresses that data *is* shared, and says nothing about what you can do with it.
> If you don't express a safe mechanism for interacting with shared data, then simply expressing the distinction of shared data really is completely uninteresting.
> It's just a marker that's mixed up in a bunch of unsafe code. I'm no more satisfied than I am with C++.
> 
> Shared needs to do something; I propose that it strictly models operations that are threadsafe and semantic restrictions required to support that, and then you have a *usage* scheme, which is safe, and API conveys proper interaction.. not just an uninteresting marker.
> 
> I'm genuinely amazed that you're not intrigued by a @safe shared proposition. Nobly likes @safe more than you.
> 
> I could run our entire SMP stack 100% @safe.
> 
> I am going to fork D with this feature one way or another. It's the most meaningful and compelling opportunity I've seen in ever. If there's ever been a single thing that could truly move a bunch of C++ programmers, this is it. C++ can do a crappy job of modelling most stuff in D, but it simply can't go anywhere near this, and I've been working on competing C++ models for months.
> SMP is the future, we're going all-in this generation. Almost every function in our codebase runs in an SMP environment... And I was staggered that I was able to work this definition through to such a simple and elegant set of rules.
> I can't get my head around why people aren't more excited about this... fully @safe SMP is huge!

I'm excited, but you need to write a DIP even if preliminary which shows both new semantics but also shows both working and current code to compare them.

October 21, 2018
On 10/20/2018 11:08 AM, Nicholas Wilson wrote:
> You can if no-one else writes to it, which is the whole point of Manu's proposal. Perhaps it should be const shared instead of shared but still.

There is no purpose whatsoever to data that can be neither read nor written. Shared data is only useful if, at some point, it is read/written, presumably by casting it to unshared in @trusted code. As soon as that is done, you've got a data race with the other existing unshared aliases.

October 21, 2018
On Sunday, 21 October 2018 at 09:58:18 UTC, Walter Bright wrote:
> On 10/20/2018 11:08 AM, Nicholas Wilson wrote:
>> You can if no-one else writes to it, which is the whole point of Manu's proposal. Perhaps it should be const shared instead of shared but still.
>
> There is no purpose whatsoever to data that can be neither read nor written. Shared data is only useful if, at some point, it is read/written, presumably by casting it to unshared in @trusted code. As soon as that is done, you've got a data race with the other existing unshared aliases.

Just a thought: if a hard requirement is made on `shared` data to be non-copyable, a @safe conversion could be guaranteed. But it can't be implicit either:

shared(T) share(T)(T value) if (!is(T == shared) && !isCopyable!T) {
    shared(T) result = move(value);
    return result;
}

struct ShareableData {
    @disable <postblit and/or copy ctor>; // Generated by compiler in presence of `shared` members and/or `shared` methods

    /* ... */
}

void sendToThread(T)(shared T* ptr) @safe;

void usage() @safe {
    int x;
    sendToThread(&x); // Error: 'x' is not shared
    shared y = x;     // Ok
    sendToThread(&y); // Ok


    ShareableData data;
    sendToThread(&data); // Error: 'data' is not shared
    auto p = &data;
    sendToThread(p);     // Error: *p is not shared

    auto sharedData = share(move(data));
    sendToThread(&sharedData); // Ok

    auto yCopy = y;   // Error: cannot copy 'shared' y
    auto dataCopy = sharedData; // Error: cannot copy 'shared' sharedData

    ShareableData otherData;
    sendToThread(cast(shared(ShareableData)*) &otherData); // Error non-@safe cast in @safe code
}

And again, we're back to 'once it's shared, it can't be @safe-ly unshared', which ruins the distinction between owned and shared references, which is one of the nicer properties that Manu's proposal seems to want to achieve :(