February 23, 2022

On Wednesday, 23 February 2022 at 00:14:13 UTC, Stanislav Blinov wrote:

>

On Tuesday, 22 February 2022 at 18:33:58 UTC, Paul Backus wrote:

>

If you believe there is some way to get the above program to produce undefined behavior, or to complete your original example in such a way that it produces undefined behavior without the use of incorrect @trusted code, I'm afraid you will have to spell it out for me.

Not exhaustive:
It may corrupt a given GC's implementation's heap, which means what occurs after the } is anyone's guess.
It may mutate data that's supposed to be immutable (i.e. in a parent process, though you could argue that might not be relevant to the DIP).
It may block indefinitely, or crash, or complete with no effect.

If you could demonstrate that it cannot possibly exhibit at least the above, I'll happily accept being mistaken.

Having spent some more time scratching my head over this, I now realize what I was missing: it is indeed possible to open a file descriptor that can corrupt arbitrary memory in a process's address space, using something like /proc/self/mem. Maybe I'm an idiot for missing this the first time around; I can only ask that you take pity on me. :)

This means that calling write on a fd is only memory safe if you have previously verified that the file the fd refers to is "well behaved" (i.e., satisfies a particular invariant). It follows that the fd itself must be stored in a @system variable in order to ensure that the invariant is maintained in @safe code.

I don't think adding scope checking to the fd makes any difference here, though. Reading from /proc/self/mem in @safe code is perfectly fine, even if you are reading from uninitialized or deallocated memory. The reason such reads are UB when done through pointers is that dereferencing an invalid pointer is UB, not because reading from the memory is UB.

(I'm also not sure if it's possible in practice to tell whether a file is "well behaved". If not, that means we have to either accept that write is always @system, or allow a permanent loophole in @safe. But that's a separate issue.)

February 23, 2022

On Wednesday, 23 February 2022 at 18:16:17 UTC, Paul Backus wrote:

>

(I'm also not sure if it's possible in practice to tell whether a file is "well behaved". If not, that means we have to either accept that write is always @system, or allow a permanent loophole in @safe. But that's a separate issue.)

By the way, this issue has also come up in Rust:

February 23, 2022

On Wednesday, 23 February 2022 at 16:14:51 UTC, Dennis wrote:

>

On Wednesday, 23 February 2022 at 00:14:13 UTC, Stanislav Blinov wrote:

>

If you're going to go there, then...

...

...that @trusted code is incorrect, at least on some platforms (yes, I can nitpick too).

Why? I don't see it.

Because not all possible values of data.length are valid values for write's third argument.

> >

...but seriously. What is it with all the condescending tone on the forums lately?

I think Paul makes a valid point and uses an appropriate tone. The DIP should not be hand-wavy about how scope checking would help memory safety when using file descriptors. I didn't go into much detail there because I didn't think it would be a contested addition.

I see the problem now, thanks.

February 23, 2022

On Wednesday, 23 February 2022 at 22:01:55 UTC, Stanislav Blinov wrote:

>

Because not all possible values of data.length are valid values for write's third argument.

POSIX says:

>

Before any action described below is taken, and if nbyte is zero and the file is a regular file, the write() function may detect and return errors as described below. In the absence of errors, or if error detection is not performed, the write() function shall return zero and have no other results. If nbyte is zero and the file is not a regular file, the results are unspecified.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html

I was unable to find a definition in the standard itself of exactly what "unspecified" means in this context, but I think we can assume that it does not mean the same thing as "undefined", because the POSIX standard uses the actual word "undefined" elsewhere (e.g., in the description of pthread_mutex_destroy).

If we assume that it means the same thing as "unspecified behavior" in C, then it means that there are multiple possible behaviors, and the standard does not require an implementation to commit to any particular one in any given situation.

February 23, 2022

On Wednesday, 23 February 2022 at 18:16:17 UTC, Paul Backus wrote:

>

Having spent some more time scratching my head over this, I now realize what I was missing: it is indeed possible to open a file descriptor that can corrupt arbitrary memory in a process's address space, using something like /proc/self/mem.

Yes, or you may use e.g. memfd_create. And you can inherit such an fd from a parent process. Or receive a shared memory descriptor from another process.

>

Maybe I'm an idiot for missing this the first time around; I can only ask that you take pity on me. :)

Never! How dare you make me question myself!!! :)

>

This means that calling write on a fd is only memory safe if you have previously verified that the file the fd refers to is "well behaved" (i.e., satisfies a particular invariant). It follows that the fd itself must be stored in a @system variable in order to ensure that the invariant is maintained in @safe code.

Yup.

>

I don't think adding scope checking to the fd makes any difference here, though. Reading from /proc/self/mem in @safe code is perfectly fine, even if you are reading from uninitialized or deallocated memory. The reason such reads are UB when done through pointers is that dereferencing an invalid pointer is UB, not because reading from the memory is UB.

Well, results of reading from some types of fds are also not specified. So, if I'm not mistaken, performing such a read and then using the resulting "data" would be undefined behavior (provided the program even gets there).

As for scope checks themselves - as Dennis mentions, double close looks dissimilar to double free. Yet it is subject to a superset of that - use after free, as are read and write. You may well safely "dangle" an fd and not invoke UB by calling those functions with it, but only up to the point when the program opens another descriptor. Calling close on a dangled fd, which would then succeed, would be a mere bug and not invoke UB, but attempting to write or read+use may.

So I do think that fds could still be a good example material for the DIP.

>

(I'm also not sure if it's possible in practice to tell whether a file is "well behaved". If not, that means we have to either accept that write is always @system, or allow a permanent loophole in @safe. But that's a separate issue.)

I don't think that should be necessary in concrete cases, as the onus of ensuring the implicit invariant would lie on the implementation of, in this case, File - e.g. making it non-copyable (or reference-counted), ensuring that the constructor opens an appropriate kind of file, etc. etc. That way the only way to make it unsafe would be to corrupt the given instance of File itself, which means there's a memory safety issue somewhere else in the program (for example, that same void-initialization).

March 04, 2022

On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote at the feedback theard:

>

On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:

>

Wouldn't putting the handle in union with void[1] work?

No, void[1] is not a type with unsafe values.

I was just checking what the language spec says about this, and found an alternative we have all been overlooking.

A type can be declared unsafe in the present language by giving it an invariant.

Yes I meant that contract programming invariant! The spec says that void-initializing a type with an invariant, or using an union that has a member with an invariant is @system-only. Thus the invariant effectively declares the type unsafe. It also means that void[1] is an unsafe type, because it can contain a struct with an invariant.

This DIP still has the advantage that @safe functions in the same module with the invariant type do not need any special care. But still, that sounds a pretty trivial gain to me - in the IntSlice example you can make the members read-only with a bit union trickery if you want to, or define a string mixin that does the same automatically.

I'm starting to think it's probably not worth it overall. Still I'm only slightly against because the rules proposed blend such nicely with the existing language, and it sure is sometimes convenient to have an alternative.

March 04, 2022

On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote at the feedback theard:

>

On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:

>

Wouldn't putting the handle in union with void[1] work?

No, void[1] is not a type with unsafe values.

I was just checking what the language spec says about this, and found an alternative we have all been overlooking.

A type can be declared unsafe in the present language by giving it an invariant.

Yes I meant that contract programming invariant! The spec says that void-initializing a type with an invariant, or using an union that has a member with an invariant is @system-only. Thus the invariant effectively declares the type unsafe. It also means that void[1] is an unsafe type, because it can contain a struct with an invariant.

This DIP still has the advantage that @safe functions in the same module with the invariant type do not need any special care. But still, that sounds a pretty trivial gain to me - in the IntSlice example you can make the members read-only with a bit union trickery if you want to, or define a string mixin that does the same automatically.

I'm starting to think it's probably not worth it overall. Still I'm only slightly against because the rules proposed blend such nicely with the existing language, and it sure is sometimes convenient to have an alternative.

March 04, 2022

This is my reply to this post from the feedback thread:

On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote:

>

On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:

>

Wouldn't putting the handle in union with void[1] work?

No, void[1] is not a type with unsafe values.

void[1] is considered by the compiler to potentially contain pointer data, in accordance with this section of the language spec:

https://dlang.org/spec/arrays.html#void_arrays

Note in particular the paragraph that begins, "Void arrays can also be static".

As a result, the compiler will not allow you to void-initialize a void[1] in @safe code:

void main() @safe {
    void[1] a = void; // error
}

So, the workaround suggested by Dukc would indeed work.

(By the way, I know this because the first thing I did after I read his post in the feedback thread was to actually write out a complete example using the void[1] workaround and check to see if it worked.)

March 04, 2022

On Friday, 4 March 2022 at 13:06:35 UTC, Dukc wrote:

>

On Friday, 4 March 2022 at 09:39:53 UTC, Dennis wrote at the feedback theard:

>

On Friday, 25 February 2022 at 21:46:25 UTC, Dukc wrote:

>

Wouldn't putting the handle in union with void[1] work?

No, void[1] is not a type with unsafe values.

I was just checking what the language spec says about this, and found an alternative we have all been overlooking.

A type can be declared unsafe in the present language by giving it an invariant.

Yes I meant that contract programming invariant! The spec says that void-initializing a type with an invariant, or using an union that has a member with an invariant is @system-only. Thus the invariant effectively declares the type unsafe.

First, this was not "overlooked"--it was added to the language spec well after DIP 1035 was written and submitted. Dennis and I have been aware of this spec change since it was first proposed in DMD PR 12326.

Second, this is not a complete alternative to DIP 1035, because it does not solve the __traits(getMember) issue. As long as @safe code is allowed to bypass encapsulation and access the fields of user-defined types directly, it is impossible for @trusted code to rely on the integrity of the data in those fields.

March 04, 2022

On Friday, 4 March 2022 at 13:09:42 UTC, Paul Backus wrote:

>

As a result, the compiler will not allow you to void-initialize a void[1] in @safe code:

void main() @safe {
    void[1] a = void; // error
}

That's new to me, and the error makes no sense considering you can (implicitly) convert any array to a void[] even in @safe code, so you can still do this:

void main() @safe {
    ubyte[1] x = void;
    void[1] y = x;
}
>

So, the workaround suggested by Dukc would indeed work.

It's an interesting alternative if we can nail it down.