July 27, 2017
On 07/27/2017 03:24 PM, Moritz Maxeiner wrote:
> --- null.d ---
> version (linux):
> 
> import core.stdc.stdio : FILE;
> import core.sys.linux.sys.mman;
> 
> extern (C) @safe int fgetc(FILE* stream);
> 
> void mmapNull()
> {
>      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
>      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root");
>      *(cast (char*) null) = 'D';
> }
> 
> void nullDeref() @safe
> {
>      fgetc(null);
> }
> 
> void main(string[] args)
> {
>      mmapNull();
>      nullDeref();
> }
> ---
> 
> For some fun on Linux, try out
> # echo 0 > /proc/sys/vm/mmap_min_addr
> $ rdmd null.d

The gist of this is that Linux can be configured so that null can be a valid pointer. Right?

That seems pretty bad for @safe at large, not only when C functions are involved.
July 27, 2017
On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
> On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
>> I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support.
>>
>> On platforms that may not segfault, you'd be on your own.
>>
>> In other words, I think we can assume for any C functions that are passed pointers that dereference those pointers, passing null is safely going to segfault.
>>
>> Likewise, because D depends on hardware flagging of dereferencing null as a segfault, any platforms that *don't* have that for C also won't have it for D. And then @safe doesn't even work in D code either.
>>
>> As we have good support for different prototypes for different platforms, we could potentially unmark those as @trusted in those cases.
> 
> --- null.d ---
> version (linux):
> 
> import core.stdc.stdio : FILE;
> import core.sys.linux.sys.mman;
> 
> extern (C) @safe int fgetc(FILE* stream);
> 
> void mmapNull()
> {
>      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
>      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root");
>      *(cast (char*) null) = 'D';
> }
> 
> void nullDeref() @safe
> {
>      fgetc(null);
> }
> 
> void main(string[] args)
> {
>      mmapNull();
>      nullDeref();
> }
> ---
> 
> For some fun on Linux, try out
> # echo 0 > /proc/sys/vm/mmap_min_addr
> $ rdmd null.d
> 
> Consider `mmapNull` being run in some third party shared lib you don't control.

Again, all these hacks are just messing with the assumptions D is making. You don't need C functions to trigger such problems. I'm fine with saying libraries or platforms that do not segfault when accessing zero page are incompatible with @safe code. And it's on you not to do this, the compiler will assume the segfault will occur.

-Steve
July 27, 2017
On Thursday, 27 July 2017 at 13:45:21 UTC, ag0aep6g wrote:
> On 07/27/2017 03:24 PM, Moritz Maxeiner wrote:
>> --- null.d ---
>> version (linux):
>> 
>> import core.stdc.stdio : FILE;
>> import core.sys.linux.sys.mman;
>> 
>> extern (C) @safe int fgetc(FILE* stream);
>> 
>> void mmapNull()
>> {
>>      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
>>      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root");
>>      *(cast (char*) null) = 'D';
>> }
>> 
>> void nullDeref() @safe
>> {
>>      fgetc(null);
>> }
>> 
>> void main(string[] args)
>> {
>>      mmapNull();
>>      nullDeref();
>> }
>> ---
>> 
>> For some fun on Linux, try out
>> # echo 0 > /proc/sys/vm/mmap_min_addr
>> $ rdmd null.d
>
> The gist of this is that Linux can be configured so that null can be a valid pointer. Right?

In summation, yes. To be technical about it:
- Linux can be configured so that the bottom page of a process' virtual address space is not protected from being mapped to valid memory (by default, `mmap_min_addr` is 4096, i.e. the bottom page can't be mapped)
- C's `NULL` is in pretty much all implementations (not the C spec) defined as the value `0`, which corresponds to the virtual address `0` in a process, i.e. lies in the bottom page of the process' virtual address space
- The null dereference segmentation fault on Linux stems from the fact that the bottom page (which `NULL` maps to) isn't mapped to valid memory
- If you map the bottom page of a process' virtual address space to valid memory, than accessing it doesn't create a segmentation fault

>
> That seems pretty bad for @safe at large, not only when C functions are involved.

Yes:
- In C land, since derefencing `NULL` is UB by definition, this is perfectly valid behaviour
- In D lang, because we require `null` dereferences to crash, we break @safe with it
July 27, 2017
On Thursday, 27 July 2017 at 13:56:00 UTC, Steven Schveighoffer wrote:
> On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
>> On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
>>> I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support.
>>>
>>> On platforms that may not segfault, you'd be on your own.
>>>
>>> In other words, I think we can assume for any C functions that are passed pointers that dereference those pointers, passing null is safely going to segfault.
>>>
>>> Likewise, because D depends on hardware flagging of dereferencing null as a segfault, any platforms that *don't* have that for C also won't have it for D. And then @safe doesn't even work in D code either.
>>>
>>> As we have good support for different prototypes for different platforms, we could potentially unmark those as @trusted in those cases.
>> 
>> --- null.d ---
>> version (linux):
>> 
>> import core.stdc.stdio : FILE;
>> import core.sys.linux.sys.mman;
>> 
>> extern (C) @safe int fgetc(FILE* stream);
>> 
>> void mmapNull()
>> {
>>      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
>>      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root");
>>      *(cast (char*) null) = 'D';
>> }
>> 
>> void nullDeref() @safe
>> {
>>      fgetc(null);
>> }
>> 
>> void main(string[] args)
>> {
>>      mmapNull();
>>      nullDeref();
>> }
>> ---
>> 
>> For some fun on Linux, try out
>> # echo 0 > /proc/sys/vm/mmap_min_addr
>> $ rdmd null.d
>> 
>> Consider `mmapNull` being run in some third party shared lib you don't control.
>
> Again, all these hacks are just messing with the assumptions D is making.

Which aren't in the official D spec (or at the very least I can't seem to find them there).

> You don't need C functions to trigger such problems.

Sure, but it was relevant to the previous discussion.

> I'm fine with saying libraries or platforms that do not segfault when accessing zero page are incompatible with @safe code.

So we can't have @safe in shared libraries on Linux? Because there's no way for the shared lib author to know what programs using it are going to do.

> And it's on you not to do this, the compiler will assume the segfault will occur.

It's not a promise the author of the D code can (always) make.
In any case, the @trusted and @safe spec need to be explicit about the assumptions made.
July 27, 2017
On 7/27/17 10:20 AM, Moritz Maxeiner wrote:
> On Thursday, 27 July 2017 at 13:56:00 UTC, Steven Schveighoffer wrote:
>> On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
>>> On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
>>>> I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support.
>>>>
>>>> On platforms that may not segfault, you'd be on your own.
>>>>
>>>> In other words, I think we can assume for any C functions that are passed pointers that dereference those pointers, passing null is safely going to segfault.
>>>>
>>>> Likewise, because D depends on hardware flagging of dereferencing null as a segfault, any platforms that *don't* have that for C also won't have it for D. And then @safe doesn't even work in D code either.
>>>>
>>>> As we have good support for different prototypes for different platforms, we could potentially unmark those as @trusted in those cases.
>>>
>>> --- null.d ---
>>> version (linux):
>>>
>>> import core.stdc.stdio : FILE;
>>> import core.sys.linux.sys.mman;
>>>
>>> extern (C) @safe int fgetc(FILE* stream);
>>>
>>> void mmapNull()
>>> {
>>>      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
>>>      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root");
>>>      *(cast (char*) null) = 'D';
>>> }
>>>
>>> void nullDeref() @safe
>>> {
>>>      fgetc(null);
>>> }
>>>
>>> void main(string[] args)
>>> {
>>>      mmapNull();
>>>      nullDeref();
>>> }
>>> ---
>>>
>>> For some fun on Linux, try out
>>> # echo 0 > /proc/sys/vm/mmap_min_addr
>>> $ rdmd null.d
>>>
>>> Consider `mmapNull` being run in some third party shared lib you don't control.
>>
>> Again, all these hacks are just messing with the assumptions D is making.
> 
> Which aren't in the official D spec (or at the very least I can't seem to find them there).

You are right. I have asked Walter to add such an update. I should pull that out to its own thread, will do.

>> You don't need C functions to trigger such problems.
> 
> Sure, but it was relevant to the previous discussion.

Right, but what I'm saying is that it's a different argument. We could say "you can't mark fgetc @safe", and still have this situation occur.

>> I'm fine with saying libraries or platforms that do not segfault when accessing zero page are incompatible with @safe code.
> 
> So we can't have @safe in shared libraries on Linux? Because there's no way for the shared lib author to know what programs using it are going to do.

You can't guarantee @safe on such processes or systems. It has to be assumed by the compiler that your provided code doesn't happen.

It's not that we can't have @safe because of what someone might do, it's that @safe guarantees can only work if you don't do such things.

It is nice to be aware of these possibilities, since they could be an effective attack on D @safe code.

>> And it's on you not to do this, the compiler will assume the segfault will occur.
> 
> It's not a promise the author of the D code can (always) make.
> In any case, the @trusted and @safe spec need to be explicit about the assumptions made.

I agree. The promise only works as well as the environment. @safe is not actually safe if it's based on incorrect assumptions.

-Steve
July 27, 2017
On Thursday, 27 July 2017 at 11:46:24 UTC, Steven Schveighoffer wrote:
> On 7/27/17 2:48 AM, Jacob Carlborg wrote:
>> And then the compiler runs the "Dead Code Elimination" pass and we're left with:
>> 
>> void contains_null_check(int* p)
>> {
>>      *p = 4;
>> }
>
> So the result is that it will segfault. I don't see a problem with this. It's what I would have expected.
>
Except that that code was used in the Linux kernel where page 0 was mapped and thus de-referencing the pointer did not segfault.

The issue that is missed here is for what purpose the compiler is used. Will the code always be run in a hosted environment or is it used in a freestanding implementation (kernel and embedded stuff). The C standard makes a difference between the 2 but the compiler gurus apparently do not care.
As for D, Walter's list of constraints for a D compiler makes it imho impossible to use the language on smaller embedded platforms ring 0 mode x86.
That's why calling D a system language to be somehow disingenuous. Calling it an application language to be truer.

July 27, 2017
On Thursday, 27 July 2017 at 14:45:03 UTC, Steven Schveighoffer wrote:
> On 7/27/17 10:20 AM, Moritz Maxeiner wrote:
>> On Thursday, 27 July 2017 at 13:56:00 UTC, Steven Schveighoffer wrote:
>
>>> I'm fine with saying libraries or platforms that do not segfault when accessing zero page are incompatible with @safe code.
>> 
>> So we can't have @safe in shared libraries on Linux? Because there's no way for the shared lib author to know what programs using it are going to do.
>
> You can't guarantee @safe on such processes or systems. It has to be assumed by the compiler that your provided code doesn't happen.
>
> It's not that we can't have @safe because of what someone might do, it's that @safe guarantees can only work if you don't do such things.

Which essentially means that any library written in @safe D exposing a C API needs to write in big fat red letters "Don't do this or you break our safety guarantees".


> It is nice to be aware of these possibilities, since they could be an effective attack on D @safe code.

Well, yeah, that's the consequence of @safe correctness depending on UB always resulting in a crash.
July 27, 2017
On 07/25/2017 10:54 PM, Walter Bright wrote:
> On 7/25/2017 8:26 AM, Andrei Alexandrescu wrote:
>> A suite of safe wrappers on OS primitives might be useful.
> 
> The idea of fixing the operating system interface(s) has come up now and then. I've tried to discourage that on the following grounds:
> 
> 
> * We are not in the operating system business.
> 
> * Operating system APIs grow like weeds. We'd set ourselves an impossible task.
> 
> * It's a huge job simply to provide accurate declarations for the APIs.
> 
> * We'd have to write our own documentation for the operating system APIs. It's hard enough writing such for Phobos.
> 
> * A lot are fundamentally unfixable, like free() and strlen().
> 
> * The API import files should be focused solely on direct access to the APIs, not adding a translation layer. The user of them will expect this.
> 
> * We already have safe wrappers for the commonly used APIs. For read(), there is std.stdio.

The standard library would not be in the position to provide such, but the project seems a good choice for a crowdsource and crowdmaintained library. -- Andrei


July 28, 2017
On 2017-07-27 13:46, Steven Schveighoffer wrote:

> So the result is that it will segfault. I don't see a problem with this. It's what I would have expected.

The problem is that behavior might change depending on which compiler is used because the code is not valid according to the specification.

-- 
/Jacob Carlborg
July 28, 2017
On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
> On 7/26/2017 6:29 AM, Kagamin wrote:
>> Should we still try to mark them safe at all?
>
> Marking ones that are safe with @safe is fine. OS APIs pretty much never change.

New technologies and new features get introduced over time: 64 bit, ipv6, bitmap_v5, generally bigger data everywhere, and api changes accordingly and incorporates new features, and takes increasingly bigger arguments over time.