Memory safe in D (page 10)

Settings

Help

Index » General » Memory safe in D (page 10)

March 30

Re: Memory safe in D

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

Permalink

On 30/03/2024 4:03 PM, Walter Bright wrote:
> On 3/18/2024 4:19 PM, Richard (Rikki) Andrew Cattermole wrote:
>> On 19/03/2024 11:46 AM, Walter Bright wrote:
>>> If one doesn't do DFA, then I will be subjected to endless bug reports where people find a case that needs DFA to resolve.
>>
>> If anyone wants evidence of this, look no further than @live.
>>
>> https://issues.dlang.org/show_bug.cgi?id=21923
>>
>> https://issues.dlang.org/show_bug.cgi?id=21854
>>
>> A memory analysis technique that requires DFA, but doesn't as it wasn't fully thought out and too specific to it.
> 
> D is a complex language, and @live has bugs in it for some constructs. That doesn't mean DFA is the wrong tool for the job, it is the only tool for it and the problems are routine problems that can be fixed.

Yes, all I am getting at is a dedicated DFA that isn't specific to @live would be a better solution.

If we really really want @live to stick around (I have other ideas on how to replace it while getting guarantees which @live cannot provide), rewriting it on my proposed semantic 4 would be a better solution long term.

March 30

Re: Memory safe in D

Posted by Nick Treleaven
in reply to Walter Bright

Permalink

Nick Treleaven

Posted in reply to Walter Bright

Permalink

On Saturday, 30 March 2024 at 03:00:32 UTC, Walter Bright wrote:
> On 3/22/2024 3:51 AM, Nick Treleaven wrote:
>> I think this is workable without DFA, the compiler just tracks when a variable is initialized. There is never a state where a variable may be both initialized and not initialized
>
> ```
> A a = null;
> if (i)
>     a = new A();
> // a is both initialized and not initialized
> ```

There `a` is always initialized. It's a nullable type. If you remove the `= null` and make `a` a non-nullable type, you would get an error for the `if` statement because it initializes `a` in its branch, and there is no `else` branch which is required to also initialize `a`.

> Now throw in loops and goto's, and DFA is needed. Compiler optimizers use DFA because it works and ad-hoc techniques do not.

This does not need DFA, correct?

March 30

Re: Memory safe in D

Posted by Nick Treleaven
in reply to Nick Treleaven

Permalink

Nick Treleaven

Posted in reply to Nick Treleaven

Permalink

On Saturday, 30 March 2024 at 09:24:12 UTC, Nick Treleaven wrote:

On Saturday, 30 March 2024 at 03:00:32 UTC, Walter Bright wrote:

On 3/22/2024 3:51 AM, Nick Treleaven wrote:

I think this is workable without DFA, the compiler just tracks when a variable is initialized. There is never a state where a variable may be both initialized and not initialized

A a = null;
if (i)
    a = new A();
// a is both initialized and not initialized

There a is always initialized. It's a nullable type. If you remove the = null and make a a non-nullable type, you would get an error for the if statement because it initializes a in its branch, and there is no else branch which is required to also initialize a.

This is an example for Herb Sutter's cppfront, which enforces that p is non-null:

main: () =
{
    i := 0;
    p: unique_ptr<int>;
    if (i) {
        p = new<int>; // error: p must be initialized on both branches or neither
    }
}

> >

Now throw in loops and goto's, and DFA is needed. Compiler optimizers use DFA because it works and ad-hoc techniques do not.

cppfront:

main: () =
{
    i := 0;
    p: unique_ptr<int>;
    while i < 3 next i++ {
        p = new<int>;
        std::cout << p* << "\n"; // ok, p is always initialized
    }
}

Changing the while loop:

    while i < 3 next i++ {
        std::cout << p* << "\n"; // error, p used before it was initialized
        p = new<int>;
    }

Goto skipping initialization is already disallowed in D.

This does not need DFA, correct?

March 31

Re: Memory safe in D

Posted by Walter Bright
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Walter Bright

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On 3/29/2024 8:07 PM, Richard (Rikki) Andrew Cattermole wrote:
> Yes, all I am getting at is a dedicated DFA that isn't specific to @live would be a better solution.
> 
> If we really really want @live to stick around (I have other ideas on how to replace it while getting guarantees which @live cannot provide), rewriting it on my proposed semantic 4 would be a better solution long term.

We should continue this in the dips.development thread

March 31

Re: Memory safe in D

Posted by Walter Bright
in reply to Nick Treleaven

Permalink

Walter Bright

Posted in reply to Nick Treleaven

Permalink

On 3/30/2024 2:38 AM, Nick Treleaven wrote:
> cppfront:

Is cppfront using ad-hoc or DFA? You can get a ways with ad-hoc, but there's a reason optimizers use DFA, especially with unstructured code.

> Goto skipping initialization is already disallowed in D.

That restriction is imposed because the front end doesn't do DFA.

April 11

Re: Memory safe in D - cppfront/C++

Posted by Nick Treleaven
in reply to Walter Bright

Permalink

Nick Treleaven

Posted in reply to Walter Bright

Permalink

On Monday, 1 April 2024 at 01:00:31 UTC, Walter Bright wrote:

On 3/30/2024 2:38 AM, Nick Treleaven wrote:

cppfront:

Is cppfront using ad-hoc or DFA? You can get a ways with ad-hoc, but there's a reason optimizers use DFA, especially with unstructured code.

From what I have gathered, cppfront is only doing 2 basic things at the moment:

require initialization on both branches of an if statement or neither
track at runtime whether a variable has been initialized and abort if it is accessed without initialization

However, proper analysis is planned to detect common cases of invalid pointers and container types. It does not aim to catch all possible errors. See the links here which are for C++1:
https://github.com/hsutter/cppfront?tab=readme-ov-file#2015-lifetime-safety

Each point where a pointer variable is modified, the compiler tracks what possible things it could point to, e.g. local data. For the latter, when the local data goes out of scope, if the pointer hasn't been overwritten, then it is known to be pointing to invalid data and any subsequent dereference is flagged at compile-time. A partial prototype was implemented for Clang which was demo'd in the 2018 youtube video. There is also a formal written proposal P1179.

There is a section in that paper on loops - see 2.4.9:

A loop is treated as if it were the first two loop iterations unrolled using an if. For example,
for(/*init*/;/*cond*/;/*incr*/){/*body*/} is treated as if(/*init*/;/*cond*/){/*body*/;/*incr*/} if(/*cond*/){/*body*/}.

There was a section on null dereference detection in the 2018 video:
https://youtu.be/80BZxujhY38?t=41m22s

At around the 45m mark Herb actually says "there is no DFA going on". The example there has a loop whose number of iterations is only known at runtime.

Just before that null bit there was also an example that detects iterator invalidation.

> >

Goto skipping initialization is already disallowed in D.

That restriction is imposed because the front end doesn't do DFA.

D already does enforce for aggregate constructors that immutable data is only initialized once. That AIUI is not really DFA, but something simpler, and the idea could be extended to all functions to detect common cases of uninitialized non-null reference types (if we had them) - at some speed cost, hopefully acceptable.

April 11

Re: Memory safe in D - cppfront/C++

Posted by Nick Treleaven
in reply to Nick Treleaven

Permalink

Nick Treleaven

Posted in reply to Nick Treleaven

Permalink

On Thursday, 11 April 2024 at 16:19:52 UTC, Nick Treleaven wrote:

Each point where a pointer variable is modified, the compiler tracks what possible things it could point to, e.g. local data.

It tracks anything in the scope of the current function that the pointer could point to, at each statement.

For the latter, when the local data goes out of scope, if the pointer hasn't been overwritten, then it is known to be pointing to invalid data and any subsequent dereference is flagged at compile-time.

What I meant was if there is a dereference of a pointer that may have been (according to the limited analysis) assigned the address of a local that has gone out of scope, that dereference gets flagged at compile-time. Even though at runtime it may never actually have that address.

April 12

Re: Memory safe in D

Posted by ShowMeTheWay
in reply to Walter Bright

Permalink

ShowMeTheWay

Posted in reply to Walter Bright

Permalink

On Tuesday, 12 March 2024 at 03:55:27 UTC, Walter Bright wrote:
> On 3/11/2024 4:01 AM, Alex wrote:
>> Yes, I got it about compiler, static analyzer can't detect such potential issue for now.
>
> It cannot do it in the general case, that would be the halting problem.
>
>> The instance of class `A` is initialized by default initializer - correct?.
>
> `A a;` will default initialize `a` to `null`.
>
> `A a = new A();` will allocate an instance of `A` where each field is default initialized, and assign the result to `a`.
>
>> But what about variable `a`?
>
> `a` is default initialized to `null`.
>
>> Is it initialized by null or contains reference to the instance initialized by default initializer?
>
> `null`
>
>> What happend when I tried to call method `run()` of `a` in runtime?
>
> `a` is passed as the `this` pointer to the method `run()`. Hence, in this case, `this` will be null. If you attempt to dereference `this`, it will seg fault.
>
>> I see that application was abnormal termination because `writeln("Hello, world!");` was not called.
>> But I don't see any information in console about it (backtrace or something else).
>
> To get a backtrace on Windows, run it in the VC debugger.
>
>> Is it uncatched excpetion? But I have tried to catch it - not work.
>
> D's exception catchers do not catch Windows system exceptions for 64 bit code. Microsoft has not seen fit to document their 64 bit EH system.

The problem here, is "the D compiler" - i.e it not warning you (as an compilation error), that you are using variable a, even though it has not yet been assigned.

Unlike the D compiler, the C# compiler *would* tell you as much.

But D compiler just leaves it to the hardware to generate a seg fault at runtime??

This can be solved at compilation time.. surely.

C# ...
------
A a;
a.run(); // C# -> error CS0165: Use of unassigned local variable 'a'
-----

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-messages/cs0165

April 16

Re: Memory safe in D

Posted by ShowMeTheWay
in reply to Steven Schveighoffer

Permalink

ShowMeTheWay

Posted in reply to Steven Schveighoffer

Permalink

On Monday, 11 March 2024 at 19:43:33 UTC, Steven Schveighoffer wrote:
> On Monday, 11 March 2024 at 08:16:13 UTC, Alex wrote:
>
>> Is it expected behavior?
>> Looks like it is not very safe approach and can lead to very unpleasant memory errors...
>
> So I know there are a lot of responses here, with a lot of discussion. But I don't think anyone has told you *why* D works this way.
>
> The explanation is that D is expecting the memory hardware to fault when you dereference null. We know that this is not the case for all situations, but it is the case for all of D's normal usage modes (e.g. as user-code on standard operating systems).
>
> Since the memory hardware *already supports this*, and is essentially free, D has deferred to that mechanism to guard against dereferencing null pointers. Not assuming this behavior means all dereferences of pointers/classes in `@safe` code would have to be instrumented with a check, slowing down the code significantly.
>
> I consider null pointer faults to be annoying, but not nearly as bad as dangling pointer accesses. At least a null pointer *always* crashes when you access it.
>
> -Steve

The problem is less that the code is dereferencing null, and more, that "..forgetting to assign a value to a local is probably a bug.", to qoute Eric Lippert.

When you're derefencing null in a situation where you almost certainly should NOT be doing that, then it should be considered a likely bug.

To quote him some more,... "If its probably a bug and it is cheap and easy to detect, then there is good incentive to make the behavior either illegal or a warning."

Many of us use compilers (that have been around for decades), that do just that.

This below is valid C++ code, a bug in C#, but valid code in D (even though it's actually a bug):

A a;
a.run();

This should not be legal D code. It should produce an error if compiled.

It's not difficult for a compiler to work this one out.

April 16

Re: Memory safe in D

Posted by bachmeier
in reply to ShowMeTheWay

Permalink

bachmeier

Posted in reply to ShowMeTheWay

Permalink

On Tuesday, 16 April 2024 at 07:25:21 UTC, ShowMeTheWay wrote:

This below is valid C++ code, a bug in C#, but valid code in D (even though it's actually a bug):

A a;
a.run();

This should not be legal D code. It should produce an error if compiled.

It's not difficult for a compiler to work this one out.

I'm repeating myself, but there's no good argument in favor of that compiling. All it gives you is bugs, confusion, and a steep learning curve in the name of saving a few keystrokes.

import std;
void main() {
  A a;
  writeln(a is null); // true
  B b = null;
  writeln(b is null); // true
  C c = void;
  writeln(c is null); // false, c isn't initialized to null or anything else
}
class A {}
class B {}
class C {}

a.run() is natural. b.run() wouldn't make sense even to a new programmer. Neither would c.run().

Top | Forum index | About this forum

Forums