4 days ago
On Saturday, April 19, 2025 5:26:29 AM MDT kdevel via Digitalmars-d wrote:
> On Saturday, 19 April 2025 at 10:35:01 UTC, Jonathan M Davis wrote:
> > [...] because then it would need to dereference s to access its i member, but until it needs to access a member, there's no reason for any dereferencing to take place.
> >
> > The same happens with C++ classes as long as the function isn't virtual.
>
> That is undefined behavior. In the C++ standard null references have been carefully ruled out [1]. There is no standard conforming C++ program having null references.

My point about non-virtual functions and derefencing wasn't really about references so much as about the fact that the compiler doesn't necessary dereference when you think that you'r telling it to dereference. It only does so when it actually needs to.

And whatever is supposed to be defined behavior or not, I have seen pointers not be dereferenced when calling non-virtual functions - and when creating references from what they point to.

> > And with
> >
> > int* p = null;
> > ref r = *p;
> >
> > no dereferencing occurs,
>
> In C++ this is a programming error. When creating a reference from
> a pointer the null check it is necessary in order to uphold C++'
> guarantee
> that references are actually bound to existing objects.
>
> [1] google.com?q="c++ reference from null pointer"
>      -
> https://old.reddit.com/r/cpp/comments/80zm83/no_references_are_never_null/
>      -
> https://stackoverflow.com/questions/4364536/is-a-null-reference-possible

Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before.

Either way, unless the compiler inserts checks of some kind in order to try ensure that a reference is never null, there's no reason to derefence a pointer or reference until the data it points to is actually used. And historically, no null checks were done for correctness.

- Jonathan M Davis



4 days ago
On Saturday, 19 April 2025 at 11:44:42 UTC, Jonathan M Davis wrote:
> [...]
>> > And with
>> >
>> > int* p = null;
>> > ref r = *p;
>> >
>> > no dereferencing occurs,
>>
>> In C++ this is a programming error. When creating a reference from a pointer the null check it is necessary in order to uphold
>> C++' guarantee that references are actually bound to existing objects.
>> [...]
>
> Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before.

Of course it doesn't and I didn't write that. I wrote that it is
a programming error to use a ptr to initialize a reference when
it is possible that the ptr is null. If refs in D were as strong
as in C++ I would write

    [... int *p is potentially null ...]
    enforce (p);
    auto ref r = *p;
4 days ago
On Saturday, April 19, 2025 6:13:36 AM MDT kdevel via Digitalmars-d wrote:
> On Saturday, 19 April 2025 at 11:44:42 UTC, Jonathan M Davis wrote:
> > [...]
> >> > And with
> >> >
> >> > int* p = null;
> >> > ref r = *p;
> >> >
> >> > no dereferencing occurs,
> >>
> >> In C++ this is a programming error. When creating a reference
> >> from a pointer the null check it is necessary in order to
> >> uphold
> >> C++' guarantee that references are actually bound to existing
> >> objects.
> >> [...]
> >
> > Well, if C++ now checks that pointer is non-null when creating a reference from it, that's new behavior, because it most definitely did not do that before.
>
> Of course it doesn't and I didn't write that. I wrote that it is a programming error to use a ptr to initialize a reference when it is possible that the ptr is null. If refs in D were as strong as in C++ I would write
>
>      [... int *p is potentially null ...]
>      enforce (p);
>      auto ref r = *p;

If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null. It's the same programmer error as any time that you dereference a null pointer except that it doesn't actually dereference the pointer when you create the reference and instead blows up later when you attempt to use what it refers to, because that's when the actual dereferencing takes place. If C++ doesn't have additional checks, then it's not any stronger about guarantees with & than D is with ref.

Meta was asking how it was possible that

int* p = null;
ref r = *p;

would result in a null reference instead of blowing up, and I explained why it didn't blow up and pointed out that C++ has the exact same situation. And unless C++ has added additional checks (and it sounds like they haven't), then there's no real difference here between C++ and D.

- Jonathan M Davis



4 days ago
On Saturday, 19 April 2025 at 12:54:27 UTC, Jonathan M Davis wrote:
>>
>>      [... int *p is potentially null ...]
>>      enforce (p);
>>      auto ref r = *p;
>
> If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null.

   int main ()
   {
      int *p = NULL;
      int &i = *p;
   }

That is an error (mistake) only in C++ because the reference is
not initialized with a valid initializer. In D, however,

   void main ()
   {
      int *p = null;
      ref int i = *p; // DMD v2.111.0
   }

is a valid program [3].

> It's the same programmer error as any time that you dereference a null pointer except that it doesn't actually dereference the pointer when you create the reference and instead blows up later when you attempt to use what it refers to, because that's when the actual dereferencing takes place.

Assume the "dereference" of the pointer and the initialization
of the reference happen in different translation units written
by different programmers. I.e.

tu1.cc

   void foo (int &i)
   {
   }

tu2.cc

   int main ()
   {
      int *p = NULL;
      foo (*p);
   }

versus

tu1.d

   void foo (ref int i)
   {
   }

tu2.d

   int main ()
   {
      int *p = NULL;
      foo (*p);
   }

Then we have different responsibilities. In the C++ case the
programmer of tu2.cc made a mistake while in the D case the
code of tu2.d is legit. I would not call this situation "the
same programmer error".

> If C++ doesn't have additional checks, then it's not any stronger about guarantees with & than D is with ref.

As programmer of translation unit 1 my job is much easier if I
use C++.

[3] https://dlang.org/spec/type.html#pointers
    "When a pointer to T is dereferenced, it must either contain a null
    value, or point to a valid object of type T."

4 days ago
On Saturday, April 19, 2025 8:23:09 AM MDT kdevel via Digitalmars-d wrote:
> On Saturday, 19 April 2025 at 12:54:27 UTC, Jonathan M Davis wrote:
> >>
> >>      [... int *p is potentially null ...]
> >>      enforce (p);
> >>      auto ref r = *p;
> >
> > If it's not doing any additional checks, then I don't understand your point. Of course it's programmer error to convert a pointer to a reference when that pointer is null.
>
>     int main ()
>     {
>        int *p = NULL;
>        int &i = *p;
>     }
>
> That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however,
>
>     void main ()
>     {
>        int *p = null;
>        ref int i = *p; // DMD v2.111.0
>     }
>
> is a valid program [3].

In both cases it's a valid program where the programmer screwed up, and they're going to get a segfault later on if the reference is ever accessed. If it weren't a valid program, it wouldn't compile. If you had a situation where a cast were being used to circumvent compiler checks, it could be argued that it wasn't valid, because the programmer was circumventing the compiler, but nothing is being circumvented here. Neither language has checks - either at compile time or at runtime - to catch this issue, so I don't see how it could be argued that the compiler is providing guarantees about this or that the program is invalid. In both cases, it's an error on the programmer's part, and in neither case is the language providing anything to prevent it or catch it. As far as I can see, the situation in both cases is identical. Maybe there's some difference in how the C++ spec talks about it, but there is no practical difference.

- Jonathan M Davis



4 days ago
On Thursday, April 17, 2025 8:39:27 PM MDT Walter Bright via Digitalmars-d wrote:
> I'd like to know what those gdc and ldc transformations are, and whether they are controllable with a switch to their optimizers.
>
> I know there's a problem with WASM not faulting on a null dereference, but in another post I suggested a way to deal with it.

Unfortunately, my understanding isn't good enough to explain those details. I discussed it with Johan in the past, but I've never worked on ldc or with llvm (or on gdc/gcc), so I really don't know what is or isn't possible. However, from what I recall of what Johan said, we were kind of stuck, and llvm considered dereferencing null to be undefined behavior.

It may be the case that there's some sort of way to control that (and llvm may have more capabilities in that regard since I last discussed it with Johan), but someone who actually knows llvm is going to have to answer those questions. And I don't know how gdc's situation differs either.

- Jonathan M Davis



4 days ago
On Saturday, 19 April 2025 at 22:23:54 UTC, Jonathan M Davis wrote:
>>
>>     int main ()
>>     {
>>        int *p = NULL;
>>        int &i = *p;
>>     }
>>
>> That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however,
>>
>>     void main ()
>>     {
>>        int *p = null;
>>        ref int i = *p; // DMD v2.111.0
>>     }
>>
>> is a valid program [3].
>
> In both cases it's a valid program

Only the D version is valid. The C++ program violates the std.
From the SO page there is a quote of the C++ 11 std draft which
says in sec. "8.3.2 References":

    "A reference shall be initialized to refer to a valid object
     or function. [ Note: in particular, a null reference cannot
     exist in a well-defined program, because the only way to create
     such a reference would be to bind it to the “object” obtained
     by dereferencing a null pointer, which causes undefined behavior.
     [...] — end note ]"

You find nearly the same wording in sec. 11.3.2 of the C++17 std
draft (N4713) and in sec. 9.3.4.3 of the C++23 std draft (N4928)
with the "dereferencing" replaced with "indirection".

> [...] If it weren't a valid program, it wouldn't compile.

That is an interesting opinion.

4 days ago
On Saturday, April 19, 2025 5:51:58 PM MDT kdevel via Digitalmars-d wrote:
> On Saturday, 19 April 2025 at 22:23:54 UTC, Jonathan M Davis wrote:
> >>
> >>     int main ()
> >>     {
> >>        int *p = NULL;
> >>        int &i = *p;
> >>     }
> >>
> >> That is an error (mistake) only in C++ because the reference is not initialized with a valid initializer. In D, however,
> >>
> >>     void main ()
> >>     {
> >>        int *p = null;
> >>        ref int i = *p; // DMD v2.111.0
> >>     }
> >>
> >> is a valid program [3].
> >
> > In both cases it's a valid program
>
> Only the D version is valid. The C++ program violates the std.
>  From the SO page there is a quote of the C++ 11 std draft which
> says in sec. "8.3.2 References":
>
>      "A reference shall be initialized to refer to a valid object
>       or function. [ Note: in particular, a null reference cannot
>       exist in a well-defined program, because the only way to
> create
>       such a reference would be to bind it to the “object” obtained
>       by dereferencing a null pointer, which causes undefined
> behavior.
>       [...] — end note ]"
>
> You find nearly the same wording in sec. 11.3.2 of the C++17 std
> draft (N4713) and in sec. 9.3.4.3 of the C++23 std draft (N4928)
> with the "dereferencing" replaced with "indirection".

I see no practical difference in general, but I guess that the difference would be that if C++ says that the behavior is undefined, then it can let the optimizer do whatever it wants with it, whereas D has to say at least roughly what would happen, or the behavior would be undefined and thus screw up @safe. Whatever assumptions the optimizer may make about it, they can't be anything that would violate memory safety.

In practice though, particularly with unoptimized code, C++ and D are going to do the same thing here, and in both cases, the programmer screwed up, so their program is going to crash. And realistically, it'll likely do the same thing in most cases even with optimized code.

- Jonathan M Davis




3 days ago
On Sunday, 20 April 2025 at 00:33:52 UTC, Jonathan M Davis wrote:
> I see no practical difference in general [...]

I consider nonconforming generally inacceptable.

3 days ago
On Sunday, April 20, 2025 8:13:44 AM MDT kdevel via Digitalmars-d wrote:
> On Sunday, 20 April 2025 at 00:33:52 UTC, Jonathan M Davis wrote:
> > I see no practical difference in general [...]
>
> I consider nonconforming generally inacceptable.

Writing a program which doesn't behave properly is always a problem and should be consider unacceptable. And both having a program relying on undefined behavior and having a program which dereferences null are problems. The latter will crash the program. The only real difference there between C++ and D is that if the language states that it's undefined behavior for the reference to be null, then the optimizer can do screwy things in the case when it actually does happen instead of the program being guaranteed to crash when the null reference is dereferenced.

So, that's why I say that I see no practical difference. If you create a reference from a null pointer, you have a bug whether the program is written in C++ or D. And outside of optimized builds (and likely in almost all cases even with optimized builds), what happens when you screw that up will be the same in both languages.

In any case, we clearly both agree that if the programmer does this, they've screwed up, and I think that we're basically arguing over language here rather than an actual technical problem.

Ultimately, the only real difference is what the language's optimizer is allowed to do when the programmer does screw it up, because the C++ spec says that it's undefined behavior, and the D spec can't say that and have references work in @safe code, since @safe code disallows undefined behavior in order to ensure memory safety. The program has a bug either way.

- Jonathan M Davis