Fixing C's Biggest Mistake (page 16)

Settings

Help

Index » General » Fixing C's Biggest Mistake (page 16)

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Max Samukha
in reply to Walter Bright

Permalink

Max Samukha

Posted in reply to Walter Bright

Permalink

On Friday, 30 December 2022 at 20:44:04 UTC, Walter Bright wrote:
> What I meant was default construction, which is not necessary in D.

It is necessary. For types that require runtime construction, initializing to T.init does not result in a constructed object. Forcing programmers to use factory functions doesn't make much sense:

struct S
{
    this() @disable;
}

S s() { S r = ...; return r; } // you disallowed `S()` just to make people fake it.


There is no need to forbid the nullary constructor. I intentionally don't call it "default constructor", because I want it to be distinct from initializing to T.init. I want this:

@disable(init) // or whatever syntax you prefer
struct S
{
    this() {}
}

S s; // error: explicit constructor call required
S s = S(); // ok, explicit constructor call
S s = S.init; // shouldn't be allowed in safe code

S[] ss;
ss.length = 1; // error
S[] ss = [S()]; // ok
// etc.

I would still dislike it, but it at least would save me from the factory function nonsense.

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Dom Disc
in reply to cc

Permalink

Dom Disc

Posted in reply to cc

Permalink

On Saturday, 31 December 2022 at 06:04:25 UTC, cc wrote:

On Friday, 30 December 2022 at 02:03:39 UTC, Walter Bright wrote:

NaNs are another excellent tool. They enable, for example, dealing with a data set that may have unknown values in it from bad sensors. Replacing that missing data with "0.0" is a very bad idea.

How many D programmers acquire data from sensors that require such default language-integrated fault detection?

E.g. we do.

Versus how many D programmers would be well benefited from having floating point types treated like other numeric types, and sensibly default initialize to a usable 0 value?

Well, all other types should have a NaN value too. But D allows you to create such types, so its not such a big problem.

Why is one group determined to be the one that needs its use case catered to, and not the other?

I suspect that's C legacy. :-/

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Timon Gehr
in reply to Walter Bright

Permalink

Timon Gehr

Posted in reply to Walter Bright

Permalink

On 12/31/22 07:34, Walter Bright wrote:
> On 12/30/2022 1:07 PM, Timon Gehr wrote:
>>> In your description of pattern matching checks in this thread, the check was at runtime.
>>> ...
>>
>> No, the check was at compile time.
> 
> The pattern matching is done at run time.
> 
>> The check I care about is the check for _failure_. The check for _null_ may or may not be _necessary_ depending on the type of the reference.
> NonNull pointers:
> 
>    int* p = ...;
>    nonnull int* np = isPtrNull(p) ? fatalError("it's null!") : p;
>    *np = 3; // guaranteed not to fail!
> 
> Null pointers:
> 
>    int* p = ...;
>    *p = 3;  // seg fault!
> 
> Which is better? Both cause the program to quit on a null pointer.
> ...

You have deliberately chosen an example where it does not matter because your aim was specifically to dereference a possibly null pointer.

I care about this case:

nonnull int* p = ...; // possibly a compile time error
*p = 3; // no runtime check. no seg fault!

Note that the declaration and dereference can be a few function calls apart. The further away the two are, the more useful tracking it in the type system becomes.

Manual checks can be used to turn possibly null pointers into non-null pointers anywhere in the program where there is a sensible way to handle the null case separately. This is just a special case of sum types, where the compiler checks that you dealt with all cases exhaustively.
The especially efficient tag encoding provided by `null` is just an additional small detail.

> 
>> This technology has a proven track record.
> 
> A proven track record of not seg faulting, sure.

Of making people think about, and handle the null case if it is necessary at all. I have already told you that my main gripe here is not specifically the segfault (though that does not help), it's the fatal and implicit nature of the crash.

> A proven trackrecord of no fatal errors at converting a nullable pointer to nonnull, I'm not so sure.
> ...

Converting a nullable pointer to nonnull without handling the null case is inherently an unsafe operation. D currently does it implicitly. Explicit is better than implicit for fatal runtime errors that will shut down your program completely.

Typically you'd mostly use nonnull pointers and not get any fatal errors. It is true that if you have nontrivial logic determining whether some pointer should be null or not you may have to check that invariant at runtime with the techniques present in popular languages, but at least it's explicit.

My experience has been that null pointer segfaults usually happen in places where either a null pointer is never expected (and a nonnull pointer should have been used, making the type system ensure that the caller provides one) or there should have been a check, with different logic for the null case. I.e., they happen because people failed to think about the null case at all. The language encourages this lack of thinking by treating all references as non-null references during type checking and then crashing at runtime implicitly once the type checker's assumptions are inevitably violated.

Nonnull pointers allow expressing such assumptions in the type system. They are actually more useful than runtime segfaults and assertion failures, because they document expectations and the error will be at the place where the bad null pointer originates instead of at the place where it was not expected to occur.

Runtime segfaults/assertion failures are actually much more susceptible to being papered over by subtly changing a function's interface and making it more complex by doing some checking internally and ignoring null instead of addressing the underlying issue. This is because it's harder to find the root cause, especially in a large undocumented code base. Nonnull is compiler-checked documentation and it will direct your attention to the function that is actually wrong by default.

> 
>  > Relying on hardware memory protection to catch the null
>  > reference is never necessary,
> 
> If you manually code in a runtime check, sure, you won't need a builtin check at runtime.
> ...

No, you don't need any runtime check at all to dereference a nonnull pointer.

nonnull x = new A;

x.y = 3; // runtime checks completely redundant here

>  > because _valid programs should not even compile if
>  > that's the kind of runtime check they would require to ensure type safety_.
> 
> Then we don't need sumtypes with pattern matching?
> ...

That's not what I said. I am specifically talking about _implicit_ runtime checks causing a _program panic/segfault_. It's just a bad combination for null handling. Bad UX and hardly defensible with technical limitations.

>  > The hardware memory protection can still catch compiler bugs I guess.
> 
> Having a hardware check is perfectly valid for checking things.
> ...

Sure, in principle it can still be leveraged for some sort of explicit runtime-checked null pointer dereference syntax. Personally, the convenience of having the assertion failure tell me where it happened (even if I don't happen to be running in a debugger) is probably _by far_ worth the additional runtime check in the couple places where it would even remain necessary.

Also as Sebastiaan points out, there are actually relevant targets that don't give you the check.

> BTW, back in the bad old DOS days, I used to write a lot of:
> 
>      assert(p != NULL);
> 
> It was very effective. But with modern CPUs, this check adds no value, and I removed them.

I have not much to add to this off-topic point. As I told you many times by now, I mostly agree here, but I want to be able to move most of this checking to compile time instead.

BTW: I really dislike the terminology "nonnull pointer/reference". It's a weird inversion of defaults. nonnull is a much better default.

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Timon Gehr
in reply to Walter Bright

Permalink

Timon Gehr

Posted in reply to Walter Bright

Permalink

On 12/31/22 07:34, Walter Bright wrote:
> On 12/30/2022 1:07 PM, Timon Gehr wrote:
>>> In your description of pattern matching checks in this thread, the check was at runtime.
>>> ...
>>
>> No, the check was at compile time.
> 
> The pattern matching is done at run time.

I don't get the relevance of this.

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Timon Gehr
in reply to Timon Gehr

Permalink

Timon Gehr

Posted in reply to Timon Gehr

Permalink

On 12/31/22 15:31, Timon Gehr wrote:
> On 12/31/22 07:34, Walter Bright wrote:
>> On 12/30/2022 1:07 PM, Timon Gehr wrote:
>>>> In your description of pattern matching checks in this thread, the check was at runtime.
>>>> ...
>>>
>>> No, the check was at compile time.
>>
>> The pattern matching is done at run time.
> 
> I don't get the relevance of this.

By the way, you were under-quoting. This is the relevant context:

On 12/31/22 07:34, Walter Bright wrote:
> On 12/29/2022 7:37 PM, Timon Gehr wrote:
>> I am not saying software can't be allowed to fail, just that it should fail compilation, not at runtime.
> 
> In your description of pattern matching checks in this thread, the check was at runtime.
> ... 

I.e., we were talking about failure, then you made an unrelated and obvious, i.e., spurious, remark about pattern matching happening at runtime instead of addressing the actual point. Now you seem to be doubling down on this.

I was talking about _failure_. You then started talking about pattern matching. I don't want to contest that pattern matching happens at runtime. Not at all. But it's just not the check we had been talking about...

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Timon Gehr
in reply to Timon Gehr

Permalink

Timon Gehr

Posted in reply to Timon Gehr

Permalink

On 12/31/22 15:41, Timon Gehr wrote:
> 
> I was talking about _failure_. You then started talking about pattern matching. I don't want to contest that pattern matching happens at runtime. Not at all. But it's just not the check we had been talking about...

My point has always been that with pattern matching, the exhaustiveness check is (ideally) at compile time. The check of the tag is at runtime. We are currently discussing the exhaustiveness check.

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by monkyyy
in reply to Walter Bright

Permalink

monkyyy

Posted in reply to Walter Bright

Permalink

On Friday, 30 December 2022 at 20:38:52 UTC, Walter Bright wrote:
> On 12/29/2022 7:04 PM, monkyyy wrote:
>> I dont understand why its such a rare opinion to think about software as fail safe or fail dangerous depending on context; most software that exists should be fail safe, where every attempt is made to make it to keep going.
>>
>> Airplanes, nasa and maybe even hard drive drivers; write triple check every line of code, turn on every safety check and have meetings about each and every type; fine.
>
> Sorry, but again, that is attempting to write perfect software. It is *impossible* to do. Humans aren't capable of doing it,

I am discussing failure modes; "how should doors fail"
a Walmart sliding door should be "failsafe" and attempt to open if it's confused about the situation like if someone pulls a fire alarm
a nuclear launch code safe should be "fail dangerous", and attempt to explode if someone is picking it

So it's nonsense to answer "how should door fail" without picking a context. It's all well and good you made airplane software the way you did therefore you want floats to init to nan and nullable to be strict or etc. etc. etc.

Airplane software can be fail dangerous so the backup kicks in. When adr is making a video game on stream and defines a vec2 with default initialized floats; it's a video game it should be fail-safe and init to 0 rather than have him take 10 minutes on stage debugging it. Different situations can call for different solutions, why is safety within computer science universally without context?

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by norm
in reply to Walter Bright

Permalink

norm

Posted in reply to Walter Bright

Permalink

On Saturday, 31 December 2022 at 01:45:06 UTC, Walter Bright wrote:
>
> ... Leaving a null entry is faster, and if you forget to add a check, the hardware will check it for you.
>

This won't be deemed acceptable by most devs I know outside the small batch utility space.

Trapping a seg-fault does not work, you lose context and cannot gracefully continue operating. A NULL object existing at runtime is not a fatal error and should be recoverable. Dereferencing a NULL pointer and triggering a seg-fault, even with a handler in place, is a fatal error and unrecoverable.

Is it really the performance hit of checking for NULL? Well C++ it then and only pay for what you use. If devs don't want the runtime cost of a SW null check in their code, then they don't use the Nullable type. Call that type whatever you want, in Python it is Optional. But having the compiler statically enforce a null check is incredibly useful in my experience because more times than not you do not want to crash.

On that note, mypy statically checks for a None check prior to accessing any Optional type. I bring this up as well because Python is inherently unsafe and typing is not high on the Python dev radar, but even Python devs consider this a useful feature.

bye,
norm

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Walter Bright
in reply to cc

Permalink

Walter Bright

Posted in reply to cc

Permalink

On 12/30/2022 10:04 PM, cc wrote:
> How many D programmers acquire data from sensors that require such default language-integrated fault detection?

Any that use D for data analysis are going to have to deal with missing data points.

> Versus how many D programmers would be well benefited from having floating point types treated like other numeric types, and sensibly default initialize to a usable 0 value?

Having the program run is one thing, having it produce correct results is quite another. The bias here is to not guess at what the programmer intended thereby producing results that look good but are wrong. The bias is to let the programmer know that the results are wrong, and he needs to select the correct initial value, rather than guess at it.

> Why is one group determined to be the one that needs its use case catered to, and not the other?

Think of it like implicit declaration of variables. Users don't need to be bothered with declaring variables before use, the language will do it for you. This is a great idea that surfaces time and again. Unfortunately, the language designers eventually figure out this produces more problems than it solves, and wind up having to painfully break existing code by requiring declarations.

December 31, 2022

Re: Fixing C's Biggest Mistake

Posted by Walter Bright
in reply to Sebastiaan Koppe

Permalink

Walter Bright

Posted in reply to Sebastiaan Koppe

Permalink

On 12/30/2022 11:55 PM, Sebastiaan Koppe wrote:
> On Saturday, 31 December 2022 at 06:34:38 UTC, Walter Bright wrote:
>> Which is better? Both cause the program to quit on a null pointer.
> 
> In a larger program the first one allows the programmer to do the check once and rely on it for the remainder of the program.

Which is what usually happens with nullable pointers. We check once and rely on it to be non-null for the rest of the program, and the hardware ensures we didn't screw it up.

> Essentially it leverages the type system to make invalid state unrepresentable.

I actually do understand that, I really do. I'm pointing out that the hardware makes dereferencing null pointers impossible. Different approach, but with the same end result.

> This simplifies subsequent code.

I'm not so sure it does. It requires two types rather than one - one with the possibility of a null, one without. Even the pattern matching to convert the type is more work than:

   if (p) ...

>> Having a hardware check is perfectly valid for checking things.
> Not all targets have said check though.

True. Some 16 bit processors don't, notably the 8086. The 80286 had it since 1985 or thereabouts, back in the stone age. My experience with such machines is to develop and debug the code on a machine with hardware memory protection, and port it to the primitive target as the very last step.

----

I know I'm not convincing anyone, and that's OK. Seg faults are a marvel of modern CPU technology, but 99% of programmers regard them as uncool as a zit. D will get sumtypes and pattern matching and then everyone can do what works best for them. D has always been a language where you can choose between a floor wax and a dessert topping.

Personally, I'm most interested in sumtypes and pattern matching as a better error handling mechanism than throwing exceptions.

Top | Forum index | About this forum

Forums