23 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 5/5/2025 8:38 PM, Timon Gehr wrote: > On 5/6/25 05:31, Timon Gehr wrote: >> >> I just wish everyone would refrain from actively putting invalid instructions and segfaults into druntime in the future. They are not _that_ useful and there are vastly more useful alternatives. x) > > And the same is true for segfault-on-null. I don't want this. If a standard null check can be implemented taking advantage of CPU features, fine. But semantics should be the same as if the compiler inserted a branch that throws an error every time a nullable pointer is dereferenced. That's an awful lot of test and branch code being unnecessarily inserted to test-and-throw instead of segfault-and-throw, because that's what a segfault is - a thrown exception. > Even better would be the type system just ensuring nullable pointers are never dereferenced, the OP's experience report notwithstanding. But I guess this part is a pipe dream for now. Have you tried using a template to achieve that? |
23 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 5/6/2025 8:34 AM, Timon Gehr wrote:
> - There are `assert(0)` in druntime.
> - Druntime/Phobos ship as `-release` build.
>
> Therefore, setting the assert handler will do nothing, even if you configure checkaction to call it in your own project.
Change the `-release` in building druntime to `-release -checkaction=D`.
|
23 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 5/6/2025 8:48 AM, Timon Gehr wrote: > "The patient has a light cough. The patient has thereby entered an invalid state. The doctors must now do as little as possible before they blow up the hospital in order to euthanize the patient and everyone else that may have been in contact with them." > > I want to diagnose and heal the patient! When the autopilot has entered an invalid state for unknown reasons, you really don't want it to continue to run. Violent maneuvers can rip the airframe apart. If a hospital blows up because the computer failed, that's a terrible design. The correct design for a critical system is: 1. detect invalid state 2. if in invalid state, shut down immediately and engage the backup > Well, I just don't want any hard crashes in production. Druntime throws other kinds of errors besides assert errors, by the way. Replace the `-release` switch in the build of druntime with `-release -checkaction=D`. I do understand you need to debug remotely, and presumably your program is not critical to your customer. Doing a custom rebuild of druntime with `-release -checkaction=D` is entirely appropriate for your situation. But I'm hesitant about making it the default build. |
23 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 5/7/25 01:28, Walter Bright wrote: > On 5/5/2025 8:31 PM, Timon Gehr wrote: >> I am perfectly content with it throwing an assert error and unwinding the stack. This is just not what druntime will do in the default build that ships with the compiler. > > If you add the -checkaction=D after the -release switch, then the assert will throw an Error. (Putting it after means it overrides the previous setting made by the -release) > ... I would never use `-release` in production for this project, but still note that what you say does not work: ```d module test; import std.stdio; void main(){ int[2] x=[0,1]; auto y=x[1..2]; x[]=y[]; writeln(x," ",y); } ``` ``` $ ./ldc2-1.41.0-beta1-linux-x86_64/bin/ldmd2 -run test Error: /tmp/test-5d6670 failed with status: -2 message: Illegal instruction (core dumped) Error: program received signal 2 (Interrupt) ``` This is something I don't want to see, ever. It will however happen every time code triggers one of the `-release` `assert(0)` hidden in the druntimes that ship with DMD and LDC. ``` $ ./ldc2-1.41.0-beta1-linux-x86_64/bin/ldmd2 -release -checkaction=D -run test [1, 21640336] [21640336] ``` This is even worse. Silent memory corruption. druntime and Phobos should get rid of `-release`. > (Actually, it throws a `staticError!AssertError(msg, file, line)` which is derived from `Error`.) > > (I don't know why it has to go through all these layers and layers of templates.) I guess this is done so it does not use the GC. (Which answers my earlier question what to replace all the `assert(0)` spam with.) |
23 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Richard (Rikki) Andrew Cattermole | On 5/6/2025 1:53 AM, Richard (Rikki) Andrew Cattermole wrote: > This is why its so important to switch over to calling the global functions like assert handler does. You cannot know if the global function is corrupted or not at that stage. A bad actor can also hijack that global function to facilitate his nefarious schemes. > People can configure it to do whatever they want, we don't have to have a default that is anything but instant crash. It's not an instant crash. It generates an invalid instruction fault, which then goes to a handler for it, and the default behavior of that handler is to terminate the process. I can tell I'm the oldest person here. I programmed for many years on a machine that had no concept of a fault. When your program crashed, it didn't stop. It kept running. It would execute data as instructions. Invalid opcodes would execute random snippets of microcode. It would run wild. Usually the only way to get control back is to do a cold boot. Now *that* is a crash. Having the program stop when it enters an invalid state is a good thing, not a bad thing. If you want to keep a program running on your customer's machine after it crashes, that is entirely up to you. But I cannot recommend it. |
23 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 5/7/25 01:34, Walter Bright wrote: > On 5/5/2025 8:38 PM, Timon Gehr wrote: >> On 5/6/25 05:31, Timon Gehr wrote: >>> >>> I just wish everyone would refrain from actively putting invalid instructions and segfaults into druntime in the future. They are not _that_ useful and there are vastly more useful alternatives. x) >> >> And the same is true for segfault-on-null. I don't want this. If a standard null check can be implemented taking advantage of CPU features, fine. But semantics should be the same as if the compiler inserted a branch that throws an error every time a nullable pointer is dereferenced. > > That's an awful lot of test and branch code being unnecessarily I don't really care if it is necessary, it is sufficient. If there is a better way to get the same result I am fine with that, but for now it would be good to simply have a way to stop the bleeding. > inserted to test-and-throw instead of segfault-and-throw, because that's what a segfault is - a thrown exception. > ... Clearly something distinct is happening that skips the scope cleanup. > >> Even better would be the type system just ensuring nullable pointers are never dereferenced, the OP's experience report notwithstanding. But I guess this part is a pipe dream for now. > > Have you tried using a template to achieve that? No. a) it's impossible due `.init` being forced on you even if you disable default construction b) I am not going to chase down every pointer in every dependency and maintain my custom forks of everything including druntime and Phobos c) build times would likely skyrocket |
22 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 5/7/25 01:36, Walter Bright wrote:
> On 5/6/2025 8:34 AM, Timon Gehr wrote:
>> - There are `assert(0)` in druntime.
>> - Druntime/Phobos ship as `-release` build.
>>
>> Therefore, setting the assert handler will do nothing, even if you configure checkaction to call it in your own project.
>
> Change the `-release` in building druntime to `-release -checkaction=D`.
>
```d
void main(){
assert(0);
}
```
```
$ dmd -release -checkaction=D -run test.d
Error: program killed by signal 4
```
The invalid instruction is in `_Dmain`, not druntime, so this is a valid demonstration even without building a custom druntime.
If I am going to build a custom druntime, I will simply delete `-release`.
We should aim to do this in general, delete `-release` and still replace all reachable `assert(0)` for good measure.
|
22 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 5/6/2025 9:29 AM, Timon Gehr wrote: > This only works because the plane keeps its own state independent of the electronics. I know what I'm talking about on this subject. There are many ways to detect that an avionics box has gone bad. Some are detected by the avionics box itself, some by external monitoring, some by comparing outputs with outputs from another box that does the same function but uses different algorithms. All result in instant electrical disconnection. Logging is done with the flight data recorder. There are several aspects of D that are influenced by my experience as an aerospace engineer. > At some point you'll just have to accept that most use cases are not like this. Then you will maybe also figure out that it is not about what kind of person you are, but about what kind of external factors are relevant to your work. (Hint: I am not currently writing software for avionics.) It's the same situation if you write stock trading software. You might not die if it goes haywire, but you certainly could go bankrupt. There's also the situation of minimizing the risk of malware injection. That could certainly ruin your whole week. > And BTW, it appears an ESA mars mission failed partly because an acceleration sensor actively refused to operate for an extended amount of time after acceleration went out of the range it was rated for for a small amount of time. It did so by sticking to one of the ends of the rated range, making the probe compute that it was underground. > > This demonstrates that your tools thinking they know better than you how to react to an error condition is also fatal in "critical" applications. The anecdote only demonstrates that the design had no backup plan for a failed sensor. Here's another: the 737MAX MCAS system kept functioning despite receiving bad data from the AOA sensor, and moved the flight controls far outside of the envelope. There was another incident long ago where the autopilot decided to turn the airplane upside down. That was fun for the crew and passenger. And another where the stabilizer jammed. The pilot, rather than leaving the jammed thing alone and doing an emergency landing, decided he would keep trying to unjam it. He eventually succeeded so well the nut broke off the end of the jackscrew and the stabilizer then broke free. Don't keep trying to work broken systems. They get more broken when you do that. |
22 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 5/7/25 01:51, Walter Bright wrote: > On 5/6/2025 8:48 AM, Timon Gehr wrote: >> "The patient has a light cough. The patient has thereby entered an invalid state. The doctors must now do as little as possible before they blow up the hospital in order to euthanize the patient and everyone else that may have been in contact with them." >> >> I want to diagnose and heal the patient! > > When the autopilot has entered an invalid state for unknown reasons, you really don't want it to continue to run. Violent maneuvers can rip the airframe apart. > > If a hospital blows up because the computer failed, that's a terrible design. Thanks for agreeing with this at least. It however seems you did not get the point. This was an analogy. The terrible design here is hard-crashing "debug-release" builds with the attitude that it is all the same anyway. > The correct design for a critical system is: > > 1. detect invalid state > 2. if in invalid state, shut down immediately and engage the backup > ... Sure, just let me do these things in the way that is appropriate. I do not have a backup here though. It's not typically viable to have a separate entity code exactly the same application (hopefully with different bugs), while having both of them share all of the important internal state right to the point where one of them decides to go out the window. > >> Well, I just don't want any hard crashes in production. Druntime throws other kinds of errors besides assert errors, by the way. > > Replace the `-release` switch in the build of druntime with `-release - checkaction=D`. > ... This does not work. > I do understand you need to debug remotely, and presumably your program is not critical to your customer. The main thing that suffers here is my established reputation of fixing any observed issues within 24 hours. Your attitude seems to be if you want to maintain such a reputation, either don't use D, or have the foresight to customize your setup to the point where you are effectively maintaining your own fork before you even start using the language. > Doing a custom rebuild of druntime with `-release -checkaction=D` is entirely appropriate for your situation. The problem with this is that you don't know you are in this situation until there was at least one unexplained crash, and it complicates build instructions for third parties, as well as the setup of new build machines. > But I'm hesitant about making it the default build. I think this would be the wrong call. I guess another valid option is to just ship both builds, so that hobbyists can still enjoy crashing into the wall at 1001mph instead of 1000mph. |
22 hours ago Re: [OT] OT: Null checks. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 5/7/25 02:40, Walter Bright wrote: > On 5/6/2025 9:29 AM, Timon Gehr wrote: >> This only works because the plane keeps its own state independent of the electronics. > > I know what I'm talking about on this subject. I am aware. I am saying it is an irrelevant subject to my problem. > ... > > There are several aspects of D that are influenced by my experience as an aerospace engineer. > ... For better and worse, it seems. Reality check: D is advertised as a general-purpose language that allows you to be productive. > >> At some point you'll just have to accept that most use cases are not like this. Then you will maybe also figure out that it is not about what kind of person you are, but about what kind of external factors are relevant to your work. (Hint: I am not currently writing software for avionics.) > > It's the same situation if you write stock trading software. You might not die if it goes haywire, but you certainly could go bankrupt. > ... Another use case that is not relevant to me now. > There's also the situation of minimizing the risk of malware injection. That could certainly ruin your whole week. > ... Yes, right, because being unable to fix unexplained segfaults is such a great way to avoid malware injection. Ideally you don't have to run the software again until the bug is fixed. This is not practical if you cannot know what went wrong. Bonus points, introduce segfaults and invalid instruction errors on otherwise mostly benign bugs that are immediately detected, such as null pointer dereferences, so that people get used to seeing segfaults and are not alarmed once the program starts segfaulting because some intruder is trying to run exploits. Incentivize people to write overly broad and overcomplicated signal handlers, that will certainly help with security. > >> And BTW, it appears an ESA mars mission failed partly because an acceleration sensor actively refused to operate for an extended amount of time after acceleration went out of the range it was rated for for a small amount of time. It did so by sticking to one of the ends of the rated range, making the probe compute that it was underground. >> >> This demonstrates that your tools thinking they know better than you how to react to an error condition is also fatal in "critical" applications. > > The anecdote only demonstrates that the design had no backup plan for a failed sensor. > ... AFAIU one issue was that the engineers did not know the sensor would behave in this stupid fashion by default to indicate failure. Anyway, clearly this is not the only thing that went wrong, but it certainly helped the mission fail. > Here's another: the 737MAX MCAS system kept functioning despite receiving bad data from the AOA sensor, and moved the flight controls far outside of the envelope. > > There was another incident long ago where the autopilot decided to turn the airplane upside down. That was fun for the crew and passenger. > > And another where the stabilizer jammed. The pilot, rather than leaving the jammed thing alone and doing an emergency landing, decided he would keep trying to unjam it. He eventually succeeded so well the nut broke off the end of the jackscrew and the stabilizer then broke free. > > Don't keep trying to work broken systems. They get more broken when you do that. > It seems hypocritical of you to know what went wrong in those circumstances. How was any information allowed to escape? /s We are looking at a failure case of the language _right now_. |
Copyright © 1999-2021 by the D Language Foundation