June 03, 2017
On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:
> On 03.06.2017 08:55, Paolo Invernizzi wrote:
>> On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:
>> 
>>> It's exacerbated because Walter is in a mindset of writing mission-critical applications where any detectable bug means you need to restart the program. Honestly, if I were writing flight control systems for Airbus, I could modify druntime to raise SIGABRT or call exit(3) when you try to throw an Error. It would be easy, and it would be worthwhile. If you really need cleanup, atexit(3) is available.
>> 
>> The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset...
>> 
>> I'm really really puzzled by why this topic pops up so often...
>> 
>> 
>> /Paolo
>
> I don't get why you would /restart/ mission-critical software that has been shown to be buggy. What you need to do instead: Have a few more development teams that create independent implementations of your service. (Completely from scratch, as the available libraries were not developed to the necessary standard.) All of them should run on different hardware produced in different factories by different companies. Furthermore, you need to hire a team of testers and software verification experts vastly exceeding the team of developers in magnitude, etc.

That's what should be done in mission-critical software, and we are relaxing the constraint of mission critical, it seems [1]

The point is software, somehow, has to be run, with bugs, or sometimes logic flaws: alas bugged software is running here [2]...

So, if you have to, you should restart 'not-so-critical-software', and you should code it as it should be restarted from time to time.

It's an opinion, when it's the better moment to just restart it, and a judgement between risks and opportunities.

My personal opinion, it should be stopped ASAP a bug is detected.

/Paolo

[1] http://exploration.esa.int/mars/59176-exomars-2016-schiaparelli-anomaly-inquiry
[2] https://motherboard.vice.com/en_us/article/the-f-35s-software-is-so-buggy-it-might-ground-the-whole-fleet
June 03, 2017
On Saturday, 3 June 2017 at 10:21:03 UTC, Paolo Invernizzi wrote:
> It doesn't seems to me that the trends to try to handle somehow, that something, somewhere, who knows when, has gone wild it's coherent with the term "robustness".

That all depends. It makes perfect sense in a "strongly pure" function to just return an exception for basically anything that went wrong in that function.

I use this strategy in other languages for writing validator_functions, it is a very useful and time-saving way of writing validators. E.g.:

try {
    …
    validated_field = validate_input(unvalidated_input);
}

I don't really care why validate_input failed, even if it was a logic flaws in the "validate_input" code itself it is perfectly fine to just respond to the exception, log the failure return a failure status code and continue with the next request.

The idea that programs can do provably full veracity checking of input isn't realistic in evolving code bases that need constant updates.

My "validate_input" only have to be correct for correct input. If it fails because the input is wrong or because the validation spec is wrong does not matter, as long as it fails.

> And the fact that the "nice tries" are done at runtime, in production, is the opposite of what I'm thinking is program verification.

Program verification requires a spec.

In the above example the spec could be that it should never allow illegal input to pass, but it could also make room for failing for some legal input.

"false alarm" is a concept that is allowed for in many real world application.

In this context it means that you throw too many exceptions, but that does not mean that you don't throw an exception when you should have.

> I'm trying to exactly do that, I like to think myself as a very pragmatic person...

What do you mean by "pragmatic"? Shutting down a B2B website because one insignificant request-handler fails on some requests (e.g. requesting a help page)  is not very pragmatic.

Pragmatic in this context would be to specify which handlers are critical and which ones are not.

June 03, 2017
Anyway, all of this boils down to the question of whether D really provides a safe programming environment.

If you only write safe code in a safe language, then it should be perfectly ok to trap and deal with a failed lookup, irrespective of what kind of data-structure it is.

So, if this isn't possible in D, then D isn't able to compete with other safe programming languages...

But then maybe one shouldn't try to sell it as a safe programming language either.

You can't really have it both ways.

June 03, 2017
On Saturday, 3 June 2017 at 10:47:36 UTC, Ola Fosheim Grøstad wrote:
> On Saturday, 3 June 2017 at 10:21:03 UTC, Paolo Invernizzi wrote:
>> It doesn't seems to me that the trends to try to handle somehow, that something, somewhere, who knows when, has gone wild it's coherent with the term "robustness".
>
> That all depends. It makes perfect sense in a "strongly pure" function to just return an exception for basically anything that went wrong in that function.
>
> I use this strategy in other languages for writing validator_functions, it is a very useful and time-saving way of writing validators. E.g.:
>
> try {
> >     validated_field = validate_input(unvalidated_input);
> }
>
> I don't really care why validate_input failed, even if it was a logic flaws in the "validate_input" code itself it is perfectly fine to just respond to the exception, log the failure return a failure status code and continue with the next request.
>
> The idea that programs can do provably full veracity checking of input isn't realistic in evolving code bases that need constant updates.

Sorry Ola, I can't support that way of working.

Don't take it wrong, Walter is doing a lot on @safe, but compilers are built from a codebase, and the codebase has, anyway, bugs.

I can't approve a "ok, do whatever you want in the validate_input and I try to *safely* throw"

IMHO you can only do that if the validator is totally segregated, and to me that means in a separate process, neither in another thread.

>> I'm trying to exactly do that, I like to think myself as a very pragmatic person...
>
> What do you mean by "pragmatic"? Shutting down a B2B website because one insignificant request-handler fails on some requests (e.g. requesting a help page)  is not very pragmatic.
>
> Pragmatic in this context would be to specify which handlers are critical and which ones are not.

To me, pragmatic means that the B2B website has to be organised in a way that the impact is minimum if one of the processes that are handling the requests are restarted, for a bug or not. See Laeeth [1]. Just handle "insignificant requests" to a cheeper, less robust, less costly, web stack.

But we were talking about another argument...

/Paolo

[1] http://forum.dlang.org/post/uvhlxtolghfydydoxwfg@forum.dlang.org

June 03, 2017
On Saturday, 3 June 2017 at 11:18:16 UTC, Paolo Invernizzi wrote:
> Sorry Ola, I can't support that way of working.
>
> Don't take it wrong, Walter is doing a lot on @safe, but compilers are built from a codebase, and the codebase has, anyway, bugs.
>
> I can't approve a "ok, do whatever you want in the validate_input and I try to *safely* throw"

If the compiler is broken then anything could happen, at any time. So that merely suggests that you consider the current version to be of beta-quality.

Would you make the same argument for Python?


> IMHO you can only do that if the validator is totally segregated, and to me that means in a separate process, neither in another thread.

Well, that would be very tedious.

The crux is:

The best way to write at good validator is to make the code in the validator as simple as possible so that you can reduce the probability of making mistakes in the implementation of the spec.

If you have to add code for things like division-by-zero logic flaws or out of bounds checks then you make it harder to catch mistakes in validator and increase the probability of a much worse situation: letting illegal input pass.

So for a validator I want to focus my energy on writing simple crystal clear code that only allows legal input to pass. That makes the overall system robust, as long as the language is capable of trapping all the logic flaws and classify them as a validation-error.

So there is a trade off here. What is more important?

1. Increasing the probability of correctly implementing the validation spec to keep the database consistent.

2. Reduce the chance of the unlikely event that the compiler/unsafe code could cause the validator to pass when it shouldn't.

If the programmer knows that the validator was written in this way, it also isn't a big deal to catch all Errors from it. Probabilistically speaking, the compiler being the cause here would be a highly unlikely event (power failure would be much more likely).

> To me, pragmatic means that the B2B website has to be organised in a way that the impact is minimum if one of the processes that are handling the requests are restarted, for a bug or not. See Laeeth [1]. Just handle "insignificant requests" to a cheeper, less robust, less costly, web stack.

Then we land on the conclusion that development and running cost would increase by choosing D over some of the competing alternatives for this particular use case.

That's ok.

June 03, 2017
On 03.06.2017 12:44, Paolo Invernizzi wrote:
> On Saturday, 3 June 2017 at 09:48:05 UTC, Timon Gehr wrote:
>> On 03.06.2017 08:55, Paolo Invernizzi wrote:
>>> On Friday, 2 June 2017 at 23:23:45 UTC, nohbdy wrote:
>>>
>>>> It's exacerbated because Walter is in a mindset of writing mission-critical applications where any detectable bug means you need to restart the program. Honestly, if I were writing flight control systems for Airbus, I could modify druntime to raise SIGABRT or call exit(3) when you try to throw an Error. It would be easy, and it would be worthwhile. If you really need cleanup, atexit(3) is available.
>>>
>>> The worst thing happened in programming in the last 30 years is just that less and less programmers are adopting Walter mindset...
>>>
>>> I'm really really puzzled by why this topic pops up so often...
>>>
>>>
>>> /Paolo
>>
>> I don't get why you would /restart/ mission-critical software that has been shown to be buggy. What you need to do instead: Have a few more development teams that create independent implementations of your service. (Completely from scratch, as the available libraries were not developed to the necessary standard.) All of them should run on different hardware produced in different factories by different companies. Furthermore, you need to hire a team of testers and software verification experts vastly exceeding the team of developers in magnitude, etc.
> 
> That's what should be done in mission-critical software, and we are relaxing the constraint of mission critical, it seems [1]
> ...

That document says that the crash was caused by a component going down after an unexpected condition instead of just continuing to operate normally. (Admittedly this is biased reporting, but it is true.)

> The point is software, somehow, has to be run, with bugs, or sometimes logic flaws: alas bugged software is running here [2]...
> ...

I.e., a detected bug is not always a sufficient reason to bring down the entire system.

> So, if you have to, you should restart 'not-so-critical-software', and you should code it as it should be restarted from time to time.
> ...

I agree. What I don't agree with is the idea that the programmer should have no way to figure out which component failed and only stop or restart that component if that is the most sensible thing to do under the given circumstances. Ideally, the Mars mission shouldn't need to be restarted just because there is a bug in one component of the probe.

> It's an opinion, when it's the better moment to just restart it, and a judgement between risks and opportunities.
> ...

I.e., the language shouldn't mandate it to be one way or the other.

> My personal opinion, it should be stopped ASAP a bug is detected.
> ...

Which is the right thing to do often enough.

> /Paolo
> 
> [1] http://exploration.esa.int/mars/59176-exomars-2016-schiaparelli-anomaly-inquiry 
> 
> [2] https://motherboard.vice.com/en_us/article/the-f-35s-software-is-so-buggy-it-might-ground-the-whole-fleet 
> 

June 03, 2017
On Wednesday, 31 May 2017 at 21:00:43 UTC, Steven Schveighoffer wrote:
>
> That is my conclusion too. Is your library in a usable state? Perhaps we should not repeat efforts, though I wasn't planning on making a robust public library for it :)

After some consideration you can now find the (dynamic) array implementation here[1].
With regards to (usage) errors: The data structures in libds allow passing an optional function `attest` via the template parameter `Hook` (DbI). `attest` is passed the data structure (by ref, for logging purposes) and a boolean value and must only return successfully if the value is true; if it is false, `attest` must throw something (e.g. an Exception), or terminate the process.
An example of how to use it is here[2].
If no `attest` is passed, the data structures default to throwing an AssertError.

[1] https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/linear/array/dynamic.d
[2] https://github.com/Calrama/libds/blob/fbceda333dbf76697050faeb6e25dbfcc9e3fbc0/src/ds/tree/heap/binary.d#L381
June 04, 2017
On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:
> Array bound accesses should be easy to intercept and have them just kill the current thread.

Ideally, fiber, as well.  Probably the real ideal for this sort of problem is to be able to be as close as possible to Erlang, where errors bring down the particular task in progress, but not the application that spawned the task.

Incidentally, I wouldn't limit the area of concern here to array bound access issues.  This is more about the ability of _any_ error to propagate in applications of this nature, where you have many independent tasks being spawned in separate threads or (more often) fibers, and where you absolutely do not want an error in one task preventing you from being able to continue with others.
June 04, 2017
On 2017-06-04 20:15, Joseph Rushton Wakeling wrote:
> On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:
>> Array bound accesses should be easy to intercept and have them just
>> kill the current thread.
>
> Ideally, fiber, as well.  Probably the real ideal for this sort of
> problem is to be able to be as close as possible to Erlang, where errors
> bring down the particular task in progress, but not the application that
> spawned the task.

Erlang has the philosophy of share nothing between processes (green processes), or task as you call it here. All allocations are process local, that makes it easier to know that a failing process doesn't affect any other process.

-- 
/Jacob Carlborg
June 04, 2017
On Sunday, 4 June 2017 at 19:12:42 UTC, Jacob Carlborg wrote:
> On 2017-06-04 20:15, Joseph Rushton Wakeling wrote:
>> On Friday, 2 June 2017 at 15:19:29 UTC, Andrei Alexandrescu wrote:
>>> Array bound accesses should be easy to intercept and have them just
>>> kill the current thread.
>>
>> Ideally, fiber, as well.  Probably the real ideal for this sort of
>> problem is to be able to be as close as possible to Erlang, where errors
>> bring down the particular task in progress, but not the application that
>> spawned the task.
>
> Erlang has the philosophy of share nothing between processes (green processes), or task as you call it here. All allocations are process local, that makes it easier to know that a failing process doesn't affect any other process.

If I'm not wrong, it also uses a VM, also if there's the availability of a native code compiler...
If a VM is involved, it's another game...

/Paolo