October 04, 2014
On 10/2/14 2:45 AM, Jacob Carlborg wrote:
> On 01/10/14 21:57, Steven Schveighoffer wrote:
>
>> ./progThatExpectsFilename ""
>>
>> -Steve
>
> It's the developer's responsibility to make sure a value like that never
> reaches the "File" constructor. That is, the developer of the
> "progThatExpectsFilename" application that uses "File". Not the
> developer of "File".

Then what is the point of File's constructor throwing an exception? This means, File is checking the filename, and I have to also check the file name.

> Although, I don't see why you shouldn't be able to pass an empty string
> to "File". You'll just get an exception, "cannot open file ''".

Right, that is fine. If you catch the exception and handle the result with a nice message to the user, that is exactly what should happen.

If you forget to catch the exception, this is a bug, and the program should crash with an appropriate stack trace.

Note 2 things:

1. You should NOT depend on the stack trace/Exception to be your error message.
2. File's ctor has NO IDEA whether throwing an exception is going to be a bug or a handled error condition.

I would say, as soon as an exception is thrown and is not caught by user code, for all intents and purposes, it becomes an Error.

-Steve
October 04, 2014
On 10/4/14 4:47 AM, Walter Bright wrote:
> On 9/29/2014 8:13 AM, Steven Schveighoffer wrote:
>> I can think of cases where it's programmer error, and cases where it's
>> user error.
>
> More carefully design the interfaces if programmer error and input error
> are conflated.
>

You mean more carefully design File's ctor? How so?

-Steve
October 04, 2014
On 04/10/14 11:18, Walter Bright via Digitalmars-d wrote:
> What you're doing is attempting to write a program with the requirement that the
> program cannot fail.
>
> It's impossible.

No, I'm attempting to discuss how to approach the problem that the program _can_ fail, and how to isolate that failure appropriately.

I'm asking for discussion of how to handle a use-case, not trying to advocate for particular solutions.

You seem to be convinced that I don't understand the principles you are advocating of isolation, backup, and so forth.  What I've been trying (but obviously failing) to communicate to you is, "OK, I agree on these principles, let's talk about how to achieve them in a practical sense with D."

> If that's your requirement, the system needs to be redesigned so that it can
> accommodate the failure of the program.
>
> (Ignoring bugs in the program is not accommodating failure, it's pretending that
> the program cannot fail.)

Indeed.

>> As I'm sure you realize, I also picked that particular use-case because it's one
>> where there is a well-known technological solution -- Erlang -- which has as a
>> key feature its ability to isolate different parts of the program, and to deal
>> with errors by bringing down the local process where the error occurred, rather
>> than the whole system.  This is an approach which is seriously battle-tested in
>> production.
>
> As I (and Brad) has stated before, process isolation, shutting down the failed
> process, and restarting the process, is acceptable, because processes are
> isolated from each other.
>
> Threads are not isolated from each other. They are not. Not. Not.

I will repeat what I said in my previous email: "Without assuming anything about how the system is architected".

I realize that in my earlier remark:

> However, it's clearly very desirable in this use-case for the application to keep going if at all possible and for any problem, even an Error, to be contained in its local context if we can do so.  (By "local context", in practice this probably means a thread or fiber or some other similar programming construct.)

... I probably conveyed the idea that I was seeking to contain Errors inside threads or fibers.  I was already anticipating that the answer here would be a definitive "You can't under any circumstances", and hence why I wrote, "or other similar programming construct", by which I was thinking of Erlang-style processes.

Actually, a large part of my reason for continuing this discussion is because where high-connectivity server applications are concerned, I'm keen to ensure that their developers _avoid_ the dangerous solution that is, "Spawn lots of threads and fibers, and localize Errors by catching them and throwing away the thread rather than the application."

However, unless there is an alternative in a practical sense, that is probably what people are going to do, because the trade-offs of their use-case make it seem the least bad option.  I think that's a crying shame and that we can and should do better.

> The only way to have super high uptime is to design the system so that failure
> is isolated, and the failed process can be quickly restarted or replaced.
> Ignoring bugs is not isolation, and hoping that bugs in one thread doesn't
> affected memory shared by other threads doesn't work.

Right.  Which is why I'd like to move the discussion over to "How can we achieve this in D?"
October 04, 2014
On Saturday, 4 October 2014 at 08:39:47 UTC, Walter Bright wrote:
> If someone writes non-robust software, D allows them to do that. However, I won't leave unchallenged attempts to pass such stuff off as robust.
>
> Nor will I accept such practices in Phobos, because, as this thread clearly shows, there are a lot of misunderstandings about what robust software is. Phobos needs to CLEARLY default towards solid, robust practice.

Would it help to clarify my intentions in this discussion if I said that, on this note, I entirely agree -- and nothing I have said in this discussion is intended to be an argument about how Phobos should be designed?
October 04, 2014
On Friday, 3 October 2014 at 19:05:51 UTC, Ola Fosheim Grøstad wrote:
> But it is a business decision whether it is better to take amazon.com off the network for a week or just let their search engine occasionally serve food instead of books as search results. Not an engineering decision.
>
> It is a business decision whether it is better for a game to corrupt 1% of user accounts and let customer support manually build them back up than to take the game off the network until the problem is fixed. You would probably have heavier load on customer support and loose more subscriptions by taking the game off the network than giving those 1% one year of free game play as a compensation.

The thing is, the privilege to make that kind of business decision is wholly dependent on the fact that there are no meaningful safety issues involved.

Compare that to the case of the Ford Pinto.  The allegation made was that Ford had preferred to risk paying out lawsuits to injured drivers over fixing a design flaw responsible for those (serious) injuries, because a cost-benefit analysis had shown the payouts were cheaper than rolling out the fix.  This allegation was rightly met with outrage, and severe punitive damages in court.
October 04, 2014
On 04/10/14 10:39, Walter Bright via Digitalmars-d wrote:
> If someone writes non-robust software, D allows them to do that. However, I
> won't leave unchallenged attempts to pass such stuff off as robust.
>
> Nor will I accept such practices in Phobos, because, as this thread clearly
> shows, there are a lot of misunderstandings about what robust software is.
> Phobos needs to CLEARLY default towards solid, robust practice.

A practical question that occurs to me here.

Suppose that I implement, in D, a framework creating Erlang-style processes (i.e. properly isolated, lightweight processes within a defined runtime environment, with an appropriate error-handling framework that allows those processes to be brought down and restarted without bringing down the entire application).

Is there any reasonable scope for accessing Phobos directly from programs written to operate within that runtime, or is it going to be necessary to wrap all of Phobos in order to ensure that it's accessed in a safe way (e.g. to ensure that the conditions required of in contracts are enforced before the call gets to phobos, etc.)?
October 04, 2014
On Saturday, 4 October 2014 at 11:19:10 UTC, Joseph Rushton
Wakeling via Digitalmars-d wrote:
> On 04/10/14 11:18, Walter Bright via Digitalmars-d wrote:
>> What you're doing is attempting to write a program with the requirement that the
>> program cannot fail.
>> The only way to have super high uptime is to design the system so that failure
>> is isolated, and the failed process can be quickly restarted or replaced.
>> Ignoring bugs is not isolation, and hoping that bugs in one thread doesn't
>> affected memory shared by other threads doesn't work.
>
> Right.  Which is why I'd like to move the discussion over to "How can we achieve this in D?"

I see two things that are in the way (aside from the obvious
things like non-@safe code): Casting away shared, and implicitly
shared immutable data. The former can be checked statically, but
the latter is harder to work around in the current language.
October 04, 2014
On Sat, Oct 04, 2014 at 02:40:28AM -0700, Walter Bright via Digitalmars-d wrote:
> On 10/4/2014 1:40 AM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote:
[...]
> >Anyway, failure should not be due to "asserts", that should be covered by program verification and formal proofs.
> 
> The assumption that "proof" means the code doesn't have bugs is charming, but still false.
[...]

"Beware -- I've only proven that the code is correct, not tested it." -- Donald Knuth.

:-)


T

-- 
It is not the employer who pays the wages. Employers only handle the money. It is the customer who pays the wages. -- Henry Ford
October 04, 2014
On 2014-10-04 12:29, Steven Schveighoffer wrote:

> Then what is the point of File's constructor throwing an exception? This
> means, File is checking the filename, and I have to also check the file
> name.

"File" should check if the file exists, can be opened and similar things. These are things that can change from outside your application during a function call between your application and the underlying system.

But, if "File" for some reason doesn't accept null as a valid value then that's something the developer of the application that uses "File" is responsible to check. It's not like the value can suddenly change without you knowing it.

> Right, that is fine. If you catch the exception and handle the result
> with a nice message to the user, that is exactly what should happen.
>
> If you forget to catch the exception, this is a bug, and the program
> should crash with an appropriate stack trace.

Yes, I agree.

> Note 2 things:
>
> 1. You should NOT depend on the stack trace/Exception to be your error
> message.

Agree.

> 2. File's ctor has NO IDEA whether throwing an exception is going to be
> a bug or a handled error condition.

Yes, but it's the responsibility of "File" to properly document what exceptions it can throw and during which conditions. Then it's up to the developer that uses "File" to handle these exceptions appropriately.

> I would say, as soon as an exception is thrown and is not caught by user
> code, for all intents and purposes, it becomes an Error.

Sure. In theory you can have an other library that handles these exceptions. Think something like a web framework that turns most exceptions into 500 responses.

-- 
/Jacob Carlborg
October 04, 2014
On Saturday, 4 October 2014 at 08:15:51 UTC, Walter Bright wrote:
> On 10/3/2014 8:43 AM, Sean Kelly wrote:
>> My point, and I think Kagamin's as well, is that the entire plane is a system
>> and the redundant internals are subsystems.  They may not share memory, but they
>> are wired to the same sensors, servos, displays, etc.
>
> No, they do not share sensors, servos, etc.

Gotcha.  I imagine there are redundant displays in the cockpit as well, which makes sense.  Thus the unifying factor in an airplane is the pilot.  In a non-mannned system, it would be a control program (or a series of redundant control programs).  So the system in this case includes the pilot.

>> Thus the point about shutting down the entire plane as a result of a small failure is fair.
>
> That's a complete misunderstanding.

Right.  So the system relies on the intelligence and training of the pilot for proper operation.  Choosing which systems are in error vs. which are correct, etc.  I still think an argument could be made that an entire airplane, pilot included, is analogous to a server infrastructure, or even a memory isolated program (the Erlang example).

My only point in all this is that while choosing the OS process is a good default when considering the potential scope of undefined behavior, it's not the only definition.  The pilot misinterpreting sensor data and making a bad judgement call is equivalent to the failure of distinct subsystems corrupting the state of the entire system to the point where the whole thing fails.  The sensors were communicating confusing information to the pilot, and his programming, as it were, was not up to the task of separating the good information from the bad.

Do you have any thoughts concerning my proposal in the "on errors" thread?