June 01, 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
> For example:
>
> int[3] arr;
> arr[3] = 5;
>
>
> Technically this is a programming error, and a bug. But memory hasn't actually been corrupted. The system properly stopped me from corrupting memory. But my reward is that even though this fiber threw an Error, and I get an error message in the log showing me the bug, the web server itself is now out of commission. No other pages can be served.

In this case it is fairly obvious where the bad index is coming from... but in general it is impossible to say.

So how much of your program is mad?

You need to reset to some safe / correct point to continue.

Which point?

It is impossible for the compiler to determine that.

Personally I would say the design fault is trying to build _everything_ into a single OS process.

The mechanism that is guaranteed, enforced by the hardware, to recover all resources and reset to a sane point is OS process exit.

ie. If you need "bug" tolerance, decompose your system into multiple processes. This actually has a large number of other benefits. (eg. Automagically concurrent)

Of course, you then need to encode some common sense in the harness... if something keeps on starting up and dying within a very short period of time.... stop restarting it.

Of course, this is just one (of many) ways that a program bug can screw up a system. For example it can start chewing way too many resources.

So your harness needs to be able to limit that.

And of course if you are going to decompose in processes, a process may spawn many more, so you need to shepherd all the subprocesses sanely.....

...and start the herd of processes in appropriate order, and shut them down appropriately....

Sounds like quite an intelligent harness...

Fortunately one exists and has really carefully thought through all these issues.

It's called systemd and works very well.

May 31, 2017
On Wednesday, May 31, 2017 23:20:54 Nick Sabalausky  via Digitalmars-d wrote:
> On 05/31/2017 10:50 PM, Jonathan M Davis via Digitalmars-d wrote:
> > Yes, there may be cases where array indices are effectively coming from user input, and you're going to have to check them all rather than the code having been written in a way that guarantees that the indices are valid, and in those cases, wrapping an array to do the checks may make sense, but in the vast majority of programs, invalid indices should simply never happen - just like dereferencing a null pointer should simply never happen - and if it does happen, it's a bug.
>
> Yes, it's a bug. A *localized* bug. NOT RAMPANT MEMORY CORRUPTION.

Indexing an array with an invalid index is the same as violating any contract in D except that you get a RangeError instead of an AssertError, and the check is always in place in @safe code (even with -release) in order to avoid memory corruption. As soon as the contract is violated, the program is in an unknown state. It's logic is clearly wrong, and the assumptions that it's making may or may not be valid. So, continuing may or may not be safe.

Whether memory corruption is involved is irrelevant. The program violated the contract, so the runtime knows that the program is in an invalid state. The cause of that bug may or may not be localized, but it's a guarantee at that point that the program is wrong, so you can't rely on it doing the right thing.

Yes, we _could_ have made it so that the contract of indexing arrays in D was such that passing an invalid index was considered normal and then have it throw an Exception to indicate that bad input had been given. But that means that that code can no longer be nothrow (which does mean that it can't be optimized as well), and programs would then need to deal with the fact that indexing an array could throw and handle that case appropriately. For the vast majority of programs, most array indices do not come from user input, and thus it usually really doesn't make sense to treat passing an invalid index to an array as anything other than a bug. It's reasonable to expect the programmer to get it right and that if they don't, they'll find it during testing.

If you want to wrap indexing arrays so that you get an Exception, then fine. At that point, you're saying that it's not a program bug to be passed an invalid index, and you're writing your programs with the idea that they need to be able to handle and recover from such bad input. But that is not the contract that the language itself uses precisely because indexing an array with an invalid index is usually a bug and not bad program input, and in the case where the array index _does_ somehow come from user input, then the programmer can test it. But having the runtime throw an Exception for what is normally a program bug would harm the programs that actually got their indices right.

- Jonathan M Davis

May 31, 2017
On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via Digitalmars-d wrote: [...]
> Personally I would say the design fault is trying to build _everything_ into a single OS process.
> 
> The mechanism that is guaranteed, enforced by the hardware, to recover all resources and reset to a sane point is OS process exit.
> 
> ie. If you need "bug" tolerance, decompose your system into multiple processes. This actually has a large number of other benefits. (eg. Automagically concurrent)
[...]

Again, from an engineering standpoint, this is a tradeoff.

The self-containment of an OS-level process is good for isolating it from affecting other processes, but they come with a cost.  In the case of vibe.d, while I can't speak for the design rationales because I'm not involved in its development, it does appear to me that fibres were chosen because of their very low context-switch cost and memory requirements.  If you were to turn the fibres into full-blown processes, that means incurring the cost of saving/restoring the full process context, because that's what it takes to achieve independence between processes. You need a bigger memory footprint because each process needs to have its own copy of data in order to ensure independence.

It may very well be that for your particular design, process independence is important, so this price may be well worth paying.

The fibre route chosen by vibe.d comes with the advantage of faster context switches and smaller memory footprint (and probably other perks as well), but the price you pay for that performance boost is that the fibres are not self-contained and isolated from each other.  So if one fibre goes awry, you can no longer guarantee that the other fibres aren't also compromised. Hence if you wish to guarantee safety in case of logic errors like out-of-bounds array accesses, you're forced to have to reset the entire process before you can be absolutely sure you're back in a sane state.

Which route to choose depends on the particulars of what you're trying to achieve, and how much / whether you're willing to pay the price to achieve what you want.


T

-- 
Today's society is one of specialization: as you grow, you learn more and more about less and less. Eventually, you know everything about nothing.
June 01, 2017
On Thursday, 1 June 2017 at 06:11:43 UTC, H. S. Teoh wrote:
> On Thu, Jun 01, 2017 at 03:24:02AM +0000, John Carter via Digitalmars-d wrote: [...]
>> [...]
> [...]
>
> Again, from an engineering standpoint, this is a tradeoff.
>
> [...]

That's exactly the point: to use the right tool for the requirement of the job to be done.

/P
June 01, 2017
On Wednesday, 31 May 2017 at 21:57:04 UTC, Ali Çehreli wrote:
> of bounds but it's one special case. The actual reason for bounds checking is maintaining an invariant.

That's true, but that could be the case with file system exception too. Say, a file is supposed to be of length N, but you get an exception because you are reading past the file end. Same issue.

Should you then wipe the entire file system, because there appears to be a problem with a single file?

> In the case of array indexes, they are in complete control of the program, hence a bug when out of bounds. It's not possible to say "Bad index; let me try 42 less."

Well, it is possible that the bad indexing was because the input was empty and there was a mistake in the program.

One reasonable thing to do is to rollback for that particular input, log it as a problem, then continue processing other input.

Which is often better than shutting down the service, but it really is contextual.

The real question is, what is the probability of a mismatched index for your application being just an indexing problem. I think it is very high for most "safe" code.

So if D supports "safe" code well, then indexing issues will most likely almost never be due to corruption.

If you only write "unsafe" code, then indexing issues are still most likely to not be because of corruption, but the probability is much higher.


June 01, 2017
On Thursday, 1 June 2017 at 01:05:42 UTC, Walter Bright wrote:
>
> This topic comes up regularly in this forum - the idea that a program that entered an unknown, undefined state is actually ok and can continue executing. Maybe that's fine on a system (such as a gaming console) where nobody cares if it goes off the deep end and it is not connected to the internet so it cannot propagate malware infections.

+1

Why are we discussing this topic again at all? Again?

Even with consumer software, you may want to crash immediately so that you actually get complaints from testers/buyers instead of having a silent, invisible bug that no one will report ever.

Actually leaving checks is imho perfectly valid for consumer software, if you don't do that the next consumers will have the issues that didn't get reported.
June 01, 2017
On Thursday, 1 June 2017 at 09:18:24 UTC, Guillaume Piolat wrote:
> Even with consumer software, you may want to crash immediately so that you actually get complaints from testers/buyers instead of having a silent, invisible bug that no one will report ever.

No. You don't want to crash immediately. In fact, you want to save and recover. Preferably without much work lost and without the user being bothered by it.

June 01, 2017
On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
> I have discovered an annoyance in using vibe.d instead of another web framework. Simple errors in indexing crash the entire application.

Since I wrote/run a bunch of websites/network services written in D, here's my experience/advice:

First, this is not something specific to array indexing, but an entire class of logic errors which are sometimes recoverable. Other examples are associative array indexing, division by zero, and out-of-memory errors resulting from underflows. All of these are due to bugs in the program, but could hypothetically be handled without compromising the integrity of the process.

My advice:

1. Let the program crash. Make sure it's restarted afterwards, either via a looping script, or a watchdog.

2. Make sure you are notified of the error. I don't mean just recorded in a log file somewhere, but set it up so you receive an email any time it happens, with the stack trace. I run all my D network services from a cronjob, which automatically sends output by email. If you have the stack trace, most of these bugs take only a few minutes to fix - at the very least, turning the error into an exception is a trivial modification if you don't have time for a full root cause analysis at that moment.

3. Design your program so that it can be terminated at any point without resulting in data corruption. I don't know if Vibe.d can satisfy this constraint, but e.g. the ae.net.http.server workflow is to build/send the entire response atomically, meaning that the Content-Length will always be populated. Wrap your database updates in transactions. Use the "write to temporary file then rename over the original file" pattern when updating files. Etc.

June 01, 2017
On Thursday, 1 June 2017 at 05:03:17 UTC, Jonathan M Davis wrote:
> Whether memory corruption is involved is irrelevant. The program violated the contract, so the runtime knows that the program is in an invalid state. The cause of that bug may or may not be localized, but it's a guarantee at that point that the program is wrong, so you can't rely on it doing the right thing.

Well, if you take this position then you should not only crash the program, but also delete the executable to prevent it from being run again.

Allowing the process to be restarted when you know that it contains logic errors breaks with the principles you are outlining.

> handle that case appropriately. For the vast majority of programs, most array indices do not come from user input, and thus it usually really doesn't make sense to treat passing an invalid index to an array as anything other than a bug. It's reasonable to expect the programmer to get it right and that if they don't, they'll find it during testing.

It is surprisingly common to forget to check for a field/file being empty in a service. So it makes a lot of sense to roll back for such errors and keep the service alive. In my experience this is the common scenario. And indexing an array is no different than asking for a key that doesn't exist in any other data-structure, array shouldn't be a special case. Does that mean that other ADTs also should throw Error and not Exception?

For instance, assume you have a chat-server and the supplied clients work fine. Then some guy decides to reverse engineer it and build his own client. You don't want that service to go down all the time. You want to shut out that specific client. You want to identify the client and block it.

June 01, 2017
[Service]
...

Restart=on-failure


On Wed, May 31, 2017 at 11:03 PM, Steven Schveighoffer via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On 5/31/17 4:53 PM, Kagamin wrote:
>
>> On Wednesday, 31 May 2017 at 13:04:52 UTC, Steven Schveighoffer wrote:
>>
>>> This seems like a large penalty for "almost" corrupting memory. No other web framework I've used crashes the entire web server for such a simple programming error.
>>>
>>
>> On windows you can set up service restart settings in case it crashes. Useful for services that crash regularly.
>>
>
> That *would* be a feature on Windows ;)
>
> No, this is Linux, so I'll have to research how to properly do it with systemd.
>
> -Steve
>