Concept proposal: Safely catching error (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Concept proposal: Safely catching error (page 2)

June 07, 2017

Re: Concept proposal: Safely catching error

Posted by Olivier FAURE
in reply to Moritz Maxeiner

Olivier FAURE

Posted in reply to Moritz Maxeiner

On Monday, 5 June 2017 at 12:59:11 UTC, Moritz Maxeiner wrote:
> On Monday, 5 June 2017 at 12:01:35 UTC, Olivier FAURE wrote:
>> Another problem is that non-gc memory allocated in the try block would be irreversibly leaked when an Error is thrown (though now that I think about it, that would probably count as impure and be impossible anyway).
>
> D considers allocating memory as pure[1].
>
> ...
>
> Sure, but with regards to long running processes that are supposed to handle tens of thousands of requests, leaking memory (and continuing to run) will likely eventually end up brutally shutting down the process on out of memory errors. But yes, that is something that would have to be evaluated on a case by case basis.

Note that in the case you describe, the alternative is either "Brutally shutdown right now", or "Throwaway some data, potentially some memory as well, and maybe brutally shut down later if that happens too often". (although in the second case, there is also the trade-off that the leaking program "steals" memory from the other routines running on the same computer)

Anyway, I don't think this would happen. Most forms of memory allocations are impure, and wouldn't be allowed in a try {} catch(Error) block; C's malloc() is pure, but C's free() isn't, so the thrown Error wouldn't be skipping over any calls to free(). Memory allocated by the GC would be reclaimed once the Error is caught and the data thrown away.

>> Arrays aside, I think there's some use in being able to safely recover from (or safely shut down after) the kind of broken contracts that throw Errors.
>
> I consider there to be value in allowing users to say "this is not a contract, it is a valid use case" (-> wrapper), but a broken contract being recoverable violates the entire concept of DbC.

I half-agree. There *should not* be way to say "Okay, the contract is broken, but let's keep going anyway".

There *should* be a way to say "okay, the contract is broken, let's get rid of all data associated with it, log an error message to explain what went wrong, then kill *the specific thread/process/task* and let the others keep going".

The goal isn't to ignore or bypass Errors, it's to compartmentalize the damage.

June 07, 2017

Re: Concept proposal: Safely catching error

Posted by Olivier FAURE
in reply to ketmar

Olivier FAURE

Posted in reply to ketmar

On Monday, 5 June 2017 at 13:13:01 UTC, ketmar wrote:
> this still nullifies the sense of Error/Exception differences. not all errors are recoverable, even in @safe code.
>
> ...
>
> using wrappers and carefully checking preconditions looks better to me. after all, if programmer failed to check some preconditions, the worst thing to do is trying to hide that by masking errors. bombing out is *way* better, i believe, 'cause it forcing programmer to really fix the bugs instead of creating hackish workarounds.

I don't think this is a workaround, or that it goes against the purpose of Errors.

The goal would still be to bomb out, cancel whatever you were doing, print a big red error message to the coder / user, and exit.

A program that catches an Error would not try to use the data that broke a contract; in fact, the program would not have access to the invalid data, since it would be thrown away. It's natural progression would be to log the error, and quit whatever it was doing.

The point is, if the program needs to free system resources before shutting down, it could do so; or if the program is a server or a multi-threaded app dealing with multiple clients at the same time, those clients would not be affected by a crash unrelated to their data.

June 07, 2017

Re: Concept proposal: Safely catching error

Posted by Moritz Maxeiner
in reply to Olivier FAURE

Moritz Maxeiner

Posted in reply to Olivier FAURE

On Wednesday, 7 June 2017 at 15:35:56 UTC, Olivier FAURE wrote:
> On Monday, 5 June 2017 at 12:59:11 UTC, Moritz Maxeiner wrote:
>
> Anyway, I don't think this would happen. Most forms of memory allocations are impure,

Not how pure is currently defined in D, see the referred spec; allocating memory is considered pure (even if it is impure with the theoretical pure definition).
This is something that would need to be changed in the spec.

>>
>> I consider there to be value in allowing users to say "this is not a contract, it is a valid use case" (-> wrapper), but a broken contract being recoverable violates the entire concept of DbC.
>
> There *should* be a way to say "okay, the contract is broken, let's get rid of all data associated with it, log an error message to explain what went wrong, then kill *the specific thread/process/task* and let the others keep going".
>
> The goal isn't to ignore or bypass Errors, it's to compartmentalize the damage.

The problem is that in current operating systems the finest scope/context of computation you can (safely) kill / compartmentalize the damage in in order to allow the rest of the system to proceed is a process (-> process isolation).
Anything finer than that (threads, fibers, etc.) may or may not work in a particular use case, but you can't guarantee/proof that it works in the majority of use cases (which is what the runtime would have to be able to do if we were to allow that behaviour as the default).
Compartmentalizing like this is your job as the programmer imho, not the job of the runtime.

June 07, 2017

Re: Concept proposal: Safely catching error

Posted by Olivier FAURE
in reply to Steven Schveighoffer

Olivier FAURE

Posted in reply to Steven Schveighoffer

On Monday, 5 June 2017 at 14:05:27 UTC, Steven Schveighoffer wrote:
>
> I don't think this will work. Only throwing Error makes a function nothrow. A nothrow function may not properly clean up the stack while unwinding. Not because the stack unwinding code skips over it, but because the compiler knows nothing can throw, and so doesn't include the cleanup code.

If the function is @pure, then the only things it can set up will be stored on local or GC data, and it won't matter if they're not properly cleaned up, since they won't be accessible anymore.

I'm not 100% sure about that, though. Can a pure function do impure things in its scope(exit) / destructor code?

> Not to mention that only doing this for pure code eliminates usages that sparked the original discussion, as my code communicates with a database, and that wouldn't be allowed in pure code.

It would work for sending to a database; but you would need to use the functional programming idiom of "do 99% of the work in pure functions, then send the data to the remaining 1% for impure tasks".

A process's structure would be:
- Read the inputs from the socket (impure, no catching errors)
- Parse them and transform them into database requests (pure)
- Send the requests to the database (impure)
- Parse / analyse / whatever the results (pure)
- Send the results to the socket (impure)

And okay, yeah, that list isn't realistic. Using functional programming idioms in real life programs can be a pain in the ass, and lead to convoluted callback-based scaffolding and weird data structures that you need to pass around a bunch of functions that don't really need them.

The point is, you could isolate the pure data-manipulating parts of the program from the impure IO parts; and encapsulate the former in Error-catching blocks (which is convenient, since those parts are likely to be more convoluted and harder to foolproof than the IO parts, therefore likely to throw more Errors).

Then if an Error occurs, you can close the connection the client (maybe send them an error packet beforehand), close the database file descriptor, log an error message, etc.

June 07, 2017

Re: Concept proposal: Safely catching error

Posted by ag0aep6g
in reply to Olivier FAURE

ag0aep6g

Posted in reply to Olivier FAURE

On 06/07/2017 05:19 PM, Olivier FAURE wrote:
>> How does `@trusted` fit into this? The premise is that there's a bug somewhere. You can't assume that the bug is in a `@system` function. It can just as well be in a `@trusted` one. And then `@safe` and `pure` mean nothing.

I think I mistyped there. Makes more sense this way: "You can't assume that the bug is in a **`@safe`** function. It can just as well be in a `@trusted` one."

> The point of this proposal is that catching Errors should be considered @safe under certain conditions; code that catch Errors properly would be considered as safe as any other code, which is, "as safe as the @trusted code it calls".

When no @trusted code is involved, then catching an out-of-bounds error from a @safe function is safe. No additional rules are needed. Assuming no compiler bugs, a @safe function simply cannot corrupt memory without calling @trusted code.

You gave the argument against catching out-of-bounds errors as: "it means an invariant is broken, which means the code surrounding it probably makes invalid assumptions and shouldn't be trusted."

That line of reasoning applies to @trusted code. Only @trusted code can lose its trustworthiness. @safe code is guaranteed trustworthy (except for calls to @trusted code).

So the argument against catching out-of-bounds errors is that there might be misbehaving @trusted code. And for misbehaving @trusted code you can't tell the reach of the potential corruption by looking at the function signature.

> I think the issue of @trusted is tangential to this. If you (or the writer of a library you use) are using @trusted to cast away pureness and then have side effects, you're already risking data corruption and undefined behavior, catching Errors or no catching Errors.

It's not about intentional misuse of the @trusted attribute. @trusted functions must be safe.

The point is that an out-of-bounds error implies a bug somewhere. If the bug is in @safe code, it doesn't affect safety at all. There is no explosion. But if the bug is in @trusted code, you can't determine how large the explosion is by looking at the function signature.

June 07, 2017

Re: Concept proposal: Safely catching error

Posted by ag0aep6g
in reply to ag0aep6g

ag0aep6g

Posted in reply to ag0aep6g

On 06/07/2017 09:45 PM, ag0aep6g wrote:
> When no @trusted code is involved, then catching an out-of-bounds error from a @safe function is safe. No additional rules are needed. Assuming no compiler bugs, a @safe function simply cannot corrupt memory without calling @trusted code.

Thinking a bit more about this, I'm not sure if it's entirely correct. Can a @safe language feature throw an Error *after* corrupting memory? For example, could `a[i] = n;` write the value first and do the bounds check afterwards? There's probably a better example, if this kind of "shoot first, ask questions later" style ever makes sense.

If bounds checking could be implemented like that, you wouldn't be able to ever catch the resulting error safely. Wouldn't matter if it comes from @safe or @trusted code. Purity wouldn't matter either, because an arbitrary write like that doesn't care about purity.

June 08, 2017

Re: Concept proposal: Safely catching error

Posted by Olivier FAURE
in reply to ag0aep6g

Olivier FAURE

Posted in reply to ag0aep6g

On Wednesday, 7 June 2017 at 19:45:05 UTC, ag0aep6g wrote:
> You gave the argument against catching out-of-bounds errors as: "it means an invariant is broken, which means the code surrounding it probably makes invalid assumptions and shouldn't be trusted."
>
> That line of reasoning applies to @trusted code. Only @trusted code can lose its trustworthiness. @safe code is guaranteed trustworthy (except for calls to @trusted code).

To clarify, when I said "shouldn't be trusted", I meant in the general sense, not in the memory safety sense.

I think Jonathan M Davis put it nicely:

On Wednesday, 31 May 2017 at 23:51:30 UTC, Jonathan M Davis wrote:
> Honestly, once a memory corruption has occurred, all bets are off anyway. The core thing here is that the contract of indexing arrays was violated, which is a bug. If we're going to argue about whether it makes sense to change that contract, then we have to discuss the consequences of doing so, and I really don't see why whether a memory corruption has occurred previously is relevant. [...] In either case, the runtime has no way of determining the reason for the failure, and I don't see why passing a bad value to index an array is any more indicative of a memory corruption than passing an invalid day of the month to std.datetime's Date when constructing it is indicative of a memory corruption.

The sane way to protect against memory corruption is to write safe code, not code that *might* shut down brutally onces memory corruption has already occurred. This is done by using @safe and proofreading all @trusted functions in your libs.

Contracts are made to preempt memory corruption, and to protect against *programming* errors; they're not recoverable because breaking a contract means that from now on the program is in a state that wasn't anticipated by the programmer.

Which means the only way to handle them gracefully is to cancel what you were doing and go back to the pre-contract-breaking state, then produce a big, detailed error message and then exit / remove the thread / etc.

>> I think the issue of @trusted is tangential to this. If you (or the writer of a library you use) are using @trusted to cast away pureness and then have side effects, you're already risking data corruption and undefined behavior, catching Errors or no catching Errors.
>
> The point is that an out-of-bounds error implies a bug somewhere. If the bug is in @safe code, it doesn't affect safety at all. There is no explosion. But if the bug is in @trusted code, you can't determine how large the explosion is by looking at the function signature.

I don't think there is much overlap between the problems that can be caused by faulty @trusted code and the problems than can be caught by Errors.

Not that this is not a philosophical problem. I'm making an empirical claim: "Catching Errors would not open programs to memory safety attacks or accidental memory safety blunders that would not otherwise happen".

For instance, if some poorly-written @trusted function causes the size of an int[10] slice to be registered as 20, then your program becomes vulnerable to buffer overflows when you iterate over it; the buffer overflow will not throw any Error.

I'm not sure what the official stance is on this. As far as I'm aware, contracts and OOB checks are supposed to prevent memory corruption, not detect it. Any security based on detecting potential memory corruption can ultimately be bypassed by a hacker.

June 08, 2017

Re: Concept proposal: Safely catching error

Posted by Steven Schveighoffer
in reply to Olivier FAURE

Steven Schveighoffer

Posted in reply to Olivier FAURE

On 6/7/17 12:20 PM, Olivier FAURE wrote:
> On Monday, 5 June 2017 at 14:05:27 UTC, Steven Schveighoffer wrote:
>>
>> I don't think this will work. Only throwing Error makes a function
>> nothrow. A nothrow function may not properly clean up the stack while
>> unwinding. Not because the stack unwinding code skips over it, but
>> because the compiler knows nothing can throw, and so doesn't include
>> the cleanup code.
>
> If the function is @pure, then the only things it can set up will be
> stored on local or GC data, and it won't matter if they're not properly
> cleaned up, since they won't be accessible anymore.

Hm... if you locked an object that was passed in on the stack, for instance, there is no guarantee the object gets unlocked.

>
> I'm not 100% sure about that, though. Can a pure function do impure
> things in its scope(exit) / destructor code?

Even if it does pure things, that can cause problems.

>> Not to mention that only doing this for pure code eliminates usages
>> that sparked the original discussion, as my code communicates with a
>> database, and that wouldn't be allowed in pure code.
>
> It would work for sending to a database; but you would need to use the
> functional programming idiom of "do 99% of the work in pure functions,
> then send the data to the remaining 1% for impure tasks".

Even this still pushes the handling of the error onto the user. I want vibe.d to handle the error, in case I create a bug. But vibe.d can't possibly know what database things I'm going to do.

And really this isn't possible. 99% of the work is using the database.

> A process's structure would be:
> - Read the inputs from the socket (impure, no catching errors)
> - Parse them and transform them into database requests (pure)
> - Send the requests to the database (impure)
> - Parse / analyse / whatever the results (pure)
> - Send the results to the socket (impure)
>
> And okay, yeah, that list isn't realistic. Using functional programming
> idioms in real life programs can be a pain in the ass, and lead to
> convoluted callback-based scaffolding and weird data structures that you
> need to pass around a bunch of functions that don't really need them.
>
> The point is, you could isolate the pure data-manipulating parts of the
> program from the impure IO parts; and encapsulate the former in
> Error-catching blocks (which is convenient, since those parts are likely
> to be more convoluted and harder to foolproof than the IO parts,
> therefore likely to throw more Errors).

Aside from the point that this still doesn't solve the problem (pure functions do cleanup too), this means a lot of headache for people who just want to write code. I'd much rather just write an array type and be done.

-Steve

June 08, 2017

Re: Concept proposal: Safely catching error

Posted by ag0aep6g
in reply to Olivier FAURE

ag0aep6g

Posted in reply to Olivier FAURE

On 06/08/2017 11:27 AM, Olivier FAURE wrote:
> Contracts are made to preempt memory corruption, and to protect against *programming* errors; they're not recoverable because breaking a contract means that from now on the program is in a state that wasn't anticipated by the programmer.
> 
> Which means the only way to handle them gracefully is to cancel what you were doing and go back to the pre-contract-breaking state, then produce a big, detailed error message and then exit / remove the thread / etc.
I might get the idea now. The throwing code could be in the middle of some unsafe operation when it throws the out-of-bounds error. It would have cleaned up after itself, but it can't because of the (unexpected) error.

Silly example:

----
void f(ref int* p) @trusted
{
    p = cast(int*) 13; /* corrupt stuff or partially initialize
        or whatever */
    int[] a; auto x = a[0]; /* trigger an out-of-bounds error */
    p = new int; /* would have cleaned up */
}
----

Catching the resulting error is @safe when you throw the int* away. So if f is `pure` and you make sure that the arguments don't survive the `try` block, you're good, because f supposedly cannot have reached anything else. This is your proposal, right?

I don't think that's sound. At least, it clashes with another relatively recent development:

https://dlang.org/phobos/core_memory.html#.pureMalloc

That's a wrapper around C's malloc. C's malloc might set the global errno, so it's impure. pureMalloc achieves purity by resetting errno to the value it had before the call.

So a `pure` function may mess with global state, as long as it cleans it up. But when it's interrupted (e.g. by an out-of-bounds error), it may leave globals in an invalid state. So you can't assume that a `pure` function upholds its purity when it throws an error.

In the end, an error indicates that something is wrong, and probably all guarantees may be compromised.

June 08, 2017

Re: Concept proposal: Safely catching error

Posted by Olivier FAURE
in reply to Steven Schveighoffer

Olivier FAURE

Posted in reply to Steven Schveighoffer

On Thursday, 8 June 2017 at 12:20:19 UTC, Steven Schveighoffer wrote:
> Hm... if you locked an object that was passed in on the stack, for instance, there is no guarantee the object gets unlocked.
>

This wouldn't be allowed unless the object was duplicated / created inside the try block.

> Aside from the point that this still doesn't solve the problem (pure functions do cleanup too), this means a lot of headache for people who just want to write code. I'd much rather just write an array type and be done.
>
> -Steve

Fair enough. There are other advantages to writing with "create data with pure functions then process it" idioms (easier to do unit tests, better for parallelism, etc), though.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation