January 05, 2021
On Tuesday, 5 January 2021 at 22:01:08 UTC, Ola Fosheim Grøstad wrote:
> Also, I think this is better determined using whole program optimization, the chosen integer bit pattern used for propagating errors has performance implications. The most freguently thrown/tested value should be the one tested most on performance critical paths.

I messed that sentence up in editing :/...

The most frequently thrown/tested values on performance critical paths should be represented with a bitpattern that is most easily tested. (you can test for more than one value using a single bitand, etc).

January 06, 2021
On Tuesday, 5 January 2021 at 21:46:46 UTC, H. S. Teoh wrote:
> 4) The universal error type contains two fields: a type field and a context field.
>
>     a) The type field is an ID unique to every thrown exception --
>     uniqueness can be guaranteed by making this a pointer to some static
>     global object that the compiler implicitly inserts per throw
>     statement, so it will be unique even across shared libraries. The
>     catch block can use this field to determine what the error was, or
>     it can just call some standard function to turn this into a string
>     message, print it and abort.

Why it must be unique? Doesn't it suffice to return the typeid here?

>
>     b) The context field contains exception-specific data that gives
>     more information about the nature of the specific instance of the
>     error that occurred, e.g., an integer value, or a pointer to a
>     string description or block of additional information about the
>     error (set by the thrower), or even a pointer to a
>     dynamically-allocated exception object if the user wishes to use
>     traditional polymorphic exceptions.

Okay, but in 99% you need dynamically allocated objects because the context is most of the time simply unknown.

But yes, in specific cases a simple error code suffice, but even then it would be better to be aware that an error code is returned instead of a runtime object. It sucks to me to box over the context pointer/value to find out if it is an error code or not when I only want an error code.
>
>     c) The universal error type is constrained to have trivial move
>     semantics, i.e., propagating it up the call stack is as simple as
>     blitting the bytes over. (Any object(s) it points to need not be
>     thus constrained, though.)
>
> The value semantics of the universal error type ensures that there is no overhead in propagating it up the call stack.  The universality of the universal error type allows it to represent errors of any kind without needing runtime polymorphism, thus eliminating the overhead the current exception implementation incurs.

So it seems the universal error type just tells me if there is or isn't error and checking for it is just a bitflip?

> The context field, however, still allows runtime polymorphism to be supported, should the user wish to.

Which in most of the cases will be required.

> The addition of the universal error type to return value is automated by the compiler, and the user need not worry about it.  The usual try/catch syntax can be built on top of it.
>
> Of course, this was proposed for C++, so a D implementation will probably be somewhat different.  But the underlying thrust is: exceptions become value types by default, thus eliminating most of the overhead associated with the current exception implementation.

I didn't know exactly how this is implemented in D, but class objects are passed as simple pointer and pointers are likewise value types.
Using value types itself doesn't guarantee anything about performance, because the context field of an exception can be anything you need some kind of boxing involving runtime polymorphism anyway.

>  Stack unwinding is replaced by normal function return mechanisms, which is much more optimizer-friendly.

I heard that all the time, but why is that true?

> This also lets us support exceptions in @nogc code.

Okay, this would be optionally great. However, if we insert the context pointer into a List we may get a problem of cyclicity.

> There is no need for a cascade of updates if you do it right. As I hinted at above, this enumeration does not have to be a literal enumeration from 0 to N; the only thing required is that it is unique *within the context of a running program*.  A pointer to a static global suffices to serve such a role: it is guaranteed to be unique in the program's address space, and it fits in a size_t.  The actual value may differ across different executions, but that's not a problem: any references to the ID from user code is resolved by the runtime dynamic linker -- as it already does for pointers to global objects.  This also takes care of any shared libraries or dynamically loaded .so's or DLLs.

What means unique, why is it important? Type ids aren't unique to distinguish exceptions and I don't know why we need this requirement.
The point in Rust or Java was to limit the plurality of error types a function call receive, but this is exactly the point where idiomatic and productive development differs. Assumptions change and there you are.


> I've said this before, that the complaints about the current exception handling mechanism is really an issue of how it's implemented, rather than the concept of exceptions itself.

Okay, I think this is definitely debatable.

>  If we implement Sutter's proposal, or something similar suitably adapted to D, it would eliminate the runtime overhead, solve the @nogc exceptions issue, and still support traditional polymorphic exception objects that some people still want.

If we don't care of the exception type nor on the kind of message of an exception did we have either runtime overhead excluding unwinding?
I refer here to the kind of exception as entity. Does a class object really require more runtime polymorphism than a tagged union?

The other point is how to unify the same frontend (try catch) with different backends (nonlocal jumps+unwinding vs value type errors implicitly in return types).
You can use Sutter's proposal in your whole project, but what is with libraries expecting the other kind of error handling backend.
Did we provide an implicit conversion from one backend to another either by turning an error object into an exception or vice versa?

January 06, 2021
Citing Herb Sutter:
>As noted in §1.1, preconditions, postconditions, and assertions are for identifying program bugs, they are never recoverable errors; violating them is always corruption, undefined behavior. Therefore they should never be reported via error reporting channels (regardless of whether exceptions, error codes, or another style is used). Instead, once we have contracts (expected in C++20), users should be taught to prefer expressing these as contracts, and we should consider using those also in the standard library.

Oh men, did you ever hear of non-determinism?
Why not just use compile time contracts and path dependent typing to solve those problems as well?
Because perfectionism is our enemy in productive development.
And terminating the whole program doesn't help either, exactly for this purpose we have error types or contexts, to know to which degree we are required to terminate and this should hold even for contracts.
January 06, 2021
On Monday, 4 January 2021 at 15:39:50 UTC, ludo456 wrote:
> Listening to the first visioconf of the Dconf 2020, titled Destroy All Memory Corruption, (https://www.youtube.com/watch?v=XQHAIglE9CU) Walter talks about not using exceptions any more in the future. He says something like "this is where languages are going" [towards no using exceptions any more].

I don't think exceptions are going anywhere. It might be that new libraries tend to avoid them (to work with @nothrow and @live), but there is no reason to banish them from the whole language - that would only result in huge breakage for limited benefit.

And I suspect Walter didn't mean all code -just the relatively low-level stuff that might want to use `@live`. Even if he did, community will force him to reconsider.


January 06, 2021
On Wed, Jan 06, 2021 at 05:36:07PM +0000, sighoya via Digitalmars-d-learn wrote:
> On Tuesday, 5 January 2021 at 21:46:46 UTC, H. S. Teoh wrote:
> > 4) The universal error type contains two fields: a type field and a context field.
> > 
> >     a) The type field is an ID unique to every thrown exception --
> >     uniqueness can be guaranteed by making this a pointer to some
> >     static global object that the compiler implicitly inserts per
> >     throw statement, so it will be unique even across shared
> >     libraries. The catch block can use this field to determine what
> >     the error was, or it can just call some standard function to
> >     turn this into a string message, print it and abort.
> 
> Why it must be unique? Doesn't it suffice to return the typeid here?

It must be unique because different functions may return different sets of error codes. If these sets overlap, then once the error propagates up the call stack it becomes ambiguous which error it is.

Contrived example:

	enum FuncAError { fileNotFound = 1, ioError = 2 }
	enum FuncBError { outOfMem = 1, networkError = 2 }

	int funcA() { throw FuncAError.fileNotFound; }
	int funcB() { throw FuncBError.outOfMem; }

	void main() {
		try {
			funcA();
			funcB();
		} catch (Error e) {
			// cannot distinguish between FuncAError and
			// FuncBError
		}
	}

Using the typeid is no good because: (1) typeid in D is a gigantic historic hack containing cruft that even Walter doesn't fully understand; (2) when all you want is to return an integer return code, using typeid is overkill.


> >     b) The context field contains exception-specific data that gives
> >     more information about the nature of the specific instance of
> >     the error that occurred, e.g., an integer value, or a pointer to
> >     a string description or block of additional information about
> >     the error (set by the thrower), or even a pointer to a
> >     dynamically-allocated exception object if the user wishes to use
> >     traditional polymorphic exceptions.
> 
> Okay, but in 99% you need dynamically allocated objects because the context is most of the time simply unknown.

If the context is sufficiently represented in a pointer-sized integer, there is no need for allocation at all. E.g., if you're returning an integer error code.

If you're in @nogc code, you can point to a statically-allocated block that the throwing code updates with relevant information about the error, e.g., a struct that contains further details about the error.

If you're using traditional polymorphic exceptions, you already have to allocate anyway, so this does not add any overhead.


> But yes, in specific cases a simple error code suffice, but even then it would be better to be aware that an error code is returned instead of a runtime object. It sucks to me to box over the context pointer/value to find out if it is an error code or not when I only want an error code.

You don't need to box anything.  The unique type ID already tells you what type the context is, whether it's integer or pointer and what the type of the latter is.


> >     c) The universal error type is constrained to have trivial move
> >     semantics, i.e., propagating it up the call stack is as simple
> >     as blitting the bytes over. (Any object(s) it points to need not
> >     be thus constrained, though.)
> > 
> > The value semantics of the universal error type ensures that there is no overhead in propagating it up the call stack.  The universality of the universal error type allows it to represent errors of any kind without needing runtime polymorphism, thus eliminating the overhead the current exception implementation incurs.
> 
> So it seems the universal error type just tells me if there is or isn't error and checking for it is just a bitflip?

No, it's a struct that represents the error. Basically:

	struct Error {
		size_t type;
		size_t context;
	}

When you `throw` something, this is what is returned from the function. To propagate it, you just return it, using the usual function return mechanisms.  It's "zero-cost" because it the cost is exactly the same as normal returns from a function.


> > The context field, however, still allows runtime polymorphism to be supported, should the user wish to.
> 
> Which in most of the cases will be required.

Only if you want to use traditional dynamically-allocated exceptions. If you only need error codes, no polymorphism is needed.


[...]
> > Of course, this was proposed for C++, so a D implementation will probably be somewhat different.  But the underlying thrust is: exceptions become value types by default, thus eliminating most of the overhead associated with the current exception implementation.
> 
> I didn't know exactly how this is implemented in D, but class objects are passed as simple pointer and pointers are likewise value types. Using value types itself doesn't guarantee anything about performance, because the context field of an exception can be anything you need some kind of boxing involving runtime polymorphism anyway.

You don't need boxing for POD types. Just store the value directly in Error.context.


> >  Stack unwinding is replaced by normal function return mechanisms,
> >  which is much more optimizer-friendly.
> 
> I heard that all the time, but why is that true?

The traditional implementation of stack unwinding bypasses normal function return mechanisms.  It's basically a glorified longjmp() to the catch block, augmented with the automatic destruction of any objects that might need destruction on the way up the call stack.

Turns out, the latter is not quite so simple in practice.  In order to properly destroy objects on the way up to the catch block, you need to store information about what to destroy somewhere.  You also need to know where the catch blocks are so that you know where to land. Once you land, you need to know how to match the exception type to what the catch block expects, etc.. To implement this, every function needs to setup standard stack frames so that libunwind knows how to unwind the stack. It also requires exception tables, an LSDA (language-specific data area) for each function, personality functions, etc..  A whole bunch of heavy machinery just to get things to work properly.

By contrast, by returning a POD type like the example Error above, none of the above is necessary: all that's required is:

1) A small ABI addition for an error indicator per function call (to a throwing function). This can either be a single CPU register, or probably better, a 1-bit CPU flag that's either set or cleared by the called function.

2) The addition of a branch in the caller to check this error indicator: if there's no error, continue as usual; if there's an error, propagate it (return it) or branch to the catch block.

The catch block then checks the Error.type field to discriminate between errors if it needs to -- if not, just bail out with a standard error message. If it's catching a specific exception, which will be a unique Error.type value, then it already knows at compile-time how to interpret Error.context, so it can take whatever corresponding action is necessary.

None of the heavy machinery would be needed.


> > This also lets us support exceptions in @nogc code.
> 
> Okay, this would be optionally great. However, if we insert the context pointer into a List we may get a problem of cyclicity.

Why would you want to insert it into a list?  The context field is a type-erased pointer-sized value. It may not even be a pointer.


[...]
> > If we implement Sutter's proposal, or something similar suitably adapted to D, it would eliminate the runtime overhead, solve the @nogc exceptions issue, and still support traditional polymorphic exception objects that some people still want.
> 
> If we don't care of the exception type nor on the kind of message of an
> exception did we have either runtime overhead excluding unwinding?
> I refer here to the kind of exception as entity. Does a class object
> really require more runtime polymorphism than a tagged union?

It's not about class vs. non-class (though Error being a struct rather than a class is important for @nogc support). It's about how exception throwing is handled.  The current stack unwinding implementation is too heavyweight for what it does; we want it replaced with something simpler and more pay-as-you-go.


> The other point is how to unify the same frontend (try catch) with
> different backends (nonlocal jumps+unwinding vs value type errors
> implicitly in return types).

That's the whole point of Sutter's proposal: they are all unified with the universal Error struct.  There is only one "backend": normal function return values, augmented as a tagged union to distinguish between normal return and error return.  We are throwing out nonlocal jumps in favor of normal function return mechanisms.  We are throwing out libunwind and all the heavy machinery it entails.

This is about *replacing* the entire exception handling mechanism, not adding another alternative (which would make things even more complicated and heavyweight for no good reason).


> You can use Sutter's proposal in your whole project, but what is with libraries expecting the other kind of error handling backend.

We will not support a different "backend".  Having more than one exception-handling mechanism just over-complicates things with no real benefit.


> Did we provide an implicit conversion from one backend to another either by turning an error object into an exception or vice versa?

No.  Except perhaps for C++ interop, in which case we can confine the heavy machinery to the C++/D boundary. Internally, all D code will use the Sutter mechanism.


T

-- 
There are four kinds of lies: lies, damn lies, and statistics.
January 06, 2021
On Wednesday, 6 January 2021 at 21:27:59 UTC, H. S. Teoh wrote:
> It must be unique because different functions may return different sets of error codes. If these sets overlap, then once the error propagates up the call stack it becomes ambiguous which error it is.

I don't think this is the case. If you analyse the full program then you know the functions that interact. All you need to do is dataflow analysis.

I also don't think there should be a specific error-code, I think that should be left implementation defined. The program should just specify a set of errors. Then it is up to the compiler if that for a given call can be represented using some free bits in another return value as a nullpointer or whatever.

If speed is what is sought, well, then design for it. :-)

January 07, 2021
On Wednesday, 6 January 2021 at 21:27:59 UTC, H. S. Teoh wrote:
> It must be unique because different functions may return different sets of error codes. If these sets overlap, then once the error propagates up the call stack it becomes ambiguous which error it is.
>
> Contrived example:
>
> 	enum FuncAError { fileNotFound = 1, ioError = 2 }
> 	enum FuncBError { outOfMem = 1, networkError = 2 }
>
> 	int funcA() { throw FuncAError.fileNotFound; }
> 	int funcB() { throw FuncBError.outOfMem; }
>
> 	void main() {
> 		try {
> 			funcA();
> 			funcB();
> 		} catch (Error e) {
> 			// cannot distinguish between FuncAError and
> 			// FuncBError
> 		}
> 	}
>

Thanks, reminds on swift error types which are enum cases.
So the type is the pointer to the enum or something which describes the enum uniquely and the context is the enum value, or does the context describe where to find the enum value in the statically allocated object.

> Using the typeid is no good because: (1) typeid in D is a

Sorry, I misspelled it, I meant the internal id in which type is turned to by the compiler, not the RTTI structure of a type at runtime.

> If you're in @nogc code, you can point to a statically-allocated block that the throwing code updates with relevant information about the error, e.g., a struct that contains further details about the error

But the amount of information for an error can't be statically known. So we can't pre-allocate it via a statically allocated block, we need some kind of runtime polymorphism here to know all the fields considered.

> You don't need to box anything.  The unique type ID already tells you what type the context is, whether it's integer or pointer and what the type of the latter is.

The question is how can a type id as integer value do that, is there any mask to retrieve this kind of information from the type id field, e.g. the first three bits say something about the context data type or did we use some kind of log2n hashing of the typeid to retrieve that kind of information.


> When you `throw` something, this is what is returned from the function. To propagate it, you just return it, using the usual function return mechanisms.  It's "zero-cost" because it the cost is exactly the same as normal returns from a function.

Except that bit check after each call is required which is neglectable for some function calls, but it's summing up rapidly for the whole amount of modularization.
Further, the space for the return value in the caller needs to be widened in some cases.

> Only if you want to use traditional dynamically-allocated exceptions. If you only need error codes, no polymorphism is needed.

Checking the bit flag is runtime polymorphism, checking the type field against the catches is runtime polymorphism, checking what the typeid tells about the context type is runtime polymorphism. Checking the type of information behind the context pointer in case of non error codes is runtime polymorphism.
The only difference is it is coded somewhat more low level and is a bit more compact than a class object.
What if we use structs for exceptions where the first field is the type and the second field the string message pointer/or error code?


> The traditional implementation of stack unwinding bypasses normal function return mechanisms.  It's basically a glorified longjmp() to the catch block, augmented with the automatic destruction of any objects that might need destruction on the way up the call stack.

It depends. There are two ways I know, either jumping or decrementing the stack pointer and read out the information in the exception tables.



> Turns out, the latter is not quite so simple in practice.  In order to properly destroy objects on the way up to the catch block, you need to store information about what to destroy somewhere.

I can't imagine why this is different in your case, this is generally the problem of exception handling independent of the underlying mechanism. Once the pointer of the first landing pad is known, the control flow continues as known before until the next error is thrown.
> You also need to know where the catch blocks are so that you know where to land. Once you land, you need to know how to match the exception type to what the catch block expects, etc.. To implement this, every function needs to setup standard stack frames so that libunwind knows how to unwind the stack.

Touché, that's better in case of error returns.

> It also requires exception tables, an LSDA (language-specific data area) for each function, personality functions, etc..  A whole bunch of heavy machinery just to get things to work properly.


> Why would you want to insert it into a list?  The context field is a type-erased pointer-sized value. It may not even be a pointer.
>

Good point, I don't know if anyone tries to gather errors in an intermediate list which is passed to certain handlers. Sometimes exceptions are used as control flow elements though that isn't good practice.


> It's not about class vs. non-class (though Error being a struct rather than a class is important for @nogc support). It's about how exception throwing is handled.  The current stack unwinding implementation is too heavyweight for what it does; we want it replaced with something simpler and more pay-as-you-go.

I agree, that fast exceptions are worthwhile for certain areas as opt-in, but I don't want them to replace non-fast exceptions because of the runtime impact of normal running code.

> That's the whole point of Sutter's proposal: they are all unified with the universal Error struct.  There is only one "backend": normal function return values, augmented as a tagged union to distinguish between normal return and error return.  We are throwing out nonlocal jumps in favor of normal function return mechanisms.  We are throwing out libunwind and all the heavy machinery it entails.
>
> This is about *replacing* the entire exception handling mechanism, not adding another alternative (which would make things even more complicated and heavyweight for no good reason).

Oh, no please not. Interestingly we don't use longjmp in default exception handling, but that would be a good alternative to Herb Sutter’s proposal because exceptions are likewise faster, but have likewise an impact on normal running code in case a new landing pad have to be registered.
But interestingly, the occurrence of this is much more seldom than checking the return value after each function.



January 07, 2021
On 2021-01-06 22:27, H. S. Teoh wrote:

> That's the whole point of Sutter's proposal: they are all unified with
> the universal Error struct.  There is only one "backend": normal
> function return values, augmented as a tagged union to distinguish
> between normal return and error return.  We are throwing out nonlocal
> jumps in favor of normal function return mechanisms.  We are throwing
> out libunwind and all the heavy machinery it entails.
This is not what Sutter is proposing. He's proposing to add a new "backend", so you end up with three different types of functions (when it comes to error handling):

* Functions annotated with `throws`. This is the new "backend":

void foo() throws;

* Functions annotated with `noexcept`. This indicates a function will not throw an exception (of the existing style):

void foo() noexcept;

* Functions without annotation. This indicates a function that may or may not throw an exception (of the existing style):

void foo();

From the proposal, paragraph 4.1.7:

"Compatibility: Dynamic exceptions and conditional noexcept still work. You can call a function that throws a dynamic exception from one that throws a static exception (and vice versa); each is translated to the other automatically by default or you can do it explicitly if you prefer."

But perhaps you're proposing something different for D?

-- 
/Jacob Carlborg
January 07, 2021
On 2021-01-07 01:01, sighoya wrote:

> Thanks, reminds on swift error types which are enum cases.

Swift can throw anything that implements the Error protocol. Classes, structs and enums can implement protocols.

> Oh, no please not. Interestingly we don't use longjmp in default exception handling, but that would be a good alternative to Herb Sutter’s proposal

Some platforms implement C++ exception using longjmp, for example, iOS.

> because exceptions are likewise faster, but have likewise an impact on normal running code in case a new landing pad have to be registered.
> But interestingly, the occurrence of this is much more seldom than checking the return value after each function.

It's claimed that exceptions are not zero cost, even when an exception is not thrown. Because the compiler cannot optimize functions that may throw as well as those that cannot throw.

-- 
/Jacob Carlborg
January 07, 2021
On Thursday, 7 January 2021 at 10:36:39 UTC, Jacob Carlborg wrote:

> Swift can throw anything that implements the Error protocol. Classes, structs and enums can implement protocols.
>

True, Swift can throw anything what implements the Error protocol. It seems the error protocol itself doesn't define any constraints how an error has to look like.

I'm contemplating if this is a good idea, maybe, I don't know yet.

> Some platforms implement C++ exception using longjmp, for example, iOS.

Interesting, I've heard some OSes don't support exception tables, therefore an alternate implementation have to be chosen.

> It's claimed that exceptions are not zero cost, even when an exception is not thrown. Because the compiler cannot optimize functions that may throw as well as those that cannot throw.

Did you refer to the case a pure function is inlined into the caller and the machinery of stack pointer decrementation doesn't work anymore?

You may be right about that. However, I think it can be transformed safely in case the source code is still available.

In case of dyn libs, we may, can develop a machinery to gather exception table information at compile time and to manipulate them in order to inline them safely, but I don't know about the case in D though.