January 07, 2021
On Thu, Jan 07, 2021 at 12:01:23AM +0000, sighoya via Digitalmars-d-learn wrote:
> On Wednesday, 6 January 2021 at 21:27:59 UTC, H. S. Teoh wrote:
[...]
> > You don't need to box anything.  The unique type ID already tells you what type the context is, whether it's integer or pointer and what the type of the latter is.
> 
> The question is how can a type id as integer value do that, is there any mask to retrieve this kind of information from the type id field, e.g. the first three bits say something about the context data type or did we use some kind of log2n hashing of the typeid to retrieve that kind of information.

Your catch block either knows exactly what type value(s) it's looking for, or it's just a generic catch for all errors.

In the former case, you already know at compile-time how to interpret the context information, and can cast it directly to the correct type. (This can, of course, be implicitly inserted by the compiler.)

In the latter case, you don't actually care what the interpretation is, so it doesn't matter.  The most you might want to do in this case is to generate some string error message; this could be implemented in various ways. If the type field is a pointer to a static global, it could be a pointer to a function that takes the context argument and returns a string, for example. Of course, it can also be a pointer to a static global struct containing more information, if needed.


> > When you `throw` something, this is what is returned from the function.  To propagate it, you just return it, using the usual function return mechanisms.  It's "zero-cost" because it the cost is exactly the same as normal returns from a function.
> 
> Except that bit check after each call is required which is neglectable for some function calls, but it's summing up rapidly for the whole amount of modularization.

But you already have to do that if you're checking error codes after the function call.  The traditional implementation of exceptions doesn't incur this particular overhead, but introduces (many!) others.

Optimizers are constrained, for example, when a particular function call may throw (under the traditional unwinding implementation): it cannot assume control flow will always return to the caller.  Handling the exception by returning the error using normal function return mechanisms allows the optimizer to assume control always returns to the caller, which enables certain optimizations not possible otherwise.


> Further, the space for the return value in the caller needs to be widened in some cases.

Perhaps. But this should not be a big problem if the error type is at most 2 pointers big. Most common architectures like x86 have plenty of registers that can be used for this purpose.


> > Only if you want to use traditional dynamically-allocated exceptions. If you only need error codes, no polymorphism is needed.
> 
> Checking the bit flag is runtime polymorphism, checking the type field against the catches is runtime polymorphism, checking what the typeid tells about the context type is runtime polymorphism. Checking the type of information behind the context pointer in case of non error codes is runtime polymorphism.

The catch block either knows exactly what error types it's catching, or it's a generic catch-all.

In the former case, it already knows at compile-time what type the context field is. So no runtime polymorphism there. Unless the error type indicates a traditional exception class hierarchy, in which case the context field can just be a pointer to the exception object and you can use the traditional RTTI mechanisms to get at the information.

In the latter case, you don't care what the context field is anyway, or only want to perform some standard operation like convert to string, as described earlier. I suppose that's runtime polymorphism, but it's optional.


> The only difference is it is coded somewhat more low level and is a
> bit more compact than a class object.
> What if we use structs for exceptions where the first field is the
> type and the second field the string message pointer/or error code?

That's exactly what struct Error is.


[...]
> > Turns out, the latter is not quite so simple in practice.  In order to properly destroy objects on the way up to the catch block, you need to store information about what to destroy somewhere.
> 
> I can't imagine why this is different in your case, this is generally the problem of exception handling independent of the underlying mechanism. Once the pointer of the first landing pad is known, the control flow continues as known before until the next error is thrown.

The difference is that for unwinding you need to duplicate / reflect this information outside the function body, and you're constrained in how you use the runtime stack (it must follow some standard stack frame format so that the unwinder knows how to unwind it).

If exceptions are handled by normal function return mechanisms, the optimizer is more free to change the way it uses the stack -- you can omit stack frames for functions that don't need it, for instance. And you don't need to duplicate dtor knowledge outside of the function body: the function just exits via the usual return mechanism that already handles the destruction of local variables. You don't even need to know where the catch blocks are: this is already encoded into the catching function via the error bit check after the function call. The exception table can be completely elided.


[...]
> > Why would you want to insert it into a list?  The context field is a type-erased pointer-sized value. It may not even be a pointer.
> 
> Good point, I don't know if anyone tries to gather errors in an intermediate list which is passed to certain handlers. Sometimes exceptions are used as control flow elements though that isn't good practice.

Exceptions should never be used as control flow.  That's definitely a code smell. :-D

But anyway, if you ever want to store errors in a list, just store the entire Error struct.  It's only 2 pointers long, and includes all the information necessary to interpret it.


> > It's not about class vs. non-class (though Error being a struct rather than a class is important for @nogc support). It's about how exception throwing is handled.  The current stack unwinding implementation is too heavyweight for what it does; we want it replaced with something simpler and more pay-as-you-go.
> 
> I agree, that fast exceptions are worthwhile for certain areas as opt-in, but I don't want them to replace non-fast exceptions because of the runtime impact of normal running code.

It will *improve* normal running code.

Please note that the proposed mechanism does NOT exclude traditional class-based exceptions. All you need is to reserve a specific Error.type value to mean "class-based exception", and store the class reference in Error.context:

	enum classBasedException = ... /* some magic value */;

	// This:
	throw new Exception(...);

	// Gets translated to this:
	Error e;
	e.type = classBasedException;
	e.context = cast(size_t) new Exception(...);
	return e;

	// ... then in the catch block, this:
	catch(MyExceptionSubclass e) {
		handleError(e);
	}

	// gets translated to this:
	catch(Error e) {
		if (e.type == classBasedException) {
			auto ex = cast(Exception) e.context;
			auto mex = cast(MyExceptionSubclass) ex; // query RTTI
			if (mex !is null) {
				handleError(ex);
				goto next;
			}
		}
		... // propagate to next catch block or return e
	}
	next: // continue normal control flow

Nothing breaks in traditional class-based exception code. You earn the free benefit of no external tables for libunwind, as well as better optimizer friendlines.

And you get a really cheap code path if you opt to use error codes instead of class objects.  *And* it works for @nogc.


[...]
> > This is about *replacing* the entire exception handling mechanism, not adding another alternative (which would make things even more complicated and heavyweight for no good reason).
> 
> Oh, no please not. Interestingly we don't use longjmp in default exception handling, but that would be a good alternative to Herb Sutter’s proposal because exceptions are likewise faster, but have likewise an impact on normal running code in case a new landing pad have to be registered.  But interestingly, the occurrence of this is much more seldom than checking the return value after each function.
[...]

I don't understand why you would need to register a new landing pad. There is no need to register anything; catch blocks become just part of the function body and are automatically handled as part of the function call mechanism.

The reason we generally don't use longjmp is because it doesn't unwind the stack properly (does not destruct local variables that need destruction). You *could* make it work, e.g., each function pushes dtor code onto a global list of dtors, and the setjmp handler just runs all the dtors in this list.  But that just brings us back to the same performance problems that libunwind has, just implemented differently. (Every function has to push/pop dtors to the global list, for instance. That's a LOT of overhead, and is very cache-unfriendly. Even libunwind does better than this.)


T

-- 
VI = Visual Irritation
January 07, 2021
On Thu, Jan 07, 2021 at 11:15:26AM +0000, sighoya via Digitalmars-d-learn wrote:
> On Thursday, 7 January 2021 at 10:36:39 UTC, Jacob Carlborg wrote:
[...]
> > It's claimed that exceptions are not zero cost, even when an exception is not thrown. Because the compiler cannot optimize functions that may throw as well as those that cannot throw.
> 
> Did you refer to the case a pure function is inlined into the caller and the machinery of stack pointer decrementation doesn't work anymore?

This has nothing to do with inlining.  Inlining is done at compile-time, and the inlined function becomes part of the caller. There is no stack pointer decrementing involved anymore because there's no longer a function call in the emitted code.

The optimizer works by transforming the code so that redundant operations are eliminated, and/or expensive operations are replaced with cheaper ones. It does this by relying on certain assumptions about the code that lets it replace/rearrange the code in a way that preserves its semantics. One very important assumption is control flow: if you have operations A, B, C in your function and the optimizer can assume that control will always reach all 3 operations, then it can reorder the operations (e.g., to improve instruction cache coherence) without changing the meaning of the code.

The problem with unwinding is that the optimizer can no longer assume, for instance, that every function call will return control to the caller (if the called function throws, control flow will bypass the current function).  So if B is a function call, then the optimizer can no longer assume C is always reached, so it cannot reorder the operations. Maybe there's a better sequence of instructions that does A and C together, but now the optimizer cannot use it because that would change the semantics of the code.

If the exception were propagated via normal return mechanisms, then the optimizer still has a way to optimize it: it can do A and C first, then if B fails it can insert code to undo C, which may still be faster than doing A and C separately.

This is why performance-conscious people prefer nothrow where possible: it lets the optimizer make more assumptions, and thereby, opens the possibility for better optimizations.


[...]
> In case of dyn libs, we may, can develop a machinery to gather exception table information at compile time and to manipulate them in order to inline them safely, but I don't know about the case in D though.

This makes no sense. Inlining is done at compile-time; if you are loading the code as a dynamic library, by definition you're not inlining anymore.


T

-- 
He who does not appreciate the beauty of language is not worthy to bemoan its flaws.
January 07, 2021
On Thursday, 7 January 2021 at 14:34:50 UTC, H. S. Teoh wrote:
> This has nothing to do with inlining.  Inlining is done at compile-time, and the inlined function becomes part of the caller.

True

>There is no stack pointer decrementing involved anymore

Also true.

> because there's no longer a function call in the emitted code.

And this is the problem, how to refer to the original line of the inlined function were the exception was thrown?
We need either some machinery for that to be backpropagated or we didn't inline at all in the said case.

> One very important assumption is control flow: if you have operations A, B, C in your function and the optimizer can assume that control will always reach all 3 operations, then it can reorder the operations (e.g., to improve instruction cache coherence) without changing the meaning of the code.

Wonderful, we have an example!
If all three operations don't refer to depend on each other. Or maybe the compiler execute them in parallel. Did we refer to lazy evaluation or asynchronous code execution here?

> If the exception were propagated via normal return mechanisms, then the optimizer still has a way to optimize it: it can do A and C first, then if B fails it can insert code to undo C, which may still be faster than doing A and C separately.

Puh, that's sounds a bit of reordering nondeterministic effectful operations which definitely aren't rollbackable in general, only in simple cases.
But in general, why not generate a try catch mechanism at compile time catching the exception in case B throws and store it temporarily in an exception variable.

After A has executed and was successful, just rethrow the exception of B.
All this could be generated at compile time, no runtime cost but involves some kind of code duplication.

> This is why performance-conscious people prefer nothrow where possible: it lets the optimizer make more assumptions, and thereby, opens the possibility for better optimizations.

But the assumption is wrong, every function can fail, e.g. out of memory, aborting the whole program in this case just to do better optimizations isn't the fine english way.


> This makes no sense. Inlining is done at compile-time; if you are loading the code as a dynamic library, by definition you're not inlining anymore.

As I said, I don't know how this is handled in D, but in theory you can even inline an already compiled function though you need meta information to do that. My idea was just to fetch the line number from the metadata of the throw statement in the callee in order to localize the error correctly in the original source code.


January 07, 2021
On Thu, Jan 07, 2021 at 05:47:37PM +0000, sighoya via Digitalmars-d-learn wrote:
> On Thursday, 7 January 2021 at 14:34:50 UTC, H. S. Teoh wrote:
> > This has nothing to do with inlining.  Inlining is done at compile-time, and the inlined function becomes part of the caller.
> 
> True
> 
> > There is no stack pointer decrementing involved anymore
> 
> Also true.
> 
> > because there's no longer a function call in the emitted code.
> 
> And this is the problem, how to refer to the original line of the inlined function were the exception was thrown?

The compiler knows exactly which line it is at the point where the exception is created, and can insert it there.


> We need either some machinery for that to be backpropagated or we didn't inline at all in the said case.

There is no need for any machinery. The information is already statically available at compile-time.


> > One very important assumption is control flow: if you have operations A, B, C in your function and the optimizer can assume that control will always reach all 3 operations, then it can reorder the operations (e.g., to improve instruction cache coherence) without changing the meaning of the code.
> 
> Wonderful, we have an example!
> If all three operations don't refer to depend on each other. Or maybe
> the compiler execute them in parallel. Did we refer to lazy evaluation
> or asynchronous code execution here?

This is just an over-simplified example to illustrate the point. Real code obviously isn't this simple, and neither are real optimizers.


> > If the exception were propagated via normal return mechanisms, then the optimizer still has a way to optimize it: it can do A and C first, then if B fails it can insert code to undo C, which may still be faster than doing A and C separately.
> 
> Puh, that's sounds a bit of reordering nondeterministic effectful operations which definitely aren't rollbackable in general, only in simple cases.

Again, this was an over-simplified contrived example just to illustrate the point. Real code and real optimizers are obviously much more complex than this.  The main point here is that being able to assume things about control flow in a function gives the optimizer more tools to produce better code.  This is neither the time nor place to get into the nitty-gritty details of how exactly optimization works. If you're unfamiliar with the subject, I recommend reading a textbook on compiler construction.


> But in general, why not generate a try catch mechanism at compile time catching the exception in case B throws and store it temporarily in an exception variable.

Because every introduced catch block in the libunwind implementation introduces additional overhead.


[...]
> > This is why performance-conscious people prefer nothrow where possible: it lets the optimizer make more assumptions, and thereby, opens the possibility for better optimizations.
> 
> But the assumption is wrong, every function can fail, e.g. out of memory, aborting the whole program in this case just to do better optimizations isn't the fine english way.

Wrong. Out of memory only occurs at specific points in the code (i.e., when you call a memory allocation primitive).


> > This makes no sense. Inlining is done at compile-time; if you are loading the code as a dynamic library, by definition you're not inlining anymore.
> 
> As I said, I don't know how this is handled in D, but in theory you can even inline an already compiled function though you need meta information to do that.

This tells me that you do not understand how compiled languages work. Again, I recommend reading a textbook on compiler construction. It will help you understand this issues better. (And it will also indirectly help you write better code, once you understand what exactly the compiler does with it, and what the machine actually does.)


> My idea was just to fetch the line number from the metadata of the throw statement in the callee in order to localize the error correctly in the original source code.

All of this information is already available at compile-time. The compiler can be easily emit code to write this information into some error-handling area that can be looked up by the catch block.

Also, you are confusing debugging information with the mechanism of try/catch. Any such information is a part of the payload of an exception; this is not the concern of the mechanism of how try/catch are implemented.


T

-- 
Perhaps the most widespread illusion is that if we were in power we would behave very differently from those who now hold it---when, in truth, in order to get power we would have to become very much like them. -- Unknown
January 07, 2021
On Thursday, 7 January 2021 at 18:12:18 UTC, H. S. Teoh wrote:
> If you're unfamiliar with the subject, I recommend reading a textbook on compiler construction.

I already read one.


> Because every introduced catch block in the libunwind implementation introduces additional overhead.

But only when an exception is thrown, right?

> Wrong. Out of memory only occurs at specific points in the code (i.e., when you call a memory allocation primitive).

What about pushing a new stack frame on top/bottom of the stack? This is very implicit. I don't talk about a theoretical Turing machine with unbounded memory, rather about a linear bounded automaton with finite memory.
What happens if stack memory isn't available anymore?

>> As I said, I don't know how this is handled in D, but in theory you can even inline an already compiled function though you need meta information to do that.
>
> This tells me that you do not understand how compiled languages work.

Traditionally, inlining means the insertion of code from the callee into the caller, yes.
Imagine now, that the source code of the callee isn't available because it is already compiled and wrapped in a dynlib/static lib before (and now you link to that dynlib/static lib), then you can't inline the source code, but you can inline the binary code of the callee. For this to be "optimize-safe" regarding exceptions you need to store some meta information, e.g. the line number of all direct thrown exceptions in it, during the compilation of the callee in the dynlib/static lib for any caller outside the dynlib/static lib.
Theoretically, you can even pass functions as binary code blocks to the callee, this is mostly inperformant, but it is at least possible.

Though, I assume that most compiles doesn't any sort of this, but it doesn't mean that it isn't possible.

> Again, I recommend reading a textbook on compiler construction. It will help you understand this issues better. (And it will also indirectly help you write better code, once you understand what exactly the compiler does with it, and what the machine actually does.)

It also depends on the considered compiler and how it is relating to the design discussed in textbooks.


> All of this information is already available at compile-time. The compiler can be easily emit code to write this information into some error-handling area that can be looked up by the catch block.

Yes, but the line number is changing when inlining the code, and we don't want the new line number to be outputed by the runtime if an exception was thrown because it points to a line number only visible to the optimizer not to the user?

> Also, you are confusing debugging information with the mechanism of try/catch.

So you only want to output line numbers in stack trace during debugging and not in production code?


January 07, 2021
On Thu, Jan 07, 2021 at 07:00:15PM +0000, sighoya via Digitalmars-d-learn wrote:
> On Thursday, 7 January 2021 at 18:12:18 UTC, H. S. Teoh wrote:
[...]
> > Wrong. Out of memory only occurs at specific points in the code (i.e., when you call a memory allocation primitive).
> 
> What about pushing a new stack frame on top/bottom of the stack? This
> is very implicit. I don't talk about a theoretical Turing machine with
> unbounded memory, rather about a linear bounded automaton with finite
> memory.
> What happens if stack memory isn't available anymore?

In all non-trivial OSes that I'm aware of, running out of stack space causes the OS to forcefully terminate the program. No non-toy compiler I know of checks the remaining stack space when making a function call; that would be an unreasonable amount of overhead. No performance-conscious programmer would accept that.


[...]
> > This tells me that you do not understand how compiled languages work.
> 
> Traditionally, inlining means the insertion of code from the callee
> into the caller, yes.
> Imagine now, that the source code of the callee isn't available
> because it is already compiled and wrapped in a dynlib/static lib
> before (and now you link to that dynlib/static lib), then you can't
> inline the source code, but you can inline the binary code of the
> callee.

This is not inlining, it's linking.


> For this to be "optimize-safe" regarding exceptions you need to store some meta information, e.g. the line number of all direct thrown exceptions in it, during the compilation of the callee in the dynlib/static lib for any caller outside the dynlib/static lib.

I don't understand what's the point you're trying to make here. What has this got to do with how exceptions are thrown?  Any code, exception or not, exports such information to the linker for debugging purposes. It does not directly relate to how exceptions are implemented.


[...]
> > All of this information is already available at compile-time. The compiler can be easily emit code to write this information into some error-handling area that can be looked up by the catch block.
> 
> Yes, but the line number is changing when inlining the code,
[...]

???!  How does inlining (or linking) change line numbers?!  Whether or not something is inlined has nothing to do with what line number it was written in.  The compiler does not edit your source code and move lines around, if that's what you're trying to say.  That would be absurd.


> > Also, you are confusing debugging information with the mechanism of try/catch.
> 
> So you only want to output line numbers in stack trace during debugging and not in production code?

"Debugging information" can be included in production code. Nothing stops you from doing that.  And this has nothing to do with how try/catch is implemented.


T

-- 
Живёшь только однажды.
January 07, 2021
On Thursday, 7 January 2021 at 19:35:00 UTC, H. S. Teoh wrote:

> Whether or not something is inlined has nothing to do with what line number it was written in.

Okay, I've tried it out, and it seems it isn't the problem in the binary case as the code was first compiled and the inlined, so the line number is correct in that case.

For source code inlining however, the direction is simply opposite. Therefore, compilation of inlined code have to respect the original line number pointing to the throw statement in the inlined function.

I think D can handle this, I hope so.

>"Debugging information" can be included in production code.

Yes, but exception line numbers aren't debug infos rather they are passed implicitly as argument to the exception class constructor.
January 08, 2021
On Tuesday, 5 January 2021 at 18:42:42 UTC, Marvin wrote:
> On Monday, 4 January 2021 at 15:39:50 UTC, ludo456 wrote:
>> Listening to the first visioconf of the Dconf 2020, titled Destroy All Memory Corruption, (https://www.youtube.com/watch?v=XQHAIglE9CU) Walter talks about not using exceptions any more in the future. He says something like "this is where languages are going" [towards no using exceptions any more].
>>
>> Can someone point me to an article or more explanations about that?
>
>
> if Exceptions disappear in the future in Dlang, I will download the last version that support exceptions and never update.

I have a similar feeling. Exceptions were a great addition to programming languages in my opinion.


1 2 3
Next ›   Last »