February 19, 2012
On Sunday, February 19, 2012 11:13:46 Robert Jacques wrote:
> Most of Phobos is an internal library. I think parsing routines are a bad counter-example to my point; combined validation and conversion is generally part of their mandate and we expect them to fail often. We also don't expect the parsing of user input to live deep inside code base; it's almost always done as close to the input as possible (i.e. in the input text box). All too often you end up with code like: try{ parse(...); } catch {...} Also, although less of a problem in D,

It very much depends on the function. In some cases, it makes the most sense to use assertions. In others, it makes the most sense to use exceptions. You have to examine it on a case-by-case basis. For instance, most of std.algorithm asserts on stuff, whereas the unicode stuff generally enforces.

> not ensuring that in
> input string actually is a string was the source of many a C exploits.

That may be, but that really doesn't apply to D aside from how you should be interacting with C functions that involve char*.

- Jonathna M Davis
February 19, 2012
On 2/19/12 4:00 PM, Nick Sabalausky wrote:
> "Andrei Alexandrescu"<SeeWebsiteForEmail@erdani.org>  wrote in message
> news:jhrn3r$fie$3@digitalmars.com...
>> On 2/19/12 12:49 PM, Nick Sabalausky wrote:
>>> No, exceptions need to be based around semantics, not modules.
>>
>> Packages and modules are also organized around specific areas.
>>
>
> And those specific areas do NOT necessarily correspond 1-to-1 to error
> conditions.

They don't; the intent here is, again, to cater to the common case.

> 1. A module can still reasonably generate more than one conceptual type of
> exception. Ex: std.file can be expected to generate both "file not found"
> and "disk full" conditions (those may both be IO exceptions, but they're
> still separate issues that are *not* always to be dealt with the same way).

Sure. That would mean std.file could define

// inside std.file
class FileNotFound : public ModuleException!.stringof
{
    ...
}
class DiskFull : public ModuleException!.stringof
{
    ...
}

Or could define one:

// inside std.file
class FileException : public ModuleException!.stringof
{
    Code code;
    ...
}

> 2. As I *already* explained in further detail, it's perfectly reasonable to
> expect one specific area to be convered by more than module.
>
> Therefore, "Modules" to "Conceptual types of exceptions" are not 1-to-1.
> They're not even 1-to-many or many-to-1. They're many-to-many. Therefore,
> module is the wrong basis for an exception.

I think that direction is worth exploring.

> Seriously, how is this not *already* crystal-clear? I feel as if every few
> weeks you're just coming up with deliberately random shit to argue so the
> rest of us have to waste our time spelling out the obvious in insanely
> pedantic detail.

It sometimes happened to me to be reach the hypothesis that my interlocutor must be some idiot. Most often I was missing something.


Andrei
February 19, 2012
Le 19/02/2012 21:57, Andrei Alexandrescu a écrit :
> On 2/19/12 1:19 PM, Nick Sabalausky wrote:
>> That wouldn't be as useful. What the catcher is typically interested
>> in is
>> *what* happened, not *where* it happened.
>
> But module organization is partitioned by functional areas.
>

Modules are organized by what the do, not how they fail. This hierarchy make no sense to me. Worse, it will not be reusable in user code.
February 19, 2012
On Sunday, February 19, 2012 10:19:13 Andrei Alexandrescu wrote:
> On 2/19/12 2:49 AM, H. S. Teoh wrote:
> [snip]
> 
> > To me, this is a giant hint that OOP is a good solution. You want a class hierarchy rooted at Exception. And groups of related exceptions probably should be rooted under their respective base classes under Exception.
> > 
> > If you have a better way to solve this, I'm waiting to hear it.
> 
> I don't. This thread is predicated on the assumption that the current approaches to exception handling are wanting, so it's time to look outside the box. It's also reasonable to assume that people involved in this thread do have an understanding of current mechanisms, so rehashes thereof are unnecessary.

Do you really believe that all of the current approaches are wanting? They may need some tweaking for us, but I don't think that they're necessarily all that broken.

Personally, I'd argue that we should look at how C# and Java have organized their exception hierarchies and then figure out which pieces of that make sense for Phobos and then do something similar. We don't have to copy what they did - we should tweak it according to our needs - but I really think that the basic exception hierarchy that they use tends to work very well. Where you tend to run into problems are where people do stupid things like catch Exception and ignore it. But you can't stop programmers from being stupid.

Already, we've managed to improve over Java (and probably C# - it's been a while since I used it though) in how we deal with Exception vs Error. So, we're a step ahead in that regard. But we don't have any kind of organized exception hierarchy right now. We simply have modules declaring their own whether it makes sense or not.uld do something similar. Perhaps it shouldn't

We should be looking at the exceptions that we currently have and figure out how they should be reorganized into a proper hierarchy, removing them in some cases and adding some in others. Java and C# have done at least a decent job with their exception hierarchies. We should do something similar. Maybe we won't have quite as many exception types, and maybe we organize some things quite differently, but I really think that we should be looking at what they did and and emulate it at least on some level. When and where we can come up with our own improvements, that's great. Let's use those. But let's not throw the baby out with the bath water.

I don't think that a complete redesign of how exceptions are designed is necessary. We just need to organize them better.

- Jonathan M Davis
February 19, 2012
On 2/19/2012 10:48 AM, address_is@invalid.invalid wrote:

>
> I guess "transient" is more descriptive.
>
> Andrei

I suppose “transient” mingles with recoverability and may get confused with it. But the interrelated issue that comes to mind for me is whether a “failure” is a common and typical result of the particular operation or not.

I'm thinking of the example of acquiring a mutex. Because the nature of a mutex is to prevent two parties from using a resource at the same time, getting a result of “mutex already in use” is a normal and expected result and probably should not throw an exception. It's transient in that you can try again and it might work. But this behavior is already built in because you can provide a timeout argument which means, “keep trying until this much time elapses, then give up”.

Perhaps a network connection would be different because when you send a packet, you expect it to get where it's going. That would be normal program flow. Otherwise it's an error and an exception.

Jim
February 19, 2012
On Sunday, February 19, 2012 18:48:02 address_is@invalid.invalid wrote:
> I guess "transient" is more descriptive.

Actually, thinking on it some more, I don't think that transient will work at all, and the reason is simple. _Which_ operation should you retry? You don't even necessarily know which function the exception came from out of the functions that you called within the try block - let alone which function actually threw the exception. Maybe it was thrown 3 functions deep from the function that you called, and while retrying that specific call 3 functions down might have made sense, retrying the function 3 functions up doesn't necessarily make sense at all.

Whether or not you can retry or retrying makes any sense at all is _highly_ dependent on who actually catches the exception. In many cases, it may be a function which could retry it, but in many it won't be, and so having the exception tell the caller that it could retry would just be misleading.

- Jonathan M Davis
February 19, 2012
On 2/19/12 4:20 PM, Nick Sabalausky wrote:
> "Andrei Alexandrescu"<SeeWebsiteForEmail@erdani.org>  wrote in message
>> The wheel is not round. We just got used to thinking it is. Exceptions are
>> wanting and it's possible and desirable to improve them.
>>
>
> They're wanting? What's the problem with them? I see no problem, and I
> haven't seen you state any real problem.

I mentioned the issues here and there, but it's worth collecting them in one place.

1. Type multiplicity. Too many handwritten types are harmful. They exacerbate boilerplate and favor code duplication.

2. Using types and inheritance excessively to represent simple categorical information.

3. Forcing cross-cutting properties into a tree structure. The way I see exceptions is a semantic graph, and making it into a tree forces things like repeated catch clauses for distinct types coming from distinct parts of the hierarchy.

4. Front-loaded design. Very finely-grained hierarchies are built on the off chance someone may need something AND would want to use a type for that.

5. Unclear on the protocol. The "well-designed" phrase has come about often in this thread, but the specifics are rather vague.

6. Bottom-heavy base interface. Class hierarchies should have significant functionality at upper levels, which allows generic code to do significant reusable work using the base class. Instead, exceptions add functionality toward the bottom of the hierarchy, which encourages non-reusable code that deals with specifics.

There might be a couple more, but I think they can be considered derivative.


Andrei
February 19, 2012
On 2/19/12 5:28 PM, Jonathan M Davis wrote:
> On Sunday, February 19, 2012 18:48:02 address_is@invalid.invalid wrote:
>> I guess "transient" is more descriptive.
>
> Actually, thinking on it some more, I don't think that transient will work at
> all, and the reason is simple. _Which_ operation should you retry?

The application decides.

> You don't
> even necessarily know which function the exception came from out of the
> functions that you called within the try block - let alone which function
> actually threw the exception. Maybe it was thrown 3 functions deep from the
> function that you called, and while retrying that specific call 3 functions
> down might have made sense, retrying the function 3 functions up doesn't
> necessarily make sense at all.
>
> Whether or not you can retry or retrying makes any sense at all is _highly_
> dependent on who actually catches the exception. In many cases, it may be a
> function which could retry it, but in many it won't be, and so having the
> exception tell the caller that it could retry would just be misleading.

No dependence on context. The bit simply tells you "operation has failed, but due to a transitory matter". That is information local to the thrower.


Andrei
February 20, 2012
On Sat, Feb 18, 2012 at 11:09:23PM -0500, bearophile wrote:
> Sean Cavanaug:
> 
> > In the Von Neumann model this has been made difficult by the stack itself.  Thinking of exceptions as they are currently implemented in Java, C++, D, etc is automatically artificially constraining how they need to work.
> 
> It's interesting to take a look at how "exceptions" are designed in
> Lisp:
> http://www.gigamonkeys.com/book/beyond-exception-handling-conditions-and-restarts.html
[...]

I'm surprised nobody responded to this. I read through the article a bit, and it does present some interesting concepts that we may be able to make use of in D. Here's a brief (possibly incomplete) summary:

One problem with the try-throw-catch paradigm is that whenever an exception is raised, the stack unwinds up some number of levels in the call stack. By the time it gets to the catch{} block, the context in which the problem happened is already long-gone, and there is no other recourse but to abort the operation, or try it again from scratch. There is no way to recover from the problem by, say, trying to fix it *in the context in which it happened* and then continuing with the operation.

Say P calls Q, Q calls R, and R calls S. S finds a problem that prevents it from doing what R expects it to do, so it throws an exception. R doesn't know what to do, so it propagates the exception to Q. Q doesn't know what to do either, so it propagates the exception to P. By the time P gets to know about the problem, the execution context of S is long gone; the operation that Q was trying to perform has already been aborted. There's no way to recover except to repeat a potentially very expensive operation.

The way Lisp handles this is by something called "conditions". I won't get into the definitions and stuff (just read the article), but the idea is this:

- When D encounters a problem, it signals a "condition".

   - Along with the condition, it may register 0 or more "restarts",
     basically predefined methods of recovering from the condition.

- The runtime then tries to recover from the condition by:

   - Checking to see if there's a handler registered for this condition.
     If there is, invoke the most recently registered one *in the
     context of the function that triggered the condition*.

   - If there's no handler, unwind the stack and propagate the condition
     to the caller.

- There are two kinds of handlers:

   - The equivalent of a "catch": matches some subset of conditions that
     propagated to that point in the code. Some stack unwinding may
     already have taken place, so these are equivalent to catch block in
     D.

   - Pre-bound handlers: these are registered with the runtime condition
     handler before the condition is triggered (possibly very high up
     the call stack). They are invoked *in the context of the code that
     triggered the condition*. Their primary use is to decide which of
     the restarts associated with the condition should be used to
     recover from it.

The pre-bound handlers are very interesting. They allow in-place recovery by having high-level callers to decide what to do, *without unwinding the stack*. Here's an example:

LoadConfig() is a function that loads an application's configuration
files, parses them, and sets up some runtime objects based on
configuration file settings. LoadConfig calls a bunch of functions to
accomplish what it does, among which is ParseConfig(). ParseConfig() in
turn calls ParseConfigItem() for each configuration item in the config
file, to set up the runtime objects associated with that item.
ParseConfigItem() calls DecodeUTF() to convert the configuration file's
text representation from, say, UTF-8 to dchar. So the call stack looks
like this:

LoadConfig
	ParseConfig
		ParseConfigItem
			DecodeUTF

Now suppose the config file has some UTF encoding errors. This causes DecodeUTF to throw a DecodingError. ParseConfigItem can't go on, since that configuration item is mangled. So it propagates DecodingError to ParseConfig.

Now, ParseConfig could simply abort, but using the idea of prebound handlers, it can actually offer two ways of recovering: (1) SkipConfigItem, to simply skip the mangled config item and process the rest of the config file as usual, or (2) ReparseConfigItem, to allow custom code to manually fix a bad config item and reprocess it.

The problem is, ParseConfig doesn't know which action to take. It's too low-level to make that sort of decision. You need higher-level code, that knows what the application needs to do, to decide that. But ParseConfig can't just propagate the exception to said high-level code, because if it does, parsing of the entire config file is aborted and will have to be restarted from scratch.

The solution is to have the higher-level code register a delegate with the exception system. Something like this:

	// NOTE: not real D code
	void main() {
		registerHandler(auto delegate(ParseError e) {
			if (can_repair_item(e.item)) {
				return e.ReparseConfigItem(
					repairConfigItem(e.item));
			} else {
				return e.SkipConfigItem();
			}
		});

		ParseConfig(configfile);
	}

Now when ParseConfig encounters a problem, it signals a ParseError object with two options for recovery: ReparseConfigItem and SkipConfigItem. It doesn't try to fix the problem on its own, but it lets the delegate from main() make that decision. The runtime exception system then sees if there's a matching handler, and calls the handler with the ParseError to determine which course of action to take. If no handler is found, or the handler decides to abort, then ParseError is propagated to the caller with stack unwinding.

So ParseConfig might look something like this:

// NOTE: not real D code
auto ParseConfig(...) {
	foreach (item; config_items) {
		try {
			// Note: not real proposed syntax, this is just
			// to show the semantics of the mechanism:
			restart:
			auto objs = ParseConfigItem(item);
			SetupConfigObjects(objs);
		} catch(ParseConfigItemError) {
			// Note: not real proposed syntax, this is just
			// to show the semantics of the mechanism:
			ConfigError e;
			e.ReparseConfigItem = void delegate(ConfigItem
				fixedItem)
			{
				goto restart;
			};
			e.SkipConfigItem = void delegate() {
				continue;
			}

			// This will unwind stack if no handler is
			// found, or handler decides to propagate
			// exception.
			handleError(e);
		}
	}
}

OK, so it looks real ugly right now. But if this mechanism is built into the language, we could have much better syntax, something like this:

auto ParseConfig(...) {
	foreach (item; config_items) {
		try {
			auto objs = ParseConfigItem(item);
			SetupConfigObjects(objs);
		} recoverBy ReparseConfigItem(fixedItem) {
			item = fixedItem;
			restart;	// restarts try{} block
		} recoverBy SkipConfigItem() {
			setDefaultConfigObjs();
			continue;	// continues foreach loop
		}
	}
}

This is just a rough sketch syntax, just to show the idea. It can of course be improved upon.


T

-- 
Nobody is perfect.  I am Nobody. -- pepoluan, GKC forum
February 20, 2012
On Sunday, February 19, 2012 17:35:31 Andrei Alexandrescu wrote:
> On 2/19/12 4:20 PM, Nick Sabalausky wrote:
> > "Andrei Alexandrescu"<SeeWebsiteForEmail@erdani.org>  wrote in message
> > 
> >> The wheel is not round. We just got used to thinking it is. Exceptions
> >> are
> >> wanting and it's possible and desirable to improve them.
> > 
> > They're wanting? What's the problem with them? I see no problem, and I haven't seen you state any real problem.
> 
> I mentioned the issues here and there, but it's worth collecting them in one place.
> 
> 1. Type multiplicity. Too many handwritten types are harmful. They exacerbate boilerplate and favor code duplication.

Assuming that the exception provides additional information via member variables rather than just being a new exception type which is essentially identical to Exception save for its type, then you have to write them by hand anyway. It's only the ones where their only point of existence is their type (which therefore indicates the type of problem that occurred but isn't able for whatever reason to give any further useful information) which result in boilerplate problems. And since in general we really should be adding member variables with additional information, the boilerplate should be minimized. And if we _do_ want such classes, we can use a mixin to generate them (the downside being that they don't end up in the ddoc - though that could be fixed, and it probably should be).

We shouldn't go to town and create tons and tons of exception types, but we should also have them where they're useful.

> 2. Using types and inheritance excessively to represent simple categorical information.

try-catch blocks are _designed_ to take advantage of types and inheritance. So, moving away from using derived types for exceptions makes it harder to write exception handling code. It also works very well with inheritance to have someone catch a more general exception and ignore the existance of the more specific ones if they don't care about the specifics of what went wrong. So, the hierarchy ends up being very useful. And how do intend to _add_ additional information to exceptions if not through derived types which can then hold additional member variables? I don't see how you could have the relevant information without going out of your way to avoid inheritance by using type codes and something equivalent to void* for the data. It would just be cleaner - and better suited to how catch blocks work - to use an exception hierarchy.

It seems to me that the very design of exceptions in the language is geared towards having a class hierarchy of exceptions. And I contend that doing that works quite well. Certainly, in my experience, it does. It's when programmers try to avoid handling exceptions that things go awry (e.g. by catching Exception and then ignoring it), and that can happen regardless of your design. You can't prevent programmers from being stupid.

> 3. Forcing cross-cutting properties into a tree structure. The way I see exceptions is a semantic graph, and making it into a tree forces things like repeated catch clauses for distinct types coming from distinct parts of the hierarchy.

Except that in most cases, you want to do handle exceptions differently if they come from different  parts of the hierarchy. And making it so that you could do something like

catch(Ex1, Ex2 : Ex e)

would cover the other cases quite well.

> 4. Front-loaded design. Very finely-grained hierarchies are built on the off chance someone may need something AND would want to use a type for that.

There's definitely some truth to that. However, it's very easy to add more exception types later without disrupting code.

For instance, if you added IOException, and made all of the exsisting exception types which would make sense with that as their base class then derive from it, then new code could choose to call either IOException or the more specific exception types. And existing could would _already_ call the more specific exception types, so nothing would be disrupted.

And if you added more specific exceptions (e.g. creating FileNotFoundException and making it derive from FileException), then new code could choose to catch the more specific exception instead of FileException, but existing code would continue to catch FileException without a problem.

So, we can organize our exception hierarchy on a fairly rough level to begin with and add exceptions to it (both common and derived) as appropriate. And while we'd still end up with some up front design, because we'd have to create at least some of the hierarchy to begin with, the fact that we can add to the hierarchy later without breaking code minimizes the need to add stuff that we think that we _might_ need.

It's removing exception types and moving them from one part of the hierarchy to another which is disruptive. And while avoiding those _does_ mean doing a good job with how you initially put exceptions into a hierarchy, I don't think that it's that hard to avoid once you've got a basic hierarchy, especially if we start with a minimal number of exception types and add more later as appropriate rather than creating a large, complex hierarchy to begin with. The main breakage would be in moving from the module-specific exceptions to a hierarchy, and much of that wouldn't break anything, because as we don't have much of a hierarchy right now, it would only be an issue with the ones that we decided to remove.

> 5. Unclear on the protocol. The "well-designed" phrase has come about often in this thread, but the specifics are rather vague.

As with any design, we'd have to look at concrete examples and examine their pros and cons. I think that looking at C# and Java is a good place to start. That should give us at least a good idea of what _they_ think is a well- designed hierarchy, and we can take the pieces that best apply to us and our situation. We don't need to start with anything as large as they have, but it's something to work from.

> 6. Bottom-heavy base interface. Class hierarchies should have significant functionality at upper levels, which allows generic code to do significant reusable work using the base class. Instead, exceptions add functionality toward the bottom of the hierarchy, which encourages non-reusable code that deals with specifics.

True. But I don't see a way around that. The more specific the problem, the more information that you have. The more general the problem, the less information that you have. So, you naturally end up with more data in the derived classes. And for the most part, it's the data that you want, not polymorphic functions. Polymorphism is of limited usefulness with exceptions. It's the hierarchy which is useful. In fact, you could probably have exceptions be non-polymorphic if we had a construct in the language which had inheritance but not polymorphism. But we don't. So, while I think that you make a valid point, I still think that using classes with an inheritance hierarchy is the best fit.

- Jonathan M Davis