April 03, 2013
On Wednesday, 3 April 2013 at 15:33:53 UTC, deadalnix wrote:
> On Wednesday, 3 April 2013 at 09:44:23 UTC, Lars T. Kyllingstad wrote:
>> I don't see the incompatibility.  This is exactly the purpose of final switch.  If the user didn't want to be forced to handle a new error category, they'd use normal switch instead.
>>
>
> This is a good thing in your own code. But in phobos, this is a guarantee to break a lot of user code when adding to the enum.
>

You're using final switch because you want to make statically sure that you consider every enum member. If you now add a another enum value the code is by definition broken regardless of dmd issues an error or not. The same is true for any kind of virtual dispatch (if a normal switch with default: wasn't enough your base class method, won't be either).

>> You have yet to specify the problems with switch in OOP.  Maybe you meant something else, that final switch doesn't solve?
>>
>
> The idiomatic way to execute code depending on a value in OOP is virtual dispatch.


April 03, 2013
On Wednesday, 3 April 2013 at 15:29:01 UTC, deadalnix wrote:
> On Wednesday, 3 April 2013 at 09:40:27 UTC, Lars T. Kyllingstad wrote:
>> For arrays, RangeError is synonymous with "out of bounds".  I see no reason to invent a new class just for this purpose.
>>
>
> Exactly, no need for RangeError (as out of bound access can be produced in many situation that don't involve ranges).

Well the word "range" has two meanings here:  One is the programming language concept, i.e. a type that defines empty, front, etc.  The other is the English word for which Google offers the following definition: "The area of variation between upper and lower limits on a particular scale."  In light of this, I don't think RangeError is a bad name for an array bounds violation.


>> And note that I'm not saying that ranges should be restricted to *only* throwing RangeErrors.  Generally, it should be used in situations that are analogous to out of bounds for arrays, such as trying to pop past the end of the range.
>>
>
> The data always come from somewhere. That somewhere can't provide anymore data for a reason. Because you reached the end of a file (IOException or something) because the network disconnected (NetworkException) or whatever.
>
> RangeError imply that the data magically appears from a range, without any actual source, which is impossible.

Of course it's not impossible.

std.range.iota
std.range.recurrence
std.range.sequence

>> However, some ranges may want to do something else in this situation.  An output range that writes to a file, for instance, may want to throw a "disk full" exception.  A wrapper range may simply propagate any errors from the underlying range.
>>
>
> Exactly. In general, wrapper should simply forward calls and let the source fail when its own logic isn't involved.

This is not always possible or convenient.  Assume the directory "foo" contains 4 files.

    auto fourFiles = std.file.dirEntries("foo", SpanMode.shallow);
    auto twoFiles = std.range.take(fourFiles, 2);
    twoFiles.popFront();
    twoFiles.popFront();
    twoFiles.popFront();

Which exception should the last popFront() throw?

Lars
April 03, 2013
On Wed, 03 Apr 2013 11:32:07 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:

> On Wednesday, 3 April 2013 at 14:32:59 UTC, Steven Schveighoffer wrote:

>> I disagree here.  There are two "users" involved, one is the actual user, typing a command on the command line, and then the developer who uses the function.  The developer should be checked with assert, the user should be checked with code like you wrote.
>>
>> The problem becomes apparent when developers don't check user input before passing to your functions.  That is on them, not you.  The library should be able to have all the safety checks removed to improve performance.
>
> Some, yes, but not all.  You always have to weigh the benefit, i.e. the improved performance, against the drawbacks, i.e. reduced safety.  If you are removing trivial safety checks from a function that performs a very expensive and possibly dangerous operation -- a disk operation, say -- you're doing something wrong.

The problem with this is if you "weigh the benefit" in the library, then the user of the library has no choice.

Especially in phobos or druntime, we are disallowing libraries that build on top of them from saying "yes, I know this data is correct".  In essense, he is stuck checking user input, and then passing the data to your function, which checks it again.  There is no way out.

> I agree it should be possible to remove safety checks from functions which are expected to be performant, and where the checks will have an impact (e.g. the range primitives), but it should be done with a less attractive compiler switch than -release.

I think it needs to be on a call-by-call basis, not a compiler-global.  The same function could be called in the program twice, once with user-supplied data, once with static data.  In the former, I want to run the checks, in the latter, I don't.

> I think it's a big mistake to encourage programmers to ship their programs with array bounds checks and the like disabled.  Such programs should be the exception, not the rule.  It's always better to err on the side of safety rather than performance.

No, I completely disagree.  It's very rare that you need bounds checks on data that is generated by the program.

Note that you can enable bounds checks by writing code in @safe D if that's how you wish to operate.

-Steve
April 03, 2013
On 04/02/2013 08:16 PM, Jesse Phillips wrote:> On Tuesday, 2 April 2013 at 19:10:47 UTC, Ali Çehreli wrote:

>> > I've mostly enjoyed having temporary files being cleaned up
>> upon some
>> > range access error which has no effect on my removing files
>> that are no
>> > longer valid.
>>
>> The problem is, the runtime cannot know that it will be doing what you
>> really wanted. The incorrect program state may result in deleting the
>> wrong file.
>
> See above, the state isn't invalid. The error is thrown which is
> stating, "hey, buddy, good thing you didn't flip that release switch as
> I'm about to do something I shouldn't be."

An assert() is for situations that the programmer knows (!) to never exist. If my foo() function called my bar() function with 42, it is an invalid state. That is when assert() failed and threw Error.

The Error is not thrown before entering the invalid state but at the first point that the error has been discovered. In that regard, the program is in a bad state.

> However D does allow nothrow functions to throw Errors. I wouldn't say
> this would cause the program enter into an invalid state (memory
> corruption) but it would be a bad state (contract violations).

There are some issues with contract programming in D. For example, some people think that whether the pre-conditions of a module to be compiled should be determined by the user of that module, not the library (e.g. Phobos).

Because the above is not the case today, if I write a function, I cannot put the function pre-conditions in 'in' blocks because I don't know whether my function is being called as an implementation of my module or as an API function. The API function foo() may also be used as part of the implementation of the same module. (Maybe the same pre-condition checks should be repeated in the 'in' block and in the body; as asserts and corresponding enforces.)

For that reason, in general, I think that pre-conditions should by default be implemented as enforce() calls in the function body, not assert() calls in the 'in' blocks.

With that aside, a failed assert() in an 'in' block should still point at an invalid program state.

> Take the RangeError thrown when you pop an empty range. Under what
> scenario would receiving one of these would indicate that my file isn't
> correct for deletion (any more so than say a ConvException from the same
> range).
>
>      auto myFile = "some.tmp";
>      scope(exit) remove(myFile);
>
>      // setup code here
>      manipulateFileRange(range);

We are in agreement that it would be impossible to prove one way or the other whether removing the file would be the right thing to do or whether it will succeed. The difference in thinking is whether that should be attempted at all when some part of the code has determined that something is wrong with the program.

Ali

April 03, 2013
On Wed, Apr 03, 2013 at 09:19:24AM -0700, Ali Çehreli wrote:
> On 04/02/2013 08:16 PM, Jesse Phillips wrote:> On Tuesday, 2 April
[...]
> > However D does allow nothrow functions to throw Errors. I wouldn't
> > say this would cause the program enter into an invalid state (memory
> > corruption) but it would be a bad state (contract violations).
> 
> There are some issues with contract programming in D. For example, some people think that whether the pre-conditions of a module to be compiled should be determined by the user of that module, not the library (e.g. Phobos).
> 
> Because the above is not the case today, if I write a function, I cannot put the function pre-conditions in 'in' blocks because I don't know whether my function is being called as an implementation of my module or as an API function. The API function foo() may also be used as part of the implementation of the same module. (Maybe the same pre-condition checks should be repeated in the 'in' block and in the body; as asserts and corresponding enforces.)

This is very bad. It makes greatly diminishes the value of DbC in D. What are the obstacles preventing us from fixing DMD so that contracts are compiled with user code instead of library code?


> For that reason, in general, I think that pre-conditions should by
> default be implemented as enforce() calls in the function body, not
> assert() calls in the 'in' blocks.
[...]

Yes, this is the result of the wrong implementation of putting contracts in the called function rather than the caller. It makes 'in' blocks useless when you don't have ready access to library source code.


T

-- 
Computers shouldn't beep through the keyhole.
April 03, 2013
On Wednesday, April 03, 2013 16:23:31 Lars T. Kyllingstad wrote:
> On Wednesday, 3 April 2013 at 14:11:09 UTC, Steven Schveighoffer
> 
> wrote:
> > The problem I have with this whole scheme of one class per error type is, you inevitably cannot cover everyone's use case, so they end up having to catch a base class and then doing the work to figure out what it is manually.
> 
> Precisely. And then, a switch over an enum is both way more efficient and more readable than a bunch of ifs and casts.

Except that the enum solution doesn't work at all when a 3rd party needs to add an error type to the mix, so they'll subclass it, and in the general case, you're forced to deal with the base class regardless.

In general, I think that using error codes in exception types like you're basically suggesting is not properly taking advantage of the language's built- in exception features. catch is _designed_ for differentiating between errors, and using error codes in exceptions circumvents that. I think that the only case where it's cleaner to have the code is when you want to handle two exceptions of differing types with the same handler but don't want to handle all exceptions with their common base type in the same way. Other than that, I think that it's much better to let catch do its job - especially when people _do_ need to handle each exception type differently.

- Jonathan M Davis
April 03, 2013
03-Apr-2013 21:43, Jonathan M Davis пишет:
> On Wednesday, April 03, 2013 16:23:31 Lars T. Kyllingstad wrote:
>> On Wednesday, 3 April 2013 at 14:11:09 UTC, Steven Schveighoffer
>>
>> wrote:
>>> The problem I have with this whole scheme of one class per
>>> error type is, you inevitably cannot cover everyone's use case,
>>> so they end up having to catch a base class and then doing the
>>> work to figure out what it is manually.
>>
>> Precisely. And then, a switch over an enum is both way more
>> efficient and more readable than a bunch of ifs and casts.
>
> Except that the enum solution doesn't work at all when a 3rd party needs to
> add an error type to the mix,

This is the critical point. There shouldn't ever be a need to add a new exception type to differ in a minor way. Mark you wild type as unknown Kind of IOExcpetion is you need it.

If you want error to propagate some extra information that's a whole other story but just subclassing everything and putting you brand name on it doesn't scale. It doesn't help standardized error handling nor with propagating specific info. Having million kinds (each library with its own) to propagate this info is awful and proven to suck (see C++).


>
> In general, I think that using error codes in exception types like you're
> basically suggesting is not properly taking advantage of the language's built-
> in exception features. catch is _designed_ for differentiating between errors

Between classes of Exceptions. Not between individual subtle thingies.


-- 
Dmitry Olshansky
April 03, 2013
On Wednesday, April 03, 2013 10:37:31 deadalnix wrote:
> On Tuesday, 2 April 2013 at 20:11:31 UTC, Lars T. Kyllingstad
> 
> wrote:
> > No. To call front on an empty range is a programming error, plain and simple. It's like trying to access the first element of an empty array. The fact that some ranges may allow it anyway does not change anything.
> 
> It is illegal for a reason. For instance, with an array, it is an out of bound access. I see ne benefice to hide this information in a more generic RangeError. This is hiding information and providing nothing more.

RangeError _is_ out-of-bounds-access. Range in this context has nothing to do with ranges in the D sense (though that would be a good argument for changing it to something more like OutOfBoundsError).

I'm not sure that I'd use RangeError for the popFront case (if I didn't, I'd just a normal assertion), but I don't think that it's necessarily wrong or at all bad to use a RangeError in that case. We've definitely started moving towards using RangeError in version(assert) blocks for opIndex and opSlice in std.range and std.algorithm. And we're doing that precisely to make them act more like arrays.

- Jonathan M Davis
April 03, 2013
On Wednesday, April 03, 2013 17:41:52 Tobias Pankrath wrote:
> On Wednesday, 3 April 2013 at 15:33:53 UTC, deadalnix wrote:
> > On Wednesday, 3 April 2013 at 09:44:23 UTC, Lars T. Kyllingstad
> > 
> > wrote:
> >> I don't see the incompatibility. This is exactly the purpose of final switch. If the user didn't want to be forced to handle a new error category, they'd use normal switch instead.
> > 
> > This is a good thing in your own code. But in phobos, this is a guarantee to break a lot of user code when adding to the enum.
> 
> You're using final switch because you want to make statically sure that you consider every enum member. If you now add a another enum value the code is by definition broken regardless of dmd issues an error or not. The same is true for any kind of virtual dispatch (if a normal switch with default: wasn't enough your base class method, won't be either).

Yeah. I'm very much in favor of using derived classes over putting error types in the exceptions, but I'm not sure that breaking code using final switch is a good argument against the error types. You use final switch _because_ you want your code to break if the list of enum members changes. Then again, given the increased focus on not breaking user code and Walter's general attitude about that, I expect that he'd then be against ever adding or removing members from such an enum, because it _would_ break code.

- Jonathan M Davis
April 03, 2013
On Wednesday, April 03, 2013 10:33:00 Steven Schveighoffer wrote:
> On Wed, 03 Apr 2013 01:59:32 -0400, Lars T. Kyllingstad
> 
> <public@kyllingen.net> wrote:
> > On Wednesday, 3 April 2013 at 05:42:13 UTC, Lars T. Kyllingstad wrote:
> >> The problem with assert is that it gets disabled in release mode.
> >> 
> >> I think it is a bad idea to have this be the "standard" behaviour of
> >> 
> >> parameter validation.
> > 
> > Allow me to clarify my position on assert() a bit. As a general rule, I think assert() should be used to verify internal program flow and program invariants, and nothing more. Specifically, public APIs should *not* change depending on whether asserts are compiled in or not.
> > 
> > Say I am writing a function that you are using. I don't trust you to always give me correct parameters, so I check them. (Maybe my function could even do something dangerous if I didn't.)
> > 
> > public void myFunction(someArgs)
> > {
> > 
> > if (someArgs are invalid)
> > 
> > throw new InvalidArgumentError;
> > 
> > ...
> > 
> > }
> 
> I disagree here. There are two "users" involved, one is the actual user, typing a command on the command line, and then the developer who uses the function. The developer should be checked with assert, the user should be checked with code like you wrote.
> 
> The problem becomes apparent when developers don't check user input before passing to your functions. That is on them, not you. The library should be able to have all the safety checks removed to improve performance.

I completely agree with this. There _are_ Errors which should definitely stay in in release, but in general, anything relating to DbC should be removed in release. It's then up to the caller to decide what testing they do or don't want in release.

> I wish there was a way to say "this data is unchecked" or "this data is checked and certified to be correct" when you call a function. That way you could run the in contracts on user-specified data, even with asserts turned off, and avoid the checks in release code when the data has already proven valid.

That would definitely be cool, though I have no idea how we could ever implement that beyond creating wrapper types which indicated whether the data was checked or not, which would introduce extra overhead - though if you're willing to overload all of your functions, you could do something like

Verify!T verifyFoo(T param)
{
 //do checks...
 return Verify!T(param);
}

void foo(T param)
{
 foo(verifyFoo(param));
}

void foo(Verified!T param)
{
 ...

}

That would be rather intrusive, but you _could_ do it if you wanted to.

However, the way that contracts really _should_ work is that whether they're enabled or not is completely dependent on how the caller is compiler rather than the callee, whereas right now, other than templated code, it's based entirely on how the callee is compiled. That way, it's completely up to the caller whether the checks happen or not. But unfortunately, I don't think that it would ever work very well from an implementation standpoint for it to work that way - not as long as we're using the C linker - since you'd need a way to associate the contracts with the functions in a way that it was up to the caller to run them or not, and I don't think that that's really possible with how C linking works.

- Jonathan M Davis