November 18, 2013
On 11/18/2013 1:23 PM, Jonathan M Davis wrote:
> Then what would you suggest for dealing with cases where a nothrow function
> calls a function which can throw in general but can't throw with the given
> arguments? Completely separate from the issue of whether std.datetime should
> need to be calling a throwing function from a nothrow function as often as it
> does, it's still an issue which must be dealt with in general. At present, any
> nothrow function which wants to call a throwing function needs to use a try-
> catch block to wrap the throwing function, and it makes good sense to use
> assert(0) to catch the case where the caller screwed up and the function can
> indeed throw. However, given that that throwing function should never throw,
> it's desirable to be able to optimize out the try-catch-assert(0). How would
> we do that without taking advantage of the fact that assert(0) indicates that
> that line of code should be unreachable?

I believe this is solving the problem in the wrong place. The function being called should not be throwing, and should be defined as not throwing. See my other posts in this thread on the topic.

November 18, 2013
On 11/18/2013 1:04 PM, Jacob Carlborg wrote:
> On 2013-11-18 21:49, Walter Bright wrote:
>
>> Most architectures I've used supported a HLT, and there's always
>> something the back end could construct if it had to, even:
>>
>>       ...
>>       call halt
>>       ...
>
> In that case, would we want it to be legal to throw AssertError instead of using
> the halt instructions? Because that's what the docs says.

Sure.

November 18, 2013
On 11/18/2013 1:16 PM, Jonathan M Davis wrote:
> On Monday, November 18, 2013 12:18:31 Walter Bright wrote:
>> I agree. But I'll add that it would be even better to redesign FracSec.from
>> so it never throws at all - have it assert(0) for invalid values being
>> passed to it.
>>
>> Exceptions should not be thrown for invalid arguments. Data validation
>> should be handled separately.
>
> I think that that depends. In some cases, it would just mean duplicating work
> if the API expected the caller to verify the arguments' validity first. A prime
> example of this would be functions which parse strings. It would be highly
> inefficient for a function like fromISOExtString to assert instead of enforcing.
> The caller would have to do the exact work that the function has to do in
> order to do the check, so it's just wasteful to expect the caller to ensure
> the validity of the input.

I think this is exactly a case where you want to have data validation separate from operating on that data. Operating on the data should be efficient, and it is not if it must also validate.


> Constructors for types like SysTime or DateTime are in a bit more of a grey
> area in that in most cases what's passed is likely to be valid - e.g.
> hopefully, when someone gives a date, they have some clue that the date is
> valid ahead of time. However, there are cases where the caller really has no
> way of knowing ahead of time whether a date is going to be valid or not - e.g.
> is February 29th a valid date in year X? A function could be provided to check
> the validity ahead of time, but that puts a greater burden on the caller and
> ends up duplicating the checks. It also would have created more code
> duplication in std.datetime in some places, because the checks would have to
> be put in more places rather than having them in a centralized location (e.g.
> Date's constructor can check the validity of the values, and DateTime can just
> take advantage of that rather than having to duplicate the check). I do
> acknowledge however that it's a grey area.

Again, I think data validation must be a separate function. std.datetime is a prime example of why.


> In general, std.datetime takes a defensive programming approach rather than
> the DbC approach with the idea that it's often likely that the input is going
> to be originate from a user or file and that many of the checks have to be done
> regardless, so asking the programmer to do them just adds overhead. In some
> cases, it arguably should have used DbC, but there were enough places that it
> should have been using defensive programming already that it was more
> consistent to use defensive programming in any of the grey areas.

I think the fact that there's a lot of code in std.datetime that asserts if the underlying functions throw demonstrates my point.

Input data validation and operating on the data should be separate operations, not combined.

Functions that operate on data should assert on bad arguments, not throw.

November 18, 2013
On 11/18/2013 1:02 PM, Jacob Carlborg wrote:
> I really don't understand the point in having some functions thrown and others
> assert.

That's a red flag that there is a faulty design somewhere, in this case I think it's not having a separation between input data validation and program bug detection.

November 18, 2013
On Monday, November 18, 2013 13:56:39 Walter Bright wrote:
> On 11/18/2013 1:16 PM, Jonathan M Davis wrote:
> > On Monday, November 18, 2013 12:18:31 Walter Bright wrote:
> >> I agree. But I'll add that it would be even better to redesign
> >> FracSec.from
> >> so it never throws at all - have it assert(0) for invalid values being
> >> passed to it.
> >> 
> >> Exceptions should not be thrown for invalid arguments. Data validation should be handled separately.
> > 
> > I think that that depends. In some cases, it would just mean duplicating work if the API expected the caller to verify the arguments' validity first. A prime example of this would be functions which parse strings. It would be highly inefficient for a function like fromISOExtString to assert instead of enforcing. The caller would have to do the exact work that the function has to do in order to do the check, so it's just wasteful to expect the caller to ensure the validity of the input.
> 
> I think this is exactly a case where you want to have data validation separate from operating on that data. Operating on the data should be efficient, and it is not if it must also validate.

But that would just duplicate the validation. You validate by parsing the string, and you extract the necessary data from it by parsing it. Validating the data first would just double the work - on top of the fact that strings are most likely to have come from outside the program rather than having been generated internally and then parsed internally. This is exactly the sort of case where I think that separate validation makes no sense. Separate validation only makes sense when the result is _less_ overhead, not more.

Separate validation is of benefit when the caller can avoid the validation in most cases, and the function itself doesn't need to do it, so the validition only occurs when the caller needs it. If the validation has to be done anyway, then there's no point in having separate validation. It just increases overhead.

- Jonathan M Davis
November 18, 2013
On Monday, November 18, 2013 13:50:42 Walter Bright wrote:
> On 11/18/2013 1:23 PM, Jonathan M Davis wrote:
> > Then what would you suggest for dealing with cases where a nothrow
> > function
> > calls a function which can throw in general but can't throw with the given
> > arguments? Completely separate from the issue of whether std.datetime
> > should need to be calling a throwing function from a nothrow function as
> > often as it does, it's still an issue which must be dealt with in
> > general. At present, any nothrow function which wants to call a throwing
> > function needs to use a try- catch block to wrap the throwing function,
> > and it makes good sense to use assert(0) to catch the case where the
> > caller screwed up and the function can indeed throw. However, given that
> > that throwing function should never throw, it's desirable to be able to
> > optimize out the try-catch-assert(0). How would we do that without taking
> > advantage of the fact that assert(0) indicates that that line of code
> > should be unreachable?
> 
> I believe this is solving the problem in the wrong place. The function being called should not be throwing, and should be defined as not throwing. See my other posts in this thread on the topic.

Sometimes that's true, but if you're asserting that it's not going to be the case that folks are going to need to calling functions which can throw but won't throw from inside nothrow functions, I think that your dead wrong. If nothing else, you don't always have control over what is and isn't going to throw. Maybe it's uncommon enough that it's not worth trying to optimize out the try-catch blocks, but I think that asserting that you won't ever need to call a throwing function from a nothrow function when you know it won't throw is like asserting that @trusted isn't needed.

- Jonathan M Davis
November 18, 2013
On Monday, November 18, 2013 22:04:59 Jacob Carlborg wrote:
> In that case, would we want it to be legal to throw AssertError instead of using the halt instructions? Because that's what the docs says.

For when you're not in release mode. You want the more informative AssertError in that case. HLT is just for when assertions get compiled out.

- Jonathan M Davis
November 18, 2013
On 11/18/13 12:53 PM, Jacob Carlborg wrote:
> On 2013-11-18 16:56, Andrei Alexandrescu wrote:
>
>> But three newlines for four lines? That's not paragraphs, it's a grocery
>> list.
>
> As Jonathan said, this example is quite bad in the empty lines vs code
> lines ratio.
>
> But I general I do put an empty newline before and after each statement.

So that would be two empty lines between statements :o). Anyhow, assuming you only meant one, that would be excessive if done for _all_ statements (i.e. everything ending with a semicolon). It would make code look like a doc printed by someone who didn't know how to turn double spacing off.


Andrei

November 18, 2013
On 11/18/13 1:50 PM, Walter Bright wrote:
> I believe this is solving the problem in the wrong place. The function
> being called should not be throwing, and should be defined as not
> throwing. See my other posts in this thread on the topic.

I think a stance of unceremoniously halting the program upon passing the wrong parameters to a standard library function is a bit excessive.

Andrei


November 18, 2013
I'm glad we're discussing this. There's an important misunderstanding.

On 11/18/2013 2:16 PM, Jonathan M Davis wrote:
> But that would just duplicate the validation. You validate by parsing the
> string, and you extract the necessary data from it by parsing it. Validating
> the data first would just double the work - on top of the fact that strings are
> most likely to have come from outside the program rather than having been
> generated internally and then parsed internally. This is exactly the sort of
> case where I think that separate validation makes no sense. Separate
> validation only makes sense when the result is _less_ overhead, not more.

The point of asserts is that they are supposed to be redundant. (If they were not redundant, then they would trip and you'd have a program bug.) Asserts are there to detect program bugs, not to validate input data. This is also why the optimizer can remove asserts without affecting the meaning of the code.


> Separate validation is of benefit when the caller can avoid the validation in
> most cases, and the function itself doesn't need to do it, so the validition
> only occurs when the caller needs it. If the validation has to be done anyway,
> then there's no point in having separate validation. It just increases
> overhead.

Consider again that std.datetime is chock full of code that requires the underlying functions to not throw.

Anytime there is code of the pattern:

   try {
        func();
   } catch (Exception e) {
       assert(0, "func() should not have thrown");
   }

then func() has the wrong API or is being misused.

Again, data validation and bug checking are two very different things and should not be combined.

Note that I have screwed this up myself in the std.uni API.