DIP76: Autodecode Should Not Throw

Apr 07, 2015

Walter Bright

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 08, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 07, 2015

Apr 19, 2015

On Tuesday, 7 April 2015 at 04:05:38 UTC, Vladimir Panteleev wrote: > On Tuesday, 7 April 2015 at 03:17:26 UTC, Walter Bright wrote: >> http://wiki.dlang.org/DIP76 > > I am against this. It can lead to silent irreversible data corruption. Instead, I would like to suggest promoting the use of `handle` and the like: http://dlang.org/phobos/std_exception.html#handle This way, code that needs to be nothrow can opt in to be nothrow via such composition, which is also aligned with that introducing the risk of silent data corruption needing to be opt-in.

On Tuesday, 7 April 2015 at 04:05:38 UTC, Vladimir Panteleev wrote: > On Tuesday, 7 April 2015 at 03:17:26 UTC, Walter Bright wrote: >> http://wiki.dlang.org/DIP76 > > I am against this. It can lead to silent irreversible data corruption. I can see the value in both. With something like Objective C on iOS, basically everything is nothrow. They don't do any cleanup for references when exceptions happen, so they don't generate slower reference counting code. Exceptions in Objective C on iOS are not supposed to be caught ever. So you don't use exceptions and garbage collection, your code runs pretty fast, and your applications are smooth. On the other hand, not throwing the exceptions leads to silent failures, which can lead to creating garbage data. Objective C in particular is designed to tolerate failure, given that messages run on nil objects simply do nothing and return cast(T) 0 for the message's return type. You're in a world of checking return codes, validating data, etc. Maybe autodecoding could throw an Error (No 'new' allowed) when debug mode is on, and use replacement characters in release mode. I haven't thought it through, but that's an idea.

On Tuesday, 7 April 2015 at 07:42:02 UTC, w0rp wrote: > Maybe autodecoding could throw an Error (No 'new' allowed) when debug mode is on, and use replacement characters in release mode. I haven't thought it through, but that's an idea. No no no, terrible idea. This means your program will pass your test suite in debug mode (which, of course, is never going to test behavior with bad UTF in all the relevant places), but silently corrupt real-world data in release mode. Errors and asserts are for logic errors, not for validating user input!

On Tuesday, 7 April 2015 at 03:17:26 UTC, Walter Bright wrote: > http://wiki.dlang.org/DIP76 Deprecation can be reported by checking version: version(EnableNothrowAutodecoding) alias autodecode=autodecodeImpl; else @deprecated("compile with -version=EnableNothrowAutodecoding") alias autodecode=autodecodeImpl;

On Tuesday, 7 April 2015 at 03:17:26 UTC, Walter Bright wrote: > http://wiki.dlang.org/DIP76 I have doubts about it similar to Vladimir. Main problem is that I have no idea what actually happens if replacement characters appear in some unicode text my program processes. So far I have that calming feeling that if something goes wrong in this regard, exception will slap me right into my face. Also it is worrying to see so much effort put into `nothrow` in language which endorses exceptions as its main error reporting mechanism.

On 4/7/2015 1:19 AM, Dicebot wrote: > I have doubts about it similar to Vladimir. Main problem is that I have no idea > what actually happens if replacement characters appear in some unicode text my > program processes. It's much like floating point NaN values, which are 'sticky'. > So far I have that calming feeling that if something goes > wrong in this regard, exception will slap me right into my face. With UTF strings, if you care about invalid UTF (a surprisingly large amount of operations done on strings simply don't care about invalid UTF) the validation can be done as a separate step. Then, the program logic is divided into operating on "validated" and "unvalidated" data. > Also it is worrying to see so much effort put into `nothrow` in language which > endorses exceptions as its main error reporting mechanism. There is definitely a tug of war going on there. Exceptions are great, except they aren't free. What I've tried to do is design things so that erroneous input is not possible - that all possible input has straightforward output. In other words, try to define the problem out of existence. Then there are no errors.

April 07, 2015

Re: DIP76: Autodecode Should Not Throw

Posted by Vladimir Panteleev
in reply to Walter Bright

Permalink

Vladimir Panteleev

Posted in reply to Walter Bright

Permalink

On Tuesday, 7 April 2015 at 09:04:09 UTC, Walter Bright wrote:
> On 4/7/2015 1:19 AM, Dicebot wrote:
>> I have doubts about it similar to Vladimir. Main problem is that I have no idea
>> what actually happens if replacement characters appear in some unicode text my
>> program processes.
>
> It's much like floating point NaN values, which are 'sticky'.

Yes, but std.conv doesn't return NaN if you try to convert "banana" to a double.

> With UTF strings, if you care about invalid UTF (a surprisingly large amount of operations done on strings simply don't care about invalid UTF) the validation can be done as a separate step.

So can converting invalid UTF to replacement characters.

>> Also it is worrying to see so much effort put into `nothrow` in language which
>> endorses exceptions as its main error reporting mechanism.
>
> There is definitely a tug of war going on there. Exceptions are great, except they aren't free.
>
> What I've tried to do is design things so that erroneous input is not possible - that all possible input has straightforward output. In other words, try to define the problem out of existence. Then there are no errors.

I think the correct solution to that is to kill auto-decoding :) Then all decoding is explicit, and since it is explicit, it is trivial to allow specifying the desired behavior upon encountering invalid UTF-8.

Vladimir Panteleev: > std.conv doesn't return NaN if you try to convert "banana" to a double. I have suggested to add a nothrow function like "maybeTo" that returns a Nullable result. Bye, bearophile

Forums