April 07, 2015
On Tuesday, 7 April 2015 at 04:05:38 UTC, Vladimir Panteleev wrote:
> On Tuesday, 7 April 2015 at 03:17:26 UTC, Walter Bright wrote:
>> http://wiki.dlang.org/DIP76
>
> I am against this. It can lead to silent irreversible data corruption.

Sigh!

99.99% of the time when I'm processing text.... my program didn't create the text.

An eclectic mob of text editors driven by a herd of cats each having wildly different concepts of encoding wrote it.

99.999% of the time when I hit one of these cases... the "irreversible data corruption" is _already_ there.

Tough.

It's there, it's irreversible, I have to live with it and make forward progress.

Sure, on some tasks I want to know it is there.... but by far in most tasks all I can do is shrug, slap it to something sensible, and carry on.

One of the first things I had to do in D was write code to do this.... and it all seem way harder and slower than it needed to be.

(Oh for The Simple Fun Good Bad Old Days of everything is 7 bit ASCII... except for the funny stuff above 127 which you ignored anyway.)
April 08, 2015
On Tuesday, 7 April 2015 at 18:18:55 UTC, H. S. Teoh wrote:
> If somebody were to write a DIP for killing autodecoding, I'd vote in
> favor.
>
> Getting it past Andrei, OTOH, is a different story. ;-)

http://forum.dlang.org/post/luonbfghopyrtcoejjsu@forum.dlang.org
But how DIP can address a non-technical issue?
April 19, 2015
On Monday, April 06, 2015 20:16:19 Walter Bright via Digitalmars-d wrote:
> http://wiki.dlang.org/DIP76

I am fully in favor of this. Most code really doesn't care about invalid unicode, and if it does, it can check explicitly. Using the replacement character is much cleaner and follows the Unicode standard.

And in my experience, if I run into invalid Unicode, I generally have to process it regardless, forcing me to do something like use the replacement character anyway. The fact that std.utf.decode throws just becomes an annoyance.

About the only real downside to this that I can think of is that if you're writing a new string algorithm, and you botch it such that it mangles the Unicode, right now, you'd quickly get exceptions, whereas with this change, you wouldn't. But if you're testing your string-based code with Unicode rather than just ASCII, then that should still get caught. Regardless, I think that this is the way to go.

- Jonathan M Davis

1 2 3
Next ›   Last »