Autodecode?

Aug 16, 2020

Related to this thread: https://forum.dlang.org/post/xtjzhkvszdiwvrmryubq@forum.dlang.org I don't want to hijack it with my newbie questions. What is autodecode and why is it such a big deal? From what I've seen it's related to handling Unicode characters? And D has the wrong defaults?

On Sunday, 16 August 2020 at 20:53:41 UTC, JN wrote: > Related to this thread: https://forum.dlang.org/post/xtjzhkvszdiwvrmryubq@forum.dlang.org > > I don't want to hijack it with my newbie questions. What is autodecode and why is it such a big deal? From what I've seen it's related to handling Unicode characters? And D has the wrong defaults? For built-in arrays, the range primitives (empty, front, popFront, etc.) are implemented as free functions in the standard-library module `std.range.primitives`. [1] For most arrays, these work the way you'd expect: empty checks if the array is empty, front returns `array[0]`, and popFront does `array = array[1..$]`. But for char[] and wchar[] specifically, `front` and `popFront` work differently. They treat the arrays as UTF-8 or UTF-16 encoded Unicode strings, and return/pop the first *code point* instead of the first *code unit*. In other words, they "automatically decode" the array. This has a number of annoying consequences. New users get mysterious template errors in the middle of range pipelines complaining about a mismatch between `dchar` (the type of a code point) and `char` (the type of a code unit). Generic code that deals with arrays has to add special cases for char[] and wchar[]. Strings don't work correctly in betterC because Unicode decoding can throw an exception. [2] If you search the forums, you'll find plenty more complaints. The intent behind autodecoding was to help programmers avoid common Unicode-related errors by doing "the right thing" by default. The problem is that (a) decoding to code points isn't always the right thing, and (b) autodecoding ended up causing a bunch of additional problems of its own. [1] http://dpldocs.info/experimental-docs/std.range.primitives.html [2] https://issues.dlang.org/show_bug.cgi?id=20139