| |
| Posted by Uknown in reply to auto | PermalinkReply |
|
Uknown
| On Sunday, 1 April 2018 at 01:19:08 UTC, auto wrote:
> What is auto decoding and why it is a problem?
Auto-decoding is essentially related to UTF representation of Unicode strings. In D, `char[]` and `string` represent UTF8 strings, `wchar[]` and `wstring` represent UTF16 strings and `dchar[]` and `dstring` represent UTF32 strings. You need to know how UFT works in order to understand auto-decoding. Since in practice most code deals with UTF8 I'll explain wrt that. Essentially, the problem comes down to the fact that not all the Unicode characters are representable by 8 bit `char`s (for UTF8). Only the ASCII stuff is represented by the "normal" way. UTF8 uses the fact that the first few buts in a char are never used in ASCII, to tell how many more `char`s ahead that character is encoded in. You can watch this video for a better understanding[0]. By default though, if one were to traverse a `char` looking for characters, they would get unexpected results with Unicode data
Auto-decoding tries to solve this by automatically applying the algorithm to decode the characters to Unicode "Code-Points". This is where my knowledge ends though. I'll give you pros and cons of auto-decoding.
Pros:
* It makes Unicode string handeling much more easier for beginners.
* Much less effort in general, it seems to "just workâ˘"
Cons:
* It makes string handling slow by default
* It may be the wrong thing, since you may not want Unicode code-points, but graphemes instead.
* Auto-decoding throws exceptions on reaching invalid code-points, so all string
handling code in general throws exceptions.
If you want to stop auto-decoding, you can use std.string.representation like this:
import std.string : representation;
auto no_decode = some_string.representation;
Now no_decode wont be auto-decoded, and you can use it in place of some_string. You can also use std.utf to decode by graphemes instead.
You should also read this blog post: https://jackstouffer.com/blog/d_auto_decoding_and_you.html
And this forum post: https://forum.dlang.org/post/eozguhavggchzzruzkwk@forum.dlang.org
[0]: https://www.youtube.com/watch?v=MijmeoH9LT4
|