Thread overview | |||||
---|---|---|---|---|---|
|
September 20, 2017 wstring hex literals | ||||
---|---|---|---|---|
| ||||
I don't seem to be having any issues making strings or dstrings from hex, but I run into some issues with wstrings. Of course, my knowledge of UTF-16 is limited, but I don't see any issues with the code below and I get some errors on the hex string literal. unittest { wchar data = 0x03C0; auto data2 = x"03C0"w; static assert(typeof(data2) == wstring); } testing_utf16.d(5): Error: Truncated UTF-8 sequence testing_utf16.d(6): while evaluating: static assert((_error_) == (wstring )) Failed: ["dmd", "-unittest", "-v", "-o-", "testing_utf16.d", "-I."] |
September 20, 2017 Re: wstring hex literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to jmh530 | On Wednesday, 20 September 2017 at 15:04:08 UTC, jmh530 wrote: > testing_utf16.d(5): Error: Truncated UTF-8 sequence > testing_utf16.d(6): while evaluating: static assert((_error_) == (wstring > )) > Failed: ["dmd", "-unittest", "-v", "-o-", "testing_utf16.d", "-I."] https://dlang.org/spec/lex.html#hex_strings says: > The string literals are assembled as UTF-8 char arrays, and the postfix is applied to convert to wchar or dchar as necessary as a final step. This isn't the friendliest thing ever and is contrary to my expectations too. You basically have to encode your string into UTF-8 and then paste the hex of that in. What should work is escape sequences: wstring str = "\u03c0"w; |
September 20, 2017 Re: wstring hex literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Neia Neutuladh | On Wednesday, 20 September 2017 at 16:26:46 UTC, Neia Neutuladh wrote:
> On Wednesday, 20 September 2017 at 15:04:08 UTC, jmh530 wrote:
>> testing_utf16.d(5): Error: Truncated UTF-8 sequence
>> testing_utf16.d(6): while evaluating: static assert((_error_) == (wstring
>> ))
>> Failed: ["dmd", "-unittest", "-v", "-o-", "testing_utf16.d", "-I."]
>
> https://dlang.org/spec/lex.html#hex_strings says:
>
>> The string literals are assembled as UTF-8 char arrays, and the postfix is applied to convert to wchar or dchar as necessary as a final step.
>
> This isn't the friendliest thing ever and is contrary to my expectations too. You basically have to encode your string into UTF-8 and then paste the hex of that in.
>
> What should work is escape sequences:
>
> wstring str = "\u03c0"w;
I see, thanks. I missed that bit on UTF-8. I was a little confused.
|
Copyright © 1999-2021 by the D Language Foundation