D's Auto Decoding and You (page 5)

On Fri, Jun 3, 2016 at 8:58 AM, tsbockman via Digitalmars-d-announce <digitalmars-d-announce@puremagic.com> wrote: > On Friday, 3 June 2016 at 06:37:59 UTC, Rory McGuire wrote: >> >> This dpaste shows a couple of issues with combining chars in D. >> >> https://dpaste.dzfl.pl/4b006959c5c0 >> >> The compiler actually can't handle a combining character literal either. see line 10. > > > Your paste behaves as expected: the "character" types in D are defined as single Unicode code units. By definition, the NFD form of "é" is not a single code unit. You would need to use a Grapheme or [w|d]string for that. > > (Of course, one might reasonably question how useful our built-in character types actually are compared to ubyte/ushort/uint.) hmm, perhaps it behaves as documented, however I'm not certain that its expected :).

June 03, 2016

Re: D's Auto Decoding and You

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Permalink

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

Permalink

On 6/2/16 5:33 PM, Andrei Alexandrescu wrote:
> On 6/2/16 5:27 PM, Steven Schveighoffer wrote:
>> On 6/2/16 5:21 PM, jmh530 wrote:
>>> On Tuesday, 17 May 2016 at 14:06:37 UTC, Jack Stouffer wrote:
>>>>
>>>> If you think there should be any more information included in the
>>>> article, please let me know so I can add it.
>>>
>>> I was a little confused by something in the main autodecoding thread, so
>>> I read your article again. Unfortunately, I don't think my confusion is
>>> resolved. I was trying one of your examples (full code I used below).
>>> You claim it works, but I keep getting assertion failures. I'm just
>>> running it with rdmd on Windows 7.
>>>
>>>
>>> import std.algorithm : canFind;
>>>
>>> void main()
>>> {
>>>     string s = "cassé";
>>>
>>>     assert(s.canFind!(x => x == 'é'));
>>> }
>>
>> If that é above is an e followed by a combining character, then you will
>> get the error. This is because autodecoding does not auto normalize as
>> well -- the code points have to match exactly.
>
> Indeed. FWIW I just copied OP's code from Thunderbird into Chrome (on
> OSX) and it worked: https://dpaste.dzfl.pl/09b9188d87a5
>
> Should I assume some normalization occurred on the way?

I think it depends on what your browser presents. But impossible to tell without being on the OP's machine to see what it's actually stored as. Thunderbird may have normalized as well!

-Steve

Forums