Thread overview
[Issue 23341] [std.uni] ZWJ not handled properly
Sep 17, 2022
Garrett D'Amore
Sep 17, 2022
Garrett D'Amore
Dec 17, 2022
Iain Buclaw
September 17, 2022
https://issues.dlang.org/show_bug.cgi?id=23341

--- Comment #1 from Garrett D'Amore <garrett@damore.org> ---
ZWJ probably requires a level of sophistication to handle properly:

https://en.wikipedia.org/wiki/Zero-width_joiner

For example, the handling in Devangari is a little different since ZWJ modifies characters placed before it.

For example:

    s2 = "\u0915\u094d\u200d";
        writefln("s2 is %s\n", s2);
    writefln("graphemes %d (expect 1)\n", wr.walkLength); // this should be "1"

This looks like: क्‍

--
September 17, 2022
https://issues.dlang.org/show_bug.cgi?id=23341

--- Comment #2 from Garrett D'Amore <garrett@damore.org> ---
This problem is not limited to ZWJ:

For example:

    s2 = "\U0001F44D\U0001F3fD";
    writefln("s2 is %s\n", s2);
    writefln("graphemes %d (expect 1)\n", wr.walkLength); // this should be "1"

That is a thumbs up with a skin tone modifier.  That should be one grapheme.

--
December 17, 2022
https://issues.dlang.org/show_bug.cgi?id=23341

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P4

--