Thread overview
Perl6 Unicode support
May 11, 2016
Guillaume Chatelet
Jun 11, 2016
Andrew Godfrey
Jun 11, 2016
ag0aep6g
Jun 11, 2016
Andrew Godfrey
Jun 11, 2016
ag0aep6g
Jun 12, 2016
Andrew Godfrey
Jun 12, 2016
ag0aep6g
Jun 12, 2016
Andrew Godfrey
May 11, 2016
It looks good:
https://perl6advent.wordpress.com/2015/12/07/day-7-unicode-perl-6-and-you/
June 11, 2016
On Wednesday, 11 May 2016 at 12:11:32 UTC, Guillaume Chatelet wrote:
> It looks good:
> https://perl6advent.wordpress.com/2015/12/07/day-7-unicode-perl-6-and-you/

Especially, it works in graphemes, and ".codes" lets you count code points. The article isn't even mentioning "code units".

OTOH, it mentions both graphemes and grapheme clusters, without much distinction. So I'm not exactly sure which is the default focus.
June 11, 2016
On 06/11/2016 06:47 PM, Andrew Godfrey wrote:
> OTOH, it mentions both graphemes and grapheme clusters, without much
> distinction. So I'm not exactly sure which is the default focus.

What distinction is there to be made? As far as I understand, a grapheme cluster is a sequence (or cluster) of code points that together represent one grapheme.
June 11, 2016
On Saturday, 11 June 2016 at 18:33:04 UTC, ag0aep6g wrote:
> On 06/11/2016 06:47 PM, Andrew Godfrey wrote:
>> OTOH, it mentions both graphemes and grapheme clusters, without much
>> distinction. So I'm not exactly sure which is the default focus.
>
> What distinction is there to be made? As far as I understand, a grapheme cluster is a sequence (or cluster) of code points that together represent one grapheme.

That's the distinction, yes. The article mentions both in a way that makes me unsure if Perl 6 confused the terms (or maybe it's just the article that isn't being clear).
June 11, 2016
On 06/11/2016 09:25 PM, Andrew Godfrey wrote:
> That's the distinction, yes. The article mentions both in a way that
> makes me unsure if Perl 6 confused the terms (or maybe it's just the
> article that isn't being clear).

But how would you "focus" on one or the other?
Is there any operation that works differently on graphemes than on grapheme clusters?
Counting/skipping/extracting graphemes is the same as counting grapheme clusters, no?
June 12, 2016
On Saturday, 11 June 2016 at 19:43:45 UTC, ag0aep6g wrote:
> On 06/11/2016 09:25 PM, Andrew Godfrey wrote:
>> That's the distinction, yes. The article mentions both in a way that
>> makes me unsure if Perl 6 confused the terms (or maybe it's just the
>> article that isn't being clear).
>
> But how would you "focus" on one or the other?
> Is there any operation that works differently on graphemes than on grapheme clusters?
> Counting/skipping/extracting graphemes is the same as counting grapheme clusters, no?

Eg it says ".chars returns the number of characters (aka graphemes)"

Does this count the number of graphemes, or the number of grapheme clusters? Later on with \r\n it pretty much says that it counts grapheme clusters. Here it says it counts graphemes.
June 12, 2016
On 06/12/2016 05:16 AM, Andrew Godfrey wrote:
> Eg it says ".chars returns the number of characters (aka graphemes)"
>
> Does this count the number of graphemes, or the number of grapheme
> clusters? Later on with \r\n it pretty much says that it counts grapheme
> clusters. Here it says it counts graphemes.

Sorry, I still don't get it. Can you give an example string where counting graphemes gives a different result from counting grapheme clusters?
June 12, 2016
On Sunday, 12 June 2016 at 08:15:37 UTC, ag0aep6g wrote:
> On 06/12/2016 05:16 AM, Andrew Godfrey wrote:
>> Eg it says ".chars returns the number of characters (aka graphemes)"
>>
>> Does this count the number of graphemes, or the number of grapheme
>> clusters? Later on with \r\n it pretty much says that it counts grapheme
>> clusters. Here it says it counts graphemes.
>
> Sorry, I still don't get it. Can you give an example string where counting graphemes gives a different result from counting grapheme clusters?

Huh. On researching "grapheme cluster", I see it is a weird Unicode term that apparently means the same thing as grapheme. Definitely something to avoid in an article (or, explain very carefully).
To the uninitiated, "grapheme cluster" means "a cluster of graphemes" and implies a one-to-many mapping.