Thread overview
Phobos uni methods
Aug 22, 2016
Andrew
Aug 22, 2016
Cauterite
Aug 22, 2016
Lodovico Giaretta
Aug 27, 2016
Andrew
Aug 28, 2016
Dmitry Olshansky
Aug 22, 2016
Jack Stouffer
August 22, 2016
Hi,

It appears as though the Phobos Unicode methods (such as std.uni.isAlpha()) are compatible with Unicode 6.2 (i.e. the 2012 standard, which is now nearly 5 years old). Unicode is now up to version 9.0.  Changes do include changes to std.uni.isAlpha(), and other methods.

Is there either an updated version of std.uni, or are there plans to update it?

Thanks
Andrew.

August 22, 2016
On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:
>

Note that changing isAlpha() can potentially break any D code with unicode in its identifiers, because the DMD frontend uses isAlpha() to determine which characters are allowed in identifiers.
August 22, 2016
On Monday, 22 August 2016 at 10:26:35 UTC, Cauterite wrote:
> On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:
>>
>
> Note that changing isAlpha() can potentially break any D code with unicode in its identifiers, because the DMD frontend uses isAlpha() to determine which characters are allowed in identifiers.

Well, the Unicode consortium is famous for having backward compatibility as a priority (in fact the Unicode standard has many strange things that are conceptually wrong but are needed to maintain compatibility).

So, updating the std.uni methods should not break anything, but at most allow more inputs to be accepted. So I think that the possibility of updating std.uni should be taken into account and further investigated, to see if it's doable.

By the way, the core team is very busy so if Andrew (the OP) wants to make a PR himself, it would be welcome.
August 22, 2016
On Monday, 22 August 2016 at 07:58:50 UTC, Andrew wrote:
> Hi,
>
> It appears as though the Phobos Unicode methods (such as std.uni.isAlpha()) are compatible with Unicode 6.2 (i.e. the 2012 standard, which is now nearly 5 years old). Unicode is now up to version 9.0.  Changes do include changes to std.uni.isAlpha(), and other methods.
>
> Is there either an updated version of std.uni, or are there plans to update it?
>
> Thanks
> Andrew.

Please make a bug report on issues.dlang.org
August 27, 2016
On Monday, 22 August 2016 at 10:48:14 UTC, Lodovico Giaretta wrote:
>
> By the way, the core team is very busy so if Andrew (the OP) wants to make a PR himself, it would be welcome.

Is there a tool somewhere that parses the UnicodeData.txt and PropList.txt and generates all the tries?  I took a quick look but didn't see one alongside the std.uni source code.

Andrew.

August 28, 2016
On 8/27/16 9:40 AM, Andrew wrote:
> On Monday, 22 August 2016 at 10:48:14 UTC, Lodovico Giaretta wrote:
>>
>> By the way, the core team is very busy so if Andrew (the OP) wants to
>> make a PR himself, it would be welcome.
>
> Is there a tool somewhere that parses the UnicodeData.txt and
> PropList.txt and generates all the tries?  I took a quick look but
> didn't see one alongside the std.uni source code.
>

An awful oversight. The tool still sits in its GSOC 2012 repo:

https://github.com/DmitryOlshansky/gsoc-bench-2012/blob/master/gen_uni.d

And the script to run it:
https://github.com/DmitryOlshansky/gsoc-bench-2012/blob/master/gen_uni.sh

> Andrew.
>