Thread overview
Submission: updated std.uni module
Feb 24, 2006
Thomas Kühne
Feb 24, 2006
Chris Miller
Feb 24, 2006
Thomas Kuehne
February 24, 2006
Attached is an updated std.uni module.

publicly visible changes:
1) updated casing and isUniAlpha data to Unicode 5.0.0

2) added "dchar[] toUniLower(dchar[])" and "dchar[] toUniUpper(dchar[])"
in order to handle cases like toUniUpper("\u00DF") -> "SS"


internal changes:
3) use AAs instead of hardcoded IFs for upper and lower casing
(I might expand the extractor to hardcode IFs, if anybody experiences
serious performance degration.)

Thomas


February 24, 2006
On Fri, 24 Feb 2006 05:58:39 -0500, Thomas Kühne <thomas-dloop@kuehne.cn> wrote:

> Attached is an updated std.uni module.
>

I didn't even know std.uni existed; I could use such functions it provides.
February 24, 2006
Thomas Kühne schrieb am 2006-02-24:
> Attached is an updated std.uni module.
>
> publicly visible changes:
> 1) updated casing and isUniAlpha data to Unicode 5.0.0
>
> 2) added "dchar[] toUniLower(dchar[])" and "dchar[] toUniUpper(dchar[])"
> in order to handle cases like toUniUpper("\u00DF") -> "SS"
>
>
> internal changes:
> 3) use AAs instead of hardcoded IFs for upper and lower casing
> (I might expand the extractor to hardcode IFs, if anybody experiences
> serious performance degration.)

Unicode seems sometimes to be a collection of special cases ;)


Forgot to add:

The following characters aren't mapped correctly.
format: character (condition)


GREEK CAPITAL LETTER SIGMA (Final_Sigma)

==Lithuanian locale==
COMBINING DOT ABOVE (After_Soft_Dotted)
LATIN CAPITAL LETTER I (More_Above)
LATIN CAPITAL LETTER J (More_Above)
LATIN CAPITAL LETTER I WITH OGONEK (More_Above)
LATIN CAPITAL LETTER I WITH GRAVE
LATIN CAPITAL LETTER I WITH ACUTE
LATIN CAPITAL LETTER I WITH TILDE

==Turkish and Azeri locale==
LATIN CAPITAL LETTER I WITH DOT ABOVE
COMBINING DOT ABOVE (After_I)
LATIN CAPITAL LETTER I (Not_Before_Dot)
LATIN SMALL LETTER I


Thomas