Thread overview | ||||||
---|---|---|---|---|---|---|
|
November 22, 2005 std.string.toupper/tolower failed with mixture of Engish and Chinese characters | ||||
---|---|---|---|---|
| ||||
std.string.toupper() and std.string.tolower() give a wrong result when deal with a mixture of upper/lower English and Chinese characters. e.g. char[] a = "AbCdÖÐeFgH"; char[] b = std.string.toupper(a); char[] c = std.string.tolower(a); The length of a is 11, but the length of b,c is 18 now. |
November 22, 2005 Re: std.string.toupper/tolower failed with mixture of Engish and Chinese characters | ||||
---|---|---|---|---|
| ||||
Posted in reply to Shawn Liu | "Shawn Liu" <Shawn_member@pathlink.com> wrote...
> std.string.toupper() and std.string.tolower() give a wrong result when
> deal with
> a mixture of upper/lower English and Chinese characters. e.g.
> char[] a = "AbCdÖÐeFgH";
> char[] b = std.string.toupper(a);
> char[] c = std.string.tolower(a);
>
> The length of a is 11, but the length of b,c is 18 now.
Phobos doesn't supports non-ascii conversions/comparisons at this time?
|
November 22, 2005 Re: std.string.toupper/tolower failed with mixture of Engish and Chinese characters | ||||
---|---|---|---|---|
| ||||
Posted in reply to Shawn Liu | On Tue, 22 Nov 2005 02:19:50 +0000 (UTC), Shawn Liu wrote: > std.string.toupper() and std.string.tolower() give a wrong result when deal with > a mixture of upper/lower English and Chinese characters. e.g. > char[] a = "AbCdÖÐeFgH"; > char[] b = std.string.toupper(a); > char[] c = std.string.tolower(a); > > The length of a is 11, but the length of b,c is 18 now. If it isn't ASCII then DMD doesn't want to know about it. Try the Mango library for its ICU bindings, I think that might have it. -- Derek (skype: derek.j.parnell) Melbourne, Australia 22/11/2005 1:33:24 PM |
November 26, 2005 Re: std.string.toupper/tolower failed with mixture of Engish and Chinese characters | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kris Attachments: | [follow up set to: digitalmars.D.bugs]
Kris schrieb am 2005-11-22:
> "Shawn Liu" <Shawn_member@pathlink.com> wrote...
>> std.string.toupper() and std.string.tolower() give a wrong result when
>> deal with
>> a mixture of upper/lower English and Chinese characters. e.g.
>> char[] a = "AbCdÖÐeFgH";
>> char[] b = std.string.toupper(a);
>> char[] c = std.string.tolower(a);
>>
>> The length of a is 11, but the length of b,c is 18 now.
>
> Phobos doesn't supports non-ascii conversions/comparisons at this time?
>
Phobos does, at least the simple conversions. No matter what cases are treated, the untreated data shouldn't get corrupted.
The attached zipped string.d fixes toupper/tolower and extends the unittests. (Yes I know, it isn't the fastest possible algorithm ...)
Thomas
|
Copyright © 1999-2021 by the D Language Foundation