On Monday, October 07, 2013 17:17:57 Andrej Mitrovic wrote:
> If I want to transfer some string to a C function that expects ascii-only string. What can I use to verify there are no non-ascii characters in a D string? I haven't seen anything in Phobos.
>
> I was thinking of using:
>
> bool isAscii = mystring.all!(a => a <= 0xFF);
>
> Is this safe?
>
> I'm thinking of whether a code point can consist of two code units such as [C1][C2], where C2 may be in the range 0 - 0xFF. I don't know if that's possible (not a unicode pro here..).
If you do
bool isASCII = !mystring.any!(not!isASCII)();
or
bool isASCII = !mystring.any!(a => a > 0x7F)();
then you should be good. Anything in UTF-8 127 or under is a single code unit and is ASCII. It has to be more than 127 to be a multi-byte character. Just look at the table on
https://en.wikipedia.org/wiki/Utf-8
It shows what each of the bytes have to look like in a UTF-8 code point for each number of bytes in the code point.
- Jonathan M Davis
|