March 22, 2005 dchar counting in a char[] | ||||
---|---|---|---|---|
| ||||
I've seen this sort of code ...
int FooFind(char[] X, dchar D)
{
foreach(int i, dchar C; X)
{
if (c == D) return i;
}
return -1;
}
Now I understand that the foreach correctly packages up the utf-8 codepoint fragments to form a valid utf32 character, but when the value of 'i' is returned, it is an index in to the original utf-8 string or an index into the equivalent utf32 string? I'm pretty sure its a utf-8 index and that is a useful thing, as it tells you where in the original string the set of code fragements that make up the character begins. However, it doesn't tell you how many characters into the utf-8 string that the searched-for character was found.
I wrote this routine below, but I'm not sure if I needed to.
int FooFind(dchar[] X, dchar D)
{
foreach(int i, dchar C; X)
{
if (c == D) return i;
}
return -1;
}
--
Derek
Melbourne, Australia
22/03/2005 4:24:50 PM
|
March 22, 2005 Re: dchar counting in a char[] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Derek Parnell | "Derek Parnell" <derek@psych.ward> wrote in message news:1w6s40so7p838.8yz58w5g6l4q.dlg@40tude.net... > I've seen this sort of code ... > > int FooFind(char[] X, dchar D) > { > foreach(int i, dchar C; X) > { > if (c == D) return i; > } > return -1; > } The Phobos library routine std.string.find() does the same thing. > Now I understand that the foreach correctly packages up the utf-8 codepoint > fragments to form a valid utf32 character, but when the value of 'i' is returned, it is an index in to the original utf-8 string or an index into the equivalent utf32 string? The former. > I'm pretty sure its a utf-8 index and that is > a useful thing, as it tells you where in the original string the set of > code fragements that make up the character begins. However, it doesn't tell > you how many characters into the utf-8 string that the searched-for character was found. That's right. You can feed the result into std.utf.toUCSindex() to get the other index. > > I wrote this routine below, but I'm not sure if I needed to. > > int FooFind(dchar[] X, dchar D) > { > foreach(int i, dchar C; X) > { > if (c == D) return i; > } > return -1; > } I think this will do what you wish as well (return UCS index): int FooFind(dchar[] X, dchar D) { int i; foreach(dchar C; X) { if (c == D) return i; i++; } return -1; } |
Copyright © 1999-2021 by the D Language Foundation