May 21, 2022
On Saturday, 21 May 2022 at 15:50:43 UTC, rikki cattermole wrote:
> Unicode.
>
> Multi-byte code points.
>
> UTF-8 and UTF-16 are variable length to produce a single Unicode codepoint that is 32bit.
>
> writeln("“".length, " ", "”".length); // 3 3

Thanks, variable-length encoding is the answer for this question.

May 21, 2022
On Saturday, 21 May 2022 at 16:11:50 UTC, Adam D Ruppe wrote:
> On Saturday, 21 May 2022 at 15:43:59 UTC, mw wrote:
>> Same question:
>> Why 3 bytes? Not 4 bytes?
>
> Same answer: it is pointing at the character in the ORIGINAL ARRAY. The original array is NOT made out of dchars but it reads and converts on the fly while maintaining the correct position.

UTF variable-length encoding is the answer I'm looking for.

> See that's the beauty of it: how it gets there can be pretty complicated (especially when going backwards with foreach_reverse), but it always gives the right answer.

OK, it's a loop array index, not a "loop counter"; so most of time these two things are the same, but in this special case of UTF variable-length encoding they are different.

two comments:

1) can we also have a true "loop counter"? coming from a numeric computation application background, "loop counter" certainly is very useful.

2) can we also make it work for range? i.e. the question in my original 1st post.
May 21, 2022
On Saturday, 21 May 2022 at 17:07:41 UTC, mw wrote:
> 1) can we also have a true "loop counter"?

That's what the `enumerate` thing does.

Or you can just

int counter;
for(whatever) {
    scope(exit) counter++;
}


1 2 3
Next ›   Last »