May 21, 2022

On Saturday, 21 May 2022 at 02:35:37 UTC, Mike Parker wrote:

>

You get indexes in a foreach with arrays and associative arrays because they have indexes. You don't get them with input ranges because they have no indexes (hence, the need for enumerate).

iota with integer arguments produces a random access range, which is indexable. It would be nice if foreach supported an index, element pair with RA ranges, when front just returns an element, not a tuple.

May 21, 2022
On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
> On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
>> Yes! let's call it "loop counter"
>
> It isn't a loop counter. It is an index back into the original source. Consider:
>
> foreach(INDEX, dchar character; "“”")
>    writeln(INDEX); // 0 then 3

Wow, thus surprised me again!

1) First, why not 0 then 4? Since dchar is 32 bits.

2) Second, compare:

import std;
void main()
{
    dstring ds = "“”";
    writeln(ds.length);
    foreach(INDEX, dchar character; ds)
      writeln(INDEX, character); // 0 then
}

Output:
2
0“
1”

Explanations?


If it's "loop counter", isn't the behavior more consistent?

May 21, 2022
On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
> On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
>> On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
>>> Yes! let's call it "loop counter"
>>
>> It isn't a loop counter. It is an index back into the original source. Consider:
>>
>> foreach(INDEX, dchar character; "“”")
>>    writeln(INDEX); // 0 then 3
>
> Wow, thus surprised me again!
>
> 1) First, why not 0 then 4? Since dchar is 32 bits.
>
> 2) Second, compare:
>
> import std;
> void main()
> {
>     dstring ds = "“”";
>     writeln(ds.length);
>     foreach(INDEX, dchar character; ds)
>       writeln(INDEX, character); // 0 then
> }
>
> Output:
> 2
> 0“
> 1”
>
> Explanations?

Adam D Ruppe expamples implies auto (hidden) decoding, which is a special case, so it reads 3 bytes to decode the the 1st glyph. The word "counter" is actually correct if the foreach'd aggregate is truely capable of random accesses. That is the case for your second version, that iterated over a dstring.

>
> If it's "loop counter", isn't the behavior more consistent?


May 21, 2022
On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
>> It isn't a loop counter. It is an index back into the original source. Consider:
>
> 1) First, why not 0 then 4? Since dchar is 32 bits.

It is an index back into the *original source* given to foreach.

I gave it a char[], not a dchar[]. So it is counting chars in that original char[].


If you do:

auto thing = whatever_you_loop_over;
foreach(index, item; thing) {
    then
      stuff_before_item == thing[0 .. index];
}


> Explanations?

The index there is the index into a dstring, which is stored differently.

> If it's "loop counter", isn't the behavior more consistent?

Also see `foreach_reverse` where it counts backwards because it is an index into the original array, not a counter.
May 21, 2022
On Saturday, 21 May 2022 at 15:21:09 UTC, user1234 wrote:
> Adam D Ruppe expamples implies auto (hidden) decoding

small nitpick, this is not autodecoding since you have to request it specifically by specifying dchar.

autodecoding is referring to a Phobos thing where it gives dchar even though you didn't specifically ask for it.
May 21, 2022
On Saturday, 21 May 2022 at 15:21:09 UTC, user1234 wrote:
> On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
>> On Saturday, 21 May 2022 at 11:59:42 UTC, Adam D Ruppe wrote:
>>> On Saturday, 21 May 2022 at 03:01:39 UTC, mw wrote:
>>>> Yes! let's call it "loop counter"
>>>
>>> It isn't a loop counter. It is an index back into the original source. Consider:
>>>
>>> foreach(INDEX, dchar character; "“”")
>>>    writeln(INDEX); // 0 then 3
>>
>> Wow, thus surprised me again!
>>
>> 1) First, why not 0 then 4? Since dchar is 32 bits.
>>
>> 2) Second, compare:
>>
>> import std;
>> void main()
>> {
>>     dstring ds = "“”";
>>     writeln(ds.length);
>>     foreach(INDEX, dchar character; ds)
>>       writeln(INDEX, character); // 0 then
>> }
>>
>> Output:
>> 2
>> 0“
>> 1”
>>
>> Explanations?
>
> Adam D Ruppe expamples implies auto (hidden) decoding, which is a special case, so it reads 3 bytes to decode the the 1st glyph.

Why 3 bytes? Not 4 bytes?

As dchar is specified as 32 bits here?

https://dlang.org/spec/type.html


May 21, 2022
On Saturday, 21 May 2022 at 15:23:08 UTC, Adam D Ruppe wrote:
> On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
>>> It isn't a loop counter. It is an index back into the original source. Consider:
>>
>> 1) First, why not 0 then 4? Since dchar is 32 bits.
>
> It is an index back into the *original source* given to foreach.
>
> I gave it a char[], not a dchar[]. So it is counting chars in that original char[].
>

Same question:


Why 3 bytes? Not 4 bytes?

As dchar is specified as 32 bits here?

https://dlang.org/spec/type.html




May 22, 2022
Unicode.

Multi-byte code points.

UTF-8 and UTF-16 are variable length to produce a single Unicode codepoint that is 32bit.

writeln("“".length, " ", "”".length); // 3 3
May 21, 2022
On Sat, May 21, 2022 at 03:43:59PM +0000, mw via Digitalmars-d wrote:
> On Saturday, 21 May 2022 at 15:23:08 UTC, Adam D Ruppe wrote:
> > On Saturday, 21 May 2022 at 15:06:31 UTC, mw wrote:
> > > > It isn't a loop counter. It is an index back into the original source. Consider:
> > > 
> > > 1) First, why not 0 then 4? Since dchar is 32 bits.
> > 
> > It is an index back into the *original source* given to foreach.
> > 
> > I gave it a char[], not a dchar[]. So it is counting chars in that original char[].
> > 
> 
> Same question:
> 
> 
> Why 3 bytes? Not 4 bytes?
[...]

Because UTF-8 is a variable-length encoding.


T

-- 
Life is too short to run proprietary software. -- Bdale Garbee
May 21, 2022
On Saturday, 21 May 2022 at 15:43:59 UTC, mw wrote:
> Same question:
> Why 3 bytes? Not 4 bytes?

Same answer: it is pointing at the character in the ORIGINAL ARRAY. The original array is NOT made out of dchars but it reads and converts on the fly while maintaining the correct position.

See that's the beauty of it: how it gets there can be pretty complicated (especially when going backwards with foreach_reverse), but it always gives the right answer.