Foreach loops on static arrays error message

Jul 06, 2017

Guillaume Chatelet

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

Jul 06, 2017

I stumbled upon https://issues.dlang.org/show_bug.cgi?id=12685 In essence: > ubyte[256] data; > foreach (ubyte i, x; data) {} > > Error: index type 'ubyte' cannot cover index range 0..256 `ubyte` can clearly hold a value from 0 to 255 so it should be ok. No need for 256 ?! So I decided to fix it https://github.com/dlang/dmd/pull/6973 Unfortunately when you think about how foreach is lowered it makes sense - but the error message is misleading. The previous code is lowered to: > ubyte[256] data; > for (ubyte i = 0 ; i < 256 ; ++i) { > ubyte x = data[i] > } `i < 256` is always true, this would loop forever. Questions: - What would be a better error message? - How about a different lowering in this case? From the programmer's point of view the original code makes sense. A correct lowering would be: > ubyte[256] data; > for(ubyte i = 0;;++i) { > ubyte x = data[i]; > ... > if(i==255) break; > }

July 06, 2017

Re: Foreach loops on static arrays error message

Posted by Stefan Koch
in reply to Guillaume Chatelet

Permalink

Stefan Koch

Posted in reply to Guillaume Chatelet

Permalink

On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote:
> I stumbled upon https://issues.dlang.org/show_bug.cgi?id=12685
>
> In essence:
>
>> [...]
>
> `ubyte` can clearly hold a value from 0 to 255 so it should be ok. No need for 256 ?!
>
> So I decided to fix it https://github.com/dlang/dmd/pull/6973
>
> Unfortunately when you think about how foreach is lowered it makes sense - but the error message is misleading.
>
> The previous code is lowered to:
>
>> [...]
>
> `i < 256` is always true, this would loop forever.
>
> Questions:
> - What would be a better error message?
> - How about a different lowering in this case?
>
> From the programmer's point of view the original code makes sense.
> A correct lowering would be:
>
>> [...]

I'd say this is not often encoutered.
One should avoid using a different type then size_t for the index, as it can have negative performance implications.

On Thursday, 6 July 2017 at 08:49:33 UTC, Stefan Koch wrote: > On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: >>> [...] > > I'd say this is not often encoutered. > One should avoid using a different type then size_t for the index, as it can have negative performance implications. Interesting. What would be the example of negative performance implication? I'm guilty of using the int on occasions.

On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: > From the programmer's point of view the original code makes sense. > A correct lowering would be: > >> ubyte[256] data; >> for(ubyte i = 0;;++i) { >> ubyte x = data[i]; >> ... >> if(i==255) break; >> } or: ubyte[256] data; foreach(ubyte i; 0..256) { ubyte x = data[i]; }

On Thursday, 6 July 2017 at 08:57:42 UTC, Nemanja Boric wrote: > On Thursday, 6 July 2017 at 08:49:33 UTC, Stefan Koch wrote: >> On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: >>>> [...] >> >> I'd say this is not often encoutered. >> One should avoid using a different type then size_t for the index, as it can have negative performance implications. > > Interesting. What would be the example of negative performance implication? I'm guilty of using the int on occasions. on 64bit a downcast can cause the compiler to emit a cqo instruction when the index is used as an index. it's relatively expensive in some circumstances when it messed up predictions.

On Thursday, 6 July 2017 at 09:00:47 UTC, Andrea Fontana wrote: > On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: >> From the programmer's point of view the original code makes sense. >> A correct lowering would be: >> >>> ubyte[256] data; >>> for(ubyte i = 0;;++i) { >>> ubyte x = data[i]; >>> ... >>> if(i==255) break; >>> } > > or: > > ubyte[256] data; > foreach(ubyte i; 0..256) { > ubyte x = data[i]; > } > Yes. Much better. What's the rewrite in this case? Using a size_t internally and casting to ubyte?

On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: > A correct lowering would be: > >> ubyte[256] data; >> for(ubyte i = 0;;++i) { >> ubyte x = data[i]; >> ... >> if(i==255) break; >> } That could lead to two branches in machine language, try to think about it in terms of if and do-while loops to mirror typical machine language. The correct lowering is: ubyte[256] data; if (data.length > 0) { ubyte i = 0; do { writeln(i); } while ((++i) != cast(ubyte)data.length); }

On Thursday, 6 July 2017 at 09:11:44 UTC, Ola Fosheim Grøstad wrote: > ubyte[256] data; > > if (data.length > 0) { > ubyte i = 0; > do { > writeln(i); > } while ((++i) != cast(ubyte)data.length); > } Here is another version that will work ok on CPUs that can issue many instructions in parallel if there are no dependencies between them as you avoid an explicit comparison on the counter (zero tests tend be be free): ubyte[N] data; size_t _counter = data.length; if( _counter !=0){ ubyte i = 0; do { writeln(i); i++; } while (--_counter); }

On Thursday, 6 July 2017 at 09:06:18 UTC, Guillaume Chatelet wrote: >> ubyte[256] data; >> foreach(ubyte i; 0..256) { >> ubyte x = data[i]; >> } >> > > Yes. Much better. What's the rewrite in this case? Using a size_t internally and casting to ubyte? I was just wondering

Forums