November 30, 2023

On Thursday, 30 November 2023 at 10:49:47 UTC, IGotD- wrote:

>

On Wednesday, 29 November 2023 at 21:47:09 UTC, bachmeier wrote:

>

In the case you're talking about, you could do

foreach(_idx, v; arr) {
  int idx = cast(int) _idx;
}

I don't mind ugly and verbose code if there's sufficient benefit. There's no benefit in this case.

I don't understand why this is not an OK solution instead adding yet another lowering for the special case of using int instead of size_t. Right now in D indexes are size_t and should be the default.

This feels more like a D3 discussion if indexes should be int or size_t.

That is because there is no reason to have explicit conversions everywhere, they are overly verbose and ugly. And this is one of the reasons because I don't use foreach in my code. Good old for loop can let you decide the type. Modern programming languages should reduce friction, and not increase it.

November 30, 2023

On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:

>

That is because there is no reason to have explicit conversions everywhere, they are overly verbose and ugly. And this is one of the reasons because I don't use foreach in my code. Good old for loop can let you decide the type. Modern programming languages should reduce friction, and not increase it.

One of the corner stones in D is that there are hardly any implicit conversions. I not saying personally against it but it is one of the main D "features".

November 30, 2023

On Wednesday, 29 November 2023 at 15:48:25 UTC, Steven Schveighoffer wrote:

>

For those who are unaware, this used to work:

auto arr = [1, 2, 3];
foreach(int idx, v; arr) {
    ...
}

But was removed at some point. I think it should be brought back (we are bringing stuff back now, right? Like hex strings?)

-Steve

what's the problem, add optional parameter to enumerate which defaults to size_t?

enumerate!int(xxx)

November 30, 2023

On Thursday, 30 November 2023 at 11:54:01 UTC, IGotD- wrote:

>

On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:

>

That is because there is no reason to have explicit conversions everywhere, they are overly verbose and ugly. And this is one of the reasons because I don't use foreach in my code. Good old for loop can let you decide the type. Modern programming languages should reduce friction, and not increase it.

One of the corner stones in D is that there are hardly any implicit conversions. I not saying personally against it but it is one of the main D "features".

Do you have an example where it can cause problems in this case?

November 30, 2023
On Thursday, November 30, 2023 7:31:42 AM MST bachmeier via Digitalmars-d wrote:
> On Thursday, 30 November 2023 at 11:54:01 UTC, IGotD- wrote:
> > On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:
> >> That is because there is no reason to have explicit conversions everywhere, they are overly verbose and ugly. And this is one of the reasons because I don't use `foreach` in my code. Good old `for` loop can let you decide the type. Modern programming languages should reduce friction, and not increase it.
> >
> > One of the corner stones in D is that there are hardly any implicit conversions. I not saying personally against it but it is one of the main D "features".
>
> Do you have an example where it can cause problems in this case?

The main place that using int with the index for foreach would cause problems is if you're programming on a 32-bit system and then later start compiling that code on a 64-bit system.

Because size_t is uint on 32-bit systems, using int with foreach works just fine aside from the issue of signed vs unsigned (which D doesn't consider to be a narrowing conversion, for better or worse). So, someone could use int with foreach on a 32-bit system and have no problems, but when they move to a 64-bit system, it could become a big problem, because there, size_t is ulong. So, code that worked fine on a 32-bit system could then break on a 64-bit system (assuming that it then starts operating on arrays that are larger than a 32-bit system could handle).

If the implicit narrowing conversion is forbidden with foreach (like it is pretty much everywhere else in the language), and you used int or uint for the index, then when you go to compile that code on a 64-bit system, then you'll get a compiler error, and you can then fix the code as appropriate, whereas if we allow it to be int or uint and treat it like an explicit cast, then you have a silent bug.

And because there wasn't actually an explicit cast involved, you can't even grep for the cast keyword. There's also no way to tell at a glance whether the programmer purposefully used int knowing that the conversion would take place or whether they just used int because that worked for what they were doing on a 32-bit system but then didn't work later on 64-bit systems (and of course, if the code worked with int on a 64-bit system originally doesn't mean that it will later, since the code that generates the array could later be changed and allow much larger arrays than was originally the case).

Of course, if you're using int instead of uint, then you actually still risk bugs on 32-bit systems if the array is large enough (depending on what exactly is done with the index), but the core idea of disallowing the implicit conversion with foreach is to force the programmer to deal with the issue rather than silently having code that could have an index that's larger than the type being used can actually hold.

Now, the question then is how likely that particular bug is vs someone purposefully using int on a 64-bit system, because they wanted int, and they didn't think that there was any way that they would be operating on an array that's larger than int.max or uint.max in length. Outside of operating on large files and "big data," there probably aren't a lot of programs that are going to have arrays large enough that int won't be enough to index them, so there will be a lot of code which could use int with foreach and have no problems whatsoever. And clearly, there are folks who have used int with foreach in the past and had no problems with it, so they're annoyed at being forced to use size_t instead.

Personally, I tend to lean towards the pendatic approach here and think that anything and everything involving indexing should be using size_t so that it will work across architectures and not have bugs due to casting to smaller types, but obviously, smart people can disagree on the issue. It's not like allowing int in foreach is wrong across the board. It's just that it can cause a particular class of bugs, and arguably, it's an implicit conversion rather than an explicit conversion, which goes against how D's type conversion rules normally work. But it's also the case that using int in foreach was allowed and treated as an explicit cast for years.

So, I think that the change to requiring size_t was a good one, but I can also see why some folks would get annoyed by it, and it's a much bigger deal than it would have been otherwise, because the behavior was changed rather than it always having required a type that size_t can implicitly convert to.

- Jonathan M Davis



November 30, 2023
On Thursday, 30 November 2023 at 15:25:52 UTC, Jonathan M Davis wrote:

> And because there wasn't actually an explicit cast involved, you can't even grep for the cast keyword.

grep 'foreach(int'


November 30, 2023

On Wednesday, 29 November 2023 at 15:48:25 UTC, Steven Schveighoffer wrote:

>

On Wednesday, 29 November 2023 at 14:56:50 UTC, Steven Schveighoffer wrote:

>

I don’t know how many times I get caught with size_t indexes but I want them to be int or uint. It’s especially painful in my class that I’m teaching where I don’t want to yet explain why int doesn’t work there and have to introduce casting or use to!int. All for the possibility that I have an array larger than 2 billion elements.

I am forgetting why we removed this in the first place.

Can we have the compiler insert an assert at the loop start that the bounds are in range when you use a smaller int type? Clearly the common case is that the array is small enough for int indexes.

For those who are unaware, this used to work:

auto arr = [1, 2, 3];
foreach(int idx, v; arr) {
    ...
}

But was removed at some point. I think it should be brought back (we are bringing stuff back now, right? Like hex strings?)

  • What about the other integer types? (uint, short, byte, char?)
  • What if arr.length >= int.max ? Does the loop become infinite, or does the loop counter stay size_t (not accessible by user) and it is cast to int (idx) upon every iteration ?

-Johan

November 30, 2023

On Thursday, 30 November 2023 at 18:26:55 UTC, Johan wrote:

>
  • What about the other integer types? (uint, short, byte, char?)

Yeah, you can use those too. I think the right answer is to ensure the length of the array being iterated can't exceed the value range of the type using an assert.

And oh god, I know we have to do char because "it's an integer too!". bool is also, you could do foreach(bool idx, v; arr)

>
  • What if arr.length >= int.max ? Does the loop become infinite, or does the loop counter stay size_t (not accessible by user) and it is cast to int (idx) upon every iteration ?

I actually tested this (with ubyte, not int), and what happens is interesting. You only get length % T.max (or something like that) elements.

For instance I did:

foreach(ubyte idx, v; iota(270).array)
{
    writeln(idx);
}

and it printed 0 to 13, and was done.

So clearly it uses ubyte as the index for iteration, and also somehow converts the length to ubyte instead of the other way around.

But it doesn't use it exactly, because modifying idx doesn't change the loop. So I don't see why it uses ubyte for the actual index instead of size_t.

Honestly, maybe the easiest fix here is just to fix the actual lowering to be more sensical (I would have expected 270 iterations with repeated indexes 0 to 13 after going through 255).

-Steve

November 30, 2023

On Thursday, 30 November 2023 at 21:28:51 UTC, Steven Schveighoffer wrote:

>

Honestly, maybe the easiest fix here is just to fix the actual lowering to be more sensical (I would have expected 270 iterations with repeated indexes 0 to 13 after going through 255).

Or maybe a runtime error, similar to how out-of-bounds array accesses are handled would be more reasonable here?

December 01, 2023

On Thursday, 30 November 2023 at 11:01:15 UTC, Hipreme wrote:

>

That is because there is no reason to have explicit conversions everywhere, they are overly verbose and ugly. And this is one of the reasons because I don't use foreach in my code. Good old for loop can let you decide the type. Modern programming languages should reduce friction, and not increase it.

Modern languages should detect bugs:

for (ubyte i = 0; i != a.length; i++) {}

The above compiles without error, but never terminates if a has > 255 elements.

foreach (ubyte i; 0 .. a.length) {} // compile error