View mode: basic / threaded / horizontal-split · Log in · Help
February 26, 2012
foreach by value iteration
I just got bitten by this again. It usually looks something like 
this:

-----------------
struct Vector
{
    ...
    void normalize() { this /= length; }
}

Vector[] vs;
...
foreach (v; vs)
    v.normalize();
-----------------

The bug here is that the loop does nothing because it is 
normalizing a temporary copy of the vector, rather than the 
vectors in the array.

Am I the only person that gets caught by this trap?

Perhaps the loop variable should be implicitly const so that this 
type of thing won't happen?
February 26, 2012
Re: foreach by value iteration
On Sunday, February 26, 2012 09:01:23 Peter Alexander wrote:
> I just got bitten by this again. It usually looks something like
> this:
> 
> -----------------
> struct Vector
> {
>      ...
>      void normalize() { this /= length; }
> }
> 
> Vector[] vs;
> ...
> foreach (v; vs)
>      v.normalize();
> -----------------
> 
> The bug here is that the loop does nothing because it is
> normalizing a temporary copy of the vector, rather than the
> vectors in the array.
> 
> Am I the only person that gets caught by this trap?
> 
> Perhaps the loop variable should be implicitly const so that this
> type of thing won't happen?

That would be annoying. Sometimes, you _want_ to be able to alter the 
variable, even if it _is_ a copy. And if it's a reference type, then you don't 
have the problem at all, and making it const would be completely restrictive.

I really don't know how we could make it easier for you to avoid this error 
without making it impossible to other, useful things. But your code works if 
you use ref. You just have to remember to do that.

- Jonathan M Davis
February 26, 2012
Re: foreach by value iteration
"Peter Alexander" <peter.alexander.au@gmail.com> wrote in message 
news:awuvuwxwgjezcdrfswyn@forum.dlang.org...

> Perhaps the loop variable should be implicitly const so that this type of 
> thing won't happen?

Making it const would break heaps of valid use cases.  It should probably be 
an rvalue, and so should the index.  I think there's a discussion about this 
burried somewhere in bugzilla.
February 26, 2012
Re: foreach by value iteration
On Sunday, February 26, 2012 20:57:10 Daniel Murphy wrote:
> "Peter Alexander" <peter.alexander.au@gmail.com> wrote in message
> news:awuvuwxwgjezcdrfswyn@forum.dlang.org...
> 
> > Perhaps the loop variable should be implicitly const so that this type of
> > thing won't happen?
> 
> Making it const would break heaps of valid use cases.  It should probably be
> an rvalue, and so should the index.  I think there's a discussion about
> this burried somewhere in bugzilla.

The index bit is highly debatable, as the discussion on that shows 
(personally, I tend to favor leaving it as-is, because it's useful that way), 
but regardless making either an rvalue wouldn't make any sense, because they 
have to be variables that you can operate on in the loop. Doing anything else 
would seriously break the semantics of foreach. The question is simply whether 
they should be actual copies so that you don't end up with them altering 
anything outside of the loop unless you mark them as ref or if the type itself 
is a reference type. The element is pretty much its own copy regardless, 
unless you mark it as ref, whereas the situation with the index is more 
complicated (as the discussion on that shows).

- Jonathan M Davis
February 26, 2012
Re: foreach by value iteration
Peter Alexander:

> Am I the only person that gets caught by this trap?

I have being hit by this bug some times. I think this is a common enough bug in D code. But it's not easy to invent a way to solve it.

----------------

Jonathan M Davis:

> I really don't know how we could make it easier for you to avoid this error
> without making it impossible to other, useful things. But your code works if
> you use ref. You just have to remember to do that.

To "just remember" to avoid the bug is an option, but programmers are not perfect.

One possible solution is to generate a warning when a mutable struct copy is done. But I think this causes too many false positives (false alarms). In theory the compiler may avoid showing the warning if the copied mutable struct is _clearly_ not modified inside the loop body. This reduces the number of false positives, but I think not enough.


Another possible solution is to require something in the foreach, like "ref" or "copy" (or "dup" instead of "copy"). Here I think it's clear that 'v' is a copy of the struct of 'vs' (maybe "copy" is not required if 'v' is not mutable):

struct Vector {
    void normalize() { this /= length; }
}
Vector[] vs;
foreach (copy v; vs)
    v.normalize();


A mix of the two ideas is to generate the warning unless a "copy" or "ref" is present when there is a mutable copy. So "copy" is used just to silence the warning.


Or maybe just an inlined comment /*copy*/ is needed to silence the warning/error:

foreach (/*copy*/ v; vs)
    v.normalize();

Or a different foreach:

foreach_copy (v; vs)
    v.normalize();

I think none of the ideas I've shown is good enough.

Bye,
bearophile
February 26, 2012
Re: foreach by value iteration
On 27 February 2012 10:31, bearophile <bearophileHUGS@lycos.com> wrote:
> Peter Alexander:
>
>> Am I the only person that gets caught by this trap?
>
> I have being hit by this bug some times. I think this is a common enough bug in D code. But it's not easy to invent a way to solve it.
>
> ----------------
>
> Jonathan M Davis:
>
>> I really don't know how we could make it easier for you to avoid this error
>> without making it impossible to other, useful things. But your code works if
>> you use ref. You just have to remember to do that.
>
> To "just remember" to avoid the bug is an option, but programmers are not perfect.
>
> One possible solution is to generate a warning when a mutable struct copy is done. But I think this causes too many false positives (false alarms). In theory the compiler may avoid showing the warning if the copied mutable struct is _clearly_ not modified inside the loop body. This reduces the number of false positives, but I think not enough.
>
>
> Another possible solution is to require something in the foreach, like "ref" or "copy" (or "dup" instead of "copy"). Here I think it's clear that 'v' is a copy of the struct of 'vs' (maybe "copy" is not required if 'v' is not mutable):
>
> struct Vector {
>     void normalize() { this /= length; }
> }
> Vector[] vs;
> foreach (copy v; vs)
>     v.normalize();
>
>
> A mix of the two ideas is to generate the warning unless a "copy" or "ref" is present when there is a mutable copy. So "copy" is used just to silence the warning.
>
>
> Or maybe just an inlined comment /*copy*/ is needed to silence the warning/error:
>
> foreach (/*copy*/ v; vs)
>     v.normalize();
>
> Or a different foreach:
>
> foreach_copy (v; vs)
>     v.normalize();
>
> I think none of the ideas I've shown is good enough.
>
> Bye,
> bearophile

I'm with Jonathon, using `ref` is the best way, it is ridiculous to
think that the compiler should be able to figure out what's going on
in every case. After getting hit by it a couple times, then you'll
remember to use ref if you want to alter a copy.

By the way, its the same in other languages, you need to use
"reference" syntax to alter values in a loop. This is one case where
its not /immediately/ obvious what the problem is, "fixing" it would
inevitably cause other code to because "unituitive" so its pretty much
a lose-lose scenario. Even in D, we are going to have code that
compiles, runs, and doesn't do what you expect for some small reason.
At least in this case its the simple matter of adding `ref` to the
loop.

--
James Miller
February 27, 2012
Re: foreach by value iteration
On Sunday, 26 February 2012 at 10:25:31 UTC, Jonathan M Davis 
wrote:
>
> The index bit is highly debatable, as the discussion on that 
> shows
> (personally, I tend to favor leaving it as-is, because it's 
> useful that way),
> but regardless making either an rvalue wouldn't make any sense, 
> because they
> have to be variables that you can operate on in the loop. Doing 
> anything else
> would seriously break the semantics of foreach. The question is 
> simply whether
> they should be actual copies so that you don't end up with them 
> altering
> anything outside of the loop unless you mark them as ref or if 
> the type itself
> is a reference type. The element is pretty much its own copy 
> regardless,
> unless you mark it as ref, whereas the situation with the index 
> is more
> complicated (as the discussion on that shows).
>
> - Jonathan M Davis

Why does making either an rvalue not make any sense?

foreach(<modifiers> var; agg)
   var = 7;

Then if var is not ref, the compiler marks it as an rvalue and 
gives an error if you try to modify or take the address of it.

It's pretty unlikely to happen, but it fixes the problem of 
accidentally modifying a foreach index or variable.  Just because 
it's in a variable doesn't mean the compiler can't treat it as an 
rvalue.
February 27, 2012
Re: foreach by value iteration
On Monday, February 27, 2012 05:51:37 Daniel Murphy wrote:
> On Sunday, 26 February 2012 at 10:25:31 UTC, Jonathan M Davis
> 
> wrote:
> > The index bit is highly debatable, as the discussion on that
> > shows
> > (personally, I tend to favor leaving it as-is, because it's
> > useful that way),
> > but regardless making either an rvalue wouldn't make any sense,
> > because they
> > have to be variables that you can operate on in the loop. Doing
> > anything else
> > would seriously break the semantics of foreach. The question is
> > simply whether
> > they should be actual copies so that you don't end up with them
> > altering
> > anything outside of the loop unless you mark them as ref or if
> > the type itself
> > is a reference type. The element is pretty much its own copy
> > regardless,
> > unless you mark it as ref, whereas the situation with the index
> > is more
> > complicated (as the discussion on that shows).
> > 
> > - Jonathan M Davis
> 
> Why does making either an rvalue not make any sense?
> 
> foreach(<modifiers> var; agg)
>     var = 7;
> 
> Then if var is not ref, the compiler marks it as an rvalue and
> gives an error if you try to modify or take the address of it.
> 
> It's pretty unlikely to happen, but it fixes the problem of
> accidentally modifying a foreach index or variable.  Just because
> it's in a variable doesn't mean the compiler can't treat it as an
> rvalue.

That would be _completely_ inconsistent with the rest of the language. 
Variables are _always_ lvalues. The closest that you can get to them being 
rvalues is if they're const or immutable, but even then, they're still 
lvalues, and you can take their address. It's just that the type system 
protects them from being altered.

But regardless, being unable to alter a foreach loop variable would be like 
making it so that you can no longer alter function parameters. Sure, it might 
fix some bugs, but it would be horribly restrictive a lot of the time.

- Jonathan M Davis
February 27, 2012
Re: foreach by value iteration
> I really don't know how we could make it easier for you to avoid this  
> error
> without making it impossible to other, useful things. But your code  
> works if
> you use ref. You just have to remember to do that.
>
They mostly behave like function parameters.
February 28, 2012
Re: foreach by value iteration
On Sunday, 26 February 2012 at 23:20:24 UTC, James Miller wrote:
> I'm with Jonathon, using `ref` is the best way, it is 
> ridiculous to
> think that the compiler should be able to figure out what's 
> going on
> in every case. After getting hit by it a couple times, then 
> you'll
> remember to use ref if you want to alter a copy.

There's no need for the compiler to figure anything out. I was 
curious if there were any syntactic ways of solving this problem. 
Changing the existing syntax is not really an option, but 
brainstorming ideas can't hurt.

I agree that I can "just remember", but that's not a good way to 
go about creating robust software. People make mistakes. D 
already has lots of errors to help prevent against these kinds of 
common mistakes and making the obvious thing correct and safe is 
one of D's design goals.


> By the way, its the same in other languages, you need to use
> "reference" syntax to alter values in a loop.

C# isn't the same. If you iterate over structs you cannot modify 
the iteration variable. You have to explicitly copy.
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home