February 26, 2012
I just got bitten by this again. It usually looks something like this:

-----------------
struct Vector
{
    ...
    void normalize() { this /= length; }
}

Vector[] vs;
...
foreach (v; vs)
    v.normalize();
-----------------

The bug here is that the loop does nothing because it is normalizing a temporary copy of the vector, rather than the vectors in the array.

Am I the only person that gets caught by this trap?

Perhaps the loop variable should be implicitly const so that this type of thing won't happen?
February 26, 2012
On Sunday, February 26, 2012 09:01:23 Peter Alexander wrote:
> I just got bitten by this again. It usually looks something like this:
> 
> -----------------
> struct Vector
> {
>      ...
>      void normalize() { this /= length; }
> }
> 
> Vector[] vs;
> ...
> foreach (v; vs)
>      v.normalize();
> -----------------
> 
> The bug here is that the loop does nothing because it is normalizing a temporary copy of the vector, rather than the vectors in the array.
> 
> Am I the only person that gets caught by this trap?
> 
> Perhaps the loop variable should be implicitly const so that this type of thing won't happen?

That would be annoying. Sometimes, you _want_ to be able to alter the variable, even if it _is_ a copy. And if it's a reference type, then you don't have the problem at all, and making it const would be completely restrictive.

I really don't know how we could make it easier for you to avoid this error without making it impossible to other, useful things. But your code works if you use ref. You just have to remember to do that.

- Jonathan M Davis
February 26, 2012
"Peter Alexander" <peter.alexander.au@gmail.com> wrote in message news:awuvuwxwgjezcdrfswyn@forum.dlang.org...

> Perhaps the loop variable should be implicitly const so that this type of thing won't happen?

Making it const would break heaps of valid use cases.  It should probably be an rvalue, and so should the index.  I think there's a discussion about this burried somewhere in bugzilla.


February 26, 2012
On Sunday, February 26, 2012 20:57:10 Daniel Murphy wrote:
> "Peter Alexander" <peter.alexander.au@gmail.com> wrote in message news:awuvuwxwgjezcdrfswyn@forum.dlang.org...
> 
> > Perhaps the loop variable should be implicitly const so that this type of thing won't happen?
> 
> Making it const would break heaps of valid use cases.  It should probably be an rvalue, and so should the index.  I think there's a discussion about this burried somewhere in bugzilla.

The index bit is highly debatable, as the discussion on that shows (personally, I tend to favor leaving it as-is, because it's useful that way), but regardless making either an rvalue wouldn't make any sense, because they have to be variables that you can operate on in the loop. Doing anything else would seriously break the semantics of foreach. The question is simply whether they should be actual copies so that you don't end up with them altering anything outside of the loop unless you mark them as ref or if the type itself is a reference type. The element is pretty much its own copy regardless, unless you mark it as ref, whereas the situation with the index is more complicated (as the discussion on that shows).

- Jonathan M Davis
February 26, 2012
Peter Alexander:

> Am I the only person that gets caught by this trap?

I have being hit by this bug some times. I think this is a common enough bug in D code. But it's not easy to invent a way to solve it.

----------------

Jonathan M Davis:

> I really don't know how we could make it easier for you to avoid this error without making it impossible to other, useful things. But your code works if you use ref. You just have to remember to do that.

To "just remember" to avoid the bug is an option, but programmers are not perfect.

One possible solution is to generate a warning when a mutable struct copy is done. But I think this causes too many false positives (false alarms). In theory the compiler may avoid showing the warning if the copied mutable struct is _clearly_ not modified inside the loop body. This reduces the number of false positives, but I think not enough.


Another possible solution is to require something in the foreach, like "ref" or "copy" (or "dup" instead of "copy"). Here I think it's clear that 'v' is a copy of the struct of 'vs' (maybe "copy" is not required if 'v' is not mutable):

struct Vector {
     void normalize() { this /= length; }
}
Vector[] vs;
foreach (copy v; vs)
     v.normalize();


A mix of the two ideas is to generate the warning unless a "copy" or "ref" is present when there is a mutable copy. So "copy" is used just to silence the warning.


Or maybe just an inlined comment /*copy*/ is needed to silence the warning/error:

foreach (/*copy*/ v; vs)
     v.normalize();

Or a different foreach:

foreach_copy (v; vs)
     v.normalize();

I think none of the ideas I've shown is good enough.

Bye,
bearophile
February 26, 2012
On 27 February 2012 10:31, bearophile <bearophileHUGS@lycos.com> wrote:
> Peter Alexander:
>
>> Am I the only person that gets caught by this trap?
>
> I have being hit by this bug some times. I think this is a common enough bug in D code. But it's not easy to invent a way to solve it.
>
> ----------------
>
> Jonathan M Davis:
>
>> I really don't know how we could make it easier for you to avoid this error without making it impossible to other, useful things. But your code works if you use ref. You just have to remember to do that.
>
> To "just remember" to avoid the bug is an option, but programmers are not perfect.
>
> One possible solution is to generate a warning when a mutable struct copy is done. But I think this causes too many false positives (false alarms). In theory the compiler may avoid showing the warning if the copied mutable struct is _clearly_ not modified inside the loop body. This reduces the number of false positives, but I think not enough.
>
>
> Another possible solution is to require something in the foreach, like "ref" or "copy" (or "dup" instead of "copy"). Here I think it's clear that 'v' is a copy of the struct of 'vs' (maybe "copy" is not required if 'v' is not mutable):
>
> struct Vector {
>     void normalize() { this /= length; }
> }
> Vector[] vs;
> foreach (copy v; vs)
>     v.normalize();
>
>
> A mix of the two ideas is to generate the warning unless a "copy" or "ref" is present when there is a mutable copy. So "copy" is used just to silence the warning.
>
>
> Or maybe just an inlined comment /*copy*/ is needed to silence the warning/error:
>
> foreach (/*copy*/ v; vs)
>     v.normalize();
>
> Or a different foreach:
>
> foreach_copy (v; vs)
>     v.normalize();
>
> I think none of the ideas I've shown is good enough.
>
> Bye,
> bearophile

I'm with Jonathon, using `ref` is the best way, it is ridiculous to think that the compiler should be able to figure out what's going on in every case. After getting hit by it a couple times, then you'll remember to use ref if you want to alter a copy.

By the way, its the same in other languages, you need to use "reference" syntax to alter values in a loop. This is one case where its not /immediately/ obvious what the problem is, "fixing" it would inevitably cause other code to because "unituitive" so its pretty much a lose-lose scenario. Even in D, we are going to have code that compiles, runs, and doesn't do what you expect for some small reason. At least in this case its the simple matter of adding `ref` to the loop.

--
James Miller
February 27, 2012
On Sunday, 26 February 2012 at 10:25:31 UTC, Jonathan M Davis wrote:
>
> The index bit is highly debatable, as the discussion on that shows
> (personally, I tend to favor leaving it as-is, because it's useful that way),
> but regardless making either an rvalue wouldn't make any sense, because they
> have to be variables that you can operate on in the loop. Doing anything else
> would seriously break the semantics of foreach. The question is simply whether
> they should be actual copies so that you don't end up with them altering
> anything outside of the loop unless you mark them as ref or if the type itself
> is a reference type. The element is pretty much its own copy regardless,
> unless you mark it as ref, whereas the situation with the index is more
> complicated (as the discussion on that shows).
>
> - Jonathan M Davis

Why does making either an rvalue not make any sense?

foreach(<modifiers> var; agg)
   var = 7;

Then if var is not ref, the compiler marks it as an rvalue and gives an error if you try to modify or take the address of it.

It's pretty unlikely to happen, but it fixes the problem of accidentally modifying a foreach index or variable.  Just because it's in a variable doesn't mean the compiler can't treat it as an rvalue.

February 27, 2012
On Monday, February 27, 2012 05:51:37 Daniel Murphy wrote:
> On Sunday, 26 February 2012 at 10:25:31 UTC, Jonathan M Davis
> 
> wrote:
> > The index bit is highly debatable, as the discussion on that
> > shows
> > (personally, I tend to favor leaving it as-is, because it's
> > useful that way),
> > but regardless making either an rvalue wouldn't make any sense,
> > because they
> > have to be variables that you can operate on in the loop. Doing
> > anything else
> > would seriously break the semantics of foreach. The question is
> > simply whether
> > they should be actual copies so that you don't end up with them
> > altering
> > anything outside of the loop unless you mark them as ref or if
> > the type itself
> > is a reference type. The element is pretty much its own copy
> > regardless,
> > unless you mark it as ref, whereas the situation with the index
> > is more
> > complicated (as the discussion on that shows).
> > 
> > - Jonathan M Davis
> 
> Why does making either an rvalue not make any sense?
> 
> foreach(<modifiers> var; agg)
>     var = 7;
> 
> Then if var is not ref, the compiler marks it as an rvalue and gives an error if you try to modify or take the address of it.
> 
> It's pretty unlikely to happen, but it fixes the problem of accidentally modifying a foreach index or variable.  Just because it's in a variable doesn't mean the compiler can't treat it as an rvalue.

That would be _completely_ inconsistent with the rest of the language. Variables are _always_ lvalues. The closest that you can get to them being rvalues is if they're const or immutable, but even then, they're still lvalues, and you can take their address. It's just that the type system protects them from being altered.

But regardless, being unable to alter a foreach loop variable would be like making it so that you can no longer alter function parameters. Sure, it might fix some bugs, but it would be horribly restrictive a lot of the time.

- Jonathan M Davis
February 27, 2012
> I really don't know how we could make it easier for you to avoid this error
> without making it impossible to other, useful things. But your code works if
> you use ref. You just have to remember to do that.
>
They mostly behave like function parameters.
February 28, 2012
On Sunday, 26 February 2012 at 23:20:24 UTC, James Miller wrote:
> I'm with Jonathon, using `ref` is the best way, it is ridiculous to
> think that the compiler should be able to figure out what's going on
> in every case. After getting hit by it a couple times, then you'll
> remember to use ref if you want to alter a copy.

There's no need for the compiler to figure anything out. I was curious if there were any syntactic ways of solving this problem. Changing the existing syntax is not really an option, but brainstorming ideas can't hurt.

I agree that I can "just remember", but that's not a good way to go about creating robust software. People make mistakes. D already has lots of errors to help prevent against these kinds of common mistakes and making the obvious thing correct and safe is one of D's design goals.


> By the way, its the same in other languages, you need to use
> "reference" syntax to alter values in a loop.

C# isn't the same. If you iterate over structs you cannot modify the iteration variable. You have to explicitly copy.


« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home