Interesting user mistake (page 4)

On 09/03/2015 01:15 PM, "Luís Marques <luis@luismarques.eu> wrote: > On Thursday, 3 September 2015 at 17:12:31 UTC, H. S. Teoh wrote: >> Is there a way for the lexer to check for the specific character >> sequence '=', '+', whitespace and not others (e.g. '=', whitespace, >> '+')? IOW, "a =+ b" will be prohibited, but "a = + b" will be >> allowed. If so, I agree with this. >> >> On that note, though, the unary + operator is totally useless in D... >> maybe we should get rid of that instead? (Then "=+" will >> automatically be an error.) > > What about the generalization? E.g., '=', '-', whitespace? Yah, '-' with the wrong spacing around it also makes sense. -- Andrei

September 04, 2015

Re: Interesting user mistake

Posted by Observer
in reply to Enamex

Permalink

Observer

Posted in reply to Enamex

Permalink

On Thursday, 3 September 2015 at 17:15:25 UTC, Enamex wrote:
> On Thursday, 3 September 2015 at 16:46:30 UTC, Andrei Alexandrescu wrote:
>> we should disallow during tokenization the sequence "=", "+", whitespace. Surely it's not a formatting anyone would aim for, but instead a misspelling of +=.
>
> Wasn't the original design for these operators in whichever language started them actually the swapped form? In any case, I'd personally choose to warn against any operator sequence without separating whitespaces if the sequence isn't in fact one operator (like "=!" (non-warn: "= !") vs "!=", and "=&", and a lot others, given operator overloading).

The += operator originally appeared in C, as the =+ operator (K&R, 1/e, p212, "Anachronisms").  Not long afterward, the ambiguity of a=+b became apparent, along with the obvious need to change the language to resolve the problem.  The issue was dealt with over several successive releases of the compiler:

(1) =+ operator is available (original compiler)
(2) =+ and += are both available (I'm not sure whether this step existed)
(3) =+ and += are both available; =+ produces a deprecation warning
(4) += is available; =+ now produces either a warning/error or just changes meaning, not sure which

I don't recall now where I read about that sequence of steps, and a quick, incomplete scan of my library doesn't yield it up so I could be more precise.  Nonetheless, that's how the transition happened.  The Rationale for the original ANSI C (X3.159-1989) mentions that =op forms have been dropped, and that in a Quiet Change, "expressions of the form x=-3 change meaning with the loss of the old-style assignment operators".  Which I suppose implies that the Standard itself doesn't require a warning message, though presumably high-quality compilers would be free to implement one.

K&R C did not contain a unary + operator (K&R, 1/e, p. 37, Section 2.5).  It was added by the first ANSI C, "for symmetry with unary -" (K&R, 2/e, p204, Section A7.4.4).  "An integral operand undergoes integral promotion."

In terms of compiler quality, we have a long history of compilers generating warning messages for legal but questionable constructions.  The first one that comes quickly to mind is GCC complaining about "if(a=b)":

    warning: suggest parentheses around assignment used as truth value [-Wparentheses]

The notion here is that a common mistake is handled by:

(a) being warned about, when warnings are enabled (at least, by -Wall in GCC)
(b) having an alternate construction suggested (e.g., "if((a=b))")
(c) having a specific compiler flag to control generation of each such warning

On Friday, 4 September 2015 at 17:17:26 UTC, Andrei Alexandrescu wrote: > On 09/03/2015 01:08 PM, H. S. Teoh via Digitalmars-d wrote: >> [...] >> >> Is there a way for the lexer to check for the specific character >> sequence '=', '+', whitespace and not others (e.g. '=', whitespace, >> '+')? IOW, "a =+ b" will be prohibited, but "a = + b" will be allowed. >> If so, I agree with this. > > Yah, space is relevant there. That's why the check is easiest done during tokenization. -- Andrei Given ` a =+ b `, I see no issue with the statement assuming 'b' is of some type T that overloads the unary + operator. ie. ` a = b.opUnary!"+" ` And while the expression could also be written as ` a = +b `, there's a number of situations where it's hard to control the formatting (ie. generated mixin code). That, and I can't think of any other C-like language where ` =+ ` would produce an error. A simple solution would be to just have unary + perform integer promotion, as it does in C.

On 9/3/15, Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote: > http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang > > The gist of it is the user wrote =+ instead of +=. I wonder if we should disallow during tokenization the sequence "=", "+", whitespace. Surely it's not a formatting anyone would aim for, but instead a misspelling of +=. The gist of it is that it's interesting because it's such a rare occurrence. That's why we probably shouldn't even think about it. I've never seen such mistakes in OSS code before, and I've never seen it in production code either. It's so rare that we shouldn't spend any time thinking about it. Sure it's interesting, but why bother with this special case when there's bigger fish to fry?

On Fri, Sep 04, 2015 at 10:41:05PM +0200, Andrej Mitrovic via Digitalmars-d wrote: > On 9/3/15, Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com> wrote: > > http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang > > > > The gist of it is the user wrote =+ instead of +=. I wonder if we should disallow during tokenization the sequence "=", "+", whitespace. Surely it's not a formatting anyone would aim for, but instead a misspelling of +=. > > The gist of it is that it's interesting because it's such a rare occurrence. That's why we probably shouldn't even think about it. I've never seen such mistakes in OSS code before, and I've never seen it in production code either. It's so rare that we shouldn't spend any time thinking about it. > > Sure it's interesting, but why bother with this special case when there's bigger fish to fry? https://en.wikipedia.org/wiki/Parkinson%27s_law_of_triviality :-) T -- "I'm not childish; I'm just in touch with the child within!" - RL

On 9/4/15, H. S. Teoh via Digitalmars-d <digitalmars-d@puremagic.com> wrote: > https://en.wikipedia.org/wiki/Parkinson%27s_law_of_triviality The irony is that I now spend most of my time thinking about really pressing issues, I can finally see the forest for the trees. But I'm no longer an OSS dev so all of my effort goes into my company (which I love). I wish I had this kind of insight back in the day. :)

On 09/04/2015 08:02 AM, Steven Schveighoffer wrote: > On 9/3/15 5:59 PM, Brian Schott wrote: >> On Thursday, 3 September 2015 at 17:17:26 UTC, Steven Schveighoffer >> wrote: >>> What about all other operations that may be typos from op= where op is >>> also a unary operator? e.g. =- >> >> We'd have to special-case '*': >> >> a=*b; >> > > You could say the same thing for =-: > > a=-b; > > seems reasonable for someone who doesn't like whitespace. I think > Andrei's rule was the token sequence must have whitespace after the > operator in order to be rejected. So the above would be fine. Yah, that's what I was thinking. -- Andrei

On 09/04/2015 08:47 AM, immu wrote: > On Thursday, 3 September 2015 at 16:46:30 UTC, Andrei Alexandrescu wrote: >> I wonder if we should disallow during tokenization the sequence "=", >> "+", whitespace. Surely it's not a formatting anyone would aim for, >> but instead a misspelling of +=. >> >> >> Andrei > > > Please don't. That would feel like a completely arbitrary exception in > the grammar. Not a good counterargument! -- Andrei

On 09/04/2015 12:39 PM, skoppe wrote: > On Thursday, 3 September 2015 at 16:46:30 UTC, Andrei Alexandrescu wrote: >> http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang >> >> >> The gist of it is the user wrote =+ instead of +=. I wonder if we >> should disallow during tokenization the sequence "=", "+", whitespace. >> Surely it's not a formatting anyone would aim for, but instead a >> misspelling of +=. >> >> >> Andrei > > Seems like a really, really small fish to catch. I wouldn't want to > litter my codebase with those kind of rules. I don't see what litter one would need you add to one's codebase? > Besides, isn't stackoverflow about the answers and opinions, rather than > about the questions? That's not a criterion to judge any discussion that originates there. Andrei

On 09/04/2015 02:55 PM, Mint wrote: > On Friday, 4 September 2015 at 17:17:26 UTC, Andrei Alexandrescu wrote: >> On 09/03/2015 01:08 PM, H. S. Teoh via Digitalmars-d wrote: >>> [...] >>> >>> Is there a way for the lexer to check for the specific character >>> sequence '=', '+', whitespace and not others (e.g. '=', whitespace, >>> '+')? IOW, "a =+ b" will be prohibited, but "a = + b" will be allowed. >>> If so, I agree with this. >> >> Yah, space is relevant there. That's why the check is easiest done >> during tokenization. -- Andrei > > Given ` a =+ b `, I see no issue with the statement assuming 'b' is of > some type T that overloads the unary + operator. ie. ` a = b.opUnary!"+" ` > > And while the expression could also be written as ` a = +b `, there's a > number of situations where it's hard to control the formatting (ie. > generated mixin code). That, and I can't think of any other C-like > language where ` =+ ` would produce an error. > > A simple solution would be to just have unary + perform integer > promotion, as it does in C. I am now sorry I started this. -- Andrei

Forums