September 04, 2015
On Friday, 4 September 2015 at 12:02:26 UTC, Steven Schveighoffer wrote:
> On 9/3/15 5:59 PM, Brian Schott wrote:
>> On Thursday, 3 September 2015 at 17:17:26 UTC, Steven Schveighoffer wrote:
>>> What about all other operations that may be typos from op= where op is
>>> also a unary operator? e.g. =-
>>
>> We'd have to special-case '*':
>>
>> a=*b;
>>
>
> You could say the same thing for =-:
>
> a=-b;
>
> seems reasonable for someone who doesn't like whitespace. I think Andrei's rule was the token sequence must have whitespace after the operator in order to be rejected. So the above would be fine.

    Vector!double p;
    p.x=+ 0.27;
    p.y=-11.91;
    p.z=- 8.24;

September 04, 2015
On Thursday, 3 September 2015 at 16:46:30 UTC, Andrei Alexandrescu wrote:
> http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang
>
> The gist of it is the user wrote =+ instead of +=. I wonder if we should disallow during tokenization the sequence "=", "+", whitespace. Surely it's not a formatting anyone would aim for, but instead a misspelling of +=.

Well, in this particular case, wouldn't it make more sense to simply make + as a unary operator illegal? It doesn't even _do_ anything, and I, for one, didn't even know that such an operator existed before this thread.

In general though, I think that warning based on whitespace is a slippery slope, especially when the language is designed such that whitespace doesn't matter unless it's required to make it so that a compound token is separate tokens. And this particular case isn't even generalizable, because there are other operators where they _would_ go in that order, and if you don't actually want to properly space out your operators with whitespace, then you'll end up with stuff like =-a and =*a which are perfectly legitimate. So, we're forced to either special case =+, or we're essentially going to require that you put a space after =, and while I think that everyone should, that's basically requiring folks to format their code in a particular way which isn't in line with D or its C/C++ ancestry and the expectations of the developers who use such languages.

So, honestly, I'm more inclined to tell folks that maybe if they want their code to be clear and avoid mistakes like this, they should put whitespace around operators, but it's a free world, and they're free to format their code in a way that's harder to read and more likely to help them shoot themselves in the foot if they really want to.

But regardless, I question that even allowing + as a unary operator even makes sense in the first place. I guess that some folks might use it to try and make the difference between a positive and a negative number more obvious? But I would have thought that it would make it _harder_ to distinguish them rather than easier. So, maybe we can just get rid of the unary + operator. But still, that would only help this particular case, and other operators which _do_ make sense as unary operators to the right of an assignment still have the potential of being mistyped to painful results if programmers don't actually put whitespace around = like most of us would.

- Jonathan M Davis
September 04, 2015
On Friday, 4 September 2015 at 13:55:03 UTC, Jonathan M Davis wrote:
> On Thursday, 3 September 2015 at 16:46:30 UTC, Andrei Alexandrescu wrote:
>> http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang
>>
>> The gist of it is the user wrote =+ instead of +=. I wonder if we should disallow during tokenization the sequence "=", "+", whitespace. Surely it's not a formatting anyone would aim for, but instead a misspelling of +=.
[snip]

Actually, I may have misunderstood the suggestion. I do _not_ think that we should require that someone who writes code like

a=+b;

should be forced to put whitespace in their code, as ugly as it arguably is that they don't (which is what I thought was being suggested). However, if they've written their code like

a =+ b;

then it would make sense to warn about it, since the odds of that being legitimate are nearly zero, and the same goes for any other unary operator. Someone might be weird and choose to put whitespace before the unary operator, but I don't think that that's very common, and if someone is doing that, they're not likely to then _not_ put a space before the unary operator. So, I don't think that we'd really be cramping anyone's style (be it ugly or otherwise) if we warned about =+ when there's whitespace on both sides of =+ but not between them.

Now, I tend to think that anything should either be an error or not and that everything else should be left to a linter, since it's subjective, so I'm still not a big fan about having something like this be a warning, but I very much doubt that it'll really cause any problems if it is, since I have a hard time believing that anyone is even going to _want_ to write

a =+ b;

- Jonathan M Davis
September 04, 2015
On Friday, 4 September 2015 at 14:05:09 UTC, Jonathan M Davis wrote:
> On Friday, 4 September 2015 at 13:55:03 UTC, Jonathan M Davis wrote:
>>> [...]
> [snip]
>
> [...]

Isn't it called Maximal Munch...

https://en.wikipedia.org/wiki/Maximal_munch

Regards, -<mike>-
September 04, 2015
On Friday, 4 September 2015 at 14:14:43 UTC, Mike James wrote:
> On Friday, 4 September 2015 at 14:05:09 UTC, Jonathan M Davis wrote:
>> On Friday, 4 September 2015 at 13:55:03 UTC, Jonathan M Davis wrote:
>>>> [...]
>> [snip]
>>
>> [...]
>
> Isn't it called Maximal Munch...
>
> https://en.wikipedia.org/wiki/Maximal_munch
>
> Regards, -<mike>-

Yes. That's how most languages typically parse tokens, but some programming languages are more willing to force formatting on you than others, even if they use maximal munch. You _can_ choose to make certain uses of whitespace illegal while still using maximal munch, since all that maximal munch is doing is deciding how you're going to know whether a sequence of characters is one token or several when it's ambiguous. It's why vector<pair<int, int>> has resulted in the C++98 parsers thinking that the >> on the end is a shift operator rather than the closing halves of the two templates, and C++11, Java, and C# have all had to _not_ use maximal munch in that particular case to make it so that it's not treated as the shift-operator. It makes their grammars that much less context-free and is part of why D uses !() for template instantiations.

In any case, I didn't use the term maximal munch, because that indicates how tokens are separated and says nothing about how you format your code (aside from the fact that you sometimes have to add whitespace to disambiguate if the grammar isn't clean enough), whereas this discussion really has to do with making formatting your code in a particular instance illegal (or at least that the compiler would warn about it, which is essentially equivalent to making it illegal, since no one should leave warnings in their code, and -w literally turns all warnings into errors anyway). There is no ambiguity as to whether =+ is the same as = + as far as the compiler is concerned, because there is no =+ token, and so maximal munch doesn't really even come into play here.

- Jonathan M Davis
September 04, 2015
On 9/4/15 9:38 AM, Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm@gmx.net> wrote:
> On Friday, 4 September 2015 at 12:02:26 UTC, Steven Schveighoffer wrote:
>> On 9/3/15 5:59 PM, Brian Schott wrote:
>>> On Thursday, 3 September 2015 at 17:17:26 UTC, Steven Schveighoffer
>>> wrote:
>>>> What about all other operations that may be typos from op= where op is
>>>> also a unary operator? e.g. =-
>>>
>>> We'd have to special-case '*':
>>>
>>> a=*b;
>>>
>>
>> You could say the same thing for =-:
>>
>> a=-b;
>>
>> seems reasonable for someone who doesn't like whitespace. I think
>> Andrei's rule was the token sequence must have whitespace after the
>> operator in order to be rejected. So the above would be fine.
>
>      Vector!double p;
>      p.x=+ 0.27;
>      p.y=-11.91;
>      p.z=- 8.24;
>

p.x= + 0.27;
p.y= -11.91;
p.z= - 8.24;

This really isn't a difficult thing to fix, nor do I likely see this being a common issue.

-Steve
September 04, 2015
On 9/4/15 10:05 AM, Jonathan M Davis wrote:
> On Friday, 4 September 2015 at 13:55:03 UTC, Jonathan M Davis wrote:
>> On Thursday, 3 September 2015 at 16:46:30 UTC, Andrei Alexandrescu wrote:
>>> http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang
>>>
>>>
>>> The gist of it is the user wrote =+ instead of +=. I wonder if we
>>> should disallow during tokenization the sequence "=", "+",
>>> whitespace. Surely it's not a formatting anyone would aim for, but
>>> instead a misspelling of +=.
> [snip]
>
> Actually, I may have misunderstood the suggestion. I do _not_ think that
> we should require that someone who writes code like
>
> a=+b;
>
> should be forced to put whitespace in their code, as ugly as it arguably
> is that they don't (which is what I thought was being suggested).
> However, if they've written their code like
>
> a =+ b;
>
> then it would make sense to warn about it, since the odds of that being
> legitimate are nearly zero, and the same goes for any other unary
> operator.

I think that is Andrei's original suggestion:

the sequence "=", "+", whitespace should be rejected.

He says nothing about "=","+" without the whitespace.

-Steve

September 04, 2015
On Friday, 4 September 2015 at 15:13:08 UTC, Steven Schveighoffer wrote:
> I think that is Andrei's original suggestion:
>
> the sequence "=", "+", whitespace should be rejected.
>
> He says nothing about "=","+" without the whitespace.

Yup.
September 04, 2015
On Thursday, 3 September 2015 at 16:46:30 UTC, Andrei Alexandrescu wrote:
> http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang
>
> The gist of it is the user wrote =+ instead of +=. I wonder if we should disallow during tokenization the sequence "=", "+", whitespace. Surely it's not a formatting anyone would aim for, but instead a misspelling of +=.
>
>
> Andrei

Seems like a really, really small fish to catch. I wouldn't want to litter my codebase with those kind of rules.

Besides, isn't stackoverflow about the answers and opinions, rather than about the questions?
September 04, 2015
On 09/03/2015 01:08 PM, H. S. Teoh via Digitalmars-d wrote:
> On Thu, Sep 03, 2015 at 12:46:29PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
>> http://stackoverflow.com/questions/32369114/leap-years-not-working-in-date-and-time-program-in-dlang
>>
>> The gist of it is the user wrote =+ instead of +=. I wonder if we
>> should disallow during tokenization the sequence "=", "+", whitespace.
>> Surely it's not a formatting anyone would aim for, but instead a
>> misspelling of +=.
> [...]
>
> Is there a way for the lexer to check for the specific character
> sequence '=', '+', whitespace and not others (e.g. '=', whitespace,
> '+')?  IOW, "a =+ b" will be prohibited, but "a = + b" will be allowed.
> If so, I agree with this.

Yah, space is relevant there. That's why the check is easiest done during tokenization. -- Andrei