February 05, 2018
On Monday, 5 February 2018 at 22:52:41 UTC, Steven Schveighoffer wrote:
> But I can't see why there is controversy over negation of byte turning into an int. I can't see why anyone would expect:
>
> int x = -b;
>
> when b is -128, to set x to -128. The integer promotion makes complete sense to me.

Do you feel the same way about

float x = 1/2;

?
February 06, 2018
On 05.02.2018 22:56, Walter Bright wrote:
> On 2/5/2018 12:45 PM, H. S. Teoh wrote:
>> Sticking to C promotion rules is one of the scourges that continue to
>> plague D;
> 
> It's necessary. Working C expressions cannot be converted to D while introducing subtle changes in behavior.
> ...

Neither byte nor dchar are C types.

>> another example is char -> byte confusion no thanks to C
>> traditions:
>>
>>     int f(dchar ch) { return 1; }
>>     int f(byte i) { return 2; }
>>     void main() {
>>         pragma(msg, f('a'));
>>         pragma(msg, f(1));
>>     }
>>
>> Exercise for reader: guess compiler output.
> 
> 'a' and 1 do not match dchar or byte exactly, and require implicit conversions. D doesn't have the C++ notion of "better" implicit conversions for function arguments, instead it uses the "leastAsSpecialized" C++ notion used for template matching, which is better.
> 
> The idea is a byte can be implicitly converted to a dchar, but not the other way around. Hence, f(byte) is selected as being the "most specialized" match.

The overloading rules are fine, but byte should not implicitly convert to char/dchar, and char should not implicitly convert to byte.
February 05, 2018
On Mon, Feb 05, 2018 at 01:56:33PM -0800, Walter Bright via Digitalmars-d wrote:
> On 2/5/2018 12:45 PM, H. S. Teoh wrote:
> > Sticking to C promotion rules is one of the scourges that continue to plague D;
> 
> It's necessary. Working C expressions cannot be converted to D while introducing subtle changes in behavior.

Exactly; maintaining compatibility with C, the ruiner of all good things, is a scourge to D that prevents D from adopting saner rules.


> > another example is char -> byte confusion no thanks to C traditions:
> > 
> > 	int f(dchar ch) { return 1; }
> > 	int f(byte i) { return 2; }
> > 	void main() {
> > 		pragma(msg, f('a'));
> > 		pragma(msg, f(1));
> > 	}
> > 
> > Exercise for reader: guess compiler output.
> 
> 'a' and 1 do not match dchar or byte exactly, and require implicit conversions. D doesn't have the C++ notion of "better" implicit conversions for function arguments, instead it uses the "leastAsSpecialized" C++ notion used for template matching, which is better.
> 
> The idea is a byte can be implicitly converted to a dchar, [...]

This is the root of the problem.  Character types should never have been implicitly convertible to/from arithmetic integral types in the first place.

Again, compatibility with C pessimizes D semantics.  Since D deliberately defines byte/ubyte as different from char, and char is defined explicitly to be a UTF-8 code unit, there is really no good reason to allow implicit conversions between them.  That just undermines the raison d'être for having separate ubyte and char types, and is a stink we inherited from C that really ought to be shed already.

It should be C code ported to D that is forced to use casts to explicate intent, rather than forcing D code to jump through cast hoops just so we remain "compatible" with C.


T

-- 
Why did the mathematician reinvent the square wheel?  Because he wanted to drive smoothly over an inverted catenary road.
February 06, 2018
On 05.02.2018 22:20, Nick Sabalausky wrote:
> Ouch. I guess "the real WTF" is that 2's complement leads to supporting one value that cannot be negated. ...

Actually, it's not fully supported at this time, but it soon will be.

import std.exception;

void main(){
    auto i = int.min;
    enforce(i > 0); // pass :)
}

https://issues.dlang.org/show_bug.cgi?id=18315
https://github.com/dlang/dmd/pull/7841
February 05, 2018
On 2/5/18 6:09 PM, Adam D. Ruppe wrote:
> On Monday, 5 February 2018 at 22:52:41 UTC, Steven Schveighoffer wrote:
>> But I can't see why there is controversy over negation of byte turning into an int. I can't see why anyone would expect:
>>
>> int x = -b;
>>
>> when b is -128, to set x to -128. The integer promotion makes complete sense to me.
> 
> Do you feel the same way about
> 
> float x = 1/2;
> 
> ?

Not really. But it is a good counter-argument.

In a way, it's the fact that it's a corner case which makes this more sinister. For all bytes *except* byte.min, the behavior is the same regardless of whether integer promotion is used or not. So when this bug actually occurs, it's going to be in code that "worked for years". And in some cases was ported from C and worked there.

But for integer division and assignment to float, it's quite obvious that it doesn't work with almost all combinations.

-Steve
February 05, 2018
On Monday, 5 February 2018 at 23:34:59 UTC, Steven Schveighoffer wrote:
> But for integer division and assignment to float, it's quite obvious that it doesn't work with almost all combinations.

So let me take two steps there to get to my point:

1) Given:
    byte a, b;
    byte c = a + b;

The cast seems a bit silly: you are already explicitly using `byte` everywhere, so your intention is pretty clear: you only want to use the bytes and are ok with the rest of it being discarded. Therefore, I find the cast to be an unnecessary addition.

2) Change it to:
    byte a, b;
    int c = a + b;

This is directly analogous to the float example. Which, I agree, reasonable people can disagree on... but I'd be perfectly ok if the `a + b` was still typed `byte` just like above... and the data got discarded when assigned to the `int`.

Just like how you'd write cast(float) 1 / 2, here you can do cast(int) a + b to forcibly promote it.



And if we did this consistently across all things, it'd be pretty clear you should cast - 127 + 50 here would have discarded info so it would stand out a lot faster than just the -a edge case. So you'd learn it pretty quickly.


But I will grant it is going to be different than the C rules... so it would make copy/pasted code a pain. tho copy/pasted code is already a pain cuz you gotta insert casts to avoid the errors... but yeah an error is less of a hassle than silent change.

So I get why things are the way they are. I just still don't like it :P
February 05, 2018
On 2/5/18 6:56 PM, Adam D. Ruppe wrote:
> On Monday, 5 February 2018 at 23:34:59 UTC, Steven Schveighoffer wrote:
>> But for integer division and assignment to float, it's quite obvious that it doesn't work with almost all combinations.
> 
> So let me take two steps there to get to my point:
> 
> 1) Given:
>      byte a, b;
>      byte c = a + b;
> 
> The cast seems a bit silly: you are already explicitly using `byte` everywhere, so your intention is pretty clear: you only want to use the bytes and are ok with the rest of it being discarded. Therefore, I find the cast to be an unnecessary addition.

It could be done that way. In fact, this works in C:

char a = 1;
char b = 2;
char c = a + b;

I would actually have no problem if it were this way, as you are clear in your intention. I'm also OK with the way it is now, where it requires the cast. The cast is generally added "because the compiler told me to", but it does make you think about what you really want.

> 2) Change it to:
>      byte a, b;
>      int c = a + b;
> 
> This is directly analogous to the float example. Which, I agree, reasonable people can disagree on... but I'd be perfectly ok if the `a + b` was still typed `byte` just like above... and the data got discarded when assigned to the `int`.

I think the CPU has to do extra work to throw away that high bit, no?

But actually, I would be OK with this as well, as long as it was consistent across all operations. Right now, it's one way for unary operations and another way for binary operations. Just pick a way and make it consistent.

I still don't love the idea that:

a = a + 1;

fails, but

++a; a += 1;

works just fine.

> But I will grant it is going to be different than the C rules... so it would make copy/pasted code a pain. tho copy/pasted code is already a pain cuz you gotta insert casts to avoid the errors... but yeah an error is less of a hassle than silent change.
> 
> So I get why things are the way they are. I just still don't like it :P

Yeah, there's a lot of stuff that's like that ;)

-Steve
February 05, 2018
On Monday, February 05, 2018 15:27:45 H. S. Teoh via Digitalmars-d wrote:
> On Mon, Feb 05, 2018 at 01:56:33PM -0800, Walter Bright via Digitalmars-d
wrote:
> > The idea is a byte can be implicitly converted to a dchar, [...]
>
> This is the root of the problem.  Character types should never have been implicitly convertible to/from arithmetic integral types in the first place.

+1

Occasionally, it's useful, but in most cases, it just causes bugs - especially when you consider stuff like appending to a string.

- Jonathan M Davis

February 06, 2018
On Tuesday, 6 February 2018 at 00:08:12 UTC, Steven Schveighoffer wrote:
> I think the CPU has to do extra work to throw away that high bit, no?

No, the x86 has never had any trouble with this, and I don't think ARM does either (worst case you load it as int, then save it as byte).

> I still don't love the idea that:
>
> a = a + 1;
>
> fails, but
>
> ++a; a += 1;
>
> works just fine.

indeed. I used to defend that but really the logic goes both ways, either way.


February 05, 2018
On Mon, Feb 05, 2018 at 07:08:12PM -0500, Steven Schveighoffer via Digitalmars-d wrote:
> On 2/5/18 6:56 PM, Adam D. Ruppe wrote:
[...]
> > 2) Change it to:
> >      byte a, b;
> >      int c = a + b;
> > 
> > This is directly analogous to the float example. Which, I agree, reasonable people can disagree on... but I'd be perfectly ok if the `a + b` was still typed `byte` just like above... and the data got discarded when assigned to the `int`.
> 
> I think the CPU has to do extra work to throw away that high bit, no?
[...]

Not sure what you mean by "extra work to throw away that high bit".

If a + b is typed as `byte`, then if an underflow happens the upper bits
will just get thrown away, then the result will be sign-extended to int,
losing the original sign. E.g., 0x80 (-128) + 0xFF (-1) = 0x(1)7F
(underflow, the upper (1) gets discarded), sign-extended to 0x0000007F
(+127).

If a + b was typed as `int` instead, e.g. both a and b were promoted to
int, then the CPU would sign-extend a, sign-extend b, and *then* add the
two together. So 0x80 gets sign-extended to 0xFFFFFF80, and 0xFF gets
sign-extended to 0xFFFFFFFF, and adding the two produces 0xFFFFFF7F
(-129). (There's an overflow 1 at the 33rd bit, but that gets
discarded in 32-bit arithmetic.)


T

-- 
May you live all the days of your life. -- Jonathan Swift