VRP and signed <-> unsigned conversion

Dec 15, 2021

Steven Schveighoffer

Dec 15, 2021

Commander Zot

Dec 15, 2021

Dec 15, 2021

Dec 16, 2021

Dec 16, 2021

Dec 16, 2021

On Wednesday, 15 December 2021 at 14:39:09 UTC, Steven Schveighoffer wrote:

[…]

I'm wondering:

Does it make sense for this to be valid? Should we reexamine unsigned <-> signed implicit casting?

I don't understand why byte should be implicitly convertible to ubyte. Seeing this as is, I think this is a bug.

If the above rewrite is possible, shouldn't VRP just allow this conversion? i.e. a type that has an unsigned/signed counterpart should be assignable if the signed/unsigned can accept the range.

To me, it seems that VRP is designed around mathematical intuition in which the integer types are seen as {−2ⁿ⁻¹, …, 2ⁿ⁻¹−1} and {0, …, 2ⁿ−1} for n appropriate. (I hope the Unicode superscripts render properly.)
The problem starts with subtraction. If x, y ∈ {0, …, 15}, then x − y ∈ {−15, …, 15}. A lot of professional people know that (at least the unsigned types) implement arithmetic modulo 2ⁿ, so x − y is well-defined. However, you can see unsigned types as positive types (e.g. when taking the length of an array); in this case, subtraction x − y makes no sense when y > x.
I guess the whole problem comes from the double-role unsigned types play: positive numbers vs. mod-2ⁿ numbers; a triple-role with bit-operations.

This is a fundamental design problem, and VRP cannot fix it. I don't know of a good solution, I've yet to see one. What I've never seen is splitting integers into three types: signed ones for {−2ⁿ⁻¹, …, 2ⁿ⁻¹−1}, unsigned ones for {0, …, 2ⁿ−1}, and bit-vectors for {0, 1}ⁿ. Bit operators would only be available for the latter, arithmetic for all of them, with the intuition that bit-vectors are mod-2ⁿ meaning that all operations are well-defined except division by zero, but for signed and unsigned types, all operations are partial (except unary plus). Even unary minus is partial for signed types because −(−2ⁿ⁻¹) ∉ {−2ⁿ⁻¹, …, 2ⁿ⁻¹−1}. What (ideal) VRP can do is keep you from code that might leave the domain.
Another thing to note is that division in modular arithmetic is a bit weird: In mod-8 arithmetic, 7 ÷ 3 = 5 (as 3 × 5 = 15 ≡ 7), but virtually nobody wants that.

December 15, 2021

Re: VRP and signed <-> unsigned conversion

Posted by Steven Schveighoffer
in reply to Quirin Schroll

Permalink

Steven Schveighoffer

Posted in reply to Quirin Schroll

Permalink

On 12/15/21 1:04 PM, Quirin Schroll wrote:

On Wednesday, 15 December 2021 at 14:39:09 UTC, Steven Schveighoffer wrote:

[…]

I'm wondering:

Does it make sense for this to be valid? Should we reexamine unsigned <-> signed implicit casting?

I don't understand why byte should be implicitly convertible to ubyte. Seeing this as is, I think this is a bug.

If it's a bug, it's a bug in design. From C, that D inherited.

> >

If the above rewrite is possible, shouldn't VRP just allow this conversion? i.e. a type that has an unsigned/signed counterpart should be assignable if the signed/unsigned can accept the range.

(they did)

The problem starts with subtraction. If x, y ∈ {0, …, 15}, then x − y ∈ {−15, …, 15}. A lot of professional people know that (at least the unsigned types) implement arithmetic modulo 2ⁿ, so x − y is well-defined. However, you can see unsigned types as positive types (e.g. when taking the length of an array); in this case, subtraction x − y makes no sense when y > x.
I guess the whole problem comes from the double-role unsigned types play: positive numbers vs. mod-2ⁿ numbers; a triple-role with bit-operations.

The impetus of VRP was simple -- C allows implicit casting between all integer types. D did not want to do that due to the errors it causes. However, since integer promotion is mimicked from C, all math is done at integer level (this makes C code that's compiled with D code behave similarly). However, it's quite painful to have to cast everything with integers that are smaller than 32-bit ints, so VRP is used to assist in making sure you don't have to cast if the compiler can prove the value fits.

But this quirk of allowing implicit casting between unsigned and signed violates that spirit. To me it feels like a bad code smell. I also have issues with the character types being considered as integers. D could do a lot better in this area I think.

-Steve

Forums