February 04, 2022

On Friday, 4 February 2022 at 09:45:31 UTC, Paulo Pinto wrote:

>

Then again, maybe Sun lacked enough people with decades of C and C++ experience, and someone with the track record of Gosling across the computing industry does have any clue about what he was talking about.

I learned about 1s- and 2s-complement at high school in the context of digital circuits, but I guess that is unusual.

Regardless, it might take a decade of system level programming to get good intuition of C semantics, on that we can agree. Maybe we also can agree that most D programmers have no use for that.

And well, why should they? The details are primarily useful for very low level trickery and error-prone bit manipulation. With a good standard library this should not be needed often. Also, with the availability of SIMD I find bithacks to be of very low utility. Prior to SIMD I sometimes used unsigned bit hacks to emulate SIMD (for image processing), but that is arcane at this point in time. I only do such things on the rare occasion where I want to create a high precision phasor (oscillator) or treat floats as bit-vectors. Most programmers don't need this knowledge, they just need a good library.

Anyways, it is a poor strategy to require C-like proficiency as that actually makes it easier for D programmers to transition to C++!

D needs to evolve towards simplicity, that is the main advantage it can obtain over C++ and Rust.

February 04, 2022
On Friday, 4 February 2022 at 09:45:31 UTC, Paulo Pinto wrote:
> One of the little experiments I tried was asking people about the rules for unsigned arithmetic in C.

2's complement is just one part of C's rules. The 2's complement part is relatively easy, but knowing what is promoted to what in C is a bit more involved and easy to get wrong.

(and what C does doesn't make a whole lot of sense.)
February 04, 2022
On Friday, 4 February 2022 at 04:28:37 UTC, Walter Bright wrote:
> On 2/3/2022 8:25 AM, Paul Backus wrote:
>> The inconsistency is the problem here. Having integer types behave differently depending on their width makes the language harder to learn,
>
> It's not really that hard - it's about two or three sentences.

Two or three sentences here, two or three sentences there--it's not much on its own, I agree, but all these little things add up.

And the fact is, C and C++ programmers *do* find these rules difficult to learn and remember in practice. That's why articles like the one that started this discussion are written in the first place.

> There's really no fix for that other than making the effort to understand 2s-complement.
[...]
> Trying to hide the reality of how computer integer arithmetic works, and how integral promotions work, is a prescription for endless frustration and inevitable failure.

2s-complement is "the reality of how computer integer arithmetic works," but there is nothing fundamental or necessary about C's integer promotion rules, and plenty of system-level languages get by without them.
February 04, 2022
On Friday, 4 February 2022 at 04:29:21 UTC, Walter Bright wrote:
> No, then the VRP will emit an error.

No, because you casted it away.

Consider the old code being:

---
struct Thing {
  short a;
}

// somewhere very different

Thing calculate(int a, int b) {
    return Thing(a + b);
}
---


The current rules would require that you put an explicit cast in that constructor call. Then, later, Thing gets refactored into `int`. It will still compile, with the explicit cast still there, now chopping off bits.

The problem with anything requiring explicit casts is once they're written, they rarely get unwritten. I tell new users that `cast` is a code smell - somethings you need it, but it is usually an indication that you're doing something wrong.

But then you do:

short a;
short b = a + 1;

And suddenly the language requires one.

Yes, I know, there's a carry bit that might get truncated. But when you're using all `short`, there's probably an understanding that this is how it works. It's not really that hard - it's about two or three sentences. As long as one understands 2s-complement arithmetic.

On the other hand, there might be loss if there's an integer in there in some kinds of generic code.

I think a reasonable compromise would be to allow implicit conversions down to the biggest type of the input. The VRP can apply here on any literals present. Meaning:


short a;
short b = a + 1;

It checks the input:

a = type short
1 = VRP'd down to byte (or bool even)

Biggest type there? short. So it allows implicit conversion down to short. then VRP can run to further make it smaller:


byte c = (a&0x7e) + 1; // ok the VRP can see it still fits there, so it goes even smaller.


But since the biggest original input fits in a `short`, it allows the output to go to `short`, even if there's a carry bit it might lose.

On the other hand:


ushort b = a + 65535 + 3;


Nope, the compiler can constant fold that literal and VRP will size it to `int` given its value, so explicit cast required there to ensure none of the *actual* input is lost.


short a;
short b;
short c = a * b;


I'd allow that. The input is a and b, they're both short, so let the output truncate back to short implicitly too. Just like with int, there's some understanding that yes, there is a high word produced by the multiply, but it might not fit and I don't need the compiler nagging me like I'm some kind of ignoramus.


This compromise I think would balance the legitimate safety concerns with accidental loss or refactoring changing things (if you refactor to ints, now the input type grows and the compiler can issue an error again) with the annoying casts almost everywhere.

And by removing most the casts, it makes the ones that remain stand out more as the potential problems they are.
February 04, 2022
On Wednesday, 2 February 2022 at 23:27:05 UTC, Walter Bright wrote:
> It also *causes* bugs. When code gets refactored, and the types change, those forced casts may not be doing what is desired, and can do things like unexpectedly truncating integer values.
>
> One of the (largely hidden because it works so well) advances D has over C is Value Range Propagation, where automatic conversions of integers to smaller integers is only done if no bits are lost.

+1

The cure would probably be worse than the "problem".
We should be careful what we wish for.
D does exactly what was so successful in C, integer promotion and proper casts.
It cause zero surprise to a native programmer.

There is a difference.
C does the conversion to shorter integer implicitely, D does not and if you translate you have to cast().
Yet, it isn't clear that the D code with the cast is less brittle than the C code in that case, as when the type changes you get neither a warning in C nor in D.

Example from qoi.d (translation of qoi.h)
----
byte vr = cast(byte)(px.rgba.r - px_prev.rgba.r);
byte vg = cast(byte)(px.rgba.g - px_prev.rgba.g);
byte vb = cast(byte)(px.rgba.b - px_prev.rgba.b);
----

When px.rgba.r changes its type, the D code will have no more warning than the C code, arguably less.


Thus the top methods for detecting integer problems are in that case:
1. No casts with VRP
2. ex aequo: D with cast, or C with implicit cast


It would be nice to "get over" the D integers, it's not like there is a magic design that makes all problems go away, also as of today people still translate from C to D all the time. It is the state we are in today, where compatibility with C semantics helps immensely.
February 04, 2022

On Friday, 4 February 2022 at 15:00:18 UTC, Guillaume Piolat wrote:

>

C does the conversion to shorter integer implicitely, D does not and if you translate you have to cast().

Let us not forget that D does not compete with C (nobody can). D competes with C++, Nim, and so on. Even C++ keeps introducing new basic type to get better type safety for every new edition of the language. For instance, in C++ std::byte is not an arithmetic type.

This is an improvement, also for system level native programming.

February 04, 2022
On Fri, Feb 04, 2022 at 02:01:43PM +0000, Paul Backus via Digitalmars-d wrote:
> On Friday, 4 February 2022 at 04:28:37 UTC, Walter Bright wrote:
[...]
> > There's really no fix for that other than making the effort to understand 2s-complement.
> [...]
> > Trying to hide the reality of how computer integer arithmetic works, and how integral promotions work, is a prescription for endless frustration and inevitable failure.
> 
> 2s-complement is "the reality of how computer integer arithmetic works," but there is nothing fundamental or necessary about C's integer promotion rules, and plenty of system-level languages get by without them.

+1.


T

-- 
There are 10 kinds of people in the world: those who can count in binary, and those who can't.
February 04, 2022
On 2/4/2022 5:18 AM, Adam D Ruppe wrote:
> (and what C does doesn't make a whole lot of sense.)

C was developed on a PDP-11 and the integral promotions rules come about because of the way the -11 instructions work. It's the same for the float=>double promotion rules.
February 04, 2022
On Fri, Feb 04, 2022 at 12:24:37PM -0800, Walter Bright via Digitalmars-d wrote:
> On 2/4/2022 5:18 AM, Adam D Ruppe wrote:
> > (and what C does doesn't make a whole lot of sense.)
> 
> C was developed on a PDP-11 and the integral promotions rules come about because of the way the -11 instructions work. It's the same for the float=>double promotion rules.

PDP-11 instructions no longer resemble how modern machines work, though. What made sense back then may not necessarily make sense anymore today.


T

-- 
To err is human; to forgive is not our policy. -- Samuel Adler
February 04, 2022
On Friday, 4 February 2022 at 04:28:37 UTC, Walter Bright wrote:
> There's really no fix for that other than making the effort to understand 2s-complement. Some noble attempts:
>
> Java: disallowed all unsigned types. Wound up having to add that back in as a hack.

How many people actually use (and need) unsigned integers? If 99% of users don't need them, that's a good case for relegating them to a library type. This wasn't possible in Java because it doesn't support operator overloading, without which dealing with such types would have been quite annoying.