August 13, 2020
On 8/13/2020 3:10 AM, H. S. Teoh wrote:
> You're mixing signed and unsigned values. That's generally dangerous
> territory where integer promotion rules inherited from C/C++ take over
> and cause sometimes weird effects, like here. Changing integer promotion
> rules will probably never happen now, because it will cause massive
> *silent* breakage of existing code. So, in the spirit of defensive
> programming, I recommend avoiding mixing signed/unsigned values in this
> way.

There are some things everyone simply needs to know to use a systems programming language successfully:

1. How 2's complement arithmetic works, especially in regard to how negative numbers are handled.

2. The range of values (signed and unsigned) in 2's complement values.

3. Overflow is not detected.

4. 2's complement arithmetic wraps around.

5. The integral promotion rules (the issue in this thread).

I say "systems programming language" because although other languages (like Python) take care of these issues, but that comes at a high cost in terms of performance. Some languages (like Java) get rid of the signed/unsigned issue by getting rid of unsigned integer types. This choice makes it very hard to do some sorts of operations.

The choice Java made to remove unsigned integers is an indication that "just add a warning" is not as workable as it sounds.

The integral promotions rule is in D because:

1. C/C++ programmers are very used to it. Subtly changing the rules will make transfer of code and skills C <=> D a much riskier proposition, especially if you're not the person who wrote that code.

2. Interoperability of C <=> D and even machine translation is far more pragmatic if these rules are followed.

Historical Note: Before 1990, half of the C compilers used "value preserving" integral promotions, half used "sign preserving". C was undergoing standardization, and a great debate raged about which one was better. Eventually, one was picked, and the other compiler vendors had to suck it up and change, and the newly broken C code had to be fixed. ("Value preserving" was picked, which is why ubyte promotes to int, not uint.)

These rules often do cause some difficulty with people new to C/C++/D. I know they seem insane to them. But they aren't hard to learn, and it's well worth the few minutes it takes to do it.

To check for overflows, etc., use core.checkedint:

https://dlang.org/phobos/core_checkedint.html

If you're willing to accept some performance reduction, std.experimental.checkedint provides integral types that protect against all kinds of integer arithmetic issues, including "unexpected change of sign":

https://dlang.org/phobos/std_experimental_checkedint.html
August 13, 2020
On 8/13/2020 12:03 PM, jmh530 wrote:
> In other words, it was not added until 2014, and even then done in a backwards compatible way that doesn't let you actually declare unsigned ints, just to call some methods on them assuming they are unsigned.

I view it as an admission of failure at doing away with unsigned integers.
August 13, 2020
On 8/13/2020 12:11 PM, mw wrote:
> But for practical purpose: half that space is large/good enough, 2^63 = 9,223,372,036,854,775,808, you sure your machine have that much memory installed? (very roughly, 9G of GB?)

Now think about 32 bit address spaces, where arrays larger than 2Gb do happen.

For example, I noticed that many 32 bit programs that operated on files, such as compressors, would corrupt files and do mysterious ugly things when given files larger than 2Gb.

August 13, 2020
On Thu, Aug 13, 2020 at 02:40:46PM -0700, Walter Bright via Digitalmars-d wrote:
> On 8/13/2020 12:03 PM, jmh530 wrote:
> > In other words, it was not added until 2014, and even then done in a backwards compatible way that doesn't let you actually declare unsigned ints, just to call some methods on them assuming they are unsigned.
> 
> I view it as an admission of failure at doing away with unsigned integers.

Yeah, in spite of all the problems, unsigned values *are* needed for certain things.  Java not having it was a *big* turnoff for me, because certain things that ought to be simple become needlessly convoluted (like parsing unsigned output from a C program, for example -- to prevent silent data corruption you had to treat everything as strings, which is a royal pain in Java).

Then again, a lot of things are needlessly convoluted in Java, so it's not saying very much. :-P


T

-- 
People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth
August 13, 2020
On Thursday, 13 August 2020 at 21:33:11 UTC, Walter Bright wrote:

> To check for overflows, etc., use core.checkedint:
>
> https://dlang.org/phobos/core_checkedint.html
>
> If you're willing to accept some performance reduction, std.experimental.checkedint provides integral types that protect against all kinds of integer arithmetic issues, including "unexpected change of sign":
>
> https://dlang.org/phobos/std_experimental_checkedint.html


So instead of let user change their existing code *manually* all over the place, e.g.

    auto r = new int[a.length + b.length];

==>

    auto r = new int[(checked(a.length) + b.length).get];


For users / applications that do value correctness more than performance, can we have a compiler switch which turn all the types & operations (e.g. in modules, that users also specified on command-line) into core_checkedint or std_experimental_checkedint *automatically*?

August 13, 2020
On Thursday, 13 August 2020 at 22:07:11 UTC, H. S. Teoh wrote:
> Then again, a lot of things are needlessly convoluted in Java, so it's not saying very much. :-P

Some of the D's competitors:

C#
----------------------------------------------------------------------------
using System;
class test{
  public static void Main(string[] args) {
    long a = -5000;
    ulong b = 2;
    long c = a / b; // Operator '/' is ambiguous on operands of type 'long' and 'ulong'
  }
}
----------------------------------------------------------------------------
$ mcs div.cs
div.cs(6,14): error CS0019: Operator `/' cannot be applied to operands of type `long' and `ulong'
Compilation failed: 1 error(s), 0 warnings


Rust:
----------------------------------------------------------------------------
fn main() {
  let a: i64 = -5000;
  let b: u64 = 2;
  let c: i64 = a / b;
}
----------------------------------------------------------------------------
$ rustc  div.rs
error[E0308]: mismatched types
 --> div.rs:4:20
  |
4 |   let c: i64 = a / b;
  |                    ^ expected `i64`, found `u64`

error[E0277]: cannot divide `i64` by `u64`
 --> div.rs:4:18
  |
4 |   let c: i64 = a / b;
  |                  ^ no implementation for `i64 / u64`
  |
  = help: the trait `std::ops::Div<u64>` is not implemented for `i64`

error: aborting due to 2 previous errors

Some errors have detailed explanations: E0277, E0308.
For more information about an error, try `rustc --explain E0277`.

August 14, 2020
On 13.08.20 21:40, mw wrote:
> On Thursday, 13 August 2020 at 19:24:11 UTC, Tove wrote:
>> One should always use unsigned whenever possible as it generates better code, many believe factor 2 is simply a shift, but not so on signed.
> 
> I'm fine with that. In many area of the language design, we need to make a choice between:  correctness v.s. raw performance.

This is not such a case. There is no good reason to round signed integer division towards zero.
August 13, 2020
On Thu, Aug 13, 2020 at 11:07:23PM +0000, mw via Digitalmars-d wrote: [...]
> C#
> ----------------------------------------------------------------------------
[...]
> div.cs(6,14): error CS0019: Operator `/' cannot be applied to operands of
> type `long' and `ulong'
[...]
> 
> Rust:
[...]
> error[E0277]: cannot divide `i64` by `u64`
[...]

Honestly, I'd be happy if we turned these implicit sign conversions to errors.  The cases where you *want* a/b to convert to unsigned are limited; if you really want to do it, you could just write a cast.  It does make the code much clearer:

	ulong x = ...;
	long y = ...;
	auto z = x / cast(ulong) y; // see? now it's completely clear

And before somebody tells me this is too verbose: we already have to do this for short ints, no thanks to the recent change that arithmetic involving anything smaller than int will implicitly promote to int first:

	ubyte x;
	ubyte y;
	//ubyte z = x + y;	// NG
	ubyte z = cast(ubyte)(x + y); // OK

Yes, it's *that* ugly.  Welcome to the Dungeon of D's Dark Corners, where you see the ugly side of D that people don't want to talk about. We hope you enjoy your stay. (Or not.) :-D


T

-- 
Caffeine underflow. Brain dumped.
August 13, 2020
On Thursday, 13 August 2020 at 21:09:41 UTC, Guillaume Piolat wrote:
> [snip]
>
> Feels correct to me !
>
> When you have an unsigned and signed integer mixed with a binary operator, the operands are converted to unsigned.
>
> This is how it works in C and C++ and we wouldn't be able to port C code to D if this were to be changed.

One way to look at it is that a design goal of D is that a C user should be able to copy and paste code into D with minimal changes. From that perspective, the integer promotion rules make sense. However, if the design goal were instead based upon automatic conversion of C code to D, particularly given the (mostly) automatic conversion of dmd from C to D, then different integer promotion rules would not have been as significant a blocker for people coming to D from C. At this point, it would be a big breaking change.

Also, we have templates and operator overloading. Nothing stops people from making their own Int type that has different semantics for division.

import std.traits: isIntegral;

struct Integer(T)
    if (isIntegral!T)
{
    T x;
    alias x this;

    Integer!T opBinary(string op)(size_t rhs)
    {
        assert(rhs < int.max);
        static if (op == "/")
            return Integer!T(x / cast(T) rhs);
        else
            static assert(0, "Operator " ~ op ~ " not implemented");
    }
}

void main() {
    auto x = Integer!long(-5000L);
    size_t y = 2;
    auto z = x / y;
    import std.stdio: writeln;
    assert(z == -2500);
}
August 13, 2020
On 8/13/20 5:33 PM, Walter Bright wrote:
> There are some things everyone simply needs to know to use a systems programming language successfully:
> ...

Walter,

You make a very good case for the choice overall but not the lack of warnings. Part of the issue (IMO) is that D is /more/ than just a systems language (side note, it sometimes seems it tries to be everything to everyone), hence the original complaint of "silent" or surprising errors cropping up in the context of e.g.

Array.length

returning ulong. Especially in cases like this, compiler warnings would be helpful.

James