February 04, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam Ruppe | On 2/4/2022 3:51 PM, Adam Ruppe wrote:
> On Friday, 4 February 2022 at 23:36:11 UTC, Adam Ruppe wrote:
>> I don't think you understand my proposal, which is closer to C's existing rules than D is now.
>
> To reiterate:
>
> C's rule: int promote, DO allow narrowing implicit conversion.
>
> D's rule: int promote, do NOT allow narrowing implicit conversion unless VRP passes.
>
> My proposed rule: int promote, do NOT allow narrowing implicit conversion unless VRP passes OR the requested conversion is the same as the largest input type (with literals excluded unless their value is obviously out of range).
We considered that and chose not to go that route, on the grounds that we were trying to minimize invisible truncation.
P.S. as a pragmatic programmer, I find very little use for shorts other than saving some space in a data structure. Using shorts as temporaries is a code smell.
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 5 February 2022 at 02:43:27 UTC, Walter Bright wrote:
> P.S. as a pragmatic programmer, I find very little use for shorts other than saving some space in a data structure. Using shorts as temporaries is a code smell.
As a pragmatic programmer with hand-coded assembly optimizations experience and also familiar with SIMD compiler intrinsics, using shorts as temporaries in C code actually works great for prototyping/testing the behavior of a single 16-bit lane. As a bonus, autovectorizers in compilers may pick up something too. But tons of forced type casts is the actual code smell.
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 5 February 2022 at 02:43:27 UTC, Walter Bright wrote:
> We considered that and chose not to go that route, on the grounds that we were trying to minimize invisible truncation.
I know how D works. I know why it works that way. Hell, I implemented part of the VRP code in dmd myself and have explained it to who knows how many new users over the last 15 years.
What I'm telling you is *it doesn't actually work*.
These forced explicit casts rarely prevent real bugs and in exchange, they make the language significantly harder to use and create their own problems down the line.
Loosening the rules would reduce the burden of the many, many, many false positives forcing harmful casts while keeping the spirit of the rule. It isn't just *invisible* truncation you want to minimize - it is *buggy* invisible truncation.
You want the compiler (and the casts) to call out potentially buggy areas so when it cries wolf, you actually look for a wolf that's probably there.
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D Ruppe | On Saturday, 5 February 2022 at 03:48:15 UTC, Adam D Ruppe wrote:
> You want the compiler (and the casts) to call out potentially buggy areas so when it cries wolf, you actually look for a wolf that's probably there.
Well written code would use a narrowing cast with checks for debugging, but the type itself is less interesting, so it would be better with overloading on return type. But it could be the default if overflow checks were implemented.
byte x = narrow(expression);
If it was the default, you could disable it instead:
byte x = uncheck(expression);
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D Ruppe | On 2/4/2022 7:48 PM, Adam D Ruppe wrote:
> On Saturday, 5 February 2022 at 02:43:27 UTC, Walter Bright wrote:
>> We considered that and chose not to go that route, on the grounds that we were trying to minimize invisible truncation.
>
> I know how D works. I know why it works that way. Hell, I implemented part of the VRP code in dmd myself and have explained it to who knows how many new users over the last 15 years.
>
> What I'm telling you is *it doesn't actually work*.
>
> These forced explicit casts rarely prevent real bugs and in exchange, they make the language significantly harder to use and create their own problems down the line.
>
> Loosening the rules would reduce the burden of the many, many, many false positives forcing harmful casts while keeping the spirit of the rule. It isn't just *invisible* truncation you want to minimize - it is *buggy* invisible truncation.
>
> You want the compiler (and the casts) to call out potentially buggy areas so when it cries wolf, you actually look for a wolf that's probably there.
I use D all day every day, the time, and I don't seem to be having these problems. I did a:
grep -w cast *.d
across src/dmd/*.d, and found hardly any casts to short/ushort that would fall under the forced cast category you mentioned. Granted, maybe your style of coding is different.
Doing the same grep across phobos/std/*.d, rather little of it which I have written, I found zero instances of forced cast to short/ushort.
As for "rarely", these kinds of bugs are indeed rare, but can be invisible yet significant. It's just the sort of thing we want to catch.
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Siarhei Siamashka | On 2/4/2022 6:35 PM, Siarhei Siamashka wrote: > On Friday, 4 February 2022 at 23:45:31 UTC, Walter Bright wrote: >> On 2/4/2022 2:18 PM, Siarhei Siamashka wrote: >>> If we want D language to be SIMD friendly, then discouraging the use of `short` and `byte` types for local variables isn't the best idea. >> >> SIMD is its own world, and why D has vector types as a core language feature. I never had much faith in autovectorization. > > I don't have much faith in autovectorization quality either, but this feature is provided by free by GCC and LLVM backends. And right now excessively paranoid errors about byte/short variables coerce the users into one of these two unattractive alternatives: > > * litter the code with ugly casts > * change types of temporary variables to ints and waste some vectorization opportunities Generally one should use the vector types rather than relying on autovectorization. One of the problems with autovectorization is never knowing that some minor change you made prevented vectorizing. > When the signal/noise ratio is bad, then it's natural that the users start ignoring error messages. Beginners are effectively trained to apply casts without thinking just to shut up the annoying compiler and it leads to situations like this: https://forum.dlang.org/thread/uqeobimtzhuyhvjpvkvz@forum.dlang.org That has nothing to do with integers. > Is see VRP as just a band-aid, which helps very little, but causes a lot of inconveniences. Certainly allowing implicit conversions of ints to shorts is *convenient*. But you cannot have that *and* safe integer math. As I mentioned repeatedly, there is no solution that is fast, convenient, and doesn't hide mistakes. > My suggestion: > > 1. Implement `wrapping_add`, `wrapping_sub`, `wrapping_mul` intrinsics similar to Rust, this is easy and costs nothing. > 2. Implement an experimental `-ftrapv` option in one of the D compilers (most likely GDC or LDC) to catch both signed and unsigned overflows at runtime. Or maybe add function attributes to enable/disable this functionality with a more fine grained control. Yes, I know that this violates the current D language spec, which requires two's complement wraparound for everything, but it doesn't matter for a fancy experimental option. > 3. Run some tests with `-ftrapv` and check how many arithmetic overflows are actually triggered in Phobos. Replace the affected arithmetic operators with intrinsics if the wrapping behavior is actually intended. > 4. In the long run consider updating the language spec. > > Benefits: even if `-ftrapv` turns out to have a high overhead, this would still become a useful tool for testing arithmetic overflows safety in applications. Having something is better than having nothing. |
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Siarhei Siamashka | On 2/4/2022 6:35 PM, Siarhei Siamashka wrote:
> My suggestion:
>
> 1. Implement `wrapping_add`, `wrapping_sub`, `wrapping_mul` intrinsics similar to Rust, this is easy and costs nothing.
> 2. Implement an experimental `-ftrapv` option in one of the D compilers (most likely GDC or LDC) to catch both signed and unsigned overflows at runtime. Or maybe add function attributes to enable/disable this functionality with a more fine grained control. Yes, I know that this violates the current D language spec, which requires two's complement wraparound for everything, but it doesn't matter for a fancy experimental option.
> 3. Run some tests with `-ftrapv` and check how many arithmetic overflows are actually triggered in Phobos. Replace the affected arithmetic operators with intrinsics if the wrapping behavior is actually intended.
> 4. In the long run consider updating the language spec.
>
> Benefits: even if `-ftrapv` turns out to have a high overhead, this would still become a useful tool for testing arithmetic overflows safety in applications. Having something is better than having nothing.
I recommend creating a DIP for it.
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Elronnd | On Friday, 4 February 2022 at 22:15:37 UTC, Elronnd wrote:
> On Friday, 4 February 2022 at 21:13:10 UTC, Walter Bright wrote:
>> It's slower, too.
>
> Not anymore. And div can be faster on smaller integers.
>
>
>> You're paying a 3 size byte penalty for using short arithmetic rather than int arithmetic.
>
> 1. You are very careful to demonstrate short arithmetic, not byte arithmetic, which is the same size as int arithmetic on x86.
Interestingly, for bytes, the code is even smaller (ldc):
0000000000000000 <_D4main5testbFPhQcQeZv>:
0: 8a 02 mov (%rdx),%al
2: 41 f6 20 mulb (%r8)
5: 88 01 mov %al,(%rcx)
7: c3 ret
0000000000000000 <_D4main5testiFPiQcQeZv>:
0: 8b 02 mov (%rdx),%eax
2: 41 0f af 00 imul (%r8),%eax
6: 89 01 mov %eax,(%rcx)
8: c3 ret
Also, there is no difference in size for ARM64:
testb:
ldrb w0, [x0]
ldrb w1, [x1]
mul w0, w0, w1
strb w0, [x2]
ret
tests:
ldrh w0, [x0]
ldrh w1, [x1]
mul w0, w0, w1
strh w0, [x2]
ret
testi:
ldr w0, [x0]
ldr w1, [x1]
mul w0, w0, w1
str w0, [x2]
ret
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Friday, 4 February 2022 at 21:27:44 UTC, H. S. Teoh wrote:
> On Fri, Feb 04, 2022 at 08:50:35PM +0000, Mark via Digitalmars-d wrote:
>> On Friday, 4 February 2022 at 04:28:37 UTC, Walter Bright wrote:
>> > There's really no fix for that other than making the effort to understand 2s-complement. Some noble attempts:
>> >
>> > Java: disallowed all unsigned types. Wound up having to add that back in as a hack.
>>
>> How many people actually use (and need) unsigned integers?
>
> I do. They are very useful in APIs where I expect only positive values. Marking the parameter type as uint makes it clear exactly what's expected, instead of using circumlocutions like taking int with an in-contract that x>=0. Also, when you're dealing with bitmasks, you WANT unsigned types. Using signed types for that will cause values to get munged by unwanted sign extensions, and in general just cause grief and needless complexity where an unsigned type would be completely straightforward.
>
> Also, for a systems programming language unsigned types are necessary, because they are a closer reflection of the reality at the hardware level.
>
>
>> If 99% of users don't need them, that's a good case for relegating them to a library type. This wasn't possible in Java because it doesn't support operator overloading, without which dealing with such types would have been quite annoying.
>
> Needing a library type for manipulating bitmasks would make D an utter joke of a systems programming language.
>
>
> T
I should have phrased my question as "how many people outside systems programming...", as this is what I had in mind (I mostly write high-level code, though I don't know if I'm the typical D user). But since D is proudly general-purpose I admit that this question is moot.
Regarding positive values, AFAIK unsigned ints aren't suitable for this because you still want to do ordinary arithmetic on positive integers, not modular arithmetic. Runtime checks are unavoidable because even mundane operations such as `--x` can potentially escape the domain of positive integers.
Also, I don't think being a library type is a mark of shame. Depending on the language, they can be just as useful and almost as convenient as built-in types. C++'s std::byte was mentioned on this thread - it's a library type.
|
February 05, 2022 Re: [OT] The Usual Arithmetic Confusions | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 05.02.22 00:43, Walter Bright wrote: > > That's fine unless you're using a systems programming language, where the customers expect performance. > > Remember the the recent deal with the x87 where dmd would keep the extra precision around, to avoid the double rounding problem? That does not avoid problems, it's just confusing to users and it will introduce new bugs. It's not even a cure for double-rounding issues! It may even have the opposite effect! > I propagated this to dmc, and it cost me a design win. The customer benchmarked it on 'float' arithmetic, and pronounced dmc 10% slower. The double rounding issue did not interest him. Sure, it stands to reason that people who are not careful with their floating-point implementations actually do not care. And the weird extra precision is extremely annoying for those that are careful. Less performance, less reproducibility, randomly introducing double-rounding issues in code that would be correct if it did not insist on keeping around more precision in hard-to-predict, implementation defined cases that are not even properly specced out, in exchange for sometimes randomly hiding issues in badly written code. No, thanks. This is terrible! I get that the entire x87 design is pretty bad and so there are trade-offs, but as it has now been deprecated, I hope this kind of second-guessing will become a thing of the past entirely. In the meantime, I will avoid using DMD for anything that requires floating-point arithmetic. |
Copyright © 1999-2021 by the D Language Foundation