March 30, 2021
On Tuesday, 30 March 2021 at 08:48:04 UTC, Max Haughton wrote:
> On Tuesday, 30 March 2021 at 06:43:04 UTC, Walter Bright wrote:
>> On 3/29/2021 10:53 PM, Max Samukha wrote:
>>> On Tuesday, 30 March 2021 at 00:02:54 UTC, H. S. Teoh wrote:
>>> 
>>>> 
[...]
>
> On the subject of run-time performance, checkedint can also do things like Saturation arithmetic, which can be accelerated using increasingly common native instructions (e.g. AVX on Intel, AMD, and presumably Via also).
[...]
>
> (AVX instructions are also quite big so there is a the usual I$ hit here too).

Some micro-architectures employ an L0/uOp cache, which can significantly alter the I$ performance calculus within loops.  To confidently identify an I$ performance bottleneck I think you'd need to use perf analysis tools. IIRC Max recommended this at Beerconf.

Side note: the checkedint code sure looks nice.  It's a very readable example of the leverage D affords.

March 30, 2021
On Tuesday, 30 March 2021 at 03:31:05 UTC, Walter Bright wrote:

> Note that Zig has a very different idea of integers than D does. It has arbitrary bit width integers, up to 65535. This seems odd, as what are you going to do with a 6 bit integer? There aren't machine instructions to support it. It'd be better off with a ranged integer, say:
>
>    i : int 0..64

The question is then, does that mean that Zig has over 131070 keywords (65535  for signed and unsigned each)? :D. Or does it reserve anything that starts with i/u followed by numbers? Kind of like how D reveres identifiers starting with two underscores.

--
/Jacob Carlborg
March 30, 2021
On Tuesday, 30 March 2021 at 15:28:04 UTC, Jacob Carlborg wrote:
> On Tuesday, 30 March 2021 at 03:31:05 UTC, Walter Bright wrote:
>
>> Note that Zig has a very different idea of integers than D does. It has arbitrary bit width integers, up to 65535. This seems odd, as what are you going to do with a 6 bit integer? There aren't machine instructions to support it. It'd be better off with a ranged integer, say:
>>
>>    i : int 0..64
>
> The question is then, does that mean that Zig has over 131070 keywords (65535  for signed and unsigned each)? :D. Or does it reserve anything that starts with i/u followed by numbers? Kind of like how D reveres identifiers starting with two underscores.
>
> --
> /Jacob Carlborg

In Zig, integer type names are not considered keywords, e.g you can use i7 as a variable name or i666 as a function name.

But you cannot define new types with this pattern, you get an error message stating that "Type 'i?' is shadowing primitive type 'i?'".

So it's more like a contextual keyword in C#.
March 30, 2021
On Tuesday, 30 March 2021 at 03:31:05 UTC, Walter Bright wrote:
> On 3/29/2021 6:29 PM, tsbockman wrote:
>> On Tuesday, 30 March 2021 at 00:33:13 UTC, Walter Bright wrote:
>>> Having D generate overflow checks on all adds and multiples will immediately make D uncompetitive with C, C++, Rust, Zig, Nim, etc.
>> 
>> As someone else shared earlier in this thread, Zig already handles this in pretty much exactly the way I argue for:
>>     https://ziglang.org/documentation/master/#Integer-Overflow
>
> I amend my statement to "immediately make D as uncompetitive as Zig is"

So you're now dismissing Zig as slow because its feature set surprised you? No real-world data is necessary? No need to understand any of Zig's relevant optimizations or options?
March 30, 2021
On 3/30/2021 10:09 AM, tsbockman wrote:
> So you're now dismissing Zig as slow because its feature set surprised you?

Because it surprised me? No. Because if someone had figured out a way to do overflow checks for no runtime costs, it would be in every language. I know Rust tried pretty hard to do it.


> No real-world data is necessary? No need to understand any of Zig's relevant optimizations or options?

I don't have to test a brick to assume it won't fly. But I could be wrong, definitely. If you can prove me wrong in my presumption, I'm listening.

P.S. Yes, I know anything will "fly" if you attach enough horsepower to it. But there's a reason airplanes don't look like bricks.
March 30, 2021
On 3/30/2021 6:33 AM, Bruce Carneal wrote:
> Side note: the checkedint code sure looks nice.  It's a very readable example of the leverage D affords.

Yes, that's also why I want it to have more visibility by being in Phobos.

March 30, 2021
On Tuesday, 30 March 2021 at 17:53:37 UTC, Walter Bright wrote:
> On 3/30/2021 10:09 AM, tsbockman wrote:
>> So you're now dismissing Zig as slow because its feature set surprised you?
>
> Because it surprised me? No. Because if someone had figured out a way to do overflow checks for no runtime costs, it would be in every language. I know Rust tried pretty hard to do it.

Zero runtime cost is not a reasonable standard unless the feature is completely worthless and it cannot be turned off.

>> No real-world data is necessary? No need to understand any of Zig's relevant optimizations or options?
>
> I don't have to test a brick to assume it won't fly. But I could be wrong, definitely. If you can prove me wrong in my presumption, I'm listening.

Since I have already been criticized for the use of micro-benchmarks, I assume that only data from complete practical applications will satisfy.

Unfortunately, the idiomatic C, C++, D, and Rust source code all omit the information required to perform such tests. Simply flipping compiler switches (the -ftrapv and -fwrapv flags in gcc Andrei mentioned earlier) won't work, because most high performance code contains some deliberate and correct examples of wrapping overflow, signed-unsigned reinterpretation, etc.

Idiomatic Zig code (probably Ada, too) does contain this information. But, the selection of "real world" open source Zig code available for testing is limited right now, since Zig hasn't stabilized the language or the standard library yet.

The best test subject I have found, compiled, and run successfully is this:
    https://github.com/Vexu/arocc
It's an incomplete C compiler: "Right now preprocessing and parsing is mostly done but anything beyond that is missing." I believe compilation is a fairly integer-intensive workload, so the results should be meaningful.

To test, I took the C source code of gzip and duplicated its contents many times until I got the arocc wall time up to about 1 second. (The final input file is 37.5 MiB.) arocc outputs a long stream of error messages to stderr, whose contents aren't important for our purposes.

In order to minimize the time consumed by I/O, I run each test several times in a row and ignore the early runs, to ensure that the input file is cached in RAM by the OS, and pipe the output of arocc (both stdout and stderr) to /dev/null.

Results with -O ReleaseSafe (optimizations on, with checked integer arithmetic, bounds checks, null checks, etc.):
    Binary size: 2.0 MiB
    Wall clock time: 1.31s
    System time: 0.71s
    User time: 0.60s
    CPU usage: 99% of a single core

Results with -O ReleaseFast (optimizations on, with safety checks off):
    Binary size: 2.3 MiB
    Wall clock time: 1.15s
    System time: 0.68s
    User time: 0.46s
    CPU usage: 99% of a single core

So, in this particular task ReleaseSafe (which checks for a lot of other things, not just integer overflow) takes 14% longer than ReleaseFast. If you only care about user time, that is 48% longer.

Last time I checked, these numbers are similar to the performance difference between optimized builds by DMD and LDC/GDC. They are also similar to the performance differences within related language pairs like C/C++, Java/C#, Ada/C in language comparison benchmarks like:
    https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/cpp.html

Note also that with Zig's approach, paying the modest performance penalty for the various safety checks is *completely optional* in release builds (just like D's bounds checking). Even for applications where that final binary order of magnitude of speed is considered essential in production, Zig's approach still leads to clearer, easier to debug code.

So, unless DMD (or C itself!) is "a brick" that "won't fly", your claim that this is something that a high performance systems programming language just cannot do is not grounded in reality.
March 31, 2021
On Tuesday, 30 March 2021 at 01:09:12 UTC, Andrei Alexandrescu wrote:
> * https://dl.acm.org/doi/abs/10.1145/2743019 - relatively recent, quotes a lot of other work. A good starting point.

I skimmed the paper, and from what I have seen so far it supports my understanding of the facts in every way. I intend to read it more carefully later this week and post a summary here of the most relevant bits, for the benefit of anyone who doesn't want to pay for it.

Of course, there is a subject aspect to all of this as well; even with numbers in hand reasonable people may disagree as to what should be done about them.
March 30, 2021
On 3/30/2021 4:01 PM, tsbockman wrote:
> So, in this particular task ReleaseSafe (which checks for a lot of other things, not just integer overflow) takes 14% longer than ReleaseFast. If you only care about user time, that is 48% longer.

Thank you for running benchmarks.

14% is a big deal.

> Last time I checked, these numbers are similar to the performance difference between optimized builds by DMD and LDC/GDC. They are also similar to the performance differences within related language pairs like C/C++, Java/C#, Ada/C in language comparison benchmarks like:
> https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/cpp.html
> 
> Note also that with Zig's approach, paying the modest performance penalty for the various safety checks is *completely optional* in release builds (just like D's bounds checking). Even for applications where that final binary order of magnitude of speed is considered essential in production, Zig's approach still leads to clearer, easier to debug code.

The problem with turning it off for production code is that the overflows tend to be rare and not encountered during testing. When you need it, it is disabled.

Essentially, turning it off for release code is an admission that it is too expensive.

Note that D's bounds checking is *not* turned off in release mode. It has a separate switch to turn that off, and I recommend only using it to see how much performance it'll cost for a particular application.

> So, unless DMD (or C itself!) is "a brick" that "won't fly", your claim that this is something that a high performance systems programming language just cannot do is not grounded in reality.

I didn't say cannot. I said it would make it uncompetitive.

Overflow checking would be nice to have. But it is not worth the cost for D. I also claim that D code is much less likely to suffer from overflows because of the implicit integer promotion rules. Adding two shorts is never going to overflow, for example, and D won't let you naively assign the resulting int back to a short.

One could legitimately claim that D *does* have a form of integer overflow protection in the form of Value Range Propagation (VRP). Best of all, VRP comes for free at zero runtime cost!

P.S. I know you know this, due to your good work on VRP :-) but I mention it for the other readers.

P.P.S. So why is this claim not made for C? Because:

    short s, t, u;
    s = t + u;

compiles without complaint in C, but will fail to compile in D. C doesn't have VRP.


March 31, 2021
On Wednesday, 31 March 2021 at 01:43:50 UTC, Walter Bright wrote:
> Thank you for running benchmarks.
>
> 14% is a big deal.

Note that I deliberately chose an integer-intensive workload, and artificially sped up the I/O to highlight the performance cost. For most real-world applications, the cost is actually *much* lower. The paper Andrei linked earlier has a couple of examples:

    Checked Apache httpd is less than 0.1% slower than unchecked.
    Checked OpenSSH file copy is about 7% slower than unchecked.

https://dl.acm.org/doi/abs/10.1145/2743019

> The problem with turning it off for production code is that the overflows tend to be rare and not encountered during testing. When you need it, it is disabled.

Only if you choose to disable it. Just because you think it's not worth the cost doesn't mean everyone, or even most people, would turn it off.

> Essentially, turning it off for release code is an admission that it is too expensive.

It's an admission that it's too expensive *for some applications*, not in general. D's garbage collector is too expensive for some applications, but that doesn't mean it should be removed from the language, nor even disabled by default.

> Note that D's bounds checking is *not* turned off in release mode. It has a separate switch to turn that off, and I recommend only using it to see how much performance it'll cost for a particular application.

That's exactly how checked arithmetic, bounds checking, etc. works in Zig. What do you think the difference is, other than your arbitrary assertion that checked arithmetic costs more than it's worth?

> I said it would make it uncompetitive.

The mean performance difference between C and C++ in the (admittedly casual) comparative benchmarks I cited is 36%. Is C uncompetitive with C++? What definition of "uncompetitive" are you using?

> Overflow checking would be nice to have. But it is not worth the cost for D. I also claim that D code is much less likely to suffer from overflows...

Yes, D is better than C in this respect (among many others).