March 31, 2021
On Saturday, 27 March 2021 at 03:25:04 UTC, Walter Bright wrote:
> 4. fast integer arithmetic is fundamental to fast code, not a mere micro-optimization. Who wants an overflow check on every pointer increment?

Dan Luu measures overflow checks as having an overall 1% performance impact for numeric-heavy c code.  (https://danluu.com/integer-overflow/).  The code size impact is also very small, ~3%.

This isn't 'speculation', it's actual measurement.  'lea' is a microoptimization, it doesn't 'significantly' improve performance; yes, mul is slow, but lea can be trivially replaced by the equivalent sequence of shifts and adds with very little penalty.

Why is this being seriously discussed as a performance pitfall?
March 31, 2021
On Wednesday, 31 March 2021 at 03:32:40 UTC, Andrei Alexandrescu wrote:
> On 3/30/21 7:01 PM, tsbockman wrote:
>> Simply flipping compiler switches (the -ftrapv and -fwrapv flags in gcc Andrei mentioned earlier) won't work, because most high performance code contains some deliberate and correct examples of wrapping overflow, signed-unsigned reinterpretation, etc.
>> 
>> Idiomatic Zig code (probably Ada, too) does contain this information. But, the selection of "real world" open source Zig code available for testing is limited right now, since Zig hasn't stabilized the language or the standard library yet.
>
> That's awfully close to "No true Scotsman".

Just tossing out names of fallacies isn't really very helpful if you don't explain why you think it may apply here.
March 31, 2021
On Wednesday, 31 March 2021 at 04:08:02 UTC, Andrei Alexandrescu wrote:
> Not much to write home about. The jumps scale linearly with the number of primitive operations:
>
> https://godbolt.org/z/r3sj1T4hc
>
> That's not going to be a speed demon.

Ideally, in release builds the compiler could loosen up the precision of the traps a bit and combine the overflow checks for short sequences of side-effect free operations.
March 31, 2021
On Wednesday, 31 March 2021 at 04:26:28 UTC, Andrei Alexandrescu wrote:
> FWIW I just tested -fwrapv and -ftrapv. The former does nothing discernible:

-fwrapv isn't supposed to do anything discernible; it just prevents the compiler from taking advantage of otherwise undefined behavior:

"Instructs the compiler to assume that signed arithmetic overflow of addition, subtraction, and multiplication, wraps using two's-complement representation."

https://www.keil.com/support/man/docs/armclang_ref/armclang_ref_sam1465487496421.htm
March 31, 2021
On 3/31/21 12:32 AM, tsbockman wrote:
> On Wednesday, 31 March 2021 at 03:32:40 UTC, Andrei Alexandrescu wrote:
>> On 3/30/21 7:01 PM, tsbockman wrote:
>>> Simply flipping compiler switches (the -ftrapv and -fwrapv flags in gcc Andrei mentioned earlier) won't work, because most high performance code contains some deliberate and correct examples of wrapping overflow, signed-unsigned reinterpretation, etc.
>>>
>>> Idiomatic Zig code (probably Ada, too) does contain this information. But, the selection of "real world" open source Zig code available for testing is limited right now, since Zig hasn't stabilized the language or the standard library yet.
>>
>> That's awfully close to "No true Scotsman".
> 
> Just tossing out names of fallacies isn't really very helpful if you don't explain why you think it may apply here.

I thought it's fairly clear - the claim is non-falsifiable: if code is faster without checks, it is deemed so on account of tricks. Code without checks could benefit of other, better tricks, but their absence is explained by the small size of the available corpus.
March 31, 2021
On 3/31/21 12:47 AM, Andrei Alexandrescu wrote:
> On 3/31/21 12:32 AM, tsbockman wrote:
>> On Wednesday, 31 March 2021 at 03:32:40 UTC, Andrei Alexandrescu wrote:
>>> On 3/30/21 7:01 PM, tsbockman wrote:
>>>> Simply flipping compiler switches (the -ftrapv and -fwrapv flags in gcc Andrei mentioned earlier) won't work, because most high performance code contains some deliberate and correct examples of wrapping overflow, signed-unsigned reinterpretation, etc.
>>>>
>>>> Idiomatic Zig code (probably Ada, too) does contain this information. But, the selection of "real world" open source Zig code available for testing is limited right now, since Zig hasn't stabilized the language or the standard library yet.
>>>
>>> That's awfully close to "No true Scotsman".
>>
>> Just tossing out names of fallacies isn't really very helpful if you don't explain why you think it may apply here.
> 
> I thought it's fairly clear - the claim is non-falsifiable: if code is faster without checks, it is deemed so on account of tricks. Code without checks could benefit of other, better tricks, but their absence is explained by the small size of the available corpus.


s/Code without checks could benefit of other/Code with checks could benefit of other/
March 31, 2021
On 3/31/21 12:37 AM, tsbockman wrote:
> On Wednesday, 31 March 2021 at 04:08:02 UTC, Andrei Alexandrescu wrote:
>> Not much to write home about. The jumps scale linearly with the number of primitive operations:
>>
>> https://godbolt.org/z/r3sj1T4hc
>>
>> That's not going to be a speed demon.
> 
> Ideally, in release builds the compiler could loosen up the precision of the traps a bit and combine the overflow checks for short sequences of side-effect free operations.

Yah, was hoping I'd find something like that. Was disappointed. That makes their umbrella claim "Zig is faster than C" quite specious.
March 31, 2021
On Wednesday, 31 March 2021 at 04:26:28 UTC, Andrei Alexandrescu wrote:
> On 3/31/21 12:11 AM, tsbockman wrote:
>> Also, the Zig checked binaries are actually slightly smaller than the unchecked binaries for some reason.
>
> That's surprising so some investigation would be in order. From what I tried on godbolt the generated code is strictly larger if it uses checks.

Perhaps the additional runtime validation is causing reduced inlining in some cases? The test program I used has almost 300 KiB of source code, so it may be hard to reproduce the effect with toy programs on godbolt.
March 30, 2021
On 3/30/2021 9:08 PM, Andrei Alexandrescu wrote:
> Not much to write home about. The jumps scale linearly with the number of primitive operations:
> 
> https://godbolt.org/z/r3sj1T4hc
> 
> That's not going to be a speed demon.

The ldc:
        mov     eax, edi
        imul    eax, eax
        add     eax, edi    *
        add     eax, 1      *
	ret

* should be:

	lea    eax,1[eax + edi]

Let's try dmd -O:

__D3lea6squareFiZi:
	mov	EDX,EAX
	imul	EAX,EAX
	lea	EAX,1[EAX][EDX]
	ret

Woo-hoo!
March 31, 2021
On 3/31/21 12:59 AM, Walter Bright wrote:
> On 3/30/2021 9:08 PM, Andrei Alexandrescu wrote:
>> Not much to write home about. The jumps scale linearly with the number of primitive operations:
>>
>> https://godbolt.org/z/r3sj1T4hc
>>
>> That's not going to be a speed demon.
> 
> The ldc:
>          mov     eax, edi
>          imul    eax, eax
>          add     eax, edi    *
>          add     eax, 1      *
>      ret
> 
> * should be:
> 
>      lea    eax,1[eax + edi]
> 
> Let's try dmd -O:
> 
> __D3lea6squareFiZi:
>      mov    EDX,EAX
>      imul    EAX,EAX
>      lea    EAX,1[EAX][EDX]
>      ret
> 
> Woo-hoo!

Yah, actually gdc uses lea as well: https://godbolt.org/z/Gb6416EKe