May 22, 2017
On Monday, 22 May 2017 at 15:05:24 UTC, Andrei Alexandrescu wrote:
> http://dlang.org/blog/2017/05/22/introspection-introspection-everywhere/ -- Andrei

A fun read!

"(Late at night, I double checked. Mozilla’s CheckedInt is just as bad as I remembered. They do a division to test for multiplication overflow. Come on, put a line of assembler in there! Portability is worth a price, just not any price.)"

Shocked: do you use assembly in Checked and cripple the optimizer?!?!
Luckily, no. But LDC and GDC do create the `seto` instruction I think you were hinting at:
https://godbolt.org/g/0jUhgs

(LDC doesn't do as good as it could, https://github.com/ldc-developers/ldc/issues/2131)

cheers,
  Johan

May 22, 2017
On Monday, 22 May 2017 at 15:05:24 UTC, Andrei Alexandrescu wrote:
> http://dlang.org/blog/2017/05/22/introspection-introspection-everywhere/ -- Andrei

Now that you are back and could take some time to think this over, would you say your trip will influence how you see D's and the D community evolution? In what way?
May 23, 2017
On 22 May 2017 at 22:51, Johan Engelen via Digitalmars-d-announce <digitalmars-d-announce@puremagic.com> wrote:
> On Monday, 22 May 2017 at 15:05:24 UTC, Andrei Alexandrescu wrote:
>>
>> http://dlang.org/blog/2017/05/22/introspection-introspection-everywhere/ -- Andrei
>
>
> A fun read!
>
> "(Late at night, I double checked. Mozilla’s CheckedInt is just as bad as I remembered. They do a division to test for multiplication overflow. Come on, put a line of assembler in there! Portability is worth a price, just not any price.)"
>
> Shocked: do you use assembly in Checked and cripple the optimizer?!?!
> Luckily, no. But LDC and GDC do create the `seto` instruction I think you
> were hinting at:
> https://godbolt.org/g/0jUhgs
>

So for LDC to be as good as GDC, you need to need to compile with -enable-ldc-amazing-feature-cross-module-inlining?

> (LDC doesn't do as good as it could,
> https://github.com/ldc-developers/ldc/issues/2131)
>

If you want a hint (though it's not my place to say), LLVM I'm told is a reasonably OK compiler, and any reasonably OK compiler should come with overflow intrinsics - try using them directly.

(Turning off mild morning sarcasm).

Iain.

May 23, 2017
On Tuesday, 23 May 2017 at 08:18:26 UTC, Iain Buclaw wrote:
> If you want a hint (though it's not my place to say), LLVM I'm told is a reasonably OK compiler, and any reasonably OK compiler should come with overflow intrinsics - try using them directly.

The intrinsics are exposed by LDC's druntime, they just need to be made use of:
https://github.com/ldc-developers/druntime/blob/ldc/src/ldc/intrinsics.di#L446-L468
May 23, 2017
On Monday, 22 May 2017 at 15:05:24 UTC, Andrei Alexandrescu wrote:
> http://dlang.org/blog/2017/05/22/introspection-introspection-everywhere/ -- Andrei

Interesting read. You're my brother from another mother. :)
May 23, 2017
On 5/22/17 4:51 PM, Johan Engelen wrote:
> On Monday, 22 May 2017 at 15:05:24 UTC, Andrei Alexandrescu wrote:
>> http://dlang.org/blog/2017/05/22/introspection-introspection-everywhere/ -- Andrei
> 
> A fun read!
> 
> "(Late at night, I double checked. Mozilla’s CheckedInt is just as bad as I remembered. They do a division to test for multiplication overflow. Come on, put a line of assembler in there! Portability is worth a price, just not any price.)"
> 
> Shocked: do you use assembly in Checked and cripple the optimizer?!?!
> Luckily, no. But LDC and GDC do create the `seto` instruction I think you were hinting at:
> https://godbolt.org/g/0jUhgs
> 
> (LDC doesn't do as good as it could, https://github.com/ldc-developers/ldc/issues/2131)

Thanks! Yes, seto is what I thought of - one way or another, it gets down to using a bit of machine-specific code to get there. I'll note that dmd does not generate seto (why?): https://goo.gl/nRjNMy. -- Andrei

May 23, 2017
On Tuesday, 23 May 2017 at 13:27:42 UTC, Andrei Alexandrescu wrote:
> On 5/22/17 4:51 PM, Johan Engelen wrote:
>> On Monday, 22 May 2017 at 15:05:24 UTC, Andrei Alexandrescu wrote:
>>> [...]
>> 
>> A fun read!
>> 
>> "(Late at night, I double checked. Mozilla’s CheckedInt is just as bad as I remembered. They do a division to test for multiplication overflow. Come on, put a line of assembler in there! Portability is worth a price, just not any price.)"
>> 
>> Shocked: do you use assembly in Checked and cripple the optimizer?!?!
>> Luckily, no. But LDC and GDC do create the `seto` instruction I think you were hinting at:
>> https://godbolt.org/g/0jUhgs
>> 
>> (LDC doesn't do as good as it could, https://github.com/ldc-developers/ldc/issues/2131)
>
> Thanks! Yes, seto is what I thought of - one way or another, it gets down to using a bit of machine-specific code to get there. I'll note that dmd does not generate seto (why?): https://goo.gl/nRjNMy. -- Andrei

it does this
overflow_flag = 0
op
if (overflowed)
{
  overflow_flag = 1;
}

this can in some circumstances be faster then using seto!
If the inliner does a good enough job :)
May 23, 2017
On 05/23/2017 09:42 AM, Stefan Koch wrote:
> On Tuesday, 23 May 2017 at 13:27:42 UTC, Andrei Alexandrescu wrote:
>> On 5/22/17 4:51 PM, Johan Engelen wrote:
>>> On Monday, 22 May 2017 at 15:05:24 UTC, Andrei Alexandrescu wrote:
>>>> [...]
>>>
>>> A fun read!
>>>
>>> "(Late at night, I double checked. Mozilla’s CheckedInt is just as bad as I remembered. They do a division to test for multiplication overflow. Come on, put a line of assembler in there! Portability is worth a price, just not any price.)"
>>>
>>> Shocked: do you use assembly in Checked and cripple the optimizer?!?!
>>> Luckily, no. But LDC and GDC do create the `seto` instruction I think you were hinting at:
>>> https://godbolt.org/g/0jUhgs
>>>
>>> (LDC doesn't do as good as it could, https://github.com/ldc-developers/ldc/issues/2131)
>>
>> Thanks! Yes, seto is what I thought of - one way or another, it gets down to using a bit of machine-specific code to get there. I'll note that dmd does not generate seto (why?): https://goo.gl/nRjNMy. -- Andrei
> 
> it does this
> overflow_flag = 0
> op
> if (overflowed)
> {
>    overflow_flag = 1;
> }

Where did you see this pattern? Couldn't find it anywhere in core.checkedint. And how is "overflowed" tested?

> this can in some circumstances be faster then using seto!
> If the inliner does a good enough job :)

The code in core.checkedint is conservative:

pragma(inline, true)
ulong mulu(ulong x, ulong y, ref bool overflow)
{
    ulong r = x * y;
    if (x && (r / x) != y)
        overflow = true;
    return r;
}

The compiler is supposed to detect the pattern and generate optimal code.


Andrei

May 23, 2017
On Tuesday, 23 May 2017 at 15:19:39 UTC, Andrei Alexandrescu wrote:
> On 05/23/2017 09:42 AM, Stefan Koch wrote:
>> On Tuesday, 23 May 2017 at 13:27:42 UTC, Andrei Alexandrescu wrote:
>>> On 5/22/17 4:51 PM, Johan Engelen wrote:
>>>> [...]
>>>
>>> Thanks! Yes, seto is what I thought of - one way or another, it gets down to using a bit of machine-specific code to get there. I'll note that dmd does not generate seto (why?): https://goo.gl/nRjNMy. -- Andrei
>> 
>> it does this
>> overflow_flag = 0
>> op
>> if (overflowed)
>> {
>>    overflow_flag = 1;
>> }
>
> Where did you see this pattern? Couldn't find it anywhere in core.checkedint. And how is "overflowed" tested?
>
>> this can in some circumstances be faster then using seto!
>> If the inliner does a good enough job :)
>
> The code in core.checkedint is conservative:
>
> pragma(inline, true)
> ulong mulu(ulong x, ulong y, ref bool overflow)
> {
>     ulong r = x * y;
>     if (x && (r / x) != y)
>         overflow = true;
>     return r;
> }
>
> The compiler is supposed to detect the pattern and generate optimal code.
>
>
> Andrei

That code is written nowhere.
It was my hand translation of the asm.
(And it was wrong)

The compiler does indeed seem to optimize the code somewhat.
Although the generated asm still looks wired.
http://asm.dlang.org/#compilers:!((compiler:dmd_nightly,options:'-dip25+-O+-release+-inline+-m32',source:'import+core.checkedint%3B%0A%0Aalias+T+%3D+ulong%3B%0Aextern+(C)+T+foo(uint+x,+uint+y,+ref+bool+overflow)%0A%7B%0A+++return+mulu(x,+y,+overflow)%3B%0A%7D%0A')),filterAsm:(binary:!t,intel:!t),version:3
May 23, 2017
On 05/23/2017 11:37 AM, Stefan Koch wrote:
> 
> The compiler does indeed seem to optimize the code somewhat.
> Although the generated asm still looks wired.
> http://asm.dlang.org/#compilers:!((compiler:dmd_nightly,options:'-dip25+-O+-release+-inline+-m32',source:'import+core.checkedint%3B%0A%0Aalias+T+%3D+ulong%3B%0Aextern+(C)+T+foo(uint+x,+uint+y,+ref+bool+overflow)%0A%7B%0A+++return+mulu(x,+y,+overflow)%3B%0A%7D%0A')),filterAsm:(binary:!t,intel:!t),version:3 

That call enters a different overload:

pragma(inline, true)
uint mulu(uint x, uint y, ref bool overflow)
{
    ulong r = ulong(x) * ulong(y);
    if (r > uint.max)
        overflow = true;
    return cast(uint)r;
}

which is of efficiency comparable with code using seto. I'm not too worried about that. https://goo.gl/eRXUpr is of interest.


Andrei