April 23, 2015 [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Why can no compiler I try optimise this toy example as I would expect? // uncomment if using a C compiler // typedef unsigned int uint; uint foo(uint a) { if (a < 5) return (a * 3) / 3; else return 0; } So, I would expect the compiler to be able to see that it is equivalent to uint foo(uint a) { return (a < 5) ? a : 0; } But apparently not a single modern compiler I tried can do this optimisation, unless it's hidden in some obscure flag I'm not aware of. An even more striking example can be found if you replace the / with %, where the result of the function is then unconditionally zero, but every compiler i tried still spat out multiplication instructions. Is there a good reason for this, or is it just " * and / aren't always inverses, so never mind all the cases where they are"? Now I know that this seems like a unrealistic example, but when you're in complicated meta-programming situations code like this can and will appear. |
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to John Colvin | On Thursday, 23 April 2015 at 08:33:56 UTC, John Colvin wrote:
> Why can no compiler I try optimise this toy example as I would expect?
>
> // uncomment if using a C compiler
> // typedef unsigned int uint;
> uint foo(uint a)
> {
> if (a < 5)
> return (a * 3) / 3;
> else
> return 0;
> }
>
> So, I would expect the compiler to be able to see that it is equivalent to
>
> uint foo(uint a)
> {
> return (a < 5) ? a : 0;
> }
>
> But apparently not a single modern compiler I tried can do this optimisation, unless it's hidden in some obscure flag I'm not aware of.
>
I think because of the potential overflow in a * 3 (if we ignore the a < 5 condition). To optimize this, a compiler must figure out that there is no overflow for any a < 5.
If you change the condition to a > 5 and call foo(uint.max), the two expression above are not equivalent.
int foo(uint a)
{
if (a > 5)
return (a * 3) / 3;
else
return 0;
}
int foo_optimized(uint a)
{
return (a > 5) ? a : 0;
}
assert(foo(uint.max) == foo_optimized(uint.max)) // -> fail.
|
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to rumbu | On Thursday, 23 April 2015 at 08:56:38 UTC, rumbu wrote:
> On Thursday, 23 April 2015 at 08:33:56 UTC, John Colvin wrote:
>> Why can no compiler I try optimise this toy example as I would expect?
>>
>> // uncomment if using a C compiler
>> // typedef unsigned int uint;
>> uint foo(uint a)
>> {
>> if (a < 5)
>> return (a * 3) / 3;
>> else
>> return 0;
>> }
>>
>> So, I would expect the compiler to be able to see that it is equivalent to
>>
>> uint foo(uint a)
>> {
>> return (a < 5) ? a : 0;
>> }
>>
>> But apparently not a single modern compiler I tried can do this optimisation, unless it's hidden in some obscure flag I'm not aware of.
>>
>
> I think because of the potential overflow in a * 3 (if we ignore the a < 5 condition). To optimize this, a compiler must figure out that there is no overflow for any a < 5.
>
> If you change the condition to a > 5 and call foo(uint.max), the two expression above are not equivalent.
>
> int foo(uint a)
> {
> if (a > 5)
> return (a * 3) / 3;
> else
> return 0;
> }
>
> int foo_optimized(uint a)
> {
> return (a > 5) ? a : 0;
> }
>
> assert(foo(uint.max) == foo_optimized(uint.max)) // -> fail.
Yes, the three things making * and / not inverses of each other on unsigned integer types are:
truncation of integer division
div by 0
overflow
But it still amazes me that the compiler can't use the a < 5 condition here. I regularly watch compilers do what appear to be magical transformations and simplifications to my code, but here they are tripping up on some very simple maths.
|
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to John Colvin | On Thursday, 23 April 2015 at 08:33:56 UTC, John Colvin wrote:
> Why can no compiler I try optimise this toy example as I would expect?
>
> // uncomment if using a C compiler
> // typedef unsigned int uint;
> uint foo(uint a)
> {
> if (a < 5)
> return (a * 3) / 3;
> else
> return 0;
> }
>
> So, I would expect the compiler to be able to see that it is equivalent to
>
> uint foo(uint a)
> {
> return (a < 5) ? a : 0;
> }
>
> But apparently not a single modern compiler I tried can do this optimisation, unless it's hidden in some obscure flag I'm not aware of.
>
> An even more striking example can be found if you replace the / with %, where the result of the function is then unconditionally zero, but every compiler i tried still spat out multiplication instructions.
>
> Is there a good reason for this, or is it just " * and / aren't always inverses, so never mind all the cases where they are"?
>
> Now I know that this seems like a unrealistic example, but when you're in complicated meta-programming situations code like this can and will appear.
If I'm right, there's a website where I can see assembly generated by d compiler. And it's not dpaste... any hint?
|
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrea Fontana | On 23/04/2015 10:02 p.m., Andrea Fontana wrote:
> On Thursday, 23 April 2015 at 08:33:56 UTC, John Colvin wrote:
>> Why can no compiler I try optimise this toy example as I would expect?
>>
>> // uncomment if using a C compiler
>> // typedef unsigned int uint;
>> uint foo(uint a)
>> {
>> if (a < 5)
>> return (a * 3) / 3;
>> else
>> return 0;
>> }
>>
>> So, I would expect the compiler to be able to see that it is
>> equivalent to
>>
>> uint foo(uint a)
>> {
>> return (a < 5) ? a : 0;
>> }
>>
>> But apparently not a single modern compiler I tried can do this
>> optimisation, unless it's hidden in some obscure flag I'm not aware of.
>>
>> An even more striking example can be found if you replace the / with
>> %, where the result of the function is then unconditionally zero, but
>> every compiler i tried still spat out multiplication instructions.
>>
>> Is there a good reason for this, or is it just " * and / aren't always
>> inverses, so never mind all the cases where they are"?
>>
>> Now I know that this seems like a unrealistic example, but when you're
>> in complicated meta-programming situations code like this can and will
>> appear.
>
> If I'm right, there's a website where I can see assembly generated by d
> compiler. And it's not dpaste... any hint?
asm.dlang.org
|
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rikki Cattermole | On Thursday, 23 April 2015 at 10:04:47 UTC, Rikki Cattermole wrote:
> On 23/04/2015 10:02 p.m., Andrea Fontana wrote:
>> On Thursday, 23 April 2015 at 08:33:56 UTC, John Colvin wrote:
>>> Why can no compiler I try optimise this toy example as I would expect?
>>>
>>> // uncomment if using a C compiler
>>> // typedef unsigned int uint;
>>> uint foo(uint a)
>>> {
>>> if (a < 5)
>>> return (a * 3) / 3;
>>> else
>>> return 0;
>>> }
>>>
>>> So, I would expect the compiler to be able to see that it is
>>> equivalent to
>>>
>>> uint foo(uint a)
>>> {
>>> return (a < 5) ? a : 0;
>>> }
>>>
>>> But apparently not a single modern compiler I tried can do this
>>> optimisation, unless it's hidden in some obscure flag I'm not aware of.
>>>
>>> An even more striking example can be found if you replace the / with
>>> %, where the result of the function is then unconditionally zero, but
>>> every compiler i tried still spat out multiplication instructions.
>>>
>>> Is there a good reason for this, or is it just " * and / aren't always
>>> inverses, so never mind all the cases where they are"?
>>>
>>> Now I know that this seems like a unrealistic example, but when you're
>>> in complicated meta-programming situations code like this can and will
>>> appear.
>>
>> If I'm right, there's a website where I can see assembly generated by d
>> compiler. And it's not dpaste... any hint?
>
> asm.dlang.org
and d.godbolt.org
This isn't a D-specific question though, so gcc.godbolt.org would allow you to test a wider range of backends.
|
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to John Colvin | On Thursday, 23 April 2015 at 10:08:24 UTC, John Colvin wrote:
>> asm.dlang.org
>
> and d.godbolt.org
>
> This isn't a D-specific question though, so gcc.godbolt.org would allow you to test a wider range of backends.
I was wondering if compilers can optimize this:
uint foo3(uint a)
{
return a*!(a/5);
}
That actually gives the same results.
|
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to John Colvin | On Thursday, 23 April 2015 at 08:33:56 UTC, John Colvin wrote:
> Why can no compiler I try optimise this toy example as I would expect?
>
> // uncomment if using a C compiler
> // typedef unsigned int uint;
> uint foo(uint a)
> {
> if (a < 5)
> return (a * 3) / 3;
> else
> return 0;
> }
It is interesting to see how things do get optimized when e.g.
changing the multiply * into a plus +.
|
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to rumbu | On Thursday, 23 April 2015 at 08:56:38 UTC, rumbu wrote: > I think because of the potential overflow in a * 3 (if we ignore the a < 5 condition). To optimize this, a compiler must figure out that there is no overflow for any a < 5. Yes, it is because of modular artithmetics which is a D design flaw. In C++ this only applies to unsigned integers, signed integers are monothonic in C++. I think Rust uses non-modular for both and Ada allows you to specify it. Compiled using ICC: int foo(int a) { if (a > 5) return (a * 3) / 3; else return 0; } yields: xorl %edx, %edx cmpl $5, %edi cmovle %edx, %edi movl %edi, %eax ret --------------------------------- int foo(unsigned int a) { if (a > 5) return (a * 3) / 3; else return 0; } yields: cmpl $5, %edi jbe ..B1.3 movl $-1431655765, %eax lea (%rdi,%rdi,2), %ecx mull %ecx shrl $1, %edx movl %edx, %eax ret ..B1.3: xorl %eax, %eax ret |
April 23, 2015 Re: [OT] compiler optimisations | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ola Fosheim Grøstad | Interestingly only clang understands that an upcast will prevent the overflow: int foo(unsigned int a) { if (a > 5) return ((unsigned long long)a * 3) / 3; else return 0; } Is compiled to compact code in clang: xorl %eax, %eax cmpl $5, %edi cmoval %edi, %eax retq |
Copyright © 1999-2021 by the D Language Foundation