Thread overview
Compiler optimizations
Dec 19, 2008
dsimcha
Dec 19, 2008
Saaa
Dec 19, 2008
Don
Dec 20, 2008
Nick B
Dec 21, 2008
Don
Dec 19, 2008
Sergey Gromov
Dec 19, 2008
Bill Baxter
Dec 20, 2008
Robert Jacques
Dec 20, 2008
Walter Bright
Dec 20, 2008
bearophile
December 19, 2008
Does anyone know of a decent guide that has information on what types of optimizations compilers typically perform and what they aren't capable of performing automatically?  I figure it would be useful to know something like this because, when micro-optimizing performance-critical code, it's silly to do a low-level optimization that makes the code less readable if the compiler's already probably doing it.  On the other hand, it would be nice to know exactly what optimizations (besides the obvious stuff like high-level design and algorithm optimizations) the compiler can't possibly be sufficiently smart to do, so I can spend time looking for opportunities to do those.
December 19, 2008
I'd also like to know, just because I like to know :)


December 19, 2008
dsimcha wrote:
> Does anyone know of a decent guide that has information on what types of
> optimizations compilers typically perform and what they aren't capable of
> performing automatically?  I figure it would be useful to know something like
> this because, when micro-optimizing performance-critical code, it's silly to
> do a low-level optimization that makes the code less readable if the
> compiler's already probably doing it.  On the other hand, it would be nice to
> know exactly what optimizations (besides the obvious stuff like high-level
> design and algorithm optimizations) the compiler can't possibly be
> sufficiently smart to do, so I can spend time looking for opportunities to do
> those.

As always -- Agner Fog. www.agner.org. Chapter 1 of his optimisation manual lists the optimisations performed for Microsoft, Borland, Intel, GCC, DMC, Watcom, and Codeplay C++ compilers. DMD is basically the same as DMC, except that it's better at constant propagation.
December 19, 2008
Fri, 19 Dec 2008 03:56:37 +0000 (UTC), dsimcha wrote:

> Does anyone know of a decent guide that has information on what types of optimizations compilers typically perform and what they aren't capable of performing automatically?  I figure it would be useful to know something like this because, when micro-optimizing performance-critical code, it's silly to do a low-level optimization that makes the code less readable if the compiler's already probably doing it.  On the other hand, it would be nice to know exactly what optimizations (besides the obvious stuff like high-level design and algorithm optimizations) the compiler can't possibly be sufficiently smart to do, so I can spend time looking for opportunities to do those.

I'm afraid that your only option is to look at the assembly output and then trick the compiler into making it better.  Very subtle  changes make difference sometimes and I doubt you can pick this knowledge from any manual.
December 19, 2008
On Fri, Dec 19, 2008 at 8:16 PM, Sergey Gromov <snake.scaly@gmail.com> wrote:
> Fri, 19 Dec 2008 03:56:37 +0000 (UTC), dsimcha wrote:
>
>> Does anyone know of a decent guide that has information on what types of optimizations compilers typically perform and what they aren't capable of performing automatically?  I figure it would be useful to know something like this because, when micro-optimizing performance-critical code, it's silly to do a low-level optimization that makes the code less readable if the compiler's already probably doing it.  On the other hand, it would be nice to know exactly what optimizations (besides the obvious stuff like high-level design and algorithm optimizations) the compiler can't possibly be sufficiently smart to do, so I can spend time looking for opportunities to do those.
>
> I'm afraid that your only option is to look at the assembly output and then trick the compiler into making it better.  Very subtle  changes make difference sometimes and I doubt you can pick this knowledge from any manual.

Sure but there are some things that aren't so subtle that the optimizer probably gets right all the time, or some that it gets right none of the time.

For instance if I need to divide more than one number by the same
thing I usually do
   auto invx = 1.0/x;
   first *= invx;
   second *= invx;

It would be nice to know that I was wasting my time and could just write  first/=invx; second/=invx and it would be optimized to one divide.

Another thing I do a lot is worry about whether to introduce a new variable just to make the naming of a variable more accurate.  Like doing this:

    float sum = 0; foreach(x; nums) sum+=x;  float avg = sum / nums.length;

Instead of the slightly less explicit, but potentially less stack using:

    float avg = 0; foreach(x; nums) avg+=x;  avg/=nums.length;

I bet the compiler is smart enough to eliminate 'avg' if I don't use 'sum' afterwards, but I'm not sure.  Anyway I don't think it makes enough difference to my program's speed to be worth investigating. But still it would be nice to know whether or not *in general* that little micro-optimization is useless or not.

--bb
December 20, 2008
On Fri, 19 Dec 2008 03:48:44 -0800, Bill Baxter <wbaxter@gmail.com> wrote:

> On Fri, Dec 19, 2008 at 8:16 PM, Sergey Gromov <snake.scaly@gmail.com> wrote:
>> Fri, 19 Dec 2008 03:56:37 +0000 (UTC), dsimcha wrote:
>>
>>> Does anyone know of a decent guide that has information on what types of
>>> optimizations compilers typically perform and what they aren't capable of
>>> performing automatically?  I figure it would be useful to know something like
>>> this because, when micro-optimizing performance-critical code, it's silly to
>>> do a low-level optimization that makes the code less readable if the
>>> compiler's already probably doing it.  On the other hand, it would be nice to
>>> know exactly what optimizations (besides the obvious stuff like high-level
>>> design and algorithm optimizations) the compiler can't possibly be
>>> sufficiently smart to do, so I can spend time looking for opportunities to do
>>> those.
>>
>> I'm afraid that your only option is to look at the assembly output and
>> then trick the compiler into making it better.  Very subtle  changes
>> make difference sometimes and I doubt you can pick this knowledge from
>> any manual.
>
> Sure but there are some things that aren't so subtle that the
> optimizer probably gets right all the time, or some that it gets right
> none of the time.
>
> For instance if I need to divide more than one number by the same
> thing I usually do
>    auto invx = 1.0/x;
>    first *= invx;
>    second *= invx;
>
> It would be nice to know that I was wasting my time and could just
> write  first/=invx; second/=invx and it would be optimized to one
> divide.

I haven't seen I compiler yet to do that optimization and DMD 1.038 doesn't:
foreach(ref val; array)  val /= x;
is ten times slower than
foreach(ref val; array)  val *= invx;
on a 100,000 element array.

> Another thing I do a lot is worry about whether to introduce a new
> variable just to make the naming of a variable more accurate.  Like
> doing this:
>
>     float sum = 0; foreach(x; nums) sum+=x;  float avg = sum / nums.length;
>
> Instead of the slightly less explicit, but potentially less stack using:
>
>     float avg = 0; foreach(x; nums) avg+=x;  avg/=nums.length;

> I bet the compiler is smart enough to eliminate 'avg' if I don't use
> 'sum' afterwards, but I'm not sure.

Most compilers do this with single variables, but if avg was a struct then the later is better than the former on at least one compiler I used (nvcc, an Open64 derivative) and the difference is important on GPUs at least.

> Anyway I don't think it makes
> enough difference to my program's speed to be worth investigating.
> But still it would be nice to know whether or not *in general* that
> little micro-optimization is useless or not.
>
> --bb

My general rule of thumb is than the optimizer is pretty good within a single code block and less so across blocks. And there's really no substitute for profiling and testing the user-optimization to see if it's actually worthwhile.

December 20, 2008
Robert Jacques wrote:
> I haven't seen I compiler yet to do that optimization and DMD 1.038 doesn't:
> foreach(ref val; array)  val /= x;
> is ten times slower than
> foreach(ref val; array)  val *= invx;
> on a 100,000 element array.

The reason the optimizer doesn't replace floating divides with floating multiplies of the reciprocal is that one gets slightly different answers (due to roundoff).

> My general rule of thumb is than the optimizer is pretty good within a single code block and less so across blocks. And there's really no substitute for profiling and testing the user-optimization to see if it's actually worthwhile.

It's also straightforward to run obj2asm on the output and see what it's doing.
December 20, 2008
Walter Bright:
> The reason the optimizer doesn't replace floating divides with floating multiplies of the reciprocal is that one gets slightly different answers (due to roundoff).

Right. Because FP numbers are just approximations of the Real field. GCC has command line arguments (like -ffast-math -funsafe-math-optimizations and others) that force similar unsafe optimizations.

Bye,
bearophile
December 20, 2008
Don wrote:
> dsimcha wrote:
>> Does anyone know of a decent guide that has information on what types of
>> optimizations compilers typically perform and what they aren't capable of
>> performing automatically?  I figure it would be useful to know something like
>> this because, when micro-optimizing performance-critical code, it's silly to
>> do a low-level optimization that makes the code less readable if the
>> compiler's already probably doing it.  On the other hand, it would be nice to
>> know exactly what optimizations (besides the obvious stuff like high-level
>> design and algorithm optimizations) the compiler can't possibly be
>> sufficiently smart to do, so I can spend time looking for opportunities to do
>> those.
> 
> As always -- Agner Fog. www.agner.org. Chapter 1 of his optimisation manual lists the optimisations performed for Microsoft, Borland, Intel, GCC, DMC, Watcom, and Codeplay C++ compilers. DMD is basically the same as DMC, except that it's better at constant propagation.

Don

Thanks for the great link. The "Optimizing software in C++" is a interesting read. It even mentions "D".

Nick B.
December 21, 2008
Nick B wrote:
> Don wrote:
>> dsimcha wrote:
>>> Does anyone know of a decent guide that has information on what types of
>>> optimizations compilers typically perform and what they aren't capable of
>>> performing automatically?
>>
>> As always -- Agner Fog. www.agner.org. Chapter 1 of his optimisation manual lists the optimisations performed for Microsoft, Borland, Intel, GCC, DMC, Watcom, and Codeplay C++ compilers. DMD is basically the same as DMC, except that it's better at constant propagation.
> 
> Don
> 
> Thanks for the great link. The "Optimizing software in C++" is a interesting read. It even mentions "D".

>
> Nick B.

I didn't know that. I've been trying to get Agner interested in D for a while now. Not a bad quote:

"Another alternative worth considering is the D language. D has many of the features of Java and C# and avoids many of the drawbacks of C++. Yet, D is compiled to binary code and can be linked together with C or C++ code. Compilers and IDE’s for D are not yet as well developed as C++ compilers."


Don.