Compiler optimizations - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Compiler optimizations

Thread overview

Compiler optimizations

Apr 30, 2006

Apr 30, 2006

dennis luehring

Apr 30, 2006

Apr 30, 2006

dennis luehring

Apr 30, 2006

Apr 30, 2006

Apr 30, 2006

May 02, 2006

Re: Compiler optimizations (I'm baffled)
May 03, 2006 Bruno Medeiros
May 03, 2006 Walter Bright
May 03, 2006 Thomas Kuehne
May 04, 2006 Bruno Medeiros
May 04, 2006 Don Clugston
May 04, 2006 pmoore
May 04, 2006 pmoore
May 04, 2006 Don Clugston
May 03, 2006 Dave
May 03, 2006 Dave

Apr 30, 2006

dennis luehring

Apr 30, 2006

Apr 30, 2006

dennis luehring

Apr 30, 2006

dennis luehring

Apr 30, 2006

Apr 30, 2006

Apr 30, 2006

Apr 30, 2006

Apr 30, 2006

Apr 30, 2006

dennis luehring

Apr 30, 2006

May 01, 2006

dennis luehring

May 01, 2006

May 03, 2006

May 04, 2006

May 03, 2006

May 02, 2006

April 30, 2006

Compiler optimizations

Posted by Craig Black

Craig Black

So I've been playing around with the pi program.  I ported it to C++ and I've been doing some benchmarks with MSVC.  It seems that MSVC is a little faster when SSE2 instructions are enabled.  It is slower when they are disabled.  Which leads me to a question.  Does the D backend make use of SSE 2 intstructions?  I think the answer is yes but perhaps the optimizer is not quite as sophisticated as MSVC yet.

Another question for someone who knows a lot about backend optimizations. I've noticed that integer division is faster when the the divisor is a constant.  How does the compiler optimize this?

Another thing.  I know that integer division is essentially a floating point operation because there is no integer division instruction.  I also read some on the internet and learned that Intel has a library that provides fixed-point math routines, which provides a division operation that outperforms floating-point division.  Then I thought to myself, "Why don't compilers just use fixed-point division for dividing integers since it's faster?"

I know some of you out there know way more than me so please all you smart people give me some input.

-Craig

April 30, 2006

Re: Compiler optimizations

Posted by dennis luehring
in reply to Craig Black

dennis luehring

Posted in reply to Craig Black

> Another thing.  I know that integer division is essentially a floating point
> operation because there is no integer division instruction.

and what is "idiv" for you?
(http://www.jegerlehner.ch/intel/opcode.html)

> I also read
> some on the internet and learned that Intel has a library that provides
> fixed-point math routines, which provides a division operation that
> outperforms floating-point division.  Then I thought to myself, "Why don't
> compilers just use fixed-point division for dividing integers since it's
> faster?"

a fixpoint floating somethimes outperforms an floating-point operating (with loss of precession)

but it never outperforms an true integer division

ciao dennis

April 30, 2006

Re: Compiler optimizations

Posted by Craig Black
in reply to dennis luehring

Craig Black

Posted in reply to dennis luehring

"dennis luehring" <dl.soluz@gmx.net> wrote in message news:e32c0m$1dsu$1@digitaldaemon.com...
> > Another thing.  I know that integer division is essentially a floating
point
> > operation because there is no integer division instruction.
>
> and what is "idiv" for you?
> (http://www.jegerlehner.ch/intel/opcode.html)

Please anyone, correct me if I'm wrong, but I think IDIV is a pseudo-instruction.  80x86 is notorious for those (as opposed to RISC).  I read on the Intel website that intel processors do not have a hardware integer division instruction.  Apparently it is somehow better to use the floating point instruction instead.

The benchmarks that I've done are constistent with this.  I found it was faster to perform a floating-point multiplication on an integer and then convert it back to an integer than just a single integer division.

>
> > I also read
> > some on the internet and learned that Intel has a library that provides
> > fixed-point math routines, which provides a division operation that
> > outperforms floating-point division.  Then I thought to myself, "Why
don't
> > compilers just use fixed-point division for dividing integers since it's faster?"
>
> a fixpoint floating somethimes outperforms an floating-point operating
> (with loss of precession)
>
> but it never outperforms an true integer division
>
> ciao dennis

Have you run benchmarks?

-Craig

April 30, 2006

Re: Compiler optimizations

Posted by dennis luehring
in reply to Craig Black

dennis luehring

Posted in reply to Craig Black

> Please anyone, correct me if I'm wrong, but I think IDIV is a
> pseudo-instruction.  80x86 is notorious for those (as opposed to RISC).  I
> read on the Intel website that intel processors do not have a hardware
> integer division instruction.  Apparently it is somehow better to use the
> floating point instruction instead.

i don't know the truth - but why should all the compiler vendors around us (microsoft, intel, ...) produce the slower code if there is an faster solution? what is the reason (maybe it produce lager code - it this will produce more cache problems...)

> The benchmarks that I've done are constistent with this.  I found it was
> faster to perform a floating-point multiplication on an integer and then
> convert it back to an integer than just a single integer division.

can you post your benchmark source?

> Have you run benchmarks?

no - but i can run your benchmark...

April 30, 2006

Re: Compiler optimizations

Posted by Craig Black
in reply to dennis luehring

Craig Black

Posted in reply to dennis luehring

"dennis luehring" <dl.soluz@gmx.net> wrote in message news:e32dfl$1f53$1@digitaldaemon.com...
> > Please anyone, correct me if I'm wrong, but I think IDIV is a pseudo-instruction.  80x86 is notorious for those (as opposed to RISC).
I
> > read on the Intel website that intel processors do not have a hardware integer division instruction.  Apparently it is somehow better to use
the
> > floating point instruction instead.
>
> i don't know the truth - but why should all the compiler vendors around us (microsoft, intel, ...) produce the slower code if there is an faster solution? what is the reason (maybe it produce lager code - it this will produce more cache problems...)

I'm not saying that there is a faster solution, but there might be.  Mainly, I'm just trying to learn more so that I will be better at optimizing my code.

> > The benchmarks that I've done are constistent with this.  I found it was faster to perform a floating-point multiplication on an integer and then convert it back to an integer than just a single integer division.
>
> can you post your benchmark source?
>
> > Have you run benchmarks?
>
> no - but i can run your benchmark...

These two functions show the difference that I'm talking about.  When benchmarking, pass in the same values for the divisor (pick a value between 2 and 30), and pass in a large enough value for total so that it will take some time for the CPU to finish.  Suprisingly, the second one is faster.

-Craig

int divTest1(int divisor, int total)
{
  int sum = 0;
  for(int i = 0; i < total; i++)
  {
    int quotient = i / divisor;
    sum += quotient;
  }
}

int divTest2(int divisor, int total)
{
  int sum = 0;
  double idiv = 1.0 / divisor;
  for(int i = 0; i < total; i++)
  {
    int quotient = i * idiv;
    sum += quotient;
  }
}

April 30, 2006

Re: Compiler optimizations

Posted by James Pelcis
in reply to Craig Black

James Pelcis

Posted in reply to Craig Black

This doesn't really surprise me.  If I am reading it correctly, the major difference comes from only doing the division once.  If you did the division every time, the results might not be the same.

Craig Black wrote:
> These two functions show the difference that I'm talking about.  When benchmarking, pass in the same values for the divisor (pick a value between 2 and 30), and pass in a large enough value for total so that it will take some time for the CPU to finish.  Suprisingly, the second one is faster.
> 
> -Craig
> 
> int divTest1(int divisor, int total)
> {
>   int sum = 0;
>   for(int i = 0; i < total; i)
>   {
>     int quotient = i / divisor;
>     sum = quotient;
>   }
> }
> 
> int divTest2(int divisor, int total)
> {
>   int sum = 0;
>   double idiv = 1.0 / divisor;
>   for(int i = 0; i < total; i)
>   {
>     int quotient = i * idiv;
>     sum = quotient;
>   }
> }

April 30, 2006

Re: Compiler optimizations

Posted by dennis luehring
in reply to Craig Black

dennis luehring

Posted in reply to Craig Black

small change

> int divTest2(int divisor, int total)
> {
>   int sum = 0;
>   for(int i = 0; i < total; i++)
>   {
>     int quotient = i * ( 1.0 / divisor ); // !!!!!!!!
>     sum += quotient;
>   }
> }

April 30, 2006

Re: Compiler optimizations

Posted by Craig Black
in reply to dennis luehring

Craig Black

Posted in reply to dennis luehring

>
> > int divTest2(int divisor, int total)
> > {
> >   int sum = 0;
> >   for(int i = 0; i < total; i++)
> >   {
> >     int quotient = i * ( 1.0 / divisor ); // !!!!!!!!
> >     sum += quotient;
> >   }
> > }

I don't see what you are trying to say here.

-Craig

April 30, 2006

Re: Compiler optimizations

Posted by Craig Black
in reply to James Pelcis

Craig Black

Posted in reply to James Pelcis

> This doesn't really surprise me.  If I am reading it correctly, the major difference comes from only doing the division once.  If you did the division every time, the results might not be the same.

I don't think you realize what is happening here ... we are dealing with integer division vs. floating point multiplication.  With the floating point multiplication we have

1) conversion from int to a floating point
2) the floating point multiplication
3) convertion from floating point back to int

Although CPUs have gotten faster at converting to and from floating points, the conversion is not trivial to performance.  What's suprising is that these three operations are faster than a singe integer division.  This is because integer division is essentially floating point division under the hood.

-Craig

April 30, 2006

Re: Compiler optimizations

Posted by Craig Black
in reply to dennis luehring

Craig Black

Posted in reply to dennis luehring

If you want to get a direct comparison between floating point and integer division, then use

> > int divTest2(double divisor, int total)
> > {
> >   int sum = 0;
> >   for(int i = 0; i < total; i++)
> >   {
> >     int quotient = i / divisor; // !!!!!!!!
> >     sum += quotient;
> >   }
> > }

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation