February 08, 2002
"Juan Carlos Arevalo Baeza" <jcab@roningames.com> wrote in message news:a419vp$2kol$1@digitaldaemon.com...

>    The suffix specifies the precision of the return type, which cannot be
> overloaded on. Or am I wrong?

Return type can be determined by the argument:

    float sqrt(float);
    double sqrt(double);
    extended sqrt(extended);


February 08, 2002
On Intel processors, the float and double math computations are not one iota faster than the extended ones. The ONLY reasons to use float and double are:

1) compatibility with C
2) large arrays will use less space

"Sean L. Palmer" <spalmer@iname.com> wrote in message news:a40c1s$1lfi$1@digitaldaemon.com...
> Actually can we have some functions like sin, cos, tan, and sqrt that deal with float instead of double?  In the world of games, speed is usually
more
> important than accuracy and I hate having to explicitly typecast back to float to avoid warnings.
>
> Another nice thing to have is reciprocal square root (most processors have
> this nowadays...) usually it's cheaper (and less accurate) than 1/sqrt(x)
>
> Sean
>
> "John Fletcher" <J.P.Fletcher@aston.ac.uk> wrote in message news:3C6268D1.C1E23080@aston.ac.uk...
> > At the moment function like sqrt() use the underlying C functions in double precision. Is there any way to have versions which work to extended precision?
> >
> > John
>
>
>


February 08, 2002
"Pavel Minayev" <evilone@omen.ru> wrote in message news:a40jcl$1oso$1@digitaldaemon.com...
> Yes, AFAIK Intel FPUs do calculations in full precision anyhow. However, extended arguments have to be passed on stack, and since they're 10-byte long, you get three PUSHes (while float would only take one).

It usually just subtracts 12 from ESP and does an FST. With scheduling and pipelining, the extra instruction frequently takes no extra time.


February 08, 2002
"Sean L. Palmer" <spalmer@iname.com> wrote in message news:a4192g$2ihc$1@digitaldaemon.com...
> I believe the common form of this stuff is to add "f" to the end of the
name
>
> sqrtf
> fabsf
> fmodf


Since D supports overloading by argument type, that is not necessary.


February 09, 2002
True true.

Sean

"Pavel Minayev" <evilone@omen.ru> wrote in message news:a419ev$2kdg$1@digitaldaemon.com...
> "Sean L. Palmer" <spalmer@iname.com> wrote in message news:a4192g$2ihc$1@digitaldaemon.com...
>
> > I believe the common form of this stuff is to add "f" to the end of the
> name
> >
> > sqrtf
> > fabsf
> > fmodf
>
> Why, if we have function overloading?



February 09, 2002
That's not true... but you have to set the CPU into low precision mode to see the speed advantages.  Otherwise it internally works with double precision by default.

In game scenarios, we can't just go around wasting 8 bytes per number when 4 bytes will do.  And it depends on the processor, as well.

Floats are still definitely faster.  For instance the P4 can handle 2 doubles per instruction, but can do 4 floats in the same amount of time.

Sean

"Walter" <walter@digitalmars.com> wrote in message news:a41oen$2se5$4@digitaldaemon.com...
> On Intel processors, the float and double math computations are not one
iota
> faster than the extended ones. The ONLY reasons to use float and double
are:
>
> 1) compatibility with C
> 2) large arrays will use less space



February 09, 2002
Hmm. I didn't know that. -Walter

"Sean L. Palmer" <spalmer@iname.com> wrote in message news:a42k1f$6m1$1@digitaldaemon.com...
> That's not true... but you have to set the CPU into low precision mode to see the speed advantages.  Otherwise it internally works with double precision by default.
>
> In game scenarios, we can't just go around wasting 8 bytes per number when
4
> bytes will do.  And it depends on the processor, as well.
>
> Floats are still definitely faster.  For instance the P4 can handle 2 doubles per instruction, but can do 4 floats in the same amount of time.
>
> Sean
>
> "Walter" <walter@digitalmars.com> wrote in message news:a41oen$2se5$4@digitaldaemon.com...
> > On Intel processors, the float and double math computations are not one
> iota
> > faster than the extended ones. The ONLY reasons to use float and double
> are:
> >
> > 1) compatibility with C
> > 2) large arrays will use less space
>
>
>


February 09, 2002
Walter wrote:

> On Intel processors, the float and double math computations are not one iota
> faster than the extended ones. The ONLY reasons to use float and double are:
> 
> 1) compatibility with C
> 2) large arrays will use less space
> 


As an extension of item 2, note that in the FPU, they're not
one iota faster, but getting thousands of floats into and out
of level-1 cache is much faster than doubles or extendeds.

That's the main reason that 3D graphics and high-end audio
applications, today, use floats instead of the fatter
formats.


-RB

February 09, 2002
Here is a sample from the MSDN docs for VC++ 6.0 which illustrates this: (you can do timings yourself if you wish...  It only affects the FPU x87 coprocessor.  I haven't tried this in a few years so newer Pentium 4 processors may not see much advantage from this)  However using SSE2 it is still true that with one instruction you can either process 2 doubles or 4 floats.

I believe the main advantage this provides is keeping the FPU from having to do so much work with complex calculations like division, square root, trig, etc.  Less bits of precision need be computed.  They can get away with fewer iterations, cheaper approximations, less terms in the Taylor series, etc.

In a lot of cases 5 or 6 digits of precision is all we need.  So don't get rid of the float type yet.  ;)

Sean

/* CNTRL87.C: This program uses _control87 to output the control
 * word, set the precision to 24 bits, and reset the status to
 * the default.
 */

#include <stdio.h>
#include <float.h>

void main( void )
{
   double a = 0.1;

   /* Show original control word and do calculation. */
   printf( "Original: 0x%.4x\n", _control87( 0, 0 ) );
   printf( "%1.1f * %1.1f = %.15e\n", a, a, a * a );

   /* Set precision to 24 bits and recalculate. */
   printf( "24-bit:   0x%.4x\n", _control87( _PC_24, MCW_PC ) );
   printf( "%1.1f * %1.1f = %.15e\n", a, a, a * a );

   /* Restore to default and recalculate. */
   printf( "Default:  0x%.4x\n",
          _control87( _CW_DEFAULT, 0xfffff ) );
   printf( "%1.1f * %1.1f = %.15e\n", a, a, a * a );
}



Output

Original: 0x9001f
0.1 * 0.1 = 1.000000000000000e-002
24-bit:   0xa001f
0.1 * 0.1 = 9.999999776482582e-003
Default:  0x001f
0.1 * 0.1 = 1.000000000000000e-002


"Walter" <walter@digitalmars.com> wrote in message news:a42tca$hrc$2@digitaldaemon.com...
> Hmm. I didn't know that. -Walter
>
> "Sean L. Palmer" <spalmer@iname.com> wrote in message news:a42k1f$6m1$1@digitaldaemon.com...
> > That's not true... but you have to set the CPU into low precision mode
to
> > see the speed advantages.  Otherwise it internally works with double precision by default.
> >
> > In game scenarios, we can't just go around wasting 8 bytes per number
when
> 4
> > bytes will do.  And it depends on the processor, as well.
> >
> > Floats are still definitely faster.  For instance the P4 can handle 2 doubles per instruction, but can do 4 floats in the same amount of time.
> >
> > Sean
> >
> > "Walter" <walter@digitalmars.com> wrote in message news:a41oen$2se5$4@digitaldaemon.com...
> > > On Intel processors, the float and double math computations are not
one
> > iota
> > > faster than the extended ones. The ONLY reasons to use float and
double
> > are:
> > >
> > > 1) compatibility with C
> > > 2) large arrays will use less space
> >
> >
> >
>
>


February 09, 2002
I know that you can reset the internal calculation precision. I did not know this affected execution time, I've not seen any hint of that in the Intel CPU documentation, though I could have just missed it.

"Sean L. Palmer" <spalmer@iname.com> wrote in message news:a444db$14h4$1@digitaldaemon.com...
> Here is a sample from the MSDN docs for VC++ 6.0 which illustrates this: (you can do timings yourself if you wish...  It only affects the FPU x87 coprocessor.  I haven't tried this in a few years so newer Pentium 4 processors may not see much advantage from this)  However using SSE2 it is still true that with one instruction you can either process 2 doubles or 4 floats.
>
> I believe the main advantage this provides is keeping the FPU from having
to
> do so much work with complex calculations like division, square root,
trig,
> etc.  Less bits of precision need be computed.  They can get away with
fewer
> iterations, cheaper approximations, less terms in the Taylor series, etc.
>
> In a lot of cases 5 or 6 digits of precision is all we need.  So don't get rid of the float type yet.  ;)
>
> Sean
>
> /* CNTRL87.C: This program uses _control87 to output the control
>  * word, set the precision to 24 bits, and reset the status to
>  * the default.
>  */
>
> #include <stdio.h>
> #include <float.h>
>
> void main( void )
> {
>    double a = 0.1;
>
>    /* Show original control word and do calculation. */
>    printf( "Original: 0x%.4x\n", _control87( 0, 0 ) );
>    printf( "%1.1f * %1.1f = %.15e\n", a, a, a * a );
>
>    /* Set precision to 24 bits and recalculate. */
>    printf( "24-bit:   0x%.4x\n", _control87( _PC_24, MCW_PC ) );
>    printf( "%1.1f * %1.1f = %.15e\n", a, a, a * a );
>
>    /* Restore to default and recalculate. */
>    printf( "Default:  0x%.4x\n",
>           _control87( _CW_DEFAULT, 0xfffff ) );
>    printf( "%1.1f * %1.1f = %.15e\n", a, a, a * a );
> }
>
>
>
> Output
>
> Original: 0x9001f
> 0.1 * 0.1 = 1.000000000000000e-002
> 24-bit:   0xa001f
> 0.1 * 0.1 = 9.999999776482582e-003
> Default:  0x001f
> 0.1 * 0.1 = 1.000000000000000e-002
>
>
> "Walter" <walter@digitalmars.com> wrote in message news:a42tca$hrc$2@digitaldaemon.com...
> > Hmm. I didn't know that. -Walter
> >
> > "Sean L. Palmer" <spalmer@iname.com> wrote in message news:a42k1f$6m1$1@digitaldaemon.com...
> > > That's not true... but you have to set the CPU into low precision mode
> to
> > > see the speed advantages.  Otherwise it internally works with double precision by default.
> > >
> > > In game scenarios, we can't just go around wasting 8 bytes per number
> when
> > 4
> > > bytes will do.  And it depends on the processor, as well.
> > >
> > > Floats are still definitely faster.  For instance the P4 can handle 2 doubles per instruction, but can do 4 floats in the same amount of
time.
> > >
> > > Sean
> > >
> > > "Walter" <walter@digitalmars.com> wrote in message news:a41oen$2se5$4@digitaldaemon.com...
> > > > On Intel processors, the float and double math computations are not
> one
> > > iota
> > > > faster than the extended ones. The ONLY reasons to use float and
> double
> > > are:
> > > >
> > > > 1) compatibility with C
> > > > 2) large arrays will use less space
> > >
> > >
> > >
> >
> >
>
>