May 16, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | Ali Çehreli:
> I don't see it in the spec. Is that an old or an unintended feature?
It's a compiler bug, don't use that bracket less syntax in your programs.
Don is fighting to fix such problems (and I have written several posts and bug reports on that stuff).
Bye,
bearophile
|
May 16, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to strtr | strtr wrote:
> == Quote from Don (nospam@nospam.com)'s article
>> strtr wrote:
>>> == Quote from bearophile (bearophileHUGS@lycos.com)'s article
>>>> But the bigger problem in your code is that you are performing operations on
>>> NaNs (that's the default initalization of FP values in D), and operations on NaNs
>>> are usually quite slower.
>>>
>>> I didn't know that. Is it the same for inf?
>> Yes, nan and inf are usually the same speed. However, it's very CPU
>> dependent, and even *within* a CPU! On Pentium 4, for example, for x87,
>> nan is 200 times slower than a normal value (!), but on Pentium 4 SSE
>> there's no speed difference at all between nan and normal. I think
>> there's no speed difference on AMD, but I'm not sure.
>> There's almost no documentation on it at all.
>
> Thanks!
> NaNs being slower I can understand but inf might well be a value you want to use.
Yes. What's happened is that none of the popular programming languages support special IEEE values, so they're given very low priority by chip designers. In the Pentium 4 case, they're implemented entirely in microcode. A 200X slowdown is really significant.
However, the bit pattern for NaN is 0xFFFF..., which is the same as a negative integer, so an uninitialized floating-point variable has a quite high probability of being a NaN. I'm certain there's a lot of C programs out there which are inadvertantly using NaNs.
|
May 16, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jérôme M. Berger Attachments: | Jérôme M. Berger wrote: > div0 wrote: >> Jérôme M. Berger wrote: >>> That depends. In C/C++, the default value for any global variable >>> is to have all bits set to 0 whatever that means for the actual data >>> type. >> No it's not, it's always uninitialized. >> > According to the C89 standard and onwards it *must* be initialized > to 0. If it isn't then your implementation isn't standard compliant > (needless to say, gcc, Visual, llvm, icc and dmc are all standard > compliant, so you won't have any difficulty checking). Ah, I only do C++, where the standard is to not initialise. I didn't know the two specs had diverged like that. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk |
May 16, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to div0 | "div0" <div0@users.sourceforge.net> wrote: >>> Jérôme M. Berger wrote: >>>> That depends. In C/C++, the default value for any global variable >>>> is to have all bits set to 0 whatever that means for the actual data >>>> type. > Ah, I only do C++, where the standard is to not initialise. No, in C++ all *global or static* variables are zero-initialized. By default, stack variables are default-initialized, which means that doubles in stack can have any value (they are uninitialized). The C-function calloc is required to fill the newly allocated memory with zero bit pattern; malloc is not required to initialize anything. Fresh heap areas given by malloc may have zero bit pattern, but one should really make no assumptions on this. -- Jouko |
May 16, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to div0 Attachments:
| div0 wrote: > Jérôme M. Berger wrote: >> div0 wrote: >>> Jérôme M. Berger wrote: >>>> That depends. In C/C++, the default value for any global variable >>>> is to have all bits set to 0 whatever that means for the actual data >>>> type. >>> No it's not, it's always uninitialized. >>> >> According to the C89 standard and onwards it *must* be initialized >> to 0. If it isn't then your implementation isn't standard compliant >> (needless to say, gcc, Visual, llvm, icc and dmc are all standard >> compliant, so you won't have any difficulty checking). > > Ah, I only do C++, where the standard is to not initialise. I didn't know the two specs had diverged like that. > The specs haven't diverged and C++ has mostly the same behaviour as C where global variables are concerned. The only difference is that if the global variable is a class with a constructor, then that constructor gets called after the memory is zeroed out. Jerome -- mailto:jeberger@free.fr http://jeberger.free.fr Jabber: jeberger@jabber.fr |
May 16, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don | Don wrote:
> bearophile wrote:
>> kai:
>>> Any ideas? Am I somehow not hitting a vital compiler optimization?
>>
>> DMD compiler doesn't perform many optimizations, especially on floating point computations.
>
> More precisely:
> In terms of optimizations performed, DMD isn't too far behind gcc. But it performs almost no optimization on floating point. Also, the inliner doesn't yet support the newer D features (this won't be hard to fix) and the scheduler is based on Pentium1.
Have to be careful when talking about floating point optimizations. For example,
x/c => x * 1/c
is not done because of roundoff error. Also,
0 * x => 0
is also not done because it is not a correct replacement if x is a NaN.
|
May 16, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | bearophile wrote:
> DMD compiler doesn't perform many optimizations,
This is simply false. DMD does an excellent job with integer and pointer operations. It does a so-so job with floating point.
There are probably over a thousand optimizations at all levels that dmd does with integer and pointer code.
Compare the generated code with and without -O. Even without -O, dmd does a long list of optimizations (such as common subexpression elimination).
|
May 17, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright:
> This is simply false. DMD does an excellent job with integer and pointer
> operations. It does a so-so job with floating point.
> There are probably over a thousand optimizations at all levels that dmd does
> with integer and pointer code.
You are of course right, I understand your feelings, I am a stupid -.-
I must be more precise in my posts. You are right that surely dmd performs numerous optimizations. What I meant to say was a comparison with other compilers, particularly ldc. And even then generic words about a generic comparison aren't useful. So I am sorry.
Bye,
bearophile
|
May 17, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 5/16/2010 4:15 PM, Walter Bright wrote:
> bearophile wrote:
>> DMD compiler doesn't perform many optimizations,
>
> This is simply false. DMD does an excellent job with integer and pointer operations. It does a so-so job with floating point.
>
> There are probably over a thousand optimizations at all levels that dmd does with integer and pointer code.
>
> Compare the generated code with and without -O. Even without -O, dmd does a long list of optimizations (such as common subexpression elimination).
While it's false that DMD doesn't do many optimizations. It's true that it's behind more modern compiler optimizers.
I've been working to fix some of the grossly bad holes in dmd's inliner which is
one are that's just obviously lacking (see bug 2008). But gcc and ldc (and
likely msvc though I lack any direct knowledge) are simply a decade or so ahead.
It's not a criticism of dmd or a suggestion that the priorities are in the
wrong place, just a point of fact. They've got larger teams of people and are
spending significant time on just improving and adding optimizations.
Later,
Brad
|
May 17, 2010 Re: Loop optimization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright:
> is not done because of roundoff error. Also,
> 0 * x => 0
> is also not done because it is not a correct replacement if x is a NaN.
I have done a little experiment, compiling this D1 code with LDC:
import tango.stdc.stdio: printf;
void main(char[][] args) {
double x = cast(double)args.length;
double y = 0 * x;
printf("%f\n", y);
}
I think the asm generated by ldc shows what you say:
ldc -O3 -release -inline -output-s test
_Dmain:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $32, %esp
movsd .LCPI1_0, %xmm0
movd 8(%ebp), %xmm1
orps %xmm0, %xmm1
subsd %xmm0, %xmm1
pxor %xmm0, %xmm0
mulsd %xmm1, %xmm0
movsd %xmm0, 4(%esp)
movl $.str, (%esp)
call printf
xorl %eax, %eax
movl %ebp, %esp
popl %ebp
ret $8
So I have added an extra "unsafe floating point" optimization:
ldc -O3 -release -inline -enable-unsafe-fp-math -output-s test
_Dmain:
subl $12, %esp
movl $0, 8(%esp)
movl $0, 4(%esp)
movl $.str, (%esp)
call printf
xorl %eax, %eax
addl $12, %esp
ret $8
GCC has similar switches.
Bye,
bearophile
|
Copyright © 1999-2021 by the D Language Foundation