May 16, 2010
Ali Çehreli:
> I don't see it in the spec. Is that an old or an unintended feature?

It's a compiler bug, don't use that bracket less syntax in your programs.
Don is fighting to fix such problems (and I have written several posts and bug reports on that stuff).

Bye,
bearophile
May 16, 2010
strtr wrote:
> == Quote from Don (nospam@nospam.com)'s article
>> strtr wrote:
>>> == Quote from bearophile (bearophileHUGS@lycos.com)'s article
>>>> But the bigger problem in your code is that you are performing operations on
>>> NaNs (that's the default initalization of FP values in D), and operations on NaNs
>>> are usually quite slower.
>>>
>>> I didn't know that. Is it the same for inf?
>> Yes, nan and inf are usually the same speed. However, it's very CPU
>> dependent, and even *within* a CPU! On Pentium 4, for example, for x87,
>> nan is 200 times slower than a normal value (!), but on Pentium 4 SSE
>> there's no speed difference at all between nan and normal. I think
>> there's no speed difference on AMD, but I'm not sure.
>> There's almost no documentation on it at all.
> 
> Thanks!
> NaNs being slower I can understand but inf might well be a value you want to use.

Yes. What's happened is that none of the popular programming languages support special IEEE values, so they're given very low priority by chip designers. In the Pentium 4 case, they're implemented entirely in microcode. A 200X slowdown is really significant.

However, the bit pattern for NaN is 0xFFFF..., which is the same as a negative integer, so an uninitialized floating-point variable has a quite high probability of being a NaN. I'm certain there's a lot of C programs out there which are inadvertantly using NaNs.
May 16, 2010
Jérôme M. Berger wrote:
> div0 wrote:
>> Jérôme M. Berger wrote:
>>> 	That depends. In C/C++, the default value for any global variable
>>> is to have all bits set to 0 whatever that means for the actual data
>>> type.
>> No it's not, it's always uninitialized.
>>
> 	According to the C89 standard and onwards it *must* be initialized
> to 0. If it isn't then your implementation isn't standard compliant
> (needless to say, gcc, Visual, llvm, icc and dmc are all standard
> compliant, so you won't have any difficulty checking).

Ah, I only do C++, where the standard is to not initialise. I didn't know the two specs had diverged like that.

- --
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk
May 16, 2010
"div0" <div0@users.sourceforge.net> wrote:
>>> Jérôme M. Berger wrote:
>>>> That depends. In C/C++, the default value for any global variable
>>>> is to have all bits set to 0 whatever that means for the actual data
>>>> type.
> Ah, I only do C++, where the standard is to not initialise.

No, in C++ all *global or static* variables are zero-initialized. By default, stack variables are default-initialized, which means that doubles in stack can have any value (they are uninitialized).

The C-function calloc is required to fill the newly allocated memory with zero bit pattern; malloc is not required to initialize anything. Fresh heap areas given by malloc may have zero bit pattern, but one should really make no assumptions on this.

-- 
Jouko 

May 16, 2010
div0 wrote:
> Jérôme M. Berger wrote:
>> div0 wrote:
>>> Jérôme M. Berger wrote:
>>>> 	That depends. In C/C++, the default value for any global variable
>>>> is to have all bits set to 0 whatever that means for the actual data
>>>> type.
>>> No it's not, it's always uninitialized.
>>>
>> 	According to the C89 standard and onwards it *must* be initialized
>> to 0. If it isn't then your implementation isn't standard compliant
>> (needless to say, gcc, Visual, llvm, icc and dmc are all standard
>> compliant, so you won't have any difficulty checking).
> 
> Ah, I only do C++, where the standard is to not initialise. I didn't know the two specs had diverged like that.
> 
	The specs haven't diverged and C++ has mostly the same behaviour as
C where global variables are concerned. The only difference is that
if the global variable is a class with a constructor, then that
constructor gets called after the memory is zeroed out.

		Jerome
-- 
mailto:jeberger@free.fr
http://jeberger.free.fr
Jabber: jeberger@jabber.fr



May 16, 2010
Don wrote:
> bearophile wrote:
>> kai:
>>> Any ideas? Am I somehow not hitting a vital compiler optimization?
>>
>> DMD compiler doesn't perform many optimizations, especially on floating point computations.
> 
> More precisely:
> In terms of optimizations performed, DMD isn't too far behind gcc. But it performs almost no optimization on floating point. Also, the inliner doesn't yet support the newer D features (this won't be hard to fix) and the scheduler is based on Pentium1.

Have to be careful when talking about floating point optimizations. For example,

   x/c => x * 1/c

is not done because of roundoff error. Also,

   0 * x => 0

is also not done because it is not a correct replacement if x is a NaN.
May 16, 2010
bearophile wrote:
> DMD compiler doesn't perform many optimizations,

This is simply false. DMD does an excellent job with integer and pointer operations. It does a so-so job with floating point.

There are probably over a thousand optimizations at all levels that dmd does with integer and pointer code.

Compare the generated code with and without -O. Even without -O, dmd does a long list of optimizations (such as common subexpression elimination).
May 17, 2010
Walter Bright:
> This is simply false. DMD does an excellent job with integer and pointer
> operations. It does a so-so job with floating point.
> There are probably over a thousand optimizations at all levels that dmd does
> with integer and pointer code.

You are of course right, I understand your feelings, I am a stupid -.-
I must be more precise in my posts. You are right that surely dmd performs numerous optimizations. What I meant to say was a comparison with other compilers, particularly ldc. And even then generic words about a generic comparison aren't useful. So I am sorry.

Bye,
bearophile
May 17, 2010
On 5/16/2010 4:15 PM, Walter Bright wrote:
> bearophile wrote:
>> DMD compiler doesn't perform many optimizations,
> 
> This is simply false. DMD does an excellent job with integer and pointer operations. It does a so-so job with floating point.
> 
> There are probably over a thousand optimizations at all levels that dmd does with integer and pointer code.
> 
> Compare the generated code with and without -O. Even without -O, dmd does a long list of optimizations (such as common subexpression elimination).

While it's false that DMD doesn't do many optimizations.  It's true that it's behind more modern compiler optimizers.

I've been working to fix some of the grossly bad holes in dmd's inliner which is
one are that's just obviously lacking (see bug 2008).  But gcc and ldc (and
likely msvc though I lack any direct knowledge) are simply a decade or so ahead.
 It's not a criticism of dmd or a suggestion that the priorities are in the
wrong place, just a point of fact.  They've got larger teams of people and are
spending significant time on just improving and adding optimizations.

Later,
Brad
May 17, 2010
Walter Bright:

> is not done because of roundoff error. Also,
>     0 * x => 0
> is also not done because it is not a correct replacement if x is a NaN.

I have done a little experiment, compiling this D1 code with LDC:


import tango.stdc.stdio: printf;
void main(char[][] args) {
    double x = cast(double)args.length;
    double y = 0 * x;
    printf("%f\n", y);
}


I think the asm generated by ldc shows what you say:


ldc -O3 -release -inline -output-s test
_Dmain:
	pushl	%ebp
	movl	%esp, %ebp
	andl	$-16, %esp
	subl	$32, %esp
	movsd	.LCPI1_0, %xmm0
	movd	8(%ebp), %xmm1
	orps	%xmm0, %xmm1
	subsd	%xmm0, %xmm1
	pxor	%xmm0, %xmm0
	mulsd	%xmm1, %xmm0
	movsd	%xmm0, 4(%esp)
	movl	$.str, (%esp)
	call	printf
	xorl	%eax, %eax
	movl	%ebp, %esp
	popl	%ebp
	ret	$8



So I have added an extra "unsafe floating point" optimization:

ldc -O3 -release -inline -enable-unsafe-fp-math -output-s test
_Dmain:
	subl	$12, %esp
	movl	$0, 8(%esp)
	movl	$0, 4(%esp)
	movl	$.str, (%esp)
	call	printf
	xorl	%eax, %eax
	addl	$12, %esp
	ret	$8


GCC has similar switches.

Bye,
bearophile