November 10, 2006
J Duncan wrote:
> %u wrote:
>> "Do you have any observations about what shorts of things put you
>> in the 20% category?"
>>
>> - Luckly the code requiring sheer processing power like math
>> functions(trigs, logs...), b-tree creation, compression, D code
>> runs only 5-6% slower (an averaged mean) compared to Intel C
>> compiler.

IMO, that's not real discouraging considering that Intel C/++/Fortran seems to be considered 'the best' for Intel platforms <g>

FWIW - DMD (the compiler, not the language) often lags by a good margin in two major areas that I wish would be improved: floating point calculations and recursion. The D FP spec. doesn't have the same maximum precision prohibitions as the C/++ spec. so it can be more heavily optimized and still follow the spec. DMD doesn't currently take advantage of that though.

>> - Unfortunately, processes requiring sheer memory access like
>> memcopy, mem alloc, de-alloc, stream copy is nearly almost 15->20%
>> slower at D. (note that, code is totally identical).
>>

D uses the DMC lib. for most of that stuff so the difference could be there.

Could it be that Intel is doing whole program optimization to inline things like memcpy and memset, during linkage (Are you using WPO? -- it may be the default, I can't remember)? I've found that for time critical code I can code my own (e.g.: memset) in D so the compiler can inline it and it will be faster. Perhaps those should be in Phobos instead of the C lib.?

>> I'll post these results on my blog once I put them into a good

I didn't see a url for your blog?

>> graphed format so we can discuss it even further. with my limited
>> knowledge on compilers, what i've seen is that intel c compiler
>> has many ingenious optimisations. maybe there can be a way to put
>> the same ideas into D compiler. (i hope)
> 
> 
> so are we talking about a GC issue? I think it would be interesting to use D for a front end to C . Then basically D code could be ran through the Intel optimizer.
November 10, 2006
Dave wrote:
> 
> Could it be that Intel is doing whole program optimization to inline things like memcpy and memset, during linkage (Are you using WPO? -- it may be the default, I can't remember)? I've found that for time critical code I can code my own (e.g.: memset) in D so the compiler can inline it and it will be faster. Perhaps those should be in Phobos instead of the C lib.?

I've been thinking about this as well.  These functions are possibly a bit much for intrinsics, but it would be fairly trivial to write them in native D or even assembler--would have to inspect the resulting compiled code to see which was better.  The only problem offhand is that DMD does not inline functions containing loops, nor does it inline functions containing ASM blocks, so we'd probably be stuck with a function call even with native D code.

Sean
November 10, 2006
Sean Kelly wrote:
> Dave wrote:
>>
>> Could it be that Intel is doing whole program optimization to inline things like memcpy and memset, during linkage (Are you using WPO? -- it may be the default, I can't remember)? I've found that for time critical code I can code my own (e.g.: memset) in D so the compiler can inline it and it will be faster. Perhaps those should be in Phobos instead of the C lib.?
> 
> I've been thinking about this as well.  These functions are possibly a bit much for intrinsics, but it would be fairly trivial to write them in native D or even assembler--would have to inspect the resulting compiled code to see which was better.  The only problem offhand is that DMD does not inline functions containing loops, nor does it inline functions containing ASM blocks, so we'd probably be stuck with a function call even with native D code.
> 

Good points - I'd forgotten about not inlining loops.. The way I've "inlined" things like memset() is to just write a foreach if needed.

> Sean
November 11, 2006
Dave wrote:
> Sean Kelly wrote:
>> Dave wrote:
>>>
>>> Could it be that Intel is doing whole program optimization to inline things like memcpy and memset, during linkage (Are you using WPO? -- it may be the default, I can't remember)? I've found that for time critical code I can code my own (e.g.: memset) in D so the compiler can inline it and it will be faster. Perhaps those should be in Phobos instead of the C lib.?
>>
>> I've been thinking about this as well.  These functions are possibly a bit much for intrinsics, but it would be fairly trivial to write them in native D or even assembler--would have to inspect the resulting compiled code to see which was better.  The only problem offhand is that DMD does not inline functions containing loops, nor does it inline functions containing ASM blocks, so we'd probably be stuck with a function call even with native D code.
>>
> 
> Good points - I'd forgotten about not inlining loops.. The way I've "inlined" things like memset() is to just write a foreach if needed.
> 

BTW - Since performance has come up a lot lately.. The reason one has to write the loop to get the most out of a simple memset type operation is because things like arr[100..200] = 0; are replaced by a call to memset anyhow. This is something that the compiler really should treat like an intrinsic IMO.

>> Sean
1 2
Next ›   Last »