February 25, 2009
Mason Green (Zzzzrrr) wrote:
> When I remove -inline there doesn't seem to
> be much of a difference in execution speed.

Try running obj2asm to see if the functions you want inlined are actually inlined or not.
February 25, 2009
On Wed, Feb 25, 2009 at 8:42 AM, Walter Bright <newshound1@digitalmars.com> wrote:
> Mason Green (Zzzzrrr) wrote:
>>
>> When I remove -inline there doesn't seem to
>> be much of a difference in execution speed.
>
> Try running obj2asm to see if the functions you want inlined are actually inlined or not.
>

perhaps a verbose mode could be added in dmd that prints the pretty printed declaration when a function is inlined. then it would be a simple grep to make sure.

dmd -vi foo.d | grep 'foo\.inc'

telling people to inspect the obj2asm output seems to be popular, but it's hardly user friendly.
February 25, 2009
Tomas Lindquist Olsen wrote:
> perhaps a verbose mode could be added in dmd that prints the pretty
> printed declaration when a function is inlined. then it would be a
> simple grep to make sure.
> 
> dmd -vi foo.d | grep 'foo\.inc'
> 
> telling people to inspect the obj2asm output seems to be popular, but
> it's hardly user friendly.

I know, but it isn't that hard, either, even if you don't know assembler. If the "call" isn't there, it likely got inlined.

Also, if you are trying to optimize the code by trying various tweaks at the statement level, it's much like shooting skeet blindfolded if you don't look at the asm output. It's time consuming and unlikely to be successful.
February 25, 2009
On Wed, Feb 25, 2009 at 3:26 AM, Walter Bright <newshound1@digitalmars.com> wrote:
>
> Also, if you are trying to optimize the code by trying various tweaks at the statement level, it's much like shooting skeet blindfolded if you don't look at the asm output. It's time consuming and unlikely to be successful.

In this case it's not entirely helpful that DMD's inlining rules are completely opaque.  Do you have a list of what DMD will and won't inline, and their justifications?  If not, could you make one?
February 25, 2009
On Wed, Feb 25, 2009 at 9:09 AM, Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:
> On Wed, Feb 25, 2009 at 3:26 AM, Walter Bright <newshound1@digitalmars.com> wrote:
>>
>> Also, if you are trying to optimize the code by trying various tweaks at the statement level, it's much like shooting skeet blindfolded if you don't look at the asm output. It's time consuming and unlikely to be successful.
>
> In this case it's not entirely helpful that DMD's inlining rules are completely opaque.  Do you have a list of what DMD will and won't inline, and their justifications?  If not, could you make one?
>

Also, looking at the DMD frontend source is *not* an acceptable option.
February 26, 2009
Jarrett Billingsley wrote:
> In this case it's not entirely helpful that DMD's inlining rules are
> completely opaque.  Do you have a list of what DMD will and won't
> inline, and their justifications?  If not, could you make one?

In the immortal words of Oggie-Ben-Doggie, "use the source, Luke".

In this case, the source is FuncDeclaration::canInline() in /dmd/src/dmd/inline.c.

Yes, I know, but it's all there is at the moment.
February 26, 2009
Jarrett Billingsley wrote:
> Also, looking at the DMD frontend source is *not* an acceptable option.

I knew you'd say that <g>.

On the other hand, inlining or not is, like register allocation and any other optimizations, highly implementation dependent. If you're going to micro-optimize at that level, it really is worthwhile to get familiar with obj2asm and the relevant compiler source code.

It'll save you much time in the long run, and will pay off in being able to write consistently faster code.

Or, you could sign up for http://www.astoriaseminar.com/compiler-construction.html <g>.
February 26, 2009
On Wed, Feb 25, 2009 at 8:59 PM, Walter Bright <newshound1@digitalmars.com> wrote:
> Jarrett Billingsley wrote:
>>
>> Also, looking at the DMD frontend source is *not* an acceptable option.
>
> I knew you'd say that <g>.

I knew you'd suggest it ;)

> On the other hand, inlining or not is, like register allocation and any other optimizations, highly implementation dependent. If you're going to micro-optimize at that level, it really is worthwhile to get familiar with obj2asm and the relevant compiler source code.

True.  However defining what the compiler does in these optimizations is not just in the interest of performance, but also in the interest of correctness and other implementations.  If everyone can see what DMD is and isn't inlining, they can ask "why" or "why not"; they can correct you if you make a mistake; they can suggest optimizations you might not have thought of; and they can see optimizations that fall out as a consequence of the language that they might not have considered when making their own compiler.

Furthermore things like NRVO either need to be specified in the language or specified in the ABI.  You told me before that static opCall for structs is just as efficient as constructors because of NRVO; I didn't and still don't buy it for exactly the reasons you just now gave: optimizations are highly implementation-dependent.  It's this kind of stuff that needs to be specified: is NRVO required, or just _really really nice to have_?  Insert many other optimizations here.
February 26, 2009
Jarrett Billingsley wrote:
> True.  However defining what the compiler does in these optimizations
> is not just in the interest of performance, but also in the interest
> of correctness and other implementations.

Optimization should have nothing to do with correctness.

> If everyone can see what
> DMD is and isn't inlining, they can ask "why" or "why not"; they can
> correct you if you make a mistake; they can suggest optimizations you
> might not have thought of; and they can see optimizations that fall
> out as a consequence of the language that they might not have
> considered when making their own compiler.

If they're working at that level, why avoid looking at the compiler source? Optimization suggestions from someone who knows how compilers work are much more likely to be viable.

> Furthermore things like NRVO either need to be specified in the
> language or specified in the ABI.  You told me before that static
> opCall for structs is just as efficient as constructors because of
> NRVO; I didn't and still don't buy it for exactly the reasons you just
> now gave: optimizations are highly implementation-dependent.  It's
> this kind of stuff that needs to be specified: is NRVO required, or
> just _really really nice to have_?  Insert many other optimizations
> here.

If an optimization is required, then yes, it needs to go in the spec. But inlining is not required.

Let me put it another way. There are *thousands* of optimizations the compiler does, and they often have some very complex interactions. Even enumerating them all would be an enormous time sink. There's nothing particularly special about inlining as opposed to constant folding, dead code elimination, register allocation, instruction scheduling, strength reduction, etc., etc.

Even if I wrote such a tome, it would be a waste of time to read it. The easiest, quickest way to see if an optimization happened is to look at the obj2asm output.

Remember the thread a while back about how dmd did a terrible job generating arithmetic code? A quick check with obj2asm showed that the speed problem had nothing to do with the code generation, it was all sucked up by a library module (since fixed).
February 26, 2009
Tue, 24 Feb 2009 14:08:26 -0500, Mason Green wrote:

>> Have you profiled your D code? What has the profiling told you? Have you seen where you allocate memory, to move such allocations away from inner loops, or just reduce their number? >
> 
> No, I have not profiled the D code other than using an FPS counter... :-) To be honest, I'm fairly light on experience when it comes to profiling. Do you have any suggestions on how to make it happen?

The material seems lacking so I've started a series of posts on profiling.  Here's the first one:

http://snakecoder.wordpress.com/2009/02/26/profiling-with-dmd-on-windows/

I already have some material for the second one, profiling Blaze.  ;-)