August 25, 2016
On Thursday, 25 August 2016 at 18:07:14 UTC, Cecil Ward wrote:
> On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
>> [...]
>
> I think that here the optimisation is only because LDC can “see” the text of the method. When expansion is not possible, that would be the real test.

(Assuming LDC behaves like GDC. I'm unfamiliar with LDC, I'm ashamed to admit.)
August 25, 2016
On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
> I found it hard to believe LDC generates such crappy code when

Yes that's right, there was an error in my script! What I've posted is actually the asm without -O.

> Sure, Foo.foo() and use() could return a constant, but otherwise it can't get much better than this.

The problem here that the example is bad with too agressive optimizations because the CALLs are eliminated despite of no inlining.

Here's a better code to illustrate the idea:

°°°°°°°°°°°°°°°°°°°°°°
interface Foo
{
    int foo() const;
}

int use(const(Foo) foo)
{
    return foo.foo() + foo.foo();
}
°°°°°°°°°°°°°°°°°°°°°°


And I'd expect this asm for a 'const optimization' in use():

push rbp
push rbx
push rax
mov rbx, rdi
mov rax, qword ptr [rbx]
call qword ptr [rax+08h]
add eax, eax // const funct = no side effect, so add the 1st result = save a CALL
add rsp, 08h
pop rbx
pop rbp
ret

August 25, 2016
On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
...
> A useful phrase I saw today: “declaration of intent given by the programmer to the compiler”.

Particular dream wish-list items of mine: some kind of mechanism that could express  possible operator properties, classes properties and arithmetic identities, identity operations. Examples:
* commutativity;
* 1.0 / (1.0 / x) ≡ x, with or without ignoring of zero and ignoring of IEEE-weirdos; or
* sin²(x) +cos²(x) ≡ 1
* special values of objects such zero, and one, so that that (x ⊛ zero) ≡ x, and that (zero ⊛ x) ≡ x
* D strings can be first, so that x ~ "" ≡  x, arrays too
* arithmetic operators’ properties and identities a they apply to complex numbers

Another dream: Strength reductions so that sequences / patterns of operators (back to identities again, sort-of) could be mapped to named helper functions or operators. For example, with strings: s1 ~ s2 ~ s3 ~ … → StringArrayConcat( [] )
August 25, 2016
On 08/25/2016 08:15 PM, Basile B. wrote:
> Here's a better code to illustrate the idea:
>
> °°°°°°°°°°°°°°°°°°°°°°
> interface Foo
> {
>     int foo() const;
> }
>
> int use(const(Foo) foo)
> {
>     return foo.foo() + foo.foo();
> }
> °°°°°°°°°°°°°°°°°°°°°°
>
>
> And I'd expect this asm for a 'const optimization' in use():
>
> push rbp
> push rbx
> push rax
> mov rbx, rdi
> mov rax, qword ptr [rbx]
> call qword ptr [rax+08h]
> add eax, eax // const funct = no side effect, so add the 1st result =
> save a CALL
> add rsp, 08h
> pop rbx
> pop rbp
> ret

At least, foo needs to be `pure` for that.
August 25, 2016
On Thursday, 25 August 2016 at 18:17:21 UTC, Cecil Ward wrote:
> On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:

> * special values of objects such zero, and one, so that that (x ⊛ zero) ≡ x, and that (zero ⊛ x) ≡ x

(Should of course read
        (x ⊛ zero) ≡ zero, and that (one ⊛ x) ≡ x
if you take the operator as being like multiplication.)

August 25, 2016
On Thursday, 25 August 2016 at 18:09:14 UTC, Cecil Ward wrote:
> On Thursday, 25 August 2016 at 18:07:14 UTC, Cecil Ward wrote:
>> On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
>>> [...]
>>
>> I think that here the optimisation is only because LDC can “see” the text of the method. When expansion is not possible, that would be the real test.
>
> (Assuming LDC behaves like GDC. I'm unfamiliar with LDC, I'm ashamed to admit.)

You're right. The question is whether it pays off to optimize heavily for externals. If you build all modules of a binary at once via `ldmd2 m1.d m2.d ...` or via `ldc2 -singleobj m1.d m2.d ...`, LDC emits all the code into a single LLVM module, which can then be optimized very aggressively. So call graphs inside the binary are taken care of, so if it's a well encapsulated library with few (or expensive) calls to externals, it doesn't matter much.

druntime and Phobos are treated as externals. But Johan Engelen already pointed out that LDC could ship with them as LLVM bitcode libraries and then link them in before machine code generation...
August 25, 2016
On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
> The problem here that the example is bad with too agressive optimizations because the CALLs are eliminated despite of no inlining.
>
> [...]
>
> int use(const(Foo) foo)
> {
>     return foo.foo() + foo.foo();
> }

From my perspective, the problem with this example isn't missed optimization potential. It's the code itself. Why waste implementation efforts for such optimizations, if that would only reward people writing such ugly code with an equal performance to a more sane `2 * foo.foo()`? The latter is a) shorter, b) also faster with optimizations turned off and c) IMO simply clearer.
August 26, 2016
On Thursday, 25 August 2016 at 22:37:13 UTC, kinke wrote:
> On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
> From my perspective, the problem with this example isn't missed optimization potential. It's the code itself. Why waste implementation efforts for such optimizations, if that would only reward people writing such ugly code with an equal performance to a more sane `2 * foo.foo()`? The latter is a) shorter, b) also faster with optimizations turned off and c) IMO simply clearer.

You're too focused on the example itself (Let's find an non trivial example, but then the asm generated would be longer). The point you miss is that it just *illustrates* what should happend when many calls to a pure const function are occur in a single sub program.
August 26, 2016
On Friday, 26 August 2016 at 05:50:52 UTC, Basile B. wrote:
> On Thursday, 25 August 2016 at 22:37:13 UTC, kinke wrote:
>> On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
>> From my perspective, the problem with this example isn't missed optimization potential. It's the code itself. Why waste implementation efforts for such optimizations, if that would only reward people writing such ugly code with an equal performance to a more sane `2 * foo.foo()`? The latter is a) shorter, b) also faster with optimizations turned off and c) IMO simply clearer.
>
> You're too focused on the example itself (Let's find an non trivial example, but then the asm generated would be longer). The point you miss is that it just *illustrates* what should happend when many calls to a pure const function are occur in a single sub program.

I know that it's just an illustration. But I surely don't like any function with repeated calls to this pure function. Why not have the developer code in a sensible style (cache that result once for that whole 'subprogram' manually) if performance is a concern? A compiler penalizing such bad coding style is absolutely fine by me.
August 26, 2016
On 26.08.2016 10:44, kink wrote:
> On Friday, 26 August 2016 at 05:50:52 UTC, Basile B. wrote:
>> On Thursday, 25 August 2016 at 22:37:13 UTC, kinke wrote:
>>> On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
>>> From my perspective, the problem with this example isn't missed
>>> optimization potential. It's the code itself. Why waste
>>> implementation efforts for such optimizations, if that would only
>>> reward people writing such ugly code with an equal performance to a
>>> more sane `2 * foo.foo()`? The latter is a) shorter, b) also faster
>>> with optimizations turned off and c) IMO simply clearer.
>>
>> You're too focused on the example itself (Let's find an non trivial
>> example, but then the asm generated would be longer). The point you
>> miss is that it just *illustrates* what should happend when many calls
>> to a pure const function are occur in a single sub program.
>
> I know that it's just an illustration.  But I surely don't like any
> function with repeated calls to this pure function. Why not have the
> developer code in a sensible style (cache that result once for that
> whole 'subprogram' manually) if performance is a concern?

Better performance is better even when it is not the primary concern.

> A compiler penalizing such bad coding style is absolutely fine by me.

It's not the compiler's business to judge coding style, also:

// original code. not "bad".

int foo(int x) pure{ ... }

int bar(int x) pure{ return foo(x) + foo(5-x); }

void main(){
    writeln(bar(5));
}

// ==> inlining

void main(){
    writeln(foo(5)+foo(10-5));
}

// ==> constant folding, "bad" code

void main(){
    writeln(foo(5)+foo(5));
}