January 14, 2014
Some experimental data.

---- test program ----

auto input()
{
    int[] arr;

    foreach (i; 0..6)
    {
        import std.stdio;
        int x;
        readf(" %d ", &x);
        arr ~= x;
    }

    return arr;
}

int main()
{
    import std.algorithm;

    auto arr = input();
    int sum;

    // version algo
    foreach (elem; arr.filter!( x => x % 2 ))
    {
        sum += elem;
    }

    // version raw
    foreach (elem; arr)
    {
        if (elem % 2)
            sum += elem;
    }

    return sum;
}

------

Disassemblies for latest LDC (ldmd2 -O -release -inline):

$ cat test-ldc-algo.dump
0000000000402eb0 <_Dmain>:
  402eb0:	50                   	push   %rax
  402eb1:	e8 fa fe ff ff       	callq  402db0 <_D4test5inputFZAi>
  402eb6:	48 89 c1             	mov    %rax,%rcx
  402eb9:	31 c0                	xor    %eax,%eax
  402ebb:	48 85 c9             	test   %rcx,%rcx
  402ebe:	74 4d                	je     402f0d <_Dmain+0x5d>
  402ec0:	f6 02 01             	testb  $0x1,(%rdx)
  402ec3:	75 0b                	jne    402ed0 <_Dmain+0x20>
  402ec5:	48 83 c2 04          	add    $0x4,%rdx
  402ec9:	48 ff c9             	dec    %rcx
  402ecc:	75 f2                	jne    402ec0 <_Dmain+0x10>
  402ece:	eb 3d                	jmp    402f0d <_Dmain+0x5d>
  402ed0:	31 c0                	xor    %eax,%eax
  402ed2:	eb 34                	jmp    402f08 <_Dmain+0x58>
  402ed4:	66 66 66 2e 0f 1f 84 	data32 data32 nopw %cs:0x0(%rax,%rax,1)
  402edb:	00 00 00 00 00
  402ee0:	03 02                	add    (%rdx),%eax
  402ee2:	66 66 66 66 66 2e 0f 	data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
  402ee9:	1f 84 00 00 00 00 00
  402ef0:	48 85 c9             	test   %rcx,%rcx
  402ef3:	74 1a                	je     402f0f <_Dmain+0x5f>
  402ef5:	48 83 f9 01          	cmp    $0x1,%rcx
  402ef9:	74 12                	je     402f0d <_Dmain+0x5d>
  402efb:	48 ff c9             	dec    %rcx
  402efe:	f6 42 04 01          	testb  $0x1,0x4(%rdx)
  402f02:	48 8d 52 04          	lea    0x4(%rdx),%rdx
  402f06:	74 e8                	je     402ef0 <_Dmain+0x40>
  402f08:	48 85 c9             	test   %rcx,%rcx
  402f0b:	75 d3                	jne    402ee0 <_Dmain+0x30>
  402f0d:	5a                   	pop    %rdx
  402f0e:	c3                   	retq
  402f0f:	bf e0 0a 68 00       	mov    $0x680ae0,%edi
  402f14:	be 9e 01 00 00       	mov    $0x19e,%esi
  402f19:	e8 c2 c2 02 00       	callq  42f1e0 <_d_array_bounds>
  402f1e:	66 90                	xchg   %ax,%ax

$ cat test-ldc-raw.dump
0000000000402eb0 <_Dmain>:
  402eb0:	50                   	push   %rax
  402eb1:	e8 fa fe ff ff       	callq  402db0 <_D4test5inputFZAi>
  402eb6:	48 89 c1             	mov    %rax,%rcx
  402eb9:	31 c0                	xor    %eax,%eax
  402ebb:	48 85 c9             	test   %rcx,%rcx
  402ebe:	74 17                	je     402ed7 <_Dmain+0x27>
  402ec0:	8b 3a                	mov    (%rdx),%edi
  402ec2:	89 fe                	mov    %edi,%esi
  402ec4:	c1 e6 1f             	shl    $0x1f,%esi
  402ec7:	c1 fe 1f             	sar    $0x1f,%esi
  402eca:	21 fe                	and    %edi,%esi
  402ecc:	01 f0                	add    %esi,%eax
  402ece:	48 83 c2 04          	add    $0x4,%rdx
  402ed2:	48 ff c9             	dec    %rcx
  402ed5:	75 e9                	jne    402ec0 <_Dmain+0x10>
  402ed7:	5a                   	pop    %rdx
  402ed8:	c3                   	retq
  402ed9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

So it does inline but end result is still less optimal assembly-wise.
January 14, 2014
On 01/14/2014 04:40 PM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote:
> On Tuesday, 14 January 2014 at 15:34:40 UTC, Dominikus Dittes Scherkl
> wrote:
>> I think it is. It would be far worse if the semantics were related.
>
> No. Array sub script and array initialization are both array-like, so no
> confusion. Same with "+" for concatenation (if you don't do implicit
> casting) and addition, they are both additive-like.

Nope. They are both monoid operations. Addition is commutative, and it is conventional in abstract algebra to use "+" to denote commutative operations only. eg. http://en.wikipedia.org/wiki/Abelian_group#Notation .

assert("123abc" == "123" + "abc"); hurts my eyes.

> The human mind categorize in a fuzzy fashion.
>


January 14, 2014
On Tuesday, 14 January 2014 at 16:32:55 UTC, Timon Gehr wrote:
> assert("123abc" == "123" + "abc"); hurts my eyes.

Doesn't matter. The human brain does not work with algebra, it's not a logic engine. It is a fuzzy abstraction engine. Besides, in unary notation you get "111" for 3, "11111" for 5 etc, but this is rather off-topic...
January 14, 2014
foreach(thought; thoughts)
 if(!thought.isInteresting)
   delete thought;
January 14, 2014
On 01/14/2014 05:50 PM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote:
> On Tuesday, 14 January 2014 at 16:32:55 UTC, Timon Gehr wrote:
>> assert("123abc" == "123" + "abc"); hurts my eyes.
>
> Doesn't matter.

Thanks.

January 14, 2014
On 1/14/14 1:04 AM, Manu wrote:
> /agree completely.
> This is nice, I didn't think of writing statements like that :)
> That's precisely the sort of suggestion I was hoping for. I'll continue
> like this.

I think I died and got to heaven.

Andrei

January 14, 2014
On Tuesday, 14 January 2014 at 09:07:43 UTC, Manu wrote:
> Can anyone comment on the codegen when using these statements? Is it
> identical to my reference 'if' statement?
> Liberal use of loops like this will probably obliterate unoptimised
> performance... :/

I can't comment for every compiler, but when using LLVM as
backend, here is what to expect.

LLVM is really good at inlining. However, it inline mostly
bottom-up, which mean that the front of the subrange will be
inlined in the outer front, popFront of the subrange will be
inlined in the popFront of the outer range, etc...

You'll generally have pretty good performance, but 2 bad
scenarios can happen.

For most ranges, the compiler will find good optimization between
front/popFront/empty and friends. But the way this stuff will be
inlined, you'll get bigger and bigger front/popFront/empty until
they all are inlined in the loop and everything is simplified
away by the optimizer. If you wrap enough range into one another,
you'll reach a point where the compiler will consider that these
method became too big to be good candidate to inline. Things
should improve over time as LLVM team is working on a better top
down inliner.

The second caveat is automatic heap promotion of the closure
frame pointer. Right now, optimizers aren't really good at heap
to stack promotion in D as they mostly don't understand the
runtime (LDC have some progress in that direction, but this is
still quite simplistic). That mean that you can end up with
unwanted heap allocation.

Realistically, it is possible for a compiler to see through any
reasonably size range code, but for now, it is kind of lacking in
some directions.
January 14, 2014
On Tuesday, 14 January 2014 at 16:31:00 UTC, Dicebot wrote:
> So it does inline but end result is still less optimal assembly-wise.

The problem, by the way, seems to be that the program actually contains two loops that LLVM cannot merge: One in the filter() constructor (skipping the initial run of even elements), and the actual foreach loop in main().

David
January 14, 2014
On Tuesday, 14 January 2014 at 18:38:27 UTC, David Nadlinger wrote:
> On Tuesday, 14 January 2014 at 16:31:00 UTC, Dicebot wrote:
>> So it does inline but end result is still less optimal assembly-wise.
>
> The problem, by the way, seems to be that the program actually contains two loops that LLVM cannot merge: One in the filter() constructor (skipping the initial run of even elements), and the actual foreach loop in main().
>
> David

I wonder if there is some way the language could help with that in the common case. For example, provide an opApply for all non-infinite ranges that can be used with foreach instead of front/empty/popFront. I imagine opApply is much easier to inline and optimise than trying to piece range operations and state back together to see the underlying loop.
January 14, 2014
On Tuesday, 14 January 2014 at 19:05:22 UTC, Peter Alexander wrote:
> On Tuesday, 14 January 2014 at 18:38:27 UTC, David Nadlinger wrote:
>> On Tuesday, 14 January 2014 at 16:31:00 UTC, Dicebot wrote:
>>> So it does inline but end result is still less optimal assembly-wise.
>>
>> The problem, by the way, seems to be that the program actually contains two loops that LLVM cannot merge: One in the filter() constructor (skipping the initial run of even elements), and the actual foreach loop in main().
>>
>> David
>
> I wonder if there is some way the language could help with that in the common case. For example, provide an opApply for all non-infinite ranges that can be used with foreach instead of front/empty/popFront. I imagine opApply is much easier to inline and optimise than trying to piece range operations and state back together to see the underlying loop.

Actually, you could give an opApply for infinite ranges too. foreach loops can have breaks, and opApply knows how to handle them.