January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dicebot | Some experimental data. ---- test program ---- auto input() { int[] arr; foreach (i; 0..6) { import std.stdio; int x; readf(" %d ", &x); arr ~= x; } return arr; } int main() { import std.algorithm; auto arr = input(); int sum; // version algo foreach (elem; arr.filter!( x => x % 2 )) { sum += elem; } // version raw foreach (elem; arr) { if (elem % 2) sum += elem; } return sum; } ------ Disassemblies for latest LDC (ldmd2 -O -release -inline): $ cat test-ldc-algo.dump 0000000000402eb0 <_Dmain>: 402eb0: 50 push %rax 402eb1: e8 fa fe ff ff callq 402db0 <_D4test5inputFZAi> 402eb6: 48 89 c1 mov %rax,%rcx 402eb9: 31 c0 xor %eax,%eax 402ebb: 48 85 c9 test %rcx,%rcx 402ebe: 74 4d je 402f0d <_Dmain+0x5d> 402ec0: f6 02 01 testb $0x1,(%rdx) 402ec3: 75 0b jne 402ed0 <_Dmain+0x20> 402ec5: 48 83 c2 04 add $0x4,%rdx 402ec9: 48 ff c9 dec %rcx 402ecc: 75 f2 jne 402ec0 <_Dmain+0x10> 402ece: eb 3d jmp 402f0d <_Dmain+0x5d> 402ed0: 31 c0 xor %eax,%eax 402ed2: eb 34 jmp 402f08 <_Dmain+0x58> 402ed4: 66 66 66 2e 0f 1f 84 data32 data32 nopw %cs:0x0(%rax,%rax,1) 402edb: 00 00 00 00 00 402ee0: 03 02 add (%rdx),%eax 402ee2: 66 66 66 66 66 2e 0f data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) 402ee9: 1f 84 00 00 00 00 00 402ef0: 48 85 c9 test %rcx,%rcx 402ef3: 74 1a je 402f0f <_Dmain+0x5f> 402ef5: 48 83 f9 01 cmp $0x1,%rcx 402ef9: 74 12 je 402f0d <_Dmain+0x5d> 402efb: 48 ff c9 dec %rcx 402efe: f6 42 04 01 testb $0x1,0x4(%rdx) 402f02: 48 8d 52 04 lea 0x4(%rdx),%rdx 402f06: 74 e8 je 402ef0 <_Dmain+0x40> 402f08: 48 85 c9 test %rcx,%rcx 402f0b: 75 d3 jne 402ee0 <_Dmain+0x30> 402f0d: 5a pop %rdx 402f0e: c3 retq 402f0f: bf e0 0a 68 00 mov $0x680ae0,%edi 402f14: be 9e 01 00 00 mov $0x19e,%esi 402f19: e8 c2 c2 02 00 callq 42f1e0 <_d_array_bounds> 402f1e: 66 90 xchg %ax,%ax $ cat test-ldc-raw.dump 0000000000402eb0 <_Dmain>: 402eb0: 50 push %rax 402eb1: e8 fa fe ff ff callq 402db0 <_D4test5inputFZAi> 402eb6: 48 89 c1 mov %rax,%rcx 402eb9: 31 c0 xor %eax,%eax 402ebb: 48 85 c9 test %rcx,%rcx 402ebe: 74 17 je 402ed7 <_Dmain+0x27> 402ec0: 8b 3a mov (%rdx),%edi 402ec2: 89 fe mov %edi,%esi 402ec4: c1 e6 1f shl $0x1f,%esi 402ec7: c1 fe 1f sar $0x1f,%esi 402eca: 21 fe and %edi,%esi 402ecc: 01 f0 add %esi,%eax 402ece: 48 83 c2 04 add $0x4,%rdx 402ed2: 48 ff c9 dec %rcx 402ed5: 75 e9 jne 402ec0 <_Dmain+0x10> 402ed7: 5a pop %rdx 402ed8: c3 retq 402ed9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) So it does inline but end result is still less optimal assembly-wise. |
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ola Fosheim Grøstad | On 01/14/2014 04:40 PM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote: > On Tuesday, 14 January 2014 at 15:34:40 UTC, Dominikus Dittes Scherkl > wrote: >> I think it is. It would be far worse if the semantics were related. > > No. Array sub script and array initialization are both array-like, so no > confusion. Same with "+" for concatenation (if you don't do implicit > casting) and addition, they are both additive-like. Nope. They are both monoid operations. Addition is commutative, and it is conventional in abstract algebra to use "+" to denote commutative operations only. eg. http://en.wikipedia.org/wiki/Abelian_group#Notation . assert("123abc" == "123" + "abc"); hurts my eyes. > The human mind categorize in a fuzzy fashion. > |
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On Tuesday, 14 January 2014 at 16:32:55 UTC, Timon Gehr wrote:
> assert("123abc" == "123" + "abc"); hurts my eyes.
Doesn't matter. The human brain does not work with algebra, it's not a logic engine. It is a fuzzy abstraction engine. Besides, in unary notation you get "111" for 3, "11111" for 5 etc, but this is rather off-topic...
|
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | foreach(thought; thoughts) if(!thought.isInteresting) delete thought; |
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ola Fosheim Grøstad | On 01/14/2014 05:50 PM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote: > On Tuesday, 14 January 2014 at 16:32:55 UTC, Timon Gehr wrote: >> assert("123abc" == "123" + "abc"); hurts my eyes. > > Doesn't matter. Thanks. |
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 1/14/14 1:04 AM, Manu wrote:
> /agree completely.
> This is nice, I didn't think of writing statements like that :)
> That's precisely the sort of suggestion I was hoping for. I'll continue
> like this.
I think I died and got to heaven.
Andrei
|
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On Tuesday, 14 January 2014 at 09:07:43 UTC, Manu wrote:
> Can anyone comment on the codegen when using these statements? Is it
> identical to my reference 'if' statement?
> Liberal use of loops like this will probably obliterate unoptimised
> performance... :/
I can't comment for every compiler, but when using LLVM as
backend, here is what to expect.
LLVM is really good at inlining. However, it inline mostly
bottom-up, which mean that the front of the subrange will be
inlined in the outer front, popFront of the subrange will be
inlined in the popFront of the outer range, etc...
You'll generally have pretty good performance, but 2 bad
scenarios can happen.
For most ranges, the compiler will find good optimization between
front/popFront/empty and friends. But the way this stuff will be
inlined, you'll get bigger and bigger front/popFront/empty until
they all are inlined in the loop and everything is simplified
away by the optimizer. If you wrap enough range into one another,
you'll reach a point where the compiler will consider that these
method became too big to be good candidate to inline. Things
should improve over time as LLVM team is working on a better top
down inliner.
The second caveat is automatic heap promotion of the closure
frame pointer. Right now, optimizers aren't really good at heap
to stack promotion in D as they mostly don't understand the
runtime (LDC have some progress in that direction, but this is
still quite simplistic). That mean that you can end up with
unwanted heap allocation.
Realistically, it is possible for a compiler to see through any
reasonably size range code, but for now, it is kind of lacking in
some directions.
|
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dicebot | On Tuesday, 14 January 2014 at 16:31:00 UTC, Dicebot wrote:
> So it does inline but end result is still less optimal assembly-wise.
The problem, by the way, seems to be that the program actually contains two loops that LLVM cannot merge: One in the filter() constructor (skipping the initial run of even elements), and the actual foreach loop in main().
David
|
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Tuesday, 14 January 2014 at 18:38:27 UTC, David Nadlinger wrote:
> On Tuesday, 14 January 2014 at 16:31:00 UTC, Dicebot wrote:
>> So it does inline but end result is still less optimal assembly-wise.
>
> The problem, by the way, seems to be that the program actually contains two loops that LLVM cannot merge: One in the filter() constructor (skipping the initial run of even elements), and the actual foreach loop in main().
>
> David
I wonder if there is some way the language could help with that in the common case. For example, provide an opApply for all non-infinite ranges that can be used with foreach instead of front/empty/popFront. I imagine opApply is much easier to inline and optimise than trying to piece range operations and state back together to see the underlying loop.
|
January 14, 2014 Re: foreach thoughts | ||||
---|---|---|---|---|
| ||||
Posted in reply to Peter Alexander | On Tuesday, 14 January 2014 at 19:05:22 UTC, Peter Alexander wrote:
> On Tuesday, 14 January 2014 at 18:38:27 UTC, David Nadlinger wrote:
>> On Tuesday, 14 January 2014 at 16:31:00 UTC, Dicebot wrote:
>>> So it does inline but end result is still less optimal assembly-wise.
>>
>> The problem, by the way, seems to be that the program actually contains two loops that LLVM cannot merge: One in the filter() constructor (skipping the initial run of even elements), and the actual foreach loop in main().
>>
>> David
>
> I wonder if there is some way the language could help with that in the common case. For example, provide an opApply for all non-infinite ranges that can be used with foreach instead of front/empty/popFront. I imagine opApply is much easier to inline and optimise than trying to piece range operations and state back together to see the underlying loop.
Actually, you could give an opApply for infinite ranges too. foreach loops can have breaks, and opApply knows how to handle them.
|
Copyright © 1999-2021 by the D Language Foundation