January 15, 2014
On 15 January 2014 02:16, Dicebot <public@dicebot.lv> wrote:

> On Tuesday, 14 January 2014 at 15:37:21 UTC, Manu wrote:
>
>> Sorry, that was 2 separate points although it didn't look like it.
>> The first question was assuming full optimisation, is it equivalent to the
>> if statement I demonstrate with full optimisation in practise? (I'll do
>> some tests, but people must have already experimented in various
>> circumstances, and identified patterns?)
>>
>
> Don't think so. LDC is best at doing such transformation per my observations but it still major are for improvements in general.


So you're saying that using these primitives like until! and filter! will
not produce the same code as my if statement when optimised?
That's worrying. I'll have to try it out.

 I'd like to know if anybody can see any path towards inlined
>> lambda's/literal's/micro-functions in non-optimised builds?
>> I guess the first question is, is this a problem that is even worth
>> addressing? How many people are likely to object in principle?
>>
>
> I'd object against it general. Important trait of such builds is resulting code gen that matches original source code as much as possible. Mass inlining will kill it. But I don't see why you use "debug" and "non-optimised" as synonyms here. For example, in DMD "-release" vs "-debug" are orthogonal to "-O". I see no problems with doing optimised debug build.
>

You can't step a -O build, or inspect the value of most variables. So you can't really debug.


January 15, 2014
On 15 January 2014 03:08, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org
> wrote:

> On 1/14/14 1:04 AM, Manu wrote:
>
>> /agree completely.
>> This is nice, I didn't think of writing statements like that :)
>> That's precisely the sort of suggestion I was hoping for. I'll continue
>> like this.
>>
>
> I think I died and got to heaven.


The problem is I initially tried to use an all-out std.algorithm approach, with map and reduce and stuff...


January 15, 2014
On 15 January 2014 02:56, Daniel Murphy <yebbliesnospam@gmail.com> wrote:

> foreach(thought; thoughts)
>  if(!thought.isInteresting)
>    delete thought;
>

That's a redundant filter, your code effectively does nothing ;)


January 15, 2014
On 15 January 2014 02:30, Dicebot <public@dicebot.lv> wrote:

> Some experimental data.
>
> ---- test program ----
> ...
> So it does inline but end result is still less optimal assembly-wise.
>

Right, thanks for that.
I'm quite surprised by how bad that turned out actually, and with LDC,
which is usually the best at optimising that sort of thing.
Need to do some intensive experimentation... but this is a bit concerning.

Cheers.


January 15, 2014
On Wednesday, 15 January 2014 at 01:42:07 UTC, Manu wrote:
> Right, thanks for that.
> I'm quite surprised by how bad that turned out actually, and with LDC,
> which is usually the best at optimising that sort of thing.
> Need to do some intensive experimentation... but this is a bit concerning.

GCC might do some loop merging, I haven't checked. It's just that this pattern doesn't tend to appear too much in traditional C/C++ code (after all, who splits up their loops into two parts just for fun?), so it could be that the LLVM people just never really bothered to write a pass to merge single loops that have been split (as opposed to classical loop fusion, where the loop ranges are the same, but the operations performed/target data different).

Maybe it would be possible to implement something like this fairly easily using the existing LLVM loop analyses though.

David
January 15, 2014
On 15 January 2014 12:22, David Nadlinger <code@klickverbot.at> wrote:

> On Wednesday, 15 January 2014 at 01:42:07 UTC, Manu wrote:
>
>> Right, thanks for that.
>> I'm quite surprised by how bad that turned out actually, and with LDC,
>> which is usually the best at optimising that sort of thing.
>> Need to do some intensive experimentation... but this is a bit concerning.
>>
>
> GCC might do some loop merging, I haven't checked. It's just that this pattern doesn't tend to appear too much in traditional C/C++ code (after all, who splits up their loops into two parts just for fun?), so it could be that the LLVM people just never really bothered to write a pass to merge single loops that have been split (as opposed to classical loop fusion, where the loop ranges are the same, but the operations performed/target data different).
>
> Maybe it would be possible to implement something like this fairly easily using the existing LLVM loop analyses though.


Okay. As long as the problem is understood and a solution seems realistic. The closure allocation is also an important problem to fix, but that one's been on the list a long time.


January 15, 2014
On Wed, Jan 15, 2014 at 11:38:13AM +1000, Manu wrote:
> On 15 January 2014 02:56, Daniel Murphy <yebbliesnospam@gmail.com> wrote:
> 
> > foreach(thought; thoughts)
> >  if(!thought.isInteresting)
> >    delete thought;
> >
> 
> That's a redundant filter, your code effectively does nothing ;)

And 'delete' is deprecated. :-P


T

-- 
Bomb technician: If I'm running, try to keep up.
January 15, 2014
On Wednesday, 15 January 2014 at 02:22:22 UTC, David Nadlinger wrote:
> On Wednesday, 15 January 2014 at 01:42:07 UTC, Manu wrote:
>> Right, thanks for that.
>> I'm quite surprised by how bad that turned out actually, and with LDC,
>> which is usually the best at optimising that sort of thing.
>> Need to do some intensive experimentation... but this is a bit concerning.
>
> GCC might do some loop merging, I haven't checked. It's just that this pattern doesn't tend to appear too much in traditional C/C++ code (after all, who splits up their loops into two parts just for fun?), so it could be that the LLVM people just never really bothered to write a pass to merge single loops that have been split (as opposed to classical loop fusion, where the loop ranges are the same, but the operations performed/target data different).
>
> Maybe it would be possible to implement something like this fairly easily using the existing LLVM loop analyses though.
>
> David

Actually this is a useful optimization technique when you need to skip a lot of objects that way. See http://www.onversity.com/load/d-loop.pdf
January 15, 2014
On Wednesday, 15 January 2014 at 07:08:14 UTC, deadalnix wrote:
> Actually this is a useful optimization technique when you need to skip a lot of objects that way. See http://www.onversity.com/load/d-loop.pdf

Well, yes, expect for the fact that the case discussed here isn't an instance of the optimization. Even the "pre-skip" loop performs two conditional jumps per array element. And worse, the form the loop is in causes LLVM to miss two additional optimization opportunities, compared to the plain C (D) version:

First, LLVM actually optimizes away the even/odd conditional branch in the plain version, replacing it with a shl/sar/and combination. The rotated loop structure in the filter version seems to cause the optimizer to miss that opportunity.

And second, as a consequence the loop in the filter version is no longer a good candidate for vectorization, whereas LLVM will happily emit AVX/… instructions in the plain case if you let it.

David
January 15, 2014
Am Tue, 14 Jan 2014 19:07:33 +1000
schrieb Manu <turkeyman@gmail.com>:

> On 14 January 2014 19:04, Manu <turkeyman@gmail.com> wrote:
> 
> > On 14 January 2014 18:36, Jakob Ovrum <jakobovrum@gmail.com> wrote:
> >
> >> On Tuesday, 14 January 2014 at 08:23:05 UTC, Manu wrote:
> >>
> >>> 1. A termination condition (ie, while)
> >>>
> >>> foreach(t; things) iterates each thing, but it's common in traditional
> >>> for
> >>> loops to have an && in the second 'while' term, to add an additional
> >>> termination condition.
> >>> for(i=0; i<things.length && things[i].someCondition; ++i)
> >>>
> >>> Or with foreach:
> >>> foreach(i, t; things)
> >>> {
> >>>   if(t.someCondition)
> >>>     break;
> >>>   ...
> >>> }
> >>>
> >>
> >> foreach(t; things.until!(t => t.someCondition))
> >> {
> >> }
> >>
> >> Unfortunately foreach over a range does not automatically support an index loop variable. We could add something like std.range.enumerate to support this, but I think it's a common enough requirement that a language amendment is warranted (there are some subtleties involved in implementing it though - specifically when combined with automatic tuple expansion).
> >>
> >>
> >>  2. A filter
> >>>
> >>> The other thing is the ability to skip uninteresting elements. This is
> >>> typically performed with the first line of the loop testing a condition,
> >>> and then continue:
> >>> foreach(i, t; things)
> >>> {
> >>>   if(!t.isInteresting)
> >>>     continue;
> >>>   ...
> >>> }
> >>>
> >>
> >> foreach(t; things.filter!(t => t.isInteresting))
> >> {
> >> }
> >>
> >> Ditto about the index loop variable.
> >>
> >>
> >>  I've tried to approach the problem with std.algorithm, but I find the
> >>> std.algorithm statement to be much more noisy and usually longer when the
> >>> loops are sufficiently simple (as they usually are in my case, which is
> >>> why
> >>> the trivial conditions are so distracting by contrast).
> >>>
> >>
> >> The two examples above look a *lot* cleaner and less noisy (declarative!) to me than the imperative approach using if-break or if-continue.
> >
> >
> > /agree completely.
> > This is nice, I didn't think of writing statements like that :)
> > That's precisely the sort of suggestion I was hoping for. I'll continue
> > like this.
> >
> 
> Can anyone comment on the codegen when using these statements? Is it
> identical to my reference 'if' statement?
> Liberal use of loops like this will probably obliterate unoptimised
> performance... :/

For a moment, I thought the performance crew lost you. :)
D's foreach is awesome. Remember this benchmark?
http://togototo.wordpress.com/2013/08/23/benchmarks-round-two-parallel-go-rust-d-scala-and-nimrod/
The top entry's performance is due to how well LDC2 optimized
the foreach, otherwise the code is quite the same as the C++
version.

-- 
Marco