Thread overview
AliasSeq seems to compile slightly faster with static foreach
Jan 05, 2018
Jonathan M Davis
Jan 05, 2018
Meta
Jan 05, 2018
Jonathan M Davis
Jan 07, 2018
Timon Gehr
Jan 07, 2018
Jonathan M Davis
January 05, 2018
There was a recent PR for Phobos where Seb added static to a bunch of foreach's that used AliasSeq. It hadn't actually occurred to me that that was legal (I've basically just been using static foreach where foreach with AliasSeq doesn't work), but it is legal (which I suppose isn't surprising when you think about it; I just hadn't). However, that got me to wondering if such a change was purely aesthetic or whether it might actually have an impact on build times - particularly since running dub test for one of my recent projects keeps taking longer and longer. So, I added static to a bunch of foreach's over AliasSeqs in that project to see if it would have any effect. The result was that dub test went from about 16.5 seconds on my system to about 15.8 seconds - and that's just by adding static to the foreach's over AliasSeqs, not fundamentally changing what any of the code did. That's not a huge speed up, but it's definitely something and far more than I was expecting.

Of course, you have to be careful with such a change, because static foreach doesn't introduce a new scope, and double braces are potentially required where they weren't before, but given that I'd very much like to streamline that test build, adding static to those foreach's was surprisingly worthwhile.

Taking it a step further, I tried switching some of the static foreach's over to using array literals, since they held values rather than types, and that seemed to have minimal impact on the time to run dub test. However, by switching to using std.range.only, it suddenly was taking more like 11.8 seconds. So, with a few small changes, I cut the time to run dub test down by almost a third.

- Jonathan M Davis

January 05, 2018
On Friday, 5 January 2018 at 13:10:25 UTC, Jonathan M Davis wrote:
> There was a recent PR for Phobos where Seb added static to a bunch of foreach's that used AliasSeq. It hadn't actually occurred to me that that was legal (I've basically just been using static foreach where foreach with AliasSeq doesn't work), but it is legal (which I suppose isn't surprising when you think about it; I just hadn't). However, that got me to wondering if such a change was purely aesthetic or whether it might actually have an impact on build times - particularly since running dub test for one of my recent projects keeps taking longer and longer. So, I added static to a bunch of foreach's over AliasSeqs in that project to see if it would have any effect. The result was that dub test went from about 16.5 seconds on my system to about 15.8 seconds - and that's just by adding static to the foreach's over AliasSeqs, not fundamentally changing what any of the code did. That's not a huge speed up, but it's definitely something and far more than I was expecting.
>
> Of course, you have to be careful with such a change, because static foreach doesn't introduce a new scope, and double braces are potentially required where they weren't before, but given that I'd very much like to streamline that test build, adding static to those foreach's was surprisingly worthwhile.
>
> Taking it a step further, I tried switching some of the static foreach's over to using array literals, since they held values rather than types, and that seemed to have minimal impact on the time to run dub test. However, by switching to using std.range.only, it suddenly was taking more like 11.8 seconds. So, with a few small changes, I cut the time to run dub test down by almost a third.
>
> - Jonathan M Davis

It does not make any sense to me as to why using only instead of AliasSeq resulted in a speedup. I would've expected no change or worse performance. Any theories?
January 05, 2018
On Friday, January 05, 2018 13:16:52 Meta via Digitalmars-d wrote:
> On Friday, 5 January 2018 at 13:10:25 UTC, Jonathan M Davis wrote:
> > There was a recent PR for Phobos where Seb added static to a bunch of foreach's that used AliasSeq. It hadn't actually occurred to me that that was legal (I've basically just been using static foreach where foreach with AliasSeq doesn't work), but it is legal (which I suppose isn't surprising when you think about it; I just hadn't). However, that got me to wondering if such a change was purely aesthetic or whether it might actually have an impact on build times - particularly since running dub test for one of my recent projects keeps taking longer and longer. So, I added static to a bunch of foreach's over AliasSeqs in that project to see if it would have any effect. The result was that dub test went from about 16.5 seconds on my system to about 15.8 seconds - and that's just by adding static to the foreach's over AliasSeqs, not fundamentally changing what any of the code did. That's not a huge speed up, but it's definitely something and far more than I was expecting.
> >
> > Of course, you have to be careful with such a change, because static foreach doesn't introduce a new scope, and double braces are potentially required where they weren't before, but given that I'd very much like to streamline that test build, adding static to those foreach's was surprisingly worthwhile.
> >
> > Taking it a step further, I tried switching some of the static foreach's over to using array literals, since they held values rather than types, and that seemed to have minimal impact on the time to run dub test. However, by switching to using std.range.only, it suddenly was taking more like 11.8 seconds. So, with a few small changes, I cut the time to run dub test down by almost a third.
> >
> > - Jonathan M Davis
>
> It does not make any sense to me as to why using only instead of AliasSeq resulted in a speedup. I would've expected no change or worse performance. Any theories?

I don't know. It's probably related to however foreach over an AliasSeq is implemented. The fact that only is faster than using an array literal doesn't surprise me though, since CTFE probably does a bunch of extra, unnecessary allocation when dealing with array literal.

Maybe the speed difference between AliasSeq and only is a sign of something that could be improved in the compiler's implementation, or maybe only is just fundamentally faster for one reason or another. I don't know. But from what I can tell, the speed difference is large enough that it's kind of crazy to use AliasSeq with values when static foreach and only will work just as well.

- Jonathan M Davis

January 07, 2018
On 05.01.2018 14:10, Jonathan M Davis wrote:
> Taking it a step further, I tried switching some of the static foreach's
> over to using array literals, since they held values rather than types, and
> that seemed to have minimal impact on the time to run dub test. However, by
> switching to using std.range.only, it suddenly was taking more like 11.8
> seconds. So, with a few small changes, I cut the time to run dub test down
> by almost a third.

This is weird, as the compiler will apply the following rewrites:

static foreach(i;only(0,1,2,3,4)){}

=>

static foreach(i;{
    int[] r=[];
    foreach(i;only(0,1,2,3,4)){
        r~=i;
    }
    return r;
}()){}

=> // (using CTFE)

static foreach(i;[0,1,2,3,4]){}

=> // (uses shortcut; not instantiating the AliasSeq template)

static foreach(i;AliasSeq!(0,1,2,3,4)){}
January 07, 2018
On Sunday, January 07, 2018 09:59:30 Timon Gehr via Digitalmars-d wrote:
> On 05.01.2018 14:10, Jonathan M Davis wrote:
> > Taking it a step further, I tried switching some of the static foreach's over to using array literals, since they held values rather than types, and that seemed to have minimal impact on the time to run dub test. However, by switching to using std.range.only, it suddenly was taking more like 11.8 seconds. So, with a few small changes, I cut the time to run dub test down by almost a third.
>
> This is weird, as the compiler will apply the following rewrites:
>
> static foreach(i;only(0,1,2,3,4)){}
>
> =>
>
> static foreach(i;{
>      int[] r=[];
>      foreach(i;only(0,1,2,3,4)){
>          r~=i;
>      }
>      return r;
> }()){}
>
> => // (using CTFE)
>
> static foreach(i;[0,1,2,3,4]){}
>
> => // (uses shortcut; not instantiating the AliasSeq template)
>
> static foreach(i;AliasSeq!(0,1,2,3,4)){}

I don't know. Looking at only's implementation, it uses a static array internally, not a dynamic one, so I would not have expected the lowering that you describe, but I don't know enough about the details of how the compiler works to know what it would really do - especially during CTFE - and I'd expect you to know far more about that than I would.

Either way, with my project at least, with the unit tests that I was generating using foreach or static foreach, using static foreach with AliasSeq was slightly faster than using normal foreach, using array literals with static foreach was about the same as using AliasSeq with static foreach, and using only with static foreach was way faster than either - enough so that by switching all of my non-static foreach's over AliasSeqs of values to only (or lockstep with iota and only where I needed indices) resulted in dub test taking a bit over half as long as it did before. Now, I had a lot of unit tests using foreach with AliasSeqs of values, which is why the impact was so great in my case, but anecdotally, it implies that the compiler treats a static foreach over a std.range.only quite differently from one over an array literal or AliasSeq.

- Jonathan M Davis