May 21, 2018
On Monday, 21 May 2018 at 15:00:09 UTC, Dennis wrote:
> I want to be convinced that Range programming works like a charm, but the procedural approaches remain more flexible (and faster too) it seems. Thanks for the example.
>
On Monday, 21 May 2018 at 22:11:42 UTC, Dennis wrote:
> In this case I used drop to drop lines, not characters. The exception was thrown by the joiner it turns out.
>  ...
> From the benchmarking I did, I found that ranges are easily an order of magnitude slower even with compiler optimizations:

My general experience is that range programming works quite well. It's especially useful when used to do lazy processing and as a result minimize memory allocations. I've gotten quite good performance with these techniques (see my DConf talk slides: https://dconf.org/2018/talks/degenhardt.html).

Your benchmarks are not against the file split case, but if you benchmarked that you may have also seen it as slow. It that case you may be hitting specific areas where there are opportunities for performance improvement in the standard library. One is that joiner is slow (PR: https://github.com/dlang/phobos/pull/6492). Another is that the write[fln] routines are much faster when operating on a single large object than many small objects. e.g. It's faster to call write[fln] with an array of 100 characters than: (a) calling it 100 times with one character; (b) calling it once, with 100 characters as individual arguments (template form); (c) calling it once with range of 100 characters, each processed one at a time.

When joiner is used as in your example, you not only hit the joiner performance issue, but the write[fln] issue. This is due to something that may not be obvious at first: When joiner is used to concatenate arrays or ranges, it flattens out the array/range into a single range of elements. So, rather than writing a line at a time, you example is effectively passing a character at a time to write[fln].

So, in the file split case, using byLine in an imperative fashion as in my example will have the effect of passing a full line at a time to write[fln], rather than individual characters. Mine will be faster, but not because it's imperative. The same thing could be achieved procedurally.

Regarding the benchmark programs you showed - This is very interesting. It would certainly be worth additional looks into this. One thing I wonder is if the performance penalty may be due to a lack of inlining due to crossing library boundaries. The imperative versions aren't crossing these boundaries. If you're willing, you could try adding LDC's LTO options and see what happens. There are some instructions in the release notes for LDC 1.9.0 (https://github.com/ldc-developers/ldc/releases). Make sure you use the form that includes druntime and phobos.

--Jon
1 2 3
Next ›   Last »