September 23, 2016 Re: topN using a heap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On 09/23/2016 11:40 AM, Stefan Koch wrote:
> On Friday, 23 September 2016 at 15:30:20 UTC, Andrei Alexandrescu wrote:
>> Work is blocked by https://issues.dlang.org/show_bug.cgi?id=16528,
>> which is quite a head-scratcher. Any ideas for a workaround? Thanks!
>> -- Andrei
>
> annotate the templates.
Nagonna work for generic functions -- Andrei
|
September 23, 2016 Re: topN using a heap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Friday, 23 September 2016 at 16:09:18 UTC, Andrei Alexandrescu wrote:
>
> BTW, as I commented in https://issues.dlang.org/show_bug.cgi?id=16517, using the new topN implementation instead of sort to compute the median over google's 1-grams is over 11x faster using dmd.
>
That's a very nice result. Both the absolute numbers and that all three sets achieve similar performance, as they different distribution characteristics.
--Jon
|
September 24, 2016 Re: topN using a heap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jon Degenhardt | On 09/23/2016 04:45 PM, Jon Degenhardt wrote: > That's a very nice result. Both the absolute numbers and that all three > sets achieve similar performance, as they different distribution > characteristics. Got curious so I tried to patch my ldc installation (0.17.1 (DMD v2.068.2, LLVM 3.8.0)), but sadly that didn't work out because sorting.d uses the new syntax in various places. Probably the best baseline is the equivalent C++ code using the mature implementation of nth_element, see http://paste.ofcode.org/ieZPdchjzTXbDcG3LvsYBP (also pasted at the end of this message). Here's what I got: http://paste.ofcode.org/6feyBLRiJtieHdZ3bmGaUW, see also text below. The topN code produced with dmd is faster in virtually all cases (has parity for -f 4, which I suspect is a degenerate case with most elements equal, which exercises only a small part of the algorithm). For C++ I used -O3 -DNDEBUG, any other flags I should use? As an aside, I enjoy poking fun at the stunning inefficiency of iostreams in my public talks; they delivered again :o). Andrei =========================== shell $ g++ -O3 -DNDEBUG -omedian_by_sort_cpp issue16517.cpp $ g++ -O3 -DNDEBUG -DTOPN -omedian_by_topn_cpp issue16517.cpp $ dmd -O -inline -release -ofmedian_by_sort -boundscheck=off issue16517.d $ dmd -O -inline -release -version=topn -ofmedian_by_topn -boundscheck=off issue16517.d $ cut -f 2 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_sort_cpp median (via sort): 1972; lines: 10512769; total time (ms): 3419.11; sort time (ms): 314.086 $ cut -f 2 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_sort median (via sort): 1972; lines: 10512769; total time (ms): 1462; sort time (ms): 399 $ cut -f 2 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_topn_cpp median (via nth_element): 1972; lines: 10512769; total time (ms): 3255.41; nth_element time (ms): 40.676 $ cut -f 2 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_topn median (via topN): 1972; lines: 10512769; total time (ms): 1211; topN time (ms): 32 $ cut -f 3 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_sort_cpp median (via sort): 3; lines: 10512769; total time (ms): 2949.11; sort time (ms): 297.814 $ cut -f 3 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_sort median (via sort): 3; lines: 10512769; total time (ms): 1351; sort time (ms): 382 $ cut -f 3 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_topn_cpp median (via nth_element): 3; lines: 10512769; total time (ms): 2316.75; nth_element time (ms): 65.041 $ cut -f 3 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_topn median (via topN): 3; lines: 10512769; total time (ms): 973; topN time (ms): 59 $ cut -f 4 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_sort_cpp median (via sort): 2; lines: 10512769; total time (ms): 2614.47; sort time (ms): 351.285 $ cut -f 4 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_sort median (via sort): 2; lines: 10512769; total time (ms): 1269; sort time (ms): 361 $ cut -f 4 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_topn_cpp median (via nth_element): 2; lines: 10512769; total time (ms): 2443.17; nth_element time (ms): 76.211 $ cut -f 4 /tmp/googlebooks-eng-all-1gram-20120701-0 | ./median_by_topn median (via topN): 2; lines: 10512769; total time (ms): 998; topN time (ms): 77 =========================== =========================== issue16517.cpp #include <iostream> #include <cstdio> #include <ctime> #include <vector> #include <algorithm> using namespace std; int main() { clock_t swTotal, swSort; swTotal = clock(); vector<double> values; double num; while (cin >> num) { values.push_back(num); } size_t medianIndex = values.size() / 2; swSort = clock(); #ifdef TOPN const char* method = "nth_element"; nth_element(values.begin(), values.begin() + medianIndex, values.end()); #else const char* method = "sort"; sort(values.begin(), values.end()); #endif clock_t done = clock(); printf("median (via %s): %g; lines: %zu; total time (ms): %g; %s time (ms): %g\n", method, values[medianIndex], values.size(), (done - swTotal) * 1000. / CLOCKS_PER_SEC, method, (done - swSort) * 1000. / CLOCKS_PER_SEC); } =========================== =========================== issue16517.d import std.stdio; import std.array : appender; import std.algorithm : sort, topN; import std.conv; import std.range; import std.datetime: StopWatch; void main() { StopWatch swTotal, swSort; swTotal.start; Appender!(double[]) values; foreach (line; stdin.byLine) values.put(line.to!double); size_t medianIndex = values.data.length/2; swSort.start; version(topn) { topN(values.data, medianIndex); auto method = "topN"; } else { sort(values.data); auto method = "sort"; } swTotal.stop; swSort.stop; writefln("median (via %s): %g; lines: %d; total time (ms): %d; %s time (ms): %d", method, values.data[medianIndex], values.data.length, swTotal.peek.msecs, method, swSort.peek.msecs); } =========================== |
September 25, 2016 Re: topN using a heap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Saturday, 24 September 2016 at 18:22:51 UTC, Andrei Alexandrescu wrote:
> Got curious so I tried to patch my ldc installation (0.17.1 (DMD v2.068.2, LLVM 3.8.0)), but sadly that didn't work out because sorting.d uses the new syntax in various places.
>
> Probably the best baseline is the equivalent C++ code using the mature implementation of nth_element, see http://paste.ofcode.org/ieZPdchjzTXbDcG3LvsYBP (also pasted at the end of this message). Here's what I got: http://paste.ofcode.org/6feyBLRiJtieHdZ3bmGaUW, see also text below.
>
> The topN code produced with dmd is faster in virtually all cases (has parity for -f 4, which I suspect is a degenerate case with most elements equal, which exercises only a small part of the algorithm). For C++ I used -O3 -DNDEBUG, any other flags I should use?
>
> [snip]
>
I added the topN pull request to a local LDC build (current LDC master). Also built the C++ nth_element and sort program you listed. (GCC 4.9.3, compiler line: g++-mp-4.9 -O3 -static-libgcc -static-libstdc++ -std=c++11).
Did 7 runs for each variant on the three fields in ngram file. Median times are below.
Median sort times (ms):
Field 2 Field 3 Field 4
DMD 351 348 326
LDC 273 261 245
C++ 281 260 246
Median topN / nth_element times (ms):
Field 2 Field 3 Field 4
LDC 21 44 57
C++ 41 60 71
Looking very good indeed!
|
September 25, 2016 Re: topN using a heap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jon Degenhardt | On 9/25/16 4:19 AM, Jon Degenhardt wrote:
> On Saturday, 24 September 2016 at 18:22:51 UTC, Andrei Alexandrescu wrote:
>> Got curious so I tried to patch my ldc installation (0.17.1 (DMD
>> v2.068.2, LLVM 3.8.0)), but sadly that didn't work out because
>> sorting.d uses the new syntax in various places.
>>
>> Probably the best baseline is the equivalent C++ code using the mature
>> implementation of nth_element, see
>> http://paste.ofcode.org/ieZPdchjzTXbDcG3LvsYBP (also pasted at the end
>> of this message). Here's what I got:
>> http://paste.ofcode.org/6feyBLRiJtieHdZ3bmGaUW, see also text below.
>>
>> The topN code produced with dmd is faster in virtually all cases (has
>> parity for -f 4, which I suspect is a degenerate case with most
>> elements equal, which exercises only a small part of the algorithm).
>> For C++ I used -O3 -DNDEBUG, any other flags I should use?
>>
>> [snip]
>>
> I added the topN pull request to a local LDC build (current LDC master).
> Also built the C++ nth_element and sort program you listed. (GCC 4.9.3,
> compiler line: g++-mp-4.9 -O3 -static-libgcc -static-libstdc++ -std=c++11).
>
> Did 7 runs for each variant on the three fields in ngram file. Median
> times are below.
>
> Median sort times (ms):
>
> Field 2 Field 3 Field 4
> DMD 351 348 326
> LDC 273 261 245
> C++ 281 260 246
>
> Median topN / nth_element times (ms):
>
> Field 2 Field 3 Field 4
> LDC 21 44 57
> C++ 41 60 71
>
> Looking very good indeed!
Thanks for this work! -- Andrei
|
Copyright © 1999-2021 by the D Language Foundation