Thread overview | |||||||||
---|---|---|---|---|---|---|---|---|---|
|
October 19, 2010 [Issue 5077] New: std.algorithm.schwartzSort is slow | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=5077 Summary: std.algorithm.schwartzSort is slow Product: D Version: D2 Platform: x86 OS/Version: Windows Status: NEW Keywords: performance Severity: enhancement Priority: P2 Component: Phobos AssignedTo: nobody@puremagic.com ReportedBy: bearophile_hugs@eml.cc --- Comment #0 from bearophile_hugs@eml.cc 2010-10-18 19:08:43 PDT --- Here are few benchmarks that show that schwartzSort() is much slower than sort(). (While in Python the usage of the 'key' argumement, analogous to a Schwartz sort, usually leads to faster code. But Python and D have very different compilation structure, so comparisons are hazardous at best.) Timings 2.1 GHz CPU, , DATA_LEN=1_000_000, best of 4, seconds: none: 0.3 sort: 4.1 sort: 3.4 (alternative) schwartz: 17.9 I have no idea why the "alternative" sort is faster than the normal sort, I was expecting the opposite. ----------------------------------------- import std.stdio: writeln; import std.algorithm: schwartzSort, sort; import std.random: Random, uniform; import std.typecons: Tuple; enum SortType { none, sort, schwartz } enum DATA_LEN = 1_000_000; enum sort_type = SortType.sort; alias Tuple!(double, "x", double, "y") P; void main() { auto data = new P[DATA_LEN]; auto rnd = Random(1); foreach (ref el; data) { double x = uniform(0.0, 1.0, rnd); double y = uniform(0.0, 1.0, rnd); el = P(x, y); } if (data.length < 50) writeln(data); static if (sort_type == SortType.schwartz) { schwartzSort!((p) { return p.y; })(data); schwartzSort!((p) { return p.x; })(data); schwartzSort!((p) { return p.y; })(data); } static if (sort_type == SortType.sort) { sort!q{ a.y < b.y }(data); sort!q{ a.x < b.x }(data); sort!q{ a.y < b.y }(data); /* // alternative sort!((P a, P b) { return a.y < b.y; })(data); sort!((P a, P b) { return a.x < b.x; })(data); sort!((P a, P b) { return a.y < b.y; })(data); */ } if (data.length < 50) writeln(data); } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
October 19, 2010 [Issue 5077] std.algorithm.schwartzSort is slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=5077 Peter Alexander <peter.alexander.au@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |peter.alexander.au@gmail.co | |m --- Comment #1 from Peter Alexander <peter.alexander.au@gmail.com> 2010-10-19 00:05:18 PDT --- I get quite similar timings: sort: 3.8 alt. sort: 3.4 schwartz: 25.8 This is with -inline -O -release. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
October 19, 2010 [Issue 5077] std.algorithm.schwartzSort is slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=5077 Andrei Alexandrescu <andrei@metalanguage.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andrei@metalanguage.com --- Comment #2 from Andrei Alexandrescu <andrei@metalanguage.com> 2010-10-19 06:37:48 PDT --- This is a misunderstanding of Schwartz sort is supposed to do. D's equivalent of Python's key argument is not Schwartz sort, but instead: sort!q{p.x < p.y}(data); i.e. the keys are not copied but instead a projection is used for the comparison. That's your "alt" sort. Schwartz sort is meant for usage when the key computation (in this case a simple member access) is too expensive to be done more than once. To avoid that, schwartzSort creates an array of the computed keys and then sorts the keys and the original arrays in lockstep. It is expected that schwartzSort is considerably slower than sort for cheap keys. It is also expected that "alt" sort is the best of the breed because it has the fastest key computation. I'm leaving this open in case you have new experiments that do reveal a problem. Otherwise, feel free to close it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
October 22, 2010 [Issue 5077] std.algorithm.schwartzSort is slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=5077 --- Comment #3 from bearophile_hugs@eml.cc 2010-10-22 04:52:28 PDT --- (In reply to comment #2) Thank you for your answers. > This is a misunderstanding of Schwartz sort is supposed to do. D's equivalent of Python's key argument is not Schwartz sort, but instead: > > sort!q{p.x < p.y}(data); > > i.e. the keys are not copied but instead a projection is used for the comparison. That's your "alt" sort. In that D benchmark there are 3 kinds of sort, two use the normal sort() the other uses schwartzSort. The "alternative" version is still a normal sort. The only difference is that it uses a delegate, as you see from the code: (P a, P b) { return a.y < b.y; } instead of a "template lambda": q{ a.y < b.y } Regarding Python, years ago Python2 used to have just a sort like this, with the "cmp" argument: s.sort(cmp) From the docs: http://docs.python.org/library/stdtypes.html#mutable-sequence-types cmp specifies a custom comparison function of two arguments (list items) which should return a negative, zero or positive number depending on whether the first argument is considered smaller than, equal to, or larger than the second argument: cmp=lambda x,y: cmp(x.lower(), y.lower()). The default value is None. Recent Python2 versions use have a more complex sort signature: s.sort([cmp[, key[, reverse]]]) Where beside the "cmp" that's still present, there is the "key" function: key specifies a function of one argument that is used to extract a comparison key from each list element: key=str.lower. The default value is None. Python3 has removed the cmp argument. In the bug report I was referring to "key" that's a function that takes a single argument and return a single transformed item. So in Python if you use the "key" you are performing a Schwartz sort. C source code of CPython is available online, if you use "key" CPython builds a temporary array of the transformed data, that is used to sort the true data. > I'm leaving this open in case you have new experiments that do reveal a problem. Otherwise, feel free to close it. The performance of schwartzSort is too much low, so in my opinion the bug report needs to be open still. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
October 22, 2010 [Issue 5077] std.algorithm.schwartzSort is slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=5077 --- Comment #4 from bearophile_hugs@eml.cc 2010-10-22 04:55:33 PDT --- Sorry, I have forgotten a Python version of the code: from random import random, seed from operator import itemgetter SortType_none = 0 SortType_sort = 1 SortType_schwartz = 2 DATA_LEN = 1000 # ********** sort_type = SortType_schwartz def main(): seed(1) data = [(random(), random()) for _ in xrange(DATA_LEN)] if len(data) < 50: print data if sort_type == SortType_schwartz: data.sort(key=itemgetter(1)) data.sort(key=itemgetter(0)) data.sort(key=itemgetter(1)) if sort_type == SortType_sort: data.sort(cmp=lambda a, b: cmp(a[1], b[1])) data.sort(cmp=lambda a, b: cmp(a[0], b[0])) data.sort(cmp=lambda a, b: cmp(a[1], b[1])) if len(data) < 50: print data main() -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
January 09, 2011 [Issue 5077] std.algorithm.schwartzSort is slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=5077 Andrei Alexandrescu <andrei@metalanguage.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED AssignedTo|nobody@puremagic.com |andrei@metalanguage.com -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 21, 2011 [Issue 5077] std.algorithm.schwartzSort is slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=5077 --- Comment #5 from bearophile_hugs@eml.cc 2011-06-21 15:53:29 PDT --- See bug 6192 for more -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation