Jump to page: 1 2
Thread overview
Work done so far
Jan 21, 2008
bearophile
fastSort benchmarking (was Re: Work done so far)
Jan 21, 2008
Matti Niemenmaa
Jan 21, 2008
Robert Fraser
Jan 21, 2008
Christopher Wright
Jan 21, 2008
bearophile
Jan 21, 2008
Matti Niemenmaa
Jan 21, 2008
bearophile
Jan 21, 2008
torhu
Jan 21, 2008
Sean Kelly
Jan 21, 2008
BCS
Jan 22, 2008
bearophile
Jan 22, 2008
BCS
January 21, 2008
Hello, in this long post I explain what's the current situation with the D libs I am developing (I call them just "d libs" for the lack of a better name).

http://www.fantascienza.net/leonardo/so/libs_d.zip

I am showing the current version (2.48, but I update them almost daily. In the zip there is the "meta" package too, not written by me, that is used by just a bit of d libs code) of the code now because their growth is now slow enough, they already contain most of the things I need, and because Walter and Sean have just said to me:

Walter>If you have a faster qsort, and want to contribute it, please do so! Same goes for other faster routines you've written.<

Sean Kelly>I'd be interested in seeing that.  I've been able to beat the D sort for some data sets and match it in others, but not beat it across the board.<

In those d libs ("func" module) there is the fastSort() too I was talking about, plus lot of other things [note: that sort of mine is really simple, and it looks much similar to the sort already present in Phobos, but it run faster to me. In the code you can find some speed tests too. It's a quicksort plus a Simple Insertion Sort. But if you need a faster sort I suggest to adapt something much better than the naive 60-lines long sort I have written, like the Timsort used by Python. I have tested the speed of my code only in few situations, not long and lot]. It's about 388 KB of clean D code, with full unittests and ddocs. They are mostly functions, but there are few small classes too.

I post them as open source hoping to see part of that stuff added to Phobos (or Tango, if you want, but they may need some time to be ported). They run on D v1.025, so they may need some little changes to run with D v2.x. But so far Python developers have refused all my offered improvements to the Python std lib, so my hopes of helping D std lib are nearly zero already :o) I'll use part of this stuff anyway, just as I use my bag of tricks in developing my Python code.

Probably among those things there are functions/classes may be just nice-looking toys of little practical usage (example: on D v2.x xrange() is probably nearly useless), few ones are just different ways of doing things already present in Phobos, but I think (many) other things can be useful.

Those d libs are a mixed bag of things, but behind more than 70% of those things there is a common rationale: I think D is a multi-level language. I have seen I can use D to write in a low-level style (like using ASM or programming in a low-level-C-like style), or I can use it with a style more similar to Python code.

Usually (but not always) D programs written in lower level style are faster, but if you have used a lot both lower level language (C, Delphi, etc) and scripting languages (Python, Ruby, Perl) (and maybe some Functional language too (Scheme, etc)) you know that there's a big need for both levels of coding.

You can write all the program in a low-level style, but it often requires lot of coding time, lot of code, and lot of debugging. On the other hand you can write most programs in Python, but sometimes you feel the need to have a language that produces faster code, that allows you to access raw memory bytes to implement your data structure filled with pointers, etc.

Many programs I have written have some "hot spots", critical parts, that must be fast, and the are often quite larger parts of the program that are I/O bound, GUI stuff, or parts that process little data, etc. So I have felt the need of a language that allows me to write the critical parts in a low-level style, while allowing me to write higher-level code in all the other parts of the program. D and Lisp are among the few languages that I think allow such multi-level style of programming without forcing you to use another language (well, D you use ASM, that's another language, but it's built-in, you don't need another compiler to compile C/Pyrex/Cython/ShedSkin extensions like in Python). (I am sure there are other kinds of programs that need to be uniformly fast. For such programs a single-level language may be enough). (Probably you can do multi-level programming in C++ too, but I think C++ is a quite complex language). If you have used only C/C++/Delphi then probably you need more time to understand why a higher level of coding is often very very positive.

D (like D. v2.x) may allow such higher-level of coding adding some things to the core language. But I have tried to add some things using the language itself. The results aren't perfectly fine because some things are missing (example: D doesn't offer a simple way to scan two lazy iterables in parallel), some syntax isn't nice, but maybe some of the things I have written may be useful, easy and nice-looking enough.

So most of the things you can find in those d libs are meant to be used in places where running speed (and memory) isn't the main concern (but usually I have coded them as fast as possible). Where you need max running speed you can just use C-style constructs. But if you have to write a 10-lines D script to process some text, if you have to write a part of code that's not meant to be maximally fast, etc, they may be useful. So you hopefully end writing those parts in less time, with less code and less bugs.

Those d libs contain mixed things (about 130, and they can be combined in many ways):
- They contain some higher level functions you can find in Python. Python has its faults, and I don't believe it's the best possible language, but I believe its design has many nice things, that I have tried to copy. So I am not ashamed by the fact that many of the things you can find in those d libs are so much similar to Python syntax (and I think D already contains lot of things copied from Python).
- Some support of lazy generators, especially some of the things you can find in the standard Python module itertools:
http://docs.python.org/lib/module-itertools.html
- Some fast mathematical/string functions missing in Phobos that I need to use.

As I've said, some of the things you can find in those d libs may end being just useless toys (and probably I'll move/remove them away when I am convinced that I'll not use them, this d lib is just few months old, so I have used it only few hundred times) but they are (or they want to be, failing) easy to use and safe. If they are hard to use, or they need too much accesses to the docs to be used, then they have missed most of their purpose, because they are meant for the parts of the code where you want max *coding&debugging* speed instead of running speed.


For example, here are the functions/classes of the main module (func.d), other things can be found in the other modules:
AA             create an AA of given types
AAKeyType
AAValType
all            true if all items of the given iterable are true
any            true if one or more items of the given iterable are true
apply          return result of function applied to given arguments
array          copy of items of iterable class, array, etc
arrayMul       return given array multiplied 'times' times
ArrayType      return the type of the base elements of an n-dimensional array
ArrayType1     like ArrayType, but it goes down just 1 level.
autoAssign     to be used with a mixin, to assign some attributes inside class constructors
azip           zip arrays
BaseType       combines IterType, ArrayType1 and AAKeyType
BaseTypedef    template, gives the base type of type type-defined one or more times
cinterval      closed arithmetic progressions of chars
clear          quickly clear an AA/dynamic array in-place, and return it too
DA             create a dynamic array of given type
DeconstArrType if given a static array return the type of a dynamc array with the same item type
dict           builds an AA from given keys and values
doTable        return array with the results of fun called with no arguments range times
dup            return the (dynamic) duplicate of the given AA/array.
equalAA        true if the two given AAs are equal
fastSort       faster than the .sort builtin
filter         to filter an iterable
filterAA       to filter an AA
flatten        return the flattened 1-dimensional array of the elements.
floatLen       len of float array
frequency      return AA, its keys are unique items, the values are their frequencies
fromKeys       builds an AA from given keys and value
groupBy        like Python grouby
intersectionAA return an AA with (key,val) pairs present in both input AAs
intLen         len of int array
IsAA
IsArray
IsCallable
IsDynamicArray
isIn           true if item is into iterable
isInSub        true if a subseq is into iterable
IsIterable
IsPointer
IsStaticArray
IsStruct
IterType       given the type of an iterable class, return the type of the items it yields.
joinArrays
leftRotate     rotate array to the left
len            len of any iterable, lazy ones too
Lets1          generate assigment instructions with a mixin
Lets2          generate assigment instructions with a mixin
Lets3          generate assigment instructions with a mixin
map            to map a function on one more arrays
map2           to map a function to a single iterable with foreach(x, y; items)
max
maxs           return all max among arguments
min
mins           return all min among arguments
moduleName     given __FILE__ return the name of the module it is called in
partition      return given iterable divided into blocks
peek           return an element of the given array/AA/iterable object
posMax         return index of max into given iterable
posMin         return index of min into given iterable
put            alias of writef
putr           alias of writefln
Raises         return true if given lazy expression raises specified exceptions
range
rightRotate    rotate array to the right
SeriesGen1     generates a natural series from min to max, for mixin, 1 replacement
SeriesGen2     generates a natural series from min to max, for mixin, 2 replacements
SeriesGen3     generates a natural series from min to max, for mixin, 3 replacements
SeriesGen4     generates a natural series from min to max, for mixin, 4 replacements
sorted         return sorted iterable, according to key() too
sortedAA       return sorted AA, according to key() too
splitArray     splits according to the given delimiter
stableSort     accepts a sorting predicate too
StaticEval     to execute a function at compile time
staticSort     compile-time sort
stdinRead      quickly read all stdin into a string
str            alias of format
strLen         len of string array
Struct         defines the type of a struct that contains given types
StructArr      like struct, but the input types are arrays (takes tye type of their elements)
structCmp      cmp among structs
sum            to sum the items of an iterable
toStruct       puts given values into a struct
Tuple          to define a tuple.
xcycle         return an infinite generator object that yields the input items in a cycle
xmap           iterative version of map
xmap2          iterative version of map2
xrange
Xrepeat        yields 'el' 'times' times, or forever
xsplitArray    lazy version of splitArray
xstdin         stdin wrapper, yields stdin lines or reads it all
xzip           lazy zip fit for "foreach"
zip            zips different arrays, return struct array


Many other things can be found in the other modules, d.math, d.string, d.random, etc.


Some usage examples of my D libs, mostly coming from the unittests. This shows just a part (~40-50%) of the function/classes that are present, the module d.func alone contains lot of things (few things are copied from Tango, like the Kiss RND generator, few little templates, etc, so far I have used this d libs only for personal use, but if necessary I can remove those things, they are all labeled):

import d.all;

void main() {
    // tinyvector module example, they are quite fast
    alias TinyVector!(float, 3) TyV;
    auto v1 = new TyV;
    auto v2 = new TyV;
    putr(str(v1)); // prints <[nan, nan, nan]>
    v1.set(10, 20, 30);
    v2.set(1, 2, 3);
    putr(v1.sqrLength(), ' ', v1.dot(v2)); // prints 1400.0 140.0

    // comb module example, a lazy iterable combinations
    foreach(p; xpermutations("abc", false))
        put(p, ' '); // prints abc bac cab acb bca cba
    putr();

    // random module example
    seed(10);
    putr(shuffled("abcdefghil"[])); // prints iecbgldfah

    // math module example, very fast approximate 1/sqrt
    putr(appInvSqrt(2)); // prints 0.70693
    putr(countBitUint(0b1010_0100__0100_0101__0000_0010__0001_0001));
    // prints 9

    // string module examples
    // Smart and numerically-aware sort:
    putr(numSorted(["aa35", "Aa6", "aA261"], true)); // prints ["Aa6", "aa35", "aA261"]

    // func module, for functional-style programming and more
    int add3(int x, int y, int z) { return x + y + z; }
    putr(apply(&add3, 2, 3, 4)); // prints 9
    // lists iterable objects and copies arrays/AAs/etc:
    putr(array("ab cd ef".xsplit())); // prints ["ab", "cd", "ef"]
    // Helpers to create AAs:
    putr(dict("abc", "def")); // prints ['a': 'd', 'b': 'e', 'c': 'f']
    putr(fromKeys("ab", 5)); // prints ['a': 5, 'b': 5]
    // More functional stuff
    int neg(int x) { return -x; }
    putr(map(&neg, xrange(3))); // prints [0, -1, -2]
    int sum3(int x, int y, int z) { return x + y + z; }
    putr(map(&sum3, [100, 200], [10, 20, 30], [1, 2, 3])); // prints [111, 222]
    // lazy version too (here "listed" anyway)
    putr(array(xmap((int x){return -x;}, [3, 1, -5, 11]))); // prints [-3, -1, 5, -11]
    int[] data2 = [33, -12, 0, 15];
    putr(filter((int x){ return x > 0; },data2)); // prints [33, 15]
    int[char[]] aa = ["aa":2, "BB":1, "cc":-2, "DD":0, "ee":5];
    bool fil(string k, int v) { return k > "DD" && v < 2; }
    putr(filterAA(&fil, aa)); // ["cc": -2]
    int[] ve1 = [1, 2, 3, 4, 5, 6];
    int[] ve2 = [3, 4, 5, 6, 7];
    int[] ve3 = [-1, -2, -3, -4];
    putr(azip(ve1, ve2, ve3, ve1));
    // prints [[1, 3, -1, 1], [2, 4, -2, 2], [3, 5, -3, 3], [4, 6, -4, 4]]
    char[][] az1 = ["hello", "how", "are"];
    auto az2 = [70, 60, 50, 40, 30];
    float[] az3 = [1.9, 2.8, 3.7, 4.6, 5.5, 6.4];
    auto az4 = "abcdefghi";
    putr(zip(az1, az2, az3, az4));
    // prints [<"hello", 70, 1.9, 'a'>, <"how", 60, 2.8, 'b'>, <"are", 50, 3.7, 'c'>]
    putr(range(3, 10)); // prints [3, 4, 5, 6, 7, 8, 9]
    // lazy too:
    putr(array(xrange(3, 10))); // prints [3, 4, 5, 6, 7, 8, 9]
    putr(cinterval('g', 'a', -1)); // prints gfedcba
    putr(sum([1.5, 2.5, 3.5], 2.0)); // prints 9.5
    // lazyly too:
    putr(sum(xrange(5), 0)); // prints 10
    putr(max(1, 2, 3)); // prints 3
    putr(max([1, 2, 3, 4, 5, 6])); // prints 6
    putr(max(range(10))); // prints 9
    putr(min([-1, -2, -3, -4, -5, -6], &abs)); // prints -1
    // find all max/min:
    putr(maxs([[-1,-2], [-1,-3], [-1,-3], [-1,-2]], (int[] a){return map(&abs,a);}));
    // prints: [[-1, -3], [-1, -3]]
    // to find the index of max/min:
    putr(posMax([1, 2, 7, 4, 5, 6])); // prints: 2
    // all()/any() (they work on iterable objects, AAs, etc):
    class IsNegative { bool opCall(int x) { return x < 0; } }
    putr(all([-1:0, -2:0, -3:0], new IsNegative)); // prints: true
    // sorting with key function.
    // There is a sortedAA too that works on keys&values
    putr(sorted([10, 1, -20, 0, 1], &abs)); // prints: [0, 1, 1, 10, -20]
    putr(flatten([[[1], [2]], [[3], [4], [5, 6]]]));
    // prints: [1, 2, 3, 4, 5, 6]
    int[] ap = [1, 2, 3 ,4 ,5, 6, 7];
    putr(partition(ap, 3)); // prints: [[1, 2, 3], [4, 5, 6]]
    // iterable objects too:
    putr(partition(xrange(7))); // prints: [[0, 1], [2, 3], [4, 5]]
    putr(isIn('f', "felipe")); // prints: true
    putr(isInSub([1.1, 1.2], [-1.5, 2.3, 1.6, 1.1, 1.2]));
    // prints: true
    putr(frequency([4, 5, 4, 5, 4])); // prints: [4: 3, 5: 2]
    // lazily cycle any kind of iterable
    // The doTable builds an array calling a callable a certain number of times
    auto c1 = xcycle("Abcd");
    putr(doTable(&c1.next, 17)); // prints: AbcdAbcdAbcdAbcdA
    // lazily groups items of any kind of iterable according to given key():
    bool myisupper(char c) { return c >= 'A' && c <= 'Z'; }
    putr(array(groupBy("ALLOrachENEPE", &myisupper)));
    // prints: ["ALLO", "rach", "ENEPE"]
    putr(array(groupBy([1, 1, 1, 2, 2, 3, 3, 3])));
    // prints [[1, 1, 1], [2, 2], [3, 3, 3]]
    char[][int] aa1 = [1:"aa", 2:"bb", 3:"cc", 4:"dd"];
    char[][int] aa2 = [1:"aa", 2:"BB", -3:"cc"];
    // Intersection among AAs:
    putr(intersectionAA(aa1, aa2)); // prints: [1: "aa"]
    // There is xstdin() that allows to read all the stdin very quicky,
    // or lazily iterate on its lines.
    // cmp among structs:
    struct Sc2 { int x, y;}
    putr(structCmp(Sc2(7, 7), Sc2(7, 8))); // prints -1
    // == among AAs:
    putr(equalAA(['a':2, 'b':3], ['a':2, 'b':3])); // prints: true
}


Those functions are designed to be fast and very flexible, they usually contain code to manage AA, arrays, strings and iterable objects in the faster way.

One of the most complex parts of those d libs are str/repr/put/putr of the d.string module. putr() is a function like writefln(), but it doesn't perform the triky formatting (as write/writeln of D 2.x), and it prints most things *way* better than writefln (it's slower too, so if you need a simpler faster printing you can use writefln. putr/put aren't meant to replace writefln/writef), fixing lot and lot of the bad ways it prints things (you can find many examples in the unittests of the putr/put). The "put"/"putr" names too are shorter and simpler to write. Maybe you don't like some of the design choices present inside that put/putr (like the automatic struct printing), but I think they are nice :-)

Here are few examples of putr usage (like in Python inside arrays, AAs, etc it uses repr instead of str, so it escapes chars, etc):

import d.string: putr;

void main() {
    putr([1, 2, 3]); // [1, 2, 3]
    putr(["1", "2", "3"]); // ["1", "2", "3"]
    putr("hel\"lo"); // hel"lo
    putr("hel\"lo"); // hel"lo
    putr(["he\"llo"]); // ["he\"llo"]
    putr([["Ab", "c"], ["D", "ef"]]); // [["Ab", "c"], ["D", "ef"]]
    putr([1:"aa", 2:"ba", 3:"bb"]); // [1: "aa", 2: "ba", 3: "bb"]

    struct S3 { int x; float y; int z;}
    putr(S3(2, 7.1, 8)); // <2, 7.1, 8>

    union U { int x; char c; float f; }
    U u;
    u.x = 100;
    putr(u); // {100}

    putr(["ab", "ba"]); // ["ab", "ba"]
    putr(['a':'d','b':'e']); // ['a': 'd', 'b': 'e']

    char[int] aa_empty;
    putr(aa_empty); // AA!(int, char)

    class Cl0 { int a; this(int b){a=b;} }
    Cl0 cl0;
    putr(cl0); // null
    cl0 = new Cl0(100);
    putr(cl0); // test.main.Cl0

    putr(cast(creal)-53-55i); // -53.0-55.0i
    putr(cast(cdouble)-7-0i); // -7.0+0.0i
    putr(cast(idouble)-5i); // -5.0i

    typedef int T;
    T t = 10;
    putr(t); // 10

    auto vv = [null, null, null];
    putr(vv); // [null, null, null]

    putr([`"hello"`]); // ["\"hello\""]
    putr(["`hello`"]); // ["`hello`"]

    putr("abc\0\25de"); // abc..de
    putr(["abc\0\25de"]); // ["abc\x00\x15de"]
}


To compare, if you replace that first line with this one:

import std.stdio: putr = writefln;

to use writefln, and you comment the lines that print those structs/enums, you obtain this output:

[1,2,3]
[1,2,3]
hel"lo
hel"lo
[he"llo]
[[Ab,c],[D,ef]]
[1:[a,a],2:[b,a],3:[b,b]]
<error>
<error>
[ab,ba]
[a:d,b:e]
[]
null
test2.main.Cl0
-53+-55i
-7+0i
-5
10
[0000,0000,0000]
["hello"]
[`hello`]


Explaining the rationale behind every function/class requires lot of space, so I can't explain everything here, I am open to questions (I can give my not-spam-shielded email address too, if necessary). Most of those things are similar to Python ones, so you can find the rationale in the Python docs, but there are few differences, like in the groupBy (it doesn't return a lazy iterable of tuples that contain as first element the head and as second element an iterable to the group, but the group here is a plain array to increase usage simplicity). The sorted/sortedAA functions take a key function instead of a cmp function like most sorting things you can find in D because for the purposes I was talking about it's simpler and shorter to use (but it requires more RAM).

New versions of the d lib will follow, I am currently adding xpartition() to func.d, that is quite hairy to write (but very easy to use). And I am writing a fast Deque class too (it's slower than the Deque of the C++ STL just because DMD doesn't perform method inlining at all in my benchmarks).

Bye,
bear hugs,
bearophile
January 21, 2008
bearophile wrote:
> Walter>If you have a faster qsort, and want to contribute it, please do so! Same goes for other faster routines you've written.<
> 
> Sean Kelly>I'd be interested in seeing that.  I've been able to beat the D sort for some data sets and match it in others, but not beat it across the board.<
> 
> In those d libs ("func" module) there is the fastSort() too I was talking about, plus lot of other things [note: that sort of mine is really simple, and it looks much similar to the sort already present in Phobos, but it run faster to me. In the code you can find some speed tests too. It's a quicksort plus a Simple Insertion Sort. But if you need a faster sort I suggest to adapt something much better than the naive 60-lines long sort I have written, like the Timsort used by Python. I have tested the speed of my code only in few situations, not long and lot].

This is still pretty impressive. Results from a sort benchmark (and the benchmark itself, which requires Tango) are attached. Your code is quite short but still knocks the socks off the built-in sort.

It appears to do a lot of comparing, though---and thus, when made to use a custom comparing function instead of the built-in "<" operator it quickly slows down. It's still much better than the built-in sort (especially since the built-in sort can't even take a custom comparator), but for a more general sort it's not that great. Unless I completely messed things up when converting it to use a comparing function, that is. Which is entirely possible.

One other change I had to make is to add "j &&" to this line in _aqs, because
when cmp was a function "return a > b" instead of "return a < b" it would result
in array bounds errors:
do j--; while (j && cmp(a, v[j]));

I see that Sean has changed the Tango sort to use introspective sort as well, so the results for that don't apply any more (http://www.dsource.org/projects/tango/ticket/571) but they're included anyway.

Key for understanding the numbers:

For each algorithm, the first column set is for random, the second for already sorted, the third for sorted and then reversed, and the fourth for "median-of-3 killer" data, specially crafted to bring worst-case behaviour in median-of-3 quicksorts. The Tango sort used to hit a stack overflow and thus isn't tested with them.

In each column set, the first column is the average, the second the minimum, and the third the maximum time taken.

-- 
E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi


January 21, 2008
Matti Niemenmaa wrote:
> It appears to do a lot of comparing, though---and thus, when made to use a custom comparing function instead of the built-in "<" operator it quickly slows down.

Mergesort generally does fewer compares than quicksort (that's why it's the default sorting algorithm for objects, but not primitives, in Java), although it has the cost of copying & allocation there.
January 21, 2008
Matti Niemenmaa:
> It appears to do a lot of comparing, though---and thus, when made to use a custom comparing function instead of the built-in "<" operator it quickly slows down. It's still much better than the built-in sort (especially since the built-in sort can't even take a custom comparator), but for a more general sort it's not that great.

I see. Notes:
- Two or more sorts can be kept, one for native data, one for more complex sorting, etc. (That's what I do in my d libs)
- Some of the situations where you use cmp() can be managed with a string mixin, maybe this can speed things up some.
- A better support of inlining by DMD may change the situation some.
- In many simple situations where speed and data size are less important a key() function is simpler to use than cmp(). See sorted/sortedAA in d libs. If uou use a key() you often don't need a cmp() too.
- Try adapting Timsort used by Python, probably it's not easy to find something better: http://py-stablesort.sourceforge.net/

Bye,
bearophile
January 21, 2008
Robert Fraser wrote:
> Matti Niemenmaa wrote:
>> It appears to do a lot of comparing, though---and thus, when made to use a custom comparing function instead of the built-in "<" operator it quickly slows down.
> 
> Mergesort generally does fewer compares than quicksort (that's why it's the default sorting algorithm for objects, but not primitives, in Java), although it has the cost of copying & allocation there.

There are in-place merge sorts, though I'm not sure how costly it is to do that. One implementation I see right now will cost O(n * n * log n) in the worst case, which is clearly unacceptable.
January 21, 2008
Matti Niemenmaa wrote:
 > This is still pretty impressive. Results from a sort benchmark (and the
> benchmark itself, which requires Tango) are attached. Your code is quite short but still knocks the socks off the built-in sort.

Did you have look at Andrei's new sort template in std.algorithm?  It was added in 2.008. I don't know if he plans to optimize it further or not.  It can be used like this:

import std.algorithm, std.functional;

sort!(less)(array);     // predicate function from std.functional
sort!("a < b")(array);  // comparison inlined by string mixin
January 21, 2008
bearophile wrote:
> Notes:
> - Two or more sorts can be kept, one for native data, one for more complex
> sorting, etc. (That's what I do in my d libs)

True.

> - Some of the situations where you use cmp() can be managed with a string
> mixin, maybe this can speed things up some.

Also true, this is what Andrei does in 2.0's std.algorithm, as torhu said. And it probably does speed things up.

> - A better support of inlining by DMD may change the situation some.

In some cases, yes, but in this general case of passing a function pointer I don't think it can help.

> - In many simple situations where speed and data size are less important a
> key() function is simpler to use than cmp(). See sorted/sortedAA in d libs.
> If uou use a key() you often don't need a cmp() too.

If speed is less important you probably don't care that much about the algorithm  being used either. ;-)

> - Try adapting Timsort used by Python, probably it's not easy to find
> something better: http://py-stablesort.sourceforge.net/

I haven't tested timsort but I gather its main selling point is that it's stable. As such I don't think it'll be faster than an average quicksort (although I could be completely wrong, of course), since, as far as I can tell, it's a highly optimized version of merge sort.

As the docs say: "an adaptive, stable, natural mergesort, modestly called
timsort". The situations in which one needs a stable sort are, IMHO, rare enough that it should be considered a special case and thereby a separate function/method.

I'm willing to concede without evidence that timsort is among the fastest stable sorts, but I'll admit I'd be a bit surprised if even a hand-optimized C implementation working on arrays would beat stdlib's qsort() or your algorithm, for instance.

I can't test it since the original code uses the Python API instead of sorting a void* or the like. If you find a more portable implementation or feel like porting it yourself, feel free to benchmark it and see how it performs.

-- 
E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi
January 21, 2008
Matti Niemenmaa wrote:
> 
> Key for understanding the numbers:
> 
> For each algorithm, the first column set is for random, the second for already sorted, the third for sorted and then reversed, and the fourth for "median-of-3 killer" data, specially crafted to bring worst-case behaviour in median-of-3 quicksorts. The Tango sort used to hit a stack overflow and thus isn't tested with them.

The ticket you linked has revised numbers for the Tango sort, if anyone is interested.  Also, I think it's worth noting that the built-in sort routine actually does call a function for comparison (TypeInfo.cmp) -- there's simply no way to override it for a particular type.  Given this, the average performance of the built-in sort is really excellent.


Sean
January 21, 2008
Matti Niemenmaa:
> If speed is less important you probably don't care that much about the algorithm
>   being used either. ;-)

This doesn't change the fact that in many situations in your code you want to use a key() instead of a cmp().


> I haven't tested timsort but I gather its main selling point is that it's stable. As such I don't think it'll be faster than an average quicksort (although I could be completely wrong, of course), since, as far as I can tell, it's a highly optimized version of merge sort.

It beats quicksort in the common case of partially ordered input, I think. With quite varied definition of "partially". Plus other situations too, like partial reverse ordering, etc. In real world situations you usually have partially ordered data, or you have to merge two partially sorted sequences, etc. In such very common cases Timsort is probably among the best. And it's fast enough in the other cases too. You can take a look at the comparison count too.


> The situations in which one needs a stable sort are, IMHO, rare enough that it should be considered a special case and thereby a separate function/method.

In my d libs there is a stable sort too. A stable sort is really useful in lot of situations, like sorting structs (multi type tuples). And Timosort was faster anyway than the not-stable sort used before it... I presume it's good mostly when you have to sort objects with a custom comparator (unlike the fastSort I have written in my D libs).


> If you find a more portable implementation

The site I have shown you has the most independent version of it you can find, I think. Jython too uses it (and I have heard that recently the Sun JavaVM may use it too, but I am not sure).

Bye,
bearophile
January 21, 2008
Reply to bearophile,

> Hello, in this long post I explain what's the current situation with
> the D libs I am developing (I call them just "d libs" for the lack of
> a better name).
> 
> http://www.fantascienza.net/leonardo/so/libs_d.zip
> 
> I am showing the current version (2.48, but I update them almost
> daily.
> 

do you have them in SVN somewhere? If not I would be willing to let youput them in scrapple.


« First   ‹ Prev
1 2