April 23, 2013
On Tuesday, 23 April 2013 at 17:42:08 UTC, Xinok wrote:

> Choosing a random pivot will always invoke "average" performance where as finding the median of N elements usually achieves better performance on sorted lists. You also have to be careful that equal elements are distributed equally among both partitions, else a list with many elements equal to the pivot will still achieve poor performance. (equal elements would all fall onto one side of the pivot)
>
> Introsort is simply a quick sort which falls back to a heap sort after so many recursions to guarantee O(n log n) performance. I've implemented this using a median of five and it works excellent. This is what I plan to contribute to Phobos.

I like the sound of that!

Regarding my previous post, it's perhaps worth mentioning Go in the comparison, it uses QuickSort with median-of-three for small sizes and median of medians-of-three for larger sizes [1].  And it actually sorts the three medians in median-of-three, which sounds like a good thing to do.  They cite "Engineering a Sort Function" (1993) by Bentley and McIlroy as inspiration [2].

Ivan Kazmenko.

-----

[1] http://golang.org/src/pkg/sort/sort.go#L104
[2] http://www.enseignement.polytechnique.fr/informatique/profs/Luc.Maranget/421/09/bentley93engineering.pdf
April 24, 2013
23-Apr-2013 05:17, Xinok пишет:
> On Saturday, 20 April 2013 at 16:35:25 UTC, Dmitry Olshansky wrote:
>> And this all is good but TimSort allocates O(N) memory. The constant
>> in front of N is smallish less then 1.0 but it could cause some grief.
>
> Worst case is O(n/2), but it starts small and doubles in size as needed.
> On a partially sorted array, it may only use a small amount of memory.

Good to know, still it's something to keep in mind.
Especially for allocation-free minded folks.

-- 
Dmitry Olshansky
April 24, 2013
24-Apr-2013 01:09, Ivan Kazmenko пишет:
> And on Tuesday, 23 April 2013 at 01:10:26 UTC, Xinok wrote:
>> I filed a bug report for this issue a year ago:
>> http://d.puremagic.com/issues/show_bug.cgi?id=7767
>>
>> I've been meaning to fix this issue myself. Time allowing, I'll do it
>> soon.
>
> What I wonder now, what would be the goal of such a fix?
>
> One possible goal would be to have an O(n log n) worst case sort.  And
> that would perhaps cost some speed and/or memory on average.  Besides,
> such a sort function is already implemented (TimSort), so it's just a
> matter of setting the default then.

A good unstable sort can do the job faster (at the very least not slower) then a good stable sort.

I'm looking forward to a version of Introsort that Xinok has in mind as a "Q-sort fix".



-- 
Dmitry Olshansky
April 25, 2013
On Friday, 19 April 2013 at 22:37:45 UTC, Ivan Kazmenko wrote:
> So, why isn't TimSort the default?

I would actually argue for this, not for safety (introsort is an adequate solution) but for different reasons. Timsort is stable and it's faster on data with low entropy, being nearly instantaneous on already sorted lists. I would guess it's the better choice for most cases. Then for those cases where stable sorting isn't necessary and unstable sorting would be faster, the user could choose the second option.
1 2 3
Next ›   Last »