Porting my Integer Sorting Algorithms to D - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Porting my Integer Sorting Algorithms to D

Thread overview

Porting my Integer Sorting Algorithms to D
Feb 23, 2014 Nordlöw
Feb 23, 2014 Russel Winder
Feb 23, 2014 Nordlöw
Feb 23, 2014 Nordlöw
Feb 23, 2014 Nordlöw
Feb 23, 2014 Russel Winder
Feb 23, 2014 Nordlöw
Feb 23, 2014 Russel Winder
Feb 23, 2014 Nordlöw
Feb 23, 2014 francesco cattoglio
Feb 23, 2014 Russel Winder
Feb 24, 2014 Nordlöw

February 23, 2014

Porting my Integer Sorting Algorithms to D

Posted by Nordlöw

Nordlöw

I have a couple of self-implemented C++ integer sort algorithms lying around in my codebase.

I also have a parallel merge sort on top of them that uses Intel TBB to give some
further speedups.

I have tweaked them to also work for floats and doubles, through some interesting bit-fiddling tips I found on the net.

Would anybody be interested in reviewing these to give suggestions on how to best port it to Phobos?

/Per

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Russel Winder
in reply to Nordlöw

Russel Winder

Posted in reply to Nordlöw

On Sun, 2014-02-23 at 14:09 +0000, "Nordlöw" wrote:
> I have a couple of self-implemented C++ integer sort algorithms lying around in my codebase.

What is the basic sort algorithm? Radix sort is generally seen as the best for sorting integer values currently. But that doesn't mean there is better, just that that is the one to beat.

Which requires benchmarks. Which requires a framework for running
benchmarks. As far as I am aware D hasn't got one of these as yet, but
it needs one. Such things exists in Python (I am currently playing with
benchmark) and , I am sure, other dynamic languages. It should be
relatively easy to do something with D.

> I also have a parallel merge sort on top of them that uses Intel
> TBB to give some
> further speedups.

std.parallelism needs some work to compete with TBB. Though I am not sure "like" is a term I would use for the TBB API.

> I have tweaked them to also work for floats and doubles, through some interesting bit-fiddling tips I found on the net.
> 
> Would anybody be interested in reviewing these to give suggestions on how to best port it to Phobos?

I guess I just volunteered.

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Nordlöw
in reply to Russel Winder

Nordlöw

Posted in reply to Russel Winder

On Sunday, 23 February 2014 at 15:10:36 UTC, Russel Winder wrote:
> On Sun, 2014-02-23 at 14:09 +0000, "Nordlöw" wrote:
>> I have a couple of self-implemented C++ integer sort algorithms lying around in my codebase.
>
> What is the basic sort algorithm? Radix sort is generally seen as the
> best for sorting integer values currently. But that doesn't mean there
> is better, just that that is the one to beat.
>
> Which requires benchmarks. Which requires a framework for running
> benchmarks. As far as I am aware D hasn't got one of these as yet, but
> it needs one. Such things exists in Python (I am currently playing with
> benchmark) and , I am sure, other dynamic languages. It should be
> relatively easy to do something with D.
> 
>> I also have a parallel merge sort on top of them that uses Intel TBB to give some
>> further speedups.
>
> std.parallelism needs some work to compete with TBB. Though I am not
> sure "like" is a term I would use for the TBB API.
>
>> I have tweaked them to also work for floats and doubles, through some interesting bit-fiddling tips I found on the net.
>> 
>> Would anybody be interested in reviewing these to give suggestions on how to best port it to Phobos?
>
> I guess I just volunteered.

The non-inplace radix sort algorithm is the most interesting. I have a test and benchmarking suite along with it. Here's an extract of it using GCC 4.8.2 with -O3 settings on my Intel Quad-Core with 8 Hyperthreads. Results are promising. These benchmarks have been verfied to produce correct results:

Element Count: 1000000
Number of tries: 5
Do in place: 0
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float    1947812us 16 4.27x 4.02x 4.87x 4.37x
double   2022670us 16 2.30x 2.09x 3.25x 2.31x
uint8_t  1673208us  8 10.4x 10.0x 9.32x 10.6x
uint16_t 1800614us 16 10.1x 9.37x 9.52x 10.1x
uint32_t 1936704us 16 5.45x 4.99x 5.77x 5.42x

Should I place the code on a Github repo or send it to you by email?

Thx,
Per

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Nordlöw
in reply to Nordlöw

Nordlöw

Posted in reply to Nordlöw

On Sunday, 23 February 2014 at 16:16:48 UTC, Nordlöw wrote:
> On Sunday, 23 February 2014 at 15:10:36 UTC, Russel Winder wrote:
>> On Sun, 2014-02-23 at 14:09 +0000, "Nordlöw" wrote:
>>> I have a couple of self-implemented C++ integer sort algorithms lying around in my codebase.
>>
>> What is the basic sort algorithm? Radix sort is generally seen as the
>> best for sorting integer values currently. But that doesn't mean there
>> is better, just that that is the one to beat.
>>
>> Which requires benchmarks. Which requires a framework for running
>> benchmarks. As far as I am aware D hasn't got one of these as yet, but
>> it needs one. Such things exists in Python (I am currently playing with
>> benchmark) and , I am sure, other dynamic languages. It should be
>> relatively easy to do something with D.
>> 
>>> I also have a parallel merge sort on top of them that uses Intel TBB to give some
>>> further speedups.
>>
>> std.parallelism needs some work to compete with TBB. Though I am not
>> sure "like" is a term I would use for the TBB API.
>>
>>> I have tweaked them to also work for floats and doubles, through some interesting bit-fiddling tips I found on the net.
>>> 
>>> Would anybody be interested in reviewing these to give suggestions on how to best port it to Phobos?
>>
>> I guess I just volunteered.
>
> The non-inplace radix sort algorithm is the most interesting. I have a test and benchmarking suite along with it. Here's an extract of it using GCC 4.8.2 with -O3 settings on my Intel Quad-Core with 8 Hyperthreads. Results are promising. These benchmarks have been verfied to produce correct results:
>
> Element Count: 1000000
> Number of tries: 5
> Do in place: 0
> ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
> float    1947812us 16 4.27x 4.02x 4.87x 4.37x
> double   2022670us 16 2.30x 2.09x 3.25x 2.31x
> uint8_t  1673208us  8 10.4x 10.0x 9.32x 10.6x
> uint16_t 1800614us 16 10.1x 9.37x 9.52x 10.1x
> uint32_t 1936704us 16 5.45x 4.99x 5.77x 5.42x
>
> Should I place the code on a Github repo or send it to you by email?
>
> Thx,
> Per

After having take a look at algorithm.d I guess we should use CTFE-logic to autoconfigure to use radix sort when ElementType isInteger or float or double. string and real could probably be sorted aswell.

To make this configuration optimal an optional pre-configuration benchmarking pass may be need in order to give optimal speeds. Such a pass should for each sort implementation figure out a suitable limit [low,high] for the size range being sorted.

/Per

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Nordlöw
in reply to Nordlöw

Nordlöw

Posted in reply to Nordlöw

I did some more benchmarking:

These are all using non-inplace radix sort.
This can be useful in some cases where in-place is not needed.

Performance gain increase with size input array compared to std::sort as radix sort in "some regard" is O(n) and quick sort is O(n*log n).

Break sizes (when radix sort wins) for different element types:
uint8: <1024
uint16: <1024
uint32: >= 4096
float: >= 8192
double: >= 16384

Number of tries: 3
Do in place: 0
Element Count: 1024
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 2390us 16 0.483x 0.892x 0.386x 0.898x
double 809us 16 0.152x 0.155x 0.059x 0.0822x
uint8_t 1104us 8 10.9x 10.9x 1.65x 2.9x
uint16_t 1813us 16 1.38x 1.4x 0.889x 1.44x
uint32_t 707us 16 0.286x 0.281x 0.137x 0.231x
Element Count: 2048
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 1511us 16 0.518x 0.513x 0.21x 0.28x
double 1520us 16 0.268x 0.27x 0.0997x 0.144x
uint8_t 2603us 8 11.8x 11.8x 3.81x 12.2x
uint16_t 1451us 16 1.03x 1.1x 0.675x 1.03x
uint32_t 1531us 16 0.592x 0.575x 0.44x 0.583x
Element Count: 4096
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 3099us 16 0.939x 0.904x 0.785x 0.925x
double 3411us 16 0.504x 0.483x 0.13x 0.392x
uint8_t 2777us 8 6.69x 6.64x 1.7x 7.37x
uint16_t 3133us 16 1.91x 1.9x 0.458x 1.46x
uint32_t 3681us 16 1.15x 1.14x 0.229x 0.539x
Element Count: 8192
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 6760us 16 1.49x 1.54x 0.422x 0.921x
double 7285us 16 0.84x 0.825x 0.333x 0.897x
uint8_t 5895us 8 7.19x 6.89x 5.59x 7.77x
uint16_t 6730us 16 3.18x 3.23x 1.5x 3.52x
uint32_t 7095us 16 1.75x 1.7x 0.4x 1.89x
Element Count: 16384
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 14410us 16 2.28x 2.19x 0.732x 2.29x
double 15066us 16 1.19x 1.08x 0.539x 1.23x
uint8_t 12104us 8 7.8x 7.65x 7.22x 7.84x
uint16_t 14570us 16 4.81x 4.69x 0.906x 4.96x
uint32_t 15034us 16 2.61x 2.48x 0.537x 2.6x
Element Count: 32768
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 30764us 16 2.82x 2.62x 2.61x 2.67x
double 31569us 16 1.6x 1.46x 0.764x 1.58x
uint8_t 26176us 8 8.23x 8.52x 2.8x 8.49x
uint16_t 30947us 16 6.45x 6.13x 6.32x 6.91x
uint32_t 32785us 16 3.53x 3.39x 1.09x 3.89x
Element Count: 65536
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 63166us 16 3.24x 3.12x 2.12x 3.25x
double 71154us 16 1.83x 1.81x 1.66x 2.02x
uint8_t 56285us 8 8.52x 8.43x 4.22x 9.35x
uint16_t 64761us 16 7.68x 8.02x 2.54x 8.32x
uint32_t 68576us 16 4.16x 3.93x 2.45x 4.4x
Element Count: 131072
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 137296us 16 3.42x 3.47x 3.13x 3.73x
double 139881us 16 2.08x 1.85x 1.93x 2.08x
uint8_t 111393us 8 9.04x 9.15x 5.99x 9.45x
uint16_t 128848us 16 8.47x 8.12x 4.65x 8.53x
uint32_t 134586us 16 4.64x 4.29x 2.37x 4.56x
Element Count: 262144
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 275099us 16 3.87x 3.66x 3.02x 3.86x
double 294763us 16 2.22x 1.93x 2.82x 2.18x
uint8_t 233671us 8 9.53x 9.65x 5.06x 9.92x
uint16_t 259553us 16 8.93x 8.43x 5.6x 9.33x
uint32_t 282111us 16 5.16x 4.57x 4.37x 5.09x
Element Count: 524288
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 573934us 16 3.96x 3.8x 3.25x 4.05x
double 622996us 16 2.16x 2.05x 2.74x 2.29x
uint8_t 482522us 8 10x 9.82x 5.71x 10.1x
uint16_t 525446us 16 9.3x 8.76x 5.64x 9.49x
uint32_t 595237us 16 5.42x 4.95x 4.86x 5.44x
Element Count: 1048576
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 1202785us 16 4.22x 3.99x 4.9x 4.24x
double 1322048us 16 2.39x 2.13x 3.51x 2.45x
uint8_t 1009387us 8 10.3x 10.2x 8.83x 10.5x
uint16_t 1070777us 16 9.6x 9x 9.45x 9.59x
uint32_t 1242144us 16 5.37x 5.09x 6.07x 5.54x
Element Count: 2097152
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 2539646us 16 4.42x 4.15x 5.77x 4.46x
double 2686074us 16 2.46x 2.19x 3.64x 2.45x
uint8_t 2136319us 8 10.8x 10.7x 9.93x 10.8x
uint16_t 2199277us 16 9.7x 9x 11.1x 9.6x
uint32_t 2619062us 16 5.78x 5.28x 7.01x 5.83x
Element Count: 4194304
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 5388683us 16 4.64x 4.46x 6.76x 4.72x
double 5617282us 16 2.54x 2.29x 4x 2.56x
uint8_t 4449453us 8 11.1x 10.9x 12.7x 11.2x
uint16_t 4546878us 16 9.96x 9.24x 12.8x 9.91x
uint32_t 5466031us 16 5.87x 5.43x 7.94x 5.92x
Element Count: 8388608
ElementType Reference(std::sort) Radix radix_sort descending_radix_sort parallel_radix_sort tbb_parallel_radix_sort
float 11029468us 16 4.73x 4.51x 6.91x 4.75x
double 11801211us 16 2.65x 2.38x 4.21x 2.62x
uint8_t 9363439us 8 11.4x 11.3x 13.5x 11.4x
uint16_t 9569329us 16 10.2x 9.4x 14x 10x
uint32_t 11413279us 16 6.05x 5.56x 8.58x 5.94x

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Russel Winder
in reply to Nordlöw

Russel Winder

Posted in reply to Nordlöw

On Sun, 2014-02-23 at 16:16 +0000, "Nordlöw" wrote:
[…]
> 
> Should I place the code on a Github repo or send it to you by email?

BitBucket or GitHub work for me: if we keep this public others may join in…

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Russel Winder
in reply to Nordlöw

Russel Winder

Posted in reply to Nordlöw

On Sun, 2014-02-23 at 16:59 +0000, "Nordlöw" wrote:
> I did some more benchmarking:

We definitely need a benchmarking framework so as to ensure we can have multiple runs of each datapoint calculating mean and standard deviation for each data point, and for grouping and ANOVA. I am guessing there isn't already one.

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Nordlöw
in reply to Russel Winder

Nordlöw

Posted in reply to Russel Winder

I guess working on a branch of Phobos (std.algorithm) is preferrable when we get to the D implementation right?

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Russel Winder
in reply to Nordlöw

Russel Winder

Posted in reply to Nordlöw

On Sun, 2014-02-23 at 17:33 +0000, "Nordlöw" wrote:
> I guess working on a branch of Phobos (std.algorithm) is preferrable when we get to the D implementation right?

I just asked effectively that question re std.parallelism on the DMD Concurrency list, only to find I wasn't a member so the email is pending moderation.

I am hoping there is a way of working on a self-contained component of Phobos without having to clone Phobos and do a feature branch.
-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

February 23, 2014

Re: Porting my Integer Sorting Algorithms to D

Posted by Nordlöw
in reply to Russel Winder

Nordlöw

Posted in reply to Russel Winder

I guess this component would be integrated into std.algorithm when its ready, right?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation