Thread overview
qsort
Jul 13, 2005
Mike
Jul 13, 2005
Andrew Fedoniouk
Jul 13, 2005
Uwe Salomon
Jul 14, 2005
Andrew Fedoniouk
Jul 13, 2005
Uwe Salomon
Jul 14, 2005
Jan-Eric Duden
Jul 14, 2005
Ben Hinkle
Jul 14, 2005
Uwe Salomon
Jul 15, 2005
Andrew Fedoniouk
July 13, 2005
has anyone written a qsort() for D? We need one that supports delegates.

If not I will probably have to write one myself.


July 13, 2005
"Mike" <Mike_member@pathlink.com> wrote in message news:db34a9$9hs$1@digitaldaemon.com...
> has anyone written a qsort() for D? We need one that supports delegates.
>
> If not I will probably have to write one myself.
>
>

harmonia/tools.d

// "smart" sort implementation, works 1.3 times faster than D builtin sort

template sort(T)
{
  void sort(T[] arr, int function(in T l, in T r) cmp)
  {
      const int quick_sort_threshold = 9;

      if(arr.length < 2) return;

      void swap_if_less( inout T t1, inout T t2 )
      {
        if( cmp(t1, t2) < 0 ) { T t = t1; t1 = t2; t2 = t; }
      }
      void swap( inout T t1, inout T t2 )
      {
        T t = t1; t1 = t2; t2 = t;
      }

      int  stack[80];
      int* top = stack;
      int  limit = arr.length;
      int  base = 0;

      for(;;)
      {
          int len = limit - base;

          int i;
          int j;
          int pivot;

          if(len > quick_sort_threshold)
          {
              // we use base + len/2 as the pivot
              pivot = base + len / 2;
              swap(arr[base], arr[pivot]);

              i = base + 1;
              j = limit - 1;

              // now ensure that *i <= *base <= *j
              swap_if_less(arr[j], arr[i]);
              swap_if_less(arr[base],arr[i]);
              swap_if_less(arr[j],arr[base]);

              for(;;)
              {
                  do i++; while( cmp(arr[i],arr[base]) < 0 );
                  do j--; while( cmp(arr[base],arr[j]) < 0 );
                  if( i > j )
                  {
                      break;
                  }
                  swap(arr[i], arr[j]);
              }

              swap(arr[base], arr[j]);

              // now, push the largest sub-array
              if(j - base > limit - i)
              {
                  top[0] = base;
                  top[1] = j;
                  base   = i;
              }
              else
              {
                  top[0] = i;
                  top[1] = limit;
                  limit  = j;
              }
              top += 2;
          }
          else
          {
              // the sub-array is small, perform insertion sort
              j = base;
              i = j + 1;

              for(; i < limit; j = i, i++)
              {
                  for(; cmp(arr[j + 1],arr[j]) < 0; j--)
                  {
                      swap(arr[j + 1],arr[j]);
                      if(j == base)
                          break;
                  }
              }
              if(top > stack)
              {
                  top  -= 2;
                  base  = top[0];
                  limit = top[1];
              }
              else
                  break;
          }
      }
  }
}


July 13, 2005
> harmonia/tools.d
>
> // "smart" sort implementation, works 1.3 times faster than D builtin sort

Did you ever test it with input data like this:

1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6

That means a lot of sorted elements, and a bunch of unsorted afterwards.

Ciao
uwe
July 13, 2005
> // "smart" sort implementation, works 1.3 times faster than D builtin sort

You can also look into Indigo. List has a very good QuickSort, but it is a little bit difficult to read because of the specialties of the List implementation, and it does not support special sorting functions. But i have included such a function for Vector, and after changing it to receive the comparison function as a template argument instead of a regular parameter, it needs 70% of the time of the D builtin sort. And it has the great advantage that it is linear for radix sort cases (i.e. a lot of elements, but only few different values: 3 3 4 5 2 2 2 2 3 4 7 5 3 4 5 2 2 2 etc.).

But keep in mind: The library is LGPL'd, and as it is a derivative work, your program has to be conformant to the GPL.

Ciao
uwe
July 14, 2005
"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.stvdh7qm6yjbe6@sandmann.maerchenwald.net...
>> harmonia/tools.d
>>
>> // "smart" sort implementation, works 1.3 times faster than D builtin sort
>
> Did you ever test it with input data like this:
>
> 1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6
>
> That means a lot of sorted elements, and a bunch of unsorted afterwards.

I did.

'sort!' is mine, 'sort' is D's builtin.
less number - the better.

sequence from 1 to 1_000_000

  sort! - 203 ms
  sort  - 265 ms

takes 76% of D time

sequence from 1_000_000 to 1

  sort! - 125 ms
  sort  - 171 ms

takes 71% of D time
-------------------------------------------------------------
Theory is here:
http://www.seeingwithc.org/topic2html.html
-------------------------------------------------------------

ATTN!!!: As algorithms like this
belong to the whole humanity then my
implementation is completely free :-P

------------------------------------------------------------
//1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6

int icmp(in int l, in int r) { return l - r; }

int main(char[][] args)
{
    ProcessTimesCounter counter = new ProcessTimesCounter();
    const uint size = 1000000;
    int[] data = new int[size];
    for(uint i = 0; i < size; ++i)
        data[i] = i;

   data[size - 6] = 13;
   data[size - 5] = 12;
   data[size - 4] = 9;
   data[size - 3] = -1;
   data[size - 2] = 7;
   data[size - 1] = 6;

    counter.start();
           sort!(int)(data,&icmp);
           //data.sort;
    counter.stop();
    ProcessTimesCounter.interval_type  ms1 = counter.milliseconds();
    writef("ms=%d\n",ms1);

    return 0;
}

Andrew.
http://terrainformatica.com




July 14, 2005
Andrew Fedoniouk wrote:
> "Mike" <Mike_member@pathlink.com> wrote in message news:db34a9$9hs$1@digitaldaemon.com...
> 
>>has anyone written a qsort() for D? We need one that supports delegates.
>>
>>If not I will probably have to write one myself.
>>
>>
> 
> 
> harmonia/tools.d
> 
> // "smart" sort implementation, works 1.3 times faster than D builtin sort
> 
> template sort(T)
> {
>   void sort(T[] arr, int function(in T l, in T r) cmp)
>   {
>       const int quick_sort_threshold = 9;
> 
>       if(arr.length < 2) return;
> 
>       void swap_if_less( inout T t1, inout T t2 )
>       {
>         if( cmp(t1, t2) < 0 ) { T t = t1; t1 = t2; t2 = t; }
>       }
>       void swap( inout T t1, inout T t2 )
>       {
>         T t = t1; t1 = t2; t2 = t;
>       }
> 
>       int  stack[80];
>       int* top = stack;
>       int  limit = arr.length;
>       int  base = 0;
> 
>       for(;;)
>       {
>           int len = limit - base;
> 
>           int i;
>           int j;
>           int pivot;
> 
>           if(len > quick_sort_threshold)
>           {
>               // we use base + len/2 as the pivot
>               pivot = base + len / 2;
>               swap(arr[base], arr[pivot]);
> 
>               i = base + 1;
>               j = limit - 1;
> 
>               // now ensure that *i <= *base <= *j
>               swap_if_less(arr[j], arr[i]);
>               swap_if_less(arr[base],arr[i]);
>               swap_if_less(arr[j],arr[base]);
> 
>               for(;;)
>               {
>                   do i++; while( cmp(arr[i],arr[base]) < 0 );
>                   do j--; while( cmp(arr[base],arr[j]) < 0 );
>                   if( i > j )
>                   {
>                       break;
>                   }
>                   swap(arr[i], arr[j]);
>               }
> 
>               swap(arr[base], arr[j]);
> 
>               // now, push the largest sub-array
>               if(j - base > limit - i)
>               {
>                   top[0] = base;
>                   top[1] = j;
>                   base   = i;
>               }
>               else
>               {
>                   top[0] = i;
>                   top[1] = limit;
>                   limit  = j;
>               }
>               top += 2;
>           }
>           else
>           {
>               // the sub-array is small, perform insertion sort
>               j = base;
>               i = j + 1;
> 
>               for(; i < limit; j = i, i++)
>               {
>                   for(; cmp(arr[j + 1],arr[j]) < 0; j--)
>                   {
>                       swap(arr[j + 1],arr[j]);
>                       if(j == base)
>                           break;
>                   }
>               }
>               if(top > stack)
>               {
>                   top  -= 2;
>                   base  = top[0];
>                   limit = top[1];
>               }
>               else
>                   break;
>           }
>       }
>   }
> }
> 
> 


Your Qsort has one draw-back its O(N^2) (worst-case). You can fix this, by switching to heap sort if your stack gets too large.

:)

Regards,
Jan-Eric
July 14, 2005
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:db3h2l$k2s$1@digitaldaemon.com...
>
> "Mike" <Mike_member@pathlink.com> wrote in message news:db34a9$9hs$1@digitaldaemon.com...
>> has anyone written a qsort() for D? We need one that supports delegates.
>>
>> If not I will probably have to write one myself.
>
> harmonia/tools.d
>
> // "smart" sort implementation, works 1.3 times faster than D builtin sort

Have you analyzed why it is faster? I'd be curious. My first guess from looking at phobos/internal/qsort.d is that the templated version has more inlinable code. The algorithm looks the same to me at first glance. Two differences I notice are a quicker switch to insertion sort (9 vs 7) and inlined swap (dmd's calls the typeinfo's swap).


July 14, 2005
> Have you analyzed why it is faster? I'd be curious. My first guess from
> looking at phobos/internal/qsort.d is that the templated version has more
> inlinable code. The algorithm looks the same to me at first glance. Two
> differences I notice are a quicker switch to insertion sort (9 vs 7) and
> inlined swap (dmd's calls the typeinfo's swap).

That should be the reason. Function calls are quite expensive. I have switched my quicksort from receiving the comparison function as a normal parameter to receiving it as a template parameter, and that has increased its speed from 1.2 times Phobos to 0.7 times Phobos (the algorithm is quite different, too, but that is not relevant for this comparison).

Ciao
uwe
July 15, 2005
"Ben Hinkle" <ben.hinkle@gmail.com> wrote in message news:db5mhn$2lfo$1@digitaldaemon.com...
> "Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:db3h2l$k2s$1@digitaldaemon.com...
>>
>> "Mike" <Mike_member@pathlink.com> wrote in message news:db34a9$9hs$1@digitaldaemon.com...
>>> has anyone written a qsort() for D? We need one that supports delegates.
>>>
>>> If not I will probably have to write one myself.
>>
>> harmonia/tools.d
>>
>> // "smart" sort implementation, works 1.3 times faster than D builtin sort
>
> Have you analyzed why it is faster? I'd be curious. My first guess from looking at phobos/internal/qsort.d is that the templated version has more inlinable code. The algorithm looks the same to me at first glance. Two differences I notice are a quicker switch to insertion sort (9 vs 7) and inlined swap (dmd's calls the typeinfo's swap).
>
>

It is all about function calls:
I've removed cmp comparator function and result is even more spectacular:

'sort!' is mine,
'sort!-inline', is mine but with inlined cmp function,
'sort' is D's builtin.

less number - the better.

sequence from 1 to 1_000_000

  sort!-inline - 78 ms
  sort! - 203 ms
  sort  - 265 ms

'sort!'  takes 76% of D time
'sort!-inline'  takes 29% of D's builtin sort time

sequence from 1_000_000 to 1

  sort!-inline - 46 ms
  sort! - 125 ms
  sort  - 171 ms

'sort!'  takes 71% of D time
'sort!-inline'  26% of D time

--------------
template sort(T)
{
  void sort(T[] arr)
  {
      int cmp(in T l, in T r) { return l - r; }
      .....................
  }
}