Thread overview
A few range questions
Jan 05, 2016
Charles Smith
Jan 05, 2016
Gary Willoughby
Jan 05, 2016
Ali Çehreli
Jan 05, 2016
Charles Smith
Jan 05, 2016
Ali Çehreli
Jan 05, 2016
Charles Smith
January 05, 2016
Hi,

I'm trying to implement counting sort with ranges; however, hit a roadblock here while testing. I'm using the following code:

    int[1_000_000] arr;

    void main() {
        import std.conv : to;
        import std.datetime;
        import std.random;
        import std.stdio;

        for(int i; i < arr.length; ++i)
            arr[i] = uniform(0, 1000);

        auto benchmarks = benchmark!(algorithmSort, countingSort)(1);

        writeln("Agorithm's Sort: ", to!Duration(benchmarks[0]));
        writeln("Counting Sort: ", to!Duration(benchmarks[1]));
    }

    void algorithmSort() {
        import std.algorithm : sort;

        auto result = arr[].sort;
    }

    void countingSort() {
        import std.algorithm : count, map;
        import std.range : array, chain, iota, repeat;

        auto result = 1_000.iota.map!(a => a.repeat(count(arr[], a))).chain.array;
    }

1. `arr[].sort` is changing arr in place. Any way to not do that?
2. I noticed that result within `countingSort` is an array of arrays. I thought `chain` would concat them... did I miss something obvious?
3. It's kind of hard to compare performance because of (1), but is there a better way to do this?
January 05, 2016
On Tuesday, 5 January 2016 at 18:47:30 UTC, Charles Smith wrote:
> 1. `arr[].sort` is changing arr in place. Any way to not do that?

Use this instead:

auto result = sort(arr[].dup);

.dup duplicates the array and sort(...) uses the std.algorithm sort and not the built-in array sort method.

> 2. I noticed that result within `countingSort` is an array of arrays. I thought `chain` would concat them... did I miss something obvious?

No idea yet.

January 05, 2016
On 01/05/2016 10:47 AM, Charles Smith wrote:

>          auto result = 1_000.iota.map!(a => a.repeat(count(arr[],
> a))).chain.array;

You are traversing the array per index value (1000 times), which can be done once up front:

enum elementCount = 1_000_000;
enum valueCount = 1000;

void main() {
    import std.conv : to;
    import std.datetime;
    import std.random;
    import std.stdio;

    int[elementCount] arr;

    for(int i; i < arr.length; ++i)
        arr[i] = uniform(0, valueCount);

    // Now they get their own arrays:
    auto benchmarks = benchmark!(() => algorithmSort(arr.dup),
                                 () => countingSort(arr.dup))(1);

    writeln("Algorithm's Sort: ", to!Duration(benchmarks[0]));
    writeln("Counting Sort   : ", to!Duration(benchmarks[1]));
}

auto algorithmSort(int[] arr) {
    import std.algorithm : sort, sum;

    auto result = sort(arr[]);
    return result.sum;
}

auto countingSort(int[] arr) {
    import std.algorithm : count, map, joiner, sum, each;
    import std.range : array, repeat, enumerate;

    uint[valueCount] hist;
    arr.each!(a => ++hist[a]);

    auto result = hist[].enumerate.map!(t => t[0].repeat(t[1])).joiner;
    // To make sure that we consume the lazy range:
    return result.sum;
}

> 2. I noticed that result within `countingSort` is an array of arrays. I
> thought `chain` would concat them... did I miss something obvious?

I think .joiner is what you're looking for.

> 3. It's kind of hard to compare performance because of (1), but is there
> a better way to do this?

I changed the code to pass each function a different array.

My results with -O -inline -noboundscheck:

Algorithm's Sort: 76 ms, 910 μs, and 8 hnsecs
Counting Sort   : 21 ms, 563 μs, and 9 hnsecs

Ali

January 05, 2016
On Tuesday, 5 January 2016 at 19:42:42 UTC, Ali Çehreli wrote:
> You are traversing the array per index value (1000 times), which can be done once up front:

Good point. I guess I assumed `map!` was magic.


>
>     // Now they get their own arrays:
>     auto benchmarks = benchmark!(() => algorithmSort(arr.dup),
>                                  () => countingSort(arr.dup))(1);
>

I'm not sure what this explicitly is doing... Are you defining an anonymous function? If so, that seems really useful.

>
>     uint[valueCount] hist;
>     arr.each!(a => ++hist[a]);
>
>     auto result = hist[].enumerate.map!(t => t[0].repeat(t[1])).joiner;
>     // To make sure that we consume the lazy range:
>     return result.sum;
> }
> I think .joiner is what you're looking for.
>

Thanks a lot for this. Coming from a webdev background I'm ashamed I didn't guess `join`. That said, I don't think the example for `chain` should compile then. Just as an aside, I was searching the terms `concat` and `flatten ` when I was looking for this.

> > 3. It's kind of hard to compare performance because of (1),
> but is there
> > a better way to do this?
>
> I changed the code to pass each function a different array.
>
> My results with -O -inline -noboundscheck:
>
> Algorithm's Sort: 76 ms, 910 μs, and 8 hnsecs
> Counting Sort   : 21 ms, 563 μs, and 9 hnsecs
>
> Ali

Awesome

January 05, 2016
On 01/05/2016 01:48 PM, Charles Smith wrote:
> On Tuesday, 5 January 2016 at 19:42:42 UTC, Ali Çehreli wrote:

>>     // Now they get their own arrays:
>>     auto benchmarks = benchmark!(() => algorithmSort(arr.dup),
>>                                  () => countingSort(arr.dup))(1);
>>
>
> I'm not sure what this explicitly is doing... Are you defining an
> anonymous function? If so, that seems really useful.

Yes. Since benchmark() expects no-parameter functions, that's a useful way of testing functions that take parameters.

However, I should have created the arrays before calling benchmark(). Otherwise, the timings include .dup as well:

    auto asArr = arr.dup;
    auto csArr = arr.dup;

    auto benchmarks = benchmark!(() => algorithmSort(asArr),
                                 () => countingSort(csArr))(10);

>> I think .joiner is what you're looking for.
> I was searching the terms `concat` and `flatten ` when I was
> looking for this.

Yep, I always start searching for 'flatten' and then remember joiner. :p

> That said, I don't think the example for `chain` should compile then.

That worked because chain() received just a single range (of ranges), in which case it had no effect.

>> My results with -O -inline -noboundscheck:
>>
>> Algorithm's Sort: 76 ms, 910 μs, and 8 hnsecs
>> Counting Sort   : 21 ms, 563 μs, and 9 hnsecs

> Awesome

I am amazed! :) countingSort() beats algorithmSort() almost always. :) Here is the final version of the program with 10 million elements and 4 million values (array that are so large cannot be allocated on the stack for me; so, I used new):

enum elementCount = 10_000_000;
enum valueCount = 4_000_000;

void main() {
    import std.conv : to;
    import std.datetime;
    import std.random;
    import std.stdio;

    auto arr = new int[elementCount];

    for(int i; i < arr.length; ++i)
        arr[i] = uniform(0, valueCount);

    auto asArr = arr.dup;
    auto csArr = arr.dup;

    // Now they get their own arrays:
    auto benchmarks = benchmark!(() => algorithmSort(asArr),
                                 () => countingSort(csArr))(10);

    writeln("Algorithm's Sort: ", to!Duration(benchmarks[0]));
    writeln("Counting Sort   : ", to!Duration(benchmarks[1]));
}

auto algorithmSort(int[] arr) {
    import std.algorithm : sort, sum;

    arr.sort();
    return arr.sum;
}

auto countingSort(int[] arr) {
    import std.algorithm : count, map, joiner, sum, each;
    import std.range : array, repeat, enumerate;

    auto hist = new uint[valueCount];
    arr.each!(a => ++hist[a]);

    auto result = hist[].enumerate.map!(t => t[0].repeat(t[1])).joiner;
    return result.sum;
}

Now they are comparable:

Algorithm's Sort: 3 secs, 881 ms, 146 μs, and 1 hnsec
Counting Sort   : 3 secs, 990 ms, 315 μs, and 4 hnsecs

Ali

January 05, 2016
On Tuesday, 5 January 2016 at 22:34:52 UTC, Ali Çehreli wrote:
> >> Algorithm's Sort: 76 ms, 910 μs, and 8 hnsecs
> >> Counting Sort   : 21 ms, 563 μs, and 9 hnsecs
>
> > Awesome
>
> I am amazed! :) countingSort() beats algorithmSort() almost always. :)

Yeah, I've been reading an algorithm book, and this was in there.

It has O(elementCount + valueCount) time complexity under a correct implementation, so it is fast. Also its best case is quick sort's worst case (lots of duplicated data), at which point it has O(elementCount) time complexity. I think D's sort uses timsort, so I'd expect it to be no worse.