Thread overview
Zipped sorting
Sep 25, 2012
bearophile
Sep 25, 2012
cal
Sep 25, 2012
bearophile
Sep 25, 2012
Dmitry Olshansky
Sep 25, 2012
bearophile
Sep 25, 2012
Dmitry Olshansky
Sep 27, 2012
bearophile
September 25, 2012
This line of code sorts two arrays in "lock step", according to the items of the first array:

zip(first, second).sort!q{a[0] < b[0]}();

This code is handy, nice looking, short and sufficiently easy to recognize once you have seen it before.

It's one case where D code is better than a Python version (and this returns two tuples instead of two lists):

first, second = zip(*sorted(izip(first, second), key=itemgetter(0)))

But with DMD on shortish arrays of about 20-30 items (each one 24 bytes long) I've seen that D code 3-5 times slower (to be sure of such timings I have to write down a proper benchmark) than sorting the items with simple manually written sorting D code, like a bubble sort or something similar (where inside swaps items of both arrays). So if this sorting is done in inner loops or critical parts of code, you can't use that zip+sort :-(

Bye,
bearophile
September 25, 2012
On Tuesday, 25 September 2012 at 02:17:53 UTC, bearophile wrote:
> This line of code sorts two arrays in "lock step", according to the items of the first array:
>
> zip(first, second).sort!q{a[0] < b[0]}();

Tangential to your point, but I hadn't seen the q{} syntax used before. Are there reasons to favor this over other string literals?
September 25, 2012
cal:

> but I hadn't seen the q{} syntax used before. Are there reasons to favor this over other string literals?

That's essentially a string literal, but it avoids the editor to color it as an uniform string, and they must contain valid D tokens: http://dlang.org/lex.html

Bye,
bearophile
September 25, 2012
On 25-Sep-12 06:18, bearophile wrote:
> This line of code sorts two arrays in "lock step", according to the
> items of the first array:
>
> zip(first, second).sort!q{a[0] < b[0]}();
>
> This code is handy, nice looking, short and sufficiently easy to
> recognize once you have seen it before.
>

Analyzing asm dump should help. But either way zip-sort heavily relies on proper inlining and I suspect it's not fully "unrolled".
Did you try DMD or other compilers?

> It's one case where D code is better than a Python version (and this
> returns two tuples instead of two lists):
>
> first, second = zip(*sorted(izip(first, second), key=itemgetter(0)))
>
> But with DMD on shortish arrays of about 20-30 items (each one 24 bytes
> long) I've seen that D code 3-5 times slower (to be sure of such timings
> I have to write down a proper benchmark) than sorting the items with
> simple manually written sorting D code, like a bubble sort or something
> similar (where inside swaps items of both arrays). So if this sorting is
> done in inner loops or critical parts of code, you can't use that
> zip+sort :-(
>
> Bye,
> bearophile


-- 
Dmitry Olshansky
September 25, 2012
Dmitry Olshansky:

> Analyzing asm dump should help.

In this case it's a good amount of asm code.


> But either way zip-sort heavily relies on proper inlining and I suspect it's not fully "unrolled".

I don't know if that's enough.


> Did you try DMD or other compilers?

In past I have used LDC often, but nowadays I use only DMD.

Bye,
bearophile
September 25, 2012
On 25-Sep-12 15:23, bearophile wrote:
> Dmitry Olshansky:
>
>> Analyzing asm dump should help.
>
> In this case it's a good amount of asm code.
>

This sounds like you are not interested in the cause of this at all ;)

I've dig it and it looks mostly fine. However for a small number of elements it will most of the time will fall through to the insertion sort, so the analysis  can be limited to just testing your bubble/etc. sort vs Phobos equivalent. Just grab optimisticInsertionSort out of Phobos std.algorithm and compare.

>> But either way zip-sort heavily relies on proper inlining and I
>> suspect it's not fully "unrolled".
>
> I don't know if that's enough.

Well the other way to test is sort array of tuples with std.sort
and then with your function. Also it's important to see the actual code that you compare otherwise we are just wasting time here.

>> Did you try DMD or other compilers?
>
> In past I have used LDC often, but nowadays I use only DMD.
Well I'd try GDC just to check if it's a codegen/optimizer issue.

-- 
Dmitry Olshansky
September 27, 2012
Dmitry Olshansky:

> This sounds like you are not interested in the cause of this at all ;)

Right, this I was a little lazy, because I was a bit depressed because of similar problems. I will do more tests.

Later,
bearophile