Thread overview  


January 27, 2012 simultaneous multiple key sorting algorithm  

 
Here's a multikey sorting problem that's bothering me; I think it's well researched but I can't find the right keywords. Say we have a collection of people with height and hair length information. We want to get people who are "tall and longhaired". More precisely, we want to order people by rank(p.height) + rank(p.hairlength), where rank(p.k) is the function that returns the rank of person p if ordered by key k. The brute force approach would essentially compute the two ranks and then sort by their sum. That's three sorts and at least one temporary array. But I think things can be done considerably better. Any ideas? Thanks, Andrei 
January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to Andrei Alexandrescu  On 01/27/2012 08:36 PM, Andrei Alexandrescu wrote: > Here's a multikey sorting problem that's bothering me; I think it's well > researched but I can't find the right keywords. > > Say we have a collection of people with height and hair length > information. We want to get people who are "tall and longhaired". More > precisely, we want to order people by rank(p.height) + > rank(p.hairlength), where rank(p.k) is the function that returns the > rank of person p if ordered by key k. > > The brute force approach would essentially compute the two ranks and > then sort by their sum. That's three sorts and at least one temporary > array. But I think things can be done considerably better. Any ideas? > > > Thanks, > > Andrei http://en.wikipedia.org/wiki/Kd_tree Sclytrack 
January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to sclytrack  On 1/27/12 3:07 PM, sclytrack wrote:
> On 01/27/2012 08:36 PM, Andrei Alexandrescu wrote:
>> Here's a multikey sorting problem that's bothering me; I think it's well
>> researched but I can't find the right keywords.
>>
>> Say we have a collection of people with height and hair length
>> information. We want to get people who are "tall and longhaired". More
>> precisely, we want to order people by rank(p.height) +
>> rank(p.hairlength), where rank(p.k) is the function that returns the
>> rank of person p if ordered by key k.
>>
>> The brute force approach would essentially compute the two ranks and
>> then sort by their sum. That's three sorts and at least one temporary
>> array. But I think things can be done considerably better. Any ideas?
>>
>>
>> Thanks,
>>
>> Andrei
>
>
> http://en.wikipedia.org/wiki/Kd_tree
Thanks. I think kd trees are good for general operations (insertion, deletion, and nearest neighbor query) whereas here we're interested in sorting.
Andrei

January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to Andrei Alexandrescu  On Sat, 28 Jan 2012 06:36:49 +1100, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > Here's a multikey sorting problem that's bothering me; I think it's well researched but I can't find the right keywords. > > Say we have a collection of people with height and hair length information. We want to get people who are "tall and longhaired". More precisely, we want to order people by rank(p.height) + rank(p.hairlength), where rank(p.k) is the function that returns the rank of person p if ordered by key k. > > The brute force approach would essentially compute the two ranks and then sort by their sum. That's three sorts and at least one temporary array. But I think things can be done considerably better. Any ideas? Just a further clarification ... If a person is tall and has short hair, would they be included in the sort? In other words, are you ONLY looking for tall AND longhaired people or are you looking for relative ranking of tallness and hairiness?  Derek Parnell Melbourne, Australia 
January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to Derek  On 1/27/12 5:27 PM, Derek wrote:
> On Sat, 28 Jan 2012 06:36:49 +1100, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>
>> Here's a multikey sorting problem that's bothering me; I think it's
>> well researched but I can't find the right keywords.
>>
>> Say we have a collection of people with height and hair length
>> information. We want to get people who are "tall and longhaired".
>> More precisely, we want to order people by rank(p.height) +
>> rank(p.hairlength), where rank(p.k) is the function that returns the
>> rank of person p if ordered by key k.
>>
>> The brute force approach would essentially compute the two ranks and
>> then sort by their sum. That's three sorts and at least one temporary
>> array. But I think things can be done considerably better. Any ideas?
>
> Just a further clarification ...
>
> If a person is tall and has short hair, would they be included in the
> sort? In other words, are you ONLY looking for tall AND longhaired
> people or are you looking for relative ranking of tallness and hairiness?
Everybody is included in the sort: the size of the array before and after the sort is the same. Then, of course, the top K problem is very interesting too.
Andrei

January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to Andrei Alexandrescu  >>> The brute force approach would essentially compute the two ranks and then sort by their sum. That's three sorts and at least one temporary
I think, you are mixing two problem levels:
Level one (theory):
How are the two properties weighted against each other
and can be combined to a total weight (a weightfunction).
Level two (programming):
how can this be expressed in an optimal algorithmical way,
given D2.
Your question suggests, the the (naive) sumfunction is
already the solution to level one.
But I suspect, that the solution for the problem requires
rather to find a proper weight function than a
combinatationalgorithm.
Christof

January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to Christof Schardt  On 1/27/12 5:35 PM, Christof Schardt wrote: >>>> The brute force approach would essentially compute the two ranks and >>>> then sort by their sum. That's three sorts and at least one temporary > > I think, you are mixing two problem levels: > > Level one (theory): > How are the two properties weighted against each other > and can be combined to a total weight (a weightfunction). There's no weighing. The sorting order is rank(k1) + rank(k2). > Level two (programming): > how can this be expressed in an optimal algorithmical way, > given D2. > > Your question suggests, the the (naive) sumfunction is > already the solution to level one. This is not naive, because the quantities being summed are ranks, not key values. > But I suspect, that the solution for the problem requires > rather to find a proper weight function than a > combinatationalgorithm. No, I think this is a confusion. Andrei 
January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to Andrei Alexandrescu  On Friday, 27 January 2012 at 19:36:49 UTC, Andrei Alexandrescu wrote: > Here's a multikey sorting problem that's bothering me; I think it's well researched but I can't find the right keywords. > > Say we have a collection of people with height and hair length information. We want to get people who are "tall and longhaired". More precisely, we want to order people by rank(p.height) + rank(p.hairlength), where rank(p.k) is the function that returns the rank of person p if ordered by key k. > > The brute force approach would essentially compute the two ranks and then sort by their sum. That's three sorts and at least one temporary array. But I think things can be done considerably better. Any ideas? I don't think you can get much better than this. Knowing the rank (for one key) for one element requires looking at all the other elements. Thus, knowing the rank of all elements isn't possible in linear time or faster. Knowing the ranks of only one key doesn't get you much information regarding the final result. Even the item ranked last on one key can tie for first place in combination with the other. These observations seem to impose a strict dependency on the order the data is calculated. I don't see any shortcuts, but I'd love to be proven wrong. You may want to ask on http://cstheory.stackexchange.com/ . 
January 27, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to sclytrack  A kd tree isn't directly applicable here  the problem is that the input data isn't normalized, so you can't just drop people into a 2d grid consisting of one axis being height, and the other axis being hair length. You need to 'normalize' the heights and lengths into ranks first, *then* insert them, and normalizing the lengths > ranks is an NlogN process to begin with. Constructing a kd tree isn't free either  you either have to presort your inputs, or sortasyougo, which works out to NlogN as well.
On Friday, 27 January 2012 at 20:07:44 UTC, sclytrack wrote:
> On 01/27/2012 08:36 PM, Andrei Alexandrescu wrote:
>> Here's a multikey sorting problem that's bothering me; I think it's well
>> researched but I can't find the right keywords.
>>
>> Say we have a collection of people with height and hair length
>> information. We want to get people who are "tall and longhaired". More
>> precisely, we want to order people by rank(p.height) +
>> rank(p.hairlength), where rank(p.k) is the function that returns the
>> rank of person p if ordered by key k.
>>
>> The brute force approach would essentially compute the two ranks and
>> then sort by their sum. That's three sorts and at least one temporary
>> array. But I think things can be done considerably better. Any ideas?
>>
>>
>> Thanks,
>>
>> Andrei
>
>
> http://en.wikipedia.org/wiki/Kd_tree
>
>
> Sclytrack

January 28, 2012 Re: simultaneous multiple key sorting algorithm  

 
Posted in reply to Andrei Alexandrescu  On Fri, 27 Jan 2012 16:45:41 0600, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > On 1/27/12 5:35 PM, Christof Schardt wrote: >>>>> The brute force approach would essentially compute the two ranks and >>>>> then sort by their sum. That's three sorts and at least one temporary >> >> I think, you are mixing two problem levels: >> >> Level one (theory): >> How are the two properties weighted against each other >> and can be combined to a total weight (a weightfunction). > > There's no weighing. The sorting order is rank(k1) + rank(k2). Weighting doesn't imply a simple linear constant. The rank function is an example of a nonlinear weighting scheme. Nonlinear weighting functions are common in complex sorting and classification problems. 
« First ‹ Prev 

Copyright © 19992016 by Digital Mars ®, All Rights Reserved