Jump to page: 1 2 3
Thread overview
Sorting array or AssocArrays by value.
May 04, 2008
Me Here
May 04, 2008
Derek Parnell
May 04, 2008
Me Here
May 04, 2008
Derek Parnell
May 04, 2008
Me Here
May 04, 2008
Xinok
May 04, 2008
Frits van Bommel
May 04, 2008
Me Here
May 04, 2008
BCS
May 04, 2008
Derek Parnell
May 04, 2008
pragma
May 04, 2008
bearophile
May 04, 2008
Me Here
May 04, 2008
bearophile
May 04, 2008
Me Here
May 05, 2008
bearophile
May 04, 2008
BCS
May 04, 2008
BCS
May 04, 2008
Manfred Nowak
May 05, 2008
Me Here
May 05, 2008
BCS
May 04, 2008
Hi all,

A little advice please. Using D v1.028.

I have (currently) an array of uints. I need to output these sorted, which
.sort does admirably,
but I need to know what index is associated with each value. How?

Springing from my own use of the word "associated", I thought that maybe I
should be using
and associative array for this. But again the problem is that I would need to
sort the keys
by their associated values.

In Perl I'd do

my @array = ...;

## Get the indexes ordered by their values
my @orderIndexes = sort{
     $array[ $a ] <=> $array[ $b ]
} 0 .. $#array;

for my $i ( @orderedIndexes ) {
    printf "%d : %d\n2, $i, $array[ $i ];
}

Is there anything built-in or in Phobos that will help me here? Or do I need to write my own sort?

Cheers, b.
-- 

May 04, 2008
On Sun, 4 May 2008 07:58:16 +0000 (UTC), Me Here wrote:

> Hi all,
> 
> A little advice please. Using D v1.028.
> 
> I have (currently) an array of uints. I need to output these sorted, which
> .sort does admirably,
> but I need to know what index is associated with each value. How?
> 
> Springing from my own use of the word "associated", I thought that maybe I
> should be using
> and associative array for this. But again the problem is that I would need to
> sort the keys
> by their associated values.
> 
> In Perl I'd do
> 
> my @array = ...;
> 
> ## Get the indexes ordered by their values
> my @orderIndexes = sort{
>      $array[ $a ] <=> $array[ $b ]
> } 0 .. $#array;
> 
> for my $i ( @orderedIndexes ) {
>     printf "%d : %d\n2, $i, $array[ $i ];
> }
> 
> Is there anything built-in or in Phobos that will help me here? Or do I need to write my own sort?
> 
> Cheers, b.

Try this 'sort' of thing ...

import std.stdio;
void main()
{
   string[string] theArray;
   theArray["one"] = "neung";
   theArray["two"] = "song";
   theArray["three"] = "saam";
   theArray["four"] = "sii";
   theArray["five"] = "hah";

   foreach(string k; theArray.keys.sort)
      writefln("Key: %10s ==> Data: %s", k, theArray[k]);

}

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
May 04, 2008
Me Here:
> I have (currently) an array of uints. I need to output these sorted, which
> .sort does admirably,
> but I need to know what index is associated with each value. How?

My libs are designed to solve such problems too, take a look at sortedAA inside the func module: http://www.fantascienza.net/leonardo/so/libs_d.zip

Bye,
bearophile
May 04, 2008
Derek Parnell wrote:

> On Sun, 4 May 2008 07:58:16 +0000 (UTC), Me Here wrote:
> 
> > Hi all,
> > 
> > A little advice please. Using D v1.028.
> > 
> > I have (currently) an array of uints. I need to output these sorted, which
> > .sort does admirably,
> > but I need to know what index is associated with each value. How?
> > 
> > Springing from my own use of the word "associated", I thought that maybe I
> > should be using
> > and associative array for this. But again the problem is that I would need
> > to sort the keys
> > by their associated values.
> > 
> > In Perl I'd do
> > 
> > my @array = ...;
> > 
> > ## Get the indexes ordered by their values
> > my @orderIndexes = sort{
> >      $array[ $a ] <=> $array[ $b ]
> > } 0 .. $#array;
> > 
> > for my $i ( @orderedIndexes ) {
> >     printf "%d : %d\n2, $i, $array[ $i ];
> > }
> > 
> > Is there anything built-in or in Phobos that will help me here? Or do I need to write my own sort?
> > 
> > Cheers, b.
> 
> Try this 'sort' of thing ...
> 
> import std.stdio;
> void main()
> {
>    string[string] theArray;
>    theArray["one"] = "neung";
>    theArray["two"] = "song";
>    theArray["three"] = "saam";
>    theArray["four"] = "sii";
>    theArray["five"] = "hah";
> 
>    foreach(string k; theArray.keys.sort)
>       writefln("Key: %10s ==> Data: %s", k, theArray[k]);
> 
> }

I need them ordered by the /values/ not the keys.

Eg. Given, (whether array or hash):

//index/key  0  1  2  3  4  5  6  7 8 uint a[] =    [ 3, 1,  8, 6, 4, 9, 2, 5, 7 ];

I need to output:
// value  - index
1 - 1
2 - 6
3 = 0
4 - 4
5 - 7
6 - 3
7 - 8
8 - 2
9 - 5

Cheers, b.
-- 

May 04, 2008
On Sun, 4 May 2008 13:11:23 +0000 (UTC), Me Here wrote:

> I need them ordered by the /values/ not the keys.
> 
> Eg. Given, (whether array or hash):
> 
> //index/key  0  1  2  3  4  5  6  7 8 uint a[] =    [ 3, 1,  8, 6, 4, 9, 2, 5, 7 ];
> 
> I need to output:
> // value  - index
> 1 - 1
> 2 - 6
> 3 = 0
> 4 - 4
> 5 - 7
> 6 - 3
> 7 - 8
> 8 - 2
> 9 - 5

The trick I use for this is to have two AAs. When the set is small, performance isn't really an issue.


import std.stdio;
void main()
{
   int[int] theArray;
   theArray[0] = 3;
   theArray[1] = 1;
   theArray[2] = 8;
   theArray[3] = 6;
   theArray[4] = 4;
   theArray[5] = 9;
   theArray[6] = 2;
   theArray[7] = 5;
   theArray[8] = 7;

   int[int] altArray;
   foreach(int i; theArray.keys)
   {
       altArray[ theArray[i] ] = i;
   }

   foreach(int i; altArray.keys.sort)
       writefln("%s - %s", i, altArray[i]);


}

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
May 04, 2008
Derek Parnell wrote:

> On Sun, 4 May 2008 13:11:23 +0000 (UTC), Me Here wrote:
> 
> > I need them ordered by the values not the keys.
> > 
> > Eg. Given, (whether array or hash):
> > 
> > //index/key  0  1  2  3  4  5  6  7 8 uint a[] =    [ 3, 1,  8, 6, 4, 9, 2, 5, 7 ];
> > 
> > I need to output:
> > // value  - index
> > 1 - 1
> > 2 - 6
> > 3 = 0
> > 4 - 4
> > 5 - 7
> > 6 - 3
> > 7 - 8
> > 8 - 2
> > 9 - 5
> 
> The trick I use for this is to have two AAs. When the set is small, performance isn't really an issue.
> 
> 
> import std.stdio;
> void main()
> {
>    int[int] theArray;
>    theArray[0] = 3;
>    theArray[1] = 1;
>    theArray[2] = 8;
>    theArray[3] = 6;
>    theArray[4] = 4;
>    theArray[5] = 9;
>    theArray[6] = 2;
>    theArray[7] = 5;
>    theArray[8] = 7;
> 
>    int[int] altArray;
>    foreach(int i; theArray.keys)
>    {
>        altArray[ theArray[i] ] = i;
>    }
> 
>    foreach(int i; altArray.keys.sort)
>        writefln("%s - %s", i, altArray[i]);
> 
> 
> }

Problem: That works fine if the values in the origianl array are unique.
But in this case thay are not. This is frequency information.
The index to the array is numeric value of words in a file.
The values are counts of the occurance of that value in the file,

A truncated sample showing the first of many duplicates.
    0 :      5780646
    1 :       219016
    2 :       138623
    3 :       100037
    4 :        89876
    5 :        61845
    6 :        60890
    7 :        41191
    8 :        60231
    9 :        36885 *
...
   70 :        13069
   71 :        36885 *
   72 :        16121
   73 :        10721

I need to out put this as either:
...
36885: 9, 70
...
or
36885: 9
36885: 70

Cheers, b.
-- 

May 04, 2008
Me Here wrote:
> Derek Parnell wrote:
> 
>> On Sun, 4 May 2008 13:11:23 +0000 (UTC), Me Here wrote:
>>
>>> I need them ordered by the values not the keys.
>>>
>>> Eg. Given, (whether array or hash):
>>>
>>> //index/key  0  1  2  3  4  5  6  7 8 uint a[] =    [ 3, 1,  8, 6, 4, 9, 2, 5, 7 ];
>>>
>>> I need to output:
>>> // value  - index
>>> 1 - 1
>>> 2 - 6
>>> 3 = 0
>>> 4 - 4
>>> 5 - 7
>>> 6 - 3
>>> 7 - 8
>>> 8 - 2
>>> 9 - 5
>> The trick I use for this is to have two AAs. When the set is small,
>> performance isn't really an issue.
>>
>>
>> import std.stdio;
>> void main()
>> {
>>    int[int] theArray;
>>    theArray[0] = 3;
>>    theArray[1] = 1;
>>    theArray[2] = 8;
>>    theArray[3] = 6;
>>    theArray[4] = 4;
>>    theArray[5] = 9;
>>    theArray[6] = 2;
>>    theArray[7] = 5;
>>    theArray[8] = 7;
>>
>>    int[int] altArray;
>>    foreach(int i; theArray.keys)
>>    {
>>        altArray[ theArray[i] ] = i;
>>    }
>>       foreach(int i; altArray.keys.sort)
>>        writefln("%s - %s", i, altArray[i]);
>>        
>>
>> }
> 
> Problem: That works fine if the values in the origianl array are unique.
> But in this case thay are not. This is frequency information. The index to the array is numeric value of words in a file. The values are counts of the occurance of that value in the file,
> 
> A truncated sample showing the first of many duplicates.
>     0 :      5780646
>     1 :       219016
>     2 :       138623
>     3 :       100037
>     4 :        89876
>     5 :        61845
>     6 :        60890
>     7 :        41191
>     8 :        60231
>     9 :        36885 *
> ...
>    70 :        13069
>    71 :        36885 *
>    72 :        16121
>    73 :        10721
> 
> I need to out put this as either:
> ...
> 36885: 9, 70
> ...
> or 36885: 9
> 36885: 70
> 
> Cheers, b.

I suggest you write your own sort function which will sort two arrays accordingly. Shell sort would probably be the easiest to implement this for.
May 04, 2008
Me Here wrote:
> Problem: That works fine if the values in the origianl array are unique.
> But in this case thay are not. This is frequency information. The index to the array is numeric value of words in a file. The values are counts of the occurance of that value in the file,

Try putting them in an array of structs with overloaded opCmp, and then sorting that. For example:
-----
module test;

import tango.io.Stdout;
import tango.text.convert.Format;

struct Pair {
    int a, b;

    int opCmp(Pair* other) {
        if (a != other.a)
            return typeid(typeof(a)).compare(&a, &other.a);
        return typeid(typeof(b)).compare(&b, &other.b);
    }

    char[] toString() {
        return Format("{}:{}", a, b);
    }
}

void main() {
    int[int] aa = [43: 2, 42: 12, 10: 3, 100: 1000, 101: 200, 7: 3];
    Stdout(aa).newline;

    auto arr = new Pair[aa.length];
    size_t i = 0;
    foreach (key, val ; aa)
        arr[i++] = Pair(val, key);

    arr.sort;

    Stdout(arr).newline;
}
-----
output:
=====
{ 100=>1000, 101=>200, 7=>3, 10=>3, 42=>12, 43=>2 }
[ 2:43, 3:7, 3:10, 12:42, 200:101, 1000:100 ]
=====

Notes:
* Tango is only used for formatting and could easily be replaced by appropriate Phobos routines.
* I used typeid().compare because simple subtraction might overflow for integer data >= int.sizeof.
* You may want to use an unsigned type for the frequency. Just change the types of the AA and the Pair member data. Optionally, template Pair on the types if you need to do this for multiple data types.
May 04, 2008
bearophile wrote:

> Me Here:
> > I have (currently) an array of uints. I need to output these sorted, which
> > .sort does admirably,
> > but I need to know what index is associated with each value. How?
> 
> My libs are designed to solve such problems too, take a look at sortedAA inside the func module:  http://www.fantascienza.net/leonardo/so/libs_d.zip
> 
> Bye,
> bearophile

sortAA() looks like it would do the trick, but could you expand on the example a litle for me. I'm sure the sub signature should tell me all I need to know, but I'm having a problem  working out how to call it?

--------
TyKey[] sortedAA(TyKey,TyVal,TyFun)(TyVal[TyKey] aa, TyFun key);
void valGetter();
Return the keys of an AA sorted according to the result of the given
function/delegate 'key' applied on two arguments, the key and value.

'key' can be any callable, or it can be the predefined &valGetter to sort the AA keys according to the values.

Example:
Sort the keys of an AA according to the values:
int[string] a = ["DD":3, "aa":1, "BB":2];
sortedAA(d, &valGetter) ==> ["aa", "BB", "DD"]

If you use this to sort an AA according to values, using a valGetter function, you are paying just about 5% speed penality.

Throws ArgumentException if key returns void.
----------------

If I define my hash as

uint freq[ ushort ];

How do I write valGetter() ?

Cheers, b.
-- 

May 04, 2008
Me Here:
> How do I write valGetter() ?

Use the predefined one, my libs are supposed to minimize the work for the programmer ;-)

Bye,
bearophile
« First   ‹ Prev
1 2 3