Thread overview
[Issue 5502] New: More handy ways to create associative arrays
January 29, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5502

           Summary: More handy ways to create associative arrays
           Product: D
           Version: D2
          Platform: All
        OS/Version: Windows
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: bearophile_hugs@eml.cc


--- Comment #0 from bearophile_hugs@eml.cc 2011-01-29 10:11:56 PST ---
I think some helpers to create built-in associative arrays will be useful. Such helpers may be built-in methods of the AAs, or free functions in std.algorithm, or a mix of the two.

Three of the most common ways to build an associative array (examples in Python
2.6.6):

1) One of the most useful ways to create an AA is from a sequence of pairs. The pairs may come from zipping of two iterables:

>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> b = range(10, 100, 10)
>>> b
[10, 20, 30, 40, 50, 60, 70, 80, 90]
>>> zip(a, b)
[(1, 10), (2, 20), (3, 30), (4, 40), (5, 50), (6, 60), (7, 70), (8, 80), (9,
90)]
>>> dict(zip(a, b))
{1: 10, 2: 20, 3: 30, 4: 40, 5: 50, 6: 60, 7: 70, 8: 80, 9: 90}


Or from sorting and then grouping:

>>> from itertools import groupby
>>> s = "abaacaabbabaa"
>>> t = "".join(sorted(s))
>>> t
'aaaaaaaabbbbc'
>>> [(h, len(list(g))) for h,g in groupby(t)]
[('a', 8), ('b', 4), ('c', 1)]
>>> dict((h, len(list(g))) for h,g in groupby(t))
{'a': 8, 'c': 1, 'b': 4}


In Phobos of DMD 2.051 there are a zip() and group() that return such iterables
of pairs. So the only needed thing is a free function to build an AA from such
sequences, or better a built-in AA method like AA.fromPairs().

See also the Scala xs.toMap method: http://www.scala-lang.org/api/current/scala/collection/immutable/List.html


See also AA.byPair() from bug 5466. AA.fromPairs(AA.byPair()) is an identity.

----------------------

2) Often you need to create an associative array that represents key-frequency pair, but you want to avoid to waste time sorting the items and then grouping them:


>>> from random import randint
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> s = "abaacaabbabaa"
>>> for c in s: d[c] += 1
...
>>> d
defaultdict(<type 'int'>, {'a': 8, 'c': 1, 'b': 4})


This is equivalent to a way to create a bag data structure.
For this purpose I don't suggest to add a built-in method to AAs. A free
frequency() function for std.algorithm will be enough.

----------------------

3) A simple and common way to create an associative array is from an iterable of keys, with constant values:


>>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> dict.fromkeys(a)
{1: None, 2: None, 3: None, 4: None, 5: None, 6: None, 7: None, 8: None, 9: None}
>>> dict.fromkeys(a, 1)
{1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1}


This may become a built-in AA method like AA.fromKeys(), or a similar free
function in std.algorithm (in a statically typed language as D the type of the
values must be somehow given or be known, even when fromKeys() is allowed to
use a default value).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 09, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=5502



--- Comment #1 from bearophile_hugs@eml.cc 2012-12-09 12:24:44 PST ---
The first suggestion was implemented:

https://github.com/D-Programming-Language/phobos/compare/467db2b45d92...c850e68b0379


This works:

import std.stdio, std.array, std.algorithm;
void main() {
    dchar[] s = "abaacaabbabaa"d.dup;
    s.sort();
    uint[dchar] aa = ['a':8, 'b':4, 'c':1];
    assert(group(s).assocArray() == aa);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
February 06, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=5502



--- Comment #2 from bearophile_hugs@eml.cc 2013-02-06 04:38:07 PST ---
Currently if you want to create an associative array with constant values you have to use something like:


import std.array: assocArray;
import std.range: zip, repeat;
void main() {
    bool[dchar] dcharSet = assocArray(zip("ABCD", repeat(true)));
}



In Python the dict has a static method "fromkeys" that is handy:


>>> dict.fromkeys("ABCD", True)
{'A': True, 'C': True, 'B': True, 'D': True}


A possible similar Phobos function:


import std.array: AAFromKeys;
void main() {
    bool[dchar] dcharSet = AAFromKeys("ABCD", true);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 24, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=5502



--- Comment #3 from bearophile_hugs@eml.cc 2013-05-24 11:22:04 PDT ---
Regarding the second suggestion, now I prefer a more general solution.

One possible name for this free function that returns an associative array is "gather", as maybe in Perl6.

As use case this program shows some of the anagrams, given a file of different lowercase words:


import std.stdio, std.algorithm, std.range, std.string, std.file;

void main() {
    string[][const ubyte[]] anags;
    foreach (string w; std.array.splitter("unixdict.txt".readText))
        anags[w.dup.representation.sort().release.idup] ~= w;

    writefln("%-(%s\n%)", anags
                          .byValue
                          .filter!q{ a.length > 2 }
                          .take(20));
}



A gather() higher order function allows to remove the foreach and replace it
with:


const anags = std.array.splitter("unixdict.txt".readText)
              .gather(w => w.dup.representation.sort().release.idup);


gather() takes a callable that specifies how to compute the key of the
associative array.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 25, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=5502



--- Comment #4 from bearophile_hugs@eml.cc 2013-05-25 04:03:30 PDT ---
(In reply to comment #3)

> One possible name for this free function that returns an associative array is "gather", as maybe in Perl6.

I was wrong, in Perl6 it's named "classify" (List.classify):
http://doc.perl6.org/routine/classify


multi sub    classify(&mapper, *@values) returns Hash:D
multi method classify(List:D: &mapper)   returns Hash:D

Transforms a list of values into a hash representing the classification of those values according to a mapper; each hash key represents the classification for one or more of the incoming list values, and the corresponding hash value contains an array of those list values classified by the mapper into the category of the associated key.

Examples:

say classify { $_ %% 2 ?? 'even' !! 'odd' }, (1, 7, 6, 3, 2);
# ("odd" => [1, 7, 3], "even" => [6, 2]).hash;;

say ('hello', 1, 22/7, 42, 'world').classify: { .Str.chars }
# ("5" => ["hello", "world"], "1" => [1], "8" => [22/7], "2" => [42]).hash

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------