Trying Phobos some more

February 12, 2014
Posted by bearophile
Permalink
bearophile
Permalink
One way to find what's missing among Phobos ranges and higher order functions is to try to use them. This is a small Rosettacode task, given a file of words, it asks to find the longest pair of anagrams that have not even one letter in the same position:

http://rosettacode.org/wiki/Anagrams/Deranged_anagrams

I don't fully understand the Clojure solution:


(let
  [words    (re-seq #"\w+" (slurp "unixdict.txt"))
   anagrams (filter second (vals (group-by sort words)))
   deranged (remove #(some true? (apply map = %)) anagrams)]
  (prn (last (sort-by #(count (first %)) deranged))))


I have written two D solutions, the first tries to be short and the second to be faster. This is the first solution (look in that page for the fast solution):


void main() {
    import std.stdio, std.file, std.algorithm, std.string,
           std.array;

    string[][dstring] anags;
    foreach (const w; "unixdict.txt".readText.split)
        anags[w.array.sort().release.idup] ~= w;

    anags
    .byValue
    .map!(anags => anags.cartesianProduct(anags)
                   .filter!q{a[].equal!{ a != b }})
    .join
    .minPos!q{ a[0].length > b[0].length }[0]
    .writeln;
}



A better short D version:

void main() {
    import std.stdio, std.file, std.algorithm, std.string,
           std.array;

    "unixdict.txt"
    .readText
    .split
    .hashGroupBy!(w => w.sorted.release)
    .byValue
    .map!(anags => anags
                   .pairwise
                   .filter!q{a[].equal!q{ a != b }})
    .join
    .max!q{ a[0].length }
    .writeln;
}


Where:

hashGroupBy returns an associative array of the equivalence classes, given a key function (http://d.puremagic.com/issues/show_bug.cgi?id=9842 ).

sorted is similar to an array+sort. But it allocates a new array, so the its output should be implicitly castable to const (http://d.puremagic.com/issues/show_bug.cgi?id=5076  https://d.puremagic.com/issues/show_bug.cgi?id=12140 ).

pairwise is a range that yields the pairs, but only the lower triangle. So it's like two nested loops. In this program it's up to twice faster than using cartesianProduct (http://d.puremagic.com/issues/show_bug.cgi?id=6788 ).

max and min accept an optional mapping key function, as in Python (http://d.puremagic.com/issues/show_bug.cgi?id=4705 ).

Bye,
bearophile
Forums