Thread overview
A set type implemented as an AA wrapper
Mar 12, 2020
mark
Mar 12, 2020
Ferhat Kurtulmuş
Mar 12, 2020
Simen Kjærås
Mar 12, 2020
mark
Mar 12, 2020
mark
Mar 12, 2020
H. S. Teoh
Mar 12, 2020
mark
March 12, 2020
I use sets a lot and since I believe that D's rbtree is O(lg n) for add/remove/in and that D's AA is O(1) for these, I want to implement a set in terms of an AA.

Below is the code I've got so far. It allows for add and remove. However, it has three problems (that I know of):

XXX: I need to use an if on the struct to restrict T to be a type that supports toHash and opEquals (i.e., to be a valid AA key)

YYY: The range() method is clearly not good D style but I don't know how to support foreach (item; aaset) ...

ZZZ: I can't figure out how to support the in operator.

// XXX how do I write the if to ensure T is a valid AA key that suppors
// toHash & opEquals?
struct AAset(T) {
    private {
        alias Unit = void[0];
        enum unit = Unit.init;
        Unit[T] set;
    }
    size_t length() const { return set.length; }
    void add(T item) { set[item] = unit; }
    bool remove(T item) { return set.remove(item); }

    // YYY is there a better way so that people can just use
    //  foreach (var; aaset)
    auto range() { return set.byKey; }

    bool opBinaryRight(string op)(T lhs) { // ZZZ doesn't work
        static if (op == "in") return lhs in set;
        else static assert(0, "operator " ~ op ~ " not supported");
    }
    // TODO union(), intersection(), difference(), symmetric_difference()
}

unittest {
    import std.algorithm: sort;
    import std.array: array;
    import std.range: enumerate;
    import std.stdio: writeln;
    import std.typecons: Tuple;

    alias Pair = Tuple!(int, "count", string, "word");
    auto inputs = [Pair(1, "one"), Pair(2, "two"), Pair(3, "three"),
                   Pair(4, "four"), Pair(4, "two"), Pair(5, "five"),
                   Pair(6, "six")];
    AAset!string words;
    assert(words.length == 0);
    foreach (pair; inputs) {
        words.add(pair.word);
        assert(words.length == pair.count);
    }
    auto len = words.length;
    assert(!words.remove("missing"));
    assert(words.remove("one"));
    assert(words.length == len - 1);
    auto expected = ["five", "four", "six", "three", "two"];
    foreach (i, word; words.range.array.sort.enumerate)
        assert(word == expected[i]);
    /*
    assert("Z" !in words);
    assert("three" !in words);
    */
}
March 12, 2020
On Thursday, 12 March 2020 at 08:51:24 UTC, mark wrote:
> I use sets a lot and since I believe that D's rbtree is O(lg n) for add/remove/in and that D's AA is O(1) for these, I want to implement a set in terms of an AA.

> XXX: I need to use an if on the struct to restrict T to be a type that supports toHash and opEquals (i.e., to be a valid AA key)
>

Maybe there is another way for this. But I would consider using:
std.traits.hasMember

> YYY: The range() method is clearly not good D style but I don't
opApply can be used for it.

> ZZZ: I can't figure out how to support the in operator.

V* opBinaryRight(string op) with in should return a pointer for valid entries and null for non-existing entries. Thus, "if(key in AA){...}" works as expected. That is how AA implementation of druntime works.

Recently I ve ported it for -betterC. You can take a look at:
https://github.com/aferust/bcaa
March 12, 2020
On Thursday, 12 March 2020 at 08:51:24 UTC, mark wrote:
> I use sets a lot and since I believe that D's rbtree is O(lg n) for add/remove/in and that D's AA is O(1) for these, I want to implement a set in terms of an AA.
>
> Below is the code I've got so far. It allows for add and remove. However, it has three problems (that I know of):
>
> XXX: I need to use an if on the struct to restrict T to be a type that supports toHash and opEquals (i.e., to be a valid AA key)

I'd suggest simply testing if an AA with that key type is valid:

    struct AAset(T) if (is(int[T]))


> YYY: The range() method is clearly not good D style but I don't know how to support foreach (item; aaset) ...

As Ferhat points out, you could use opApply for this. There's also the option of implenting the range primitives front, popFront() and empty. However, the easiest solution is the one you've already chosen, combined with alias this:

    struct AAset(T) if (is(int[T])) {
        // stuffs...
        auto range() { return set.byKey; }
        alias range this;
    }


> ZZZ: I can't figure out how to support the in operator.

'in' for AAs returns a pointer to the value it finds, or null if no item is found. This pointer does not implicitly convert to bool when returned from a function, so you need to compare it to null:

    bool opBinaryRight(string op)(T lhs) {
        static if (op == "in") return (lhs in set) != null;
        else static assert(0, "operator " ~ op ~ " not supported");
    }

I would also suggest using a template specialization instead of static if and static assert:

    bool opBinaryRight(string op : "in")(T lhs) {
        return (lhs in set) != null;
    }

--
  Simen
March 12, 2020
On Thursday, 12 March 2020 at 11:33:25 UTC, Simen Kjærås wrote:
[snip]
> I'd suggest simply testing if an AA with that key type is valid:
>
>     struct AAset(T) if (is(int[T]))

That's very subtle, but it works.

> As Ferhat points out, you could use opApply for this. There's also the option of implenting the range primitives front, popFront() and empty. However, the easiest solution is the one you've already chosen, combined with alias this:
>
>     struct AAset(T) if (is(int[T])) {
>         // stuffs...
>         auto range() { return set.byKey; }
>         alias range this;
>     }

Again, subtle, and again it works!

> I would also suggest using a template specialization instead of static if and static assert:
>
>     bool opBinaryRight(string op : "in")(T lhs) {
>         return (lhs in set) != null;
>     }

Thanks, you've solved all the issued I had! Here's the complete revised code:

struct AAset(T) if (is(int[T])) {
    private {
        alias Unit = void[0];
        enum unit = Unit.init;
        Unit[T] set;
    }
    size_t length() const { return set.length; }
    void add(T item) { set[item] = unit; }
    bool remove(T item) { return set.remove(item); }
    auto range() { return set.byKey; }
    alias range this;
    bool opBinaryRight(string op: "in")(T lhs) {
        return (lhs in set) != null;
    }
}

unittest {
    import std.algorithm: sort;
    import std.array: array;
    import std.range: enumerate;
    import std.stdio: writeln;
    import std.typecons: Tuple;

    writeln("unittest for the aaset library.");
    alias Pair = Tuple!(int, "count", string, "word");
    immutable inputs = [Pair(1, "one"), Pair(2, "two"), Pair(3, "three"),
                        Pair(4, "four"), Pair(4, "two"), Pair(5, "five"),
                        Pair(6, "six")];
    AAset!string words;
    assert(words.length == 0);
    foreach (pair; inputs) {
        words.add(pair.word);
        assert(words.length == pair.count);
    }
    immutable len = words.length;
    assert(!words.remove("missing"));
    assert(words.remove("one"));
    assert(words.length == len - 1);
    immutable expected = ["five", "four", "six", "three", "two"];
    foreach (i, word; words.array.sort.enumerate)
        assert(word == expected[i]);
    assert("Z" !in words);
    assert("three" in words);
}
March 12, 2020
Just in case anyone would like to use it, I've put it on github and also added it as a dub package.
https://code.dlang.org/packages/aaset
March 12, 2020
On Thu, Mar 12, 2020 at 08:51:24AM +0000, mark via Digitalmars-d-learn wrote: [...]
> YYY: The range() method is clearly not good D style but I don't know
> how to support foreach (item; aaset) ...

The usual idiom is to overload .opSlice, then you can do:

	foreach (item; aaset[]) { ... }

IOW rename .range to .opSlice.


> ZZZ: I can't figure out how to support the in operator.

Note that 'x in aa' returns a pointer, not a bool.  You could try:

	bool opBinaryRight(string op : "in")(T lhs) {
		return (lhs in set) !is null;
	}


T

-- 
The richest man is not he who has the most, but he who needs the least.
March 12, 2020
On Thursday, 12 March 2020 at 16:02:14 UTC, H. S. Teoh wrote:
> On Thu, Mar 12, 2020 at 08:51:24AM +0000, mark via Digitalmars-d-learn wrote: [...]
>> YYY: The range() method is clearly not good D style but I don't know
>> how to support foreach (item; aaset) ...
>
> The usual idiom is to overload .opSlice, then you can do:
>
> 	foreach (item; aaset[]) { ... }
>
> IOW rename .range to .opSlice.
>
>
>> ZZZ: I can't figure out how to support the in operator.
>
> Note that 'x in aa' returns a pointer, not a bool.  You could try:
>
> 	bool opBinaryRight(string op : "in")(T lhs) {
> 		return (lhs in set) !is null;
> 	}

I renamed .range to .opSlice (and deleted the alias range this) and  foreach(item; aaset) works fine.

As for the != null, that was a mistake (due to Java) which I've now fixed.

I'm hoping to add union & intersection soon too.

Thanks.