Thread overview
Re: Map with maintained insertion order
Mar 24, 2012
Jonathan M Davis
Mar 24, 2012
Brad Anderson
Mar 24, 2012
Andrej Mitrovic
Mar 24, 2012
Andrej Mitrovic
Mar 24, 2012
H. S. Teoh
Mar 24, 2012
H. S. Teoh
Mar 24, 2012
Andrej Mitrovic
Mar 24, 2012
Andrej Mitrovic
Mar 24, 2012
Andrej Mitrovic
March 24, 2012
On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
> Does someone have a map implementation that maintains the insertion order of the keys?
> 
> E.g.:
> 
> string[string] map;
> map["foo"] = "x";
> map["bar"] = "y";
> 
> When iterating over map keys I want to visit "foo" first, then "bar", and so on.

I can't think of any data structure that does that off the top of my head, though there may very well be one. If you don't mind the extra memory overhead, it wouldn't be all that hard to do it with multiple maps wrapped in a single type. You could do something like

class InsertedOrderMap
{
    string[string] _map;
    RedBlackTree!(Tuple!(int, string), "a[0] < b[0]") _insertionOrder;

    ...
}

_map holds the normal map and _insertionOrder holds pairs of the insertion ordered and the key to _map.

You can then have your range iterate over _insertionOrder and use the keys to return either the keys, values, or pairs of keys and values from _map, depending on what you want to iterate over.

That _does_ require having two data structures in one, and there may a be a better way to do it, but just off the top of my head, that's the best that I can come up with. I don't know how you could have a data structure where it's efficient to index by both insertion order and key without effectively having two data structures. But I really should go reread my data structures books. It's been too long since I studied that stuff.

- Jonathan M Davis
March 24, 2012
On 3/23/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> Does someone have a map implementation that maintains the insertion order of the keys?

Here's a sloppy implementation:

module test;

import std.stdio;
import std.traits;

template KeyType(V : V[K], K)
    if (isAssociativeArray!(V[K]))
{
    alias K KeyType;
}

template ValueType(V : V[K], K)
    if (isAssociativeArray!(V[K]))
{
    alias V ValueType;
}

template BaseElement(T : U[], U)
{
    alias U BaseElement;
}

struct LinkedHash(T)
    if (isAssociativeArray!T)
{
    alias KeyType!T Key;
    alias ValueType!T Value;

    static if (isArray!Value)
        template isValue(T) { enum bool isValue = (is(T == Value) ||
is(BaseElement!Value == T)); }
    else
        template isValue(T) { enum bool isValue = is(T == Value); }

    Value[] payload;
    size_t[Key] keyToIndex;
    Key[size_t] indexToKey;
    size_t lastIndex;

    size_t getNewIndex()
    {
        payload.length = payload.length + 1;
        return lastIndex++;
    }

    size_t* getKeyIndex(string key)
    {
        if (auto index = key in keyToIndex)
            return index;
        else
            return null;
    }

    Value opIndex(in Key key)
    {
        if (auto index = getKeyIndex(key))
            return payload[*index];
        else
            assert(0, "Key '" ~ key ~ "' not found.");
    }

    void opIndexAssign(V)(V value, Key key)
        if (isValue!V)
    {
        if (auto index = getKeyIndex(key))
            payload[*index] = value;
        else
        {
            auto index = getNewIndex();
            payload[index] = value;
            keyToIndex[key] = index;
            indexToKey[index] = key;
        }
    }

    void opIndexOpAssign(string op, V)(V value, Key key)
        if (isValue!V)
    {
        if (auto index = getKeyIndex(key))
            mixin("payload[*index] " ~ op ~ "= value;");
        else
        {
            auto index = getNewIndex();
            mixin("payload[index] " ~ op ~ "= value;");
            keyToIndex[key] = index;
            indexToKey[index] = key;
        }
    }

    int opApply(scope int delegate(ref Key, ref Value) dg)
    {
        foreach (index; 0 .. lastIndex)
        {
            auto result = dg(indexToKey[index], payload[index]);
                if (result)
                    return result;
        }

        return 0;
    }
}

void main()
{
    auto names = ["1", "2", "3", "4", "5", "6", "7", "8"];

    LinkedHash!(string[string]) stringMap;
    foreach (name; names)
        stringMap[name] ~= "test";

    foreach (key, val; stringMap)
        writefln("%s: %s", key, val);

    LinkedHash!(string[][string]) stringArrMap;
    stringArrMap["100"] = ["foo"];
    stringArrMap["101"] ~= "bar";

    foreach (name; names)
        stringArrMap[name] ~= "test";

    foreach (key, val; stringArrMap)
        writefln("%s: %s", key, val);
}
March 24, 2012
On 3/24/12, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> I can't think of any data structure that does that off the top of my head

Java has it and they call it a LinkedHashMap: http://docs.oracle.com/javase/1.4.2/docs/api/java/util/LinkedHashMap.html

> That _does_ require having two data structures in one, and there may a be a better way to do it, but just off the top of my head, that's the best that I can come up with. I don't know how you could have a data structure where it's efficient to index by both insertion order and key without effectively having two data structures.

Yeah it's not efficient, but in my use-case I'm not looking so much for efficiency but convenience. Sometimes I have to look up a key based on a name to modify some values, and I have to keep the order of the keys.
March 24, 2012
On Fri, Mar 23, 2012 at 06:07:42PM -0700, Jonathan M Davis wrote:
> On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
> > Does someone have a map implementation that maintains the insertion order of the keys?
> > 
> > E.g.:
> > 
> > string[string] map;
> > map["foo"] = "x";
> > map["bar"] = "y";
> > 
> > When iterating over map keys I want to visit "foo" first, then "bar", and so on.
> 
> I can't think of any data structure that does that off the top of my head, though there may very well be one.
[...]

Yes there is one. It's called a B-tree. And it's probably total overkill
for what you need. :-) But also, B-trees have O(log n) access time, not
O(1). :-(


T

-- 
Right now I'm having amnesia and deja vu at the same time. I think I've forgotten this before.
March 24, 2012
On Fri, Mar 23, 2012 at 06:33:54PM -0700, H. S. Teoh wrote:
> On Fri, Mar 23, 2012 at 06:07:42PM -0700, Jonathan M Davis wrote:
> > On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
> > > Does someone have a map implementation that maintains the insertion order of the keys?
> > > 
> > > E.g.:
> > > 
> > > string[string] map;
> > > map["foo"] = "x";
> > > map["bar"] = "y";
> > > 
> > > When iterating over map keys I want to visit "foo" first, then "bar", and so on.
> > 
> > I can't think of any data structure that does that off the top of my head, though there may very well be one.
> [...]
> 
> Yes there is one. It's called a B-tree. And it's probably total overkill
> for what you need. :-) But also, B-trees have O(log n) access time, not
> O(1). :-(
[...]

Actually, nevermind that. B-trees are probably not what you're looking for at all, and they aren't really suited to the task anyway (you want to sort by insertion order, not key order).

What you *could* do is implement a hash augmented with next/prev pointers in each slot, IOW, a linked list. The hash itself stores a pointer to the head and tail of the list. When a new slot is created, the tail of the list is updated to link to it. When a slot is deleted, its neighbours are updated. This will maintain O(1) performance of the hash.

In fact, you could use the current AA implementation if you wrap your data in a struct that contains the extra pointers, and wrap around the AA operations to update these pointers as needed.


T

-- 
Valentine's Day: an occasion for florists to reach into the wallets of nominal lovers in dire need of being reminded to profess their hypothetical love for their long-forgotten.
March 24, 2012
On 3/24/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> On 3/23/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
>> Does someone have a map implementation that maintains the insertion order of the keys?
>
> Here's a sloppy implementation:

I forgot to make opIndex ref, but also the isValue check fails on
structs with alias this, it's better to use is(typeof()) instead of
directly checking the type. The fix would be:

static if (isArray!Value)
{
    template isValue(T)
    {
       enum bool isValue = is(typeof( { Value v; T t; v = t; } )) ||
is(BaseElement!Value == T);
    }
}
else
{
    template isValue(T) { enum bool isValue = is(typeof( { Value v; T
t; v = t; } )); }
}
March 24, 2012
On 3/24/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> static if (isArray!Value)
> {
>     template isValue(T)
>     {
>        enum bool isValue = is(typeof( { Value v; T t; v = t; } )) ||
> is(BaseElement!Value == T);
>     }
> }

Correction, it's:
template isValue(T)
{
    enum bool isValue = is(typeof( { Value v; T t; v = t; } ))
    || is(typeof( { BaseElement!Value v; T t; v = t; } ))
    || is(BaseElement!Value == T);
}

Complicated beast, but I have to take into acount both 'alias this' tricks and the fact that ~= works when RHS is either an element type or an array.
March 24, 2012
On 3/24/12, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> Correction, it's:
> template isValue(T)
> {
>     enum bool isValue = is(typeof( { Value v; T t; v = t; } ))
>     || is(typeof( { BaseElement!Value v; T t; v = t; } ))
>     || is(BaseElement!Value == T);
> }

Last check not needed, so:
> template isValue(T)
> {
>     enum bool isValue = is(typeof( { Value v; T t; v = t; } ))
>     || is(typeof( { BaseElement!Value v; T t; v = t; } ));
> }

Anyway I'll stop spamming.
March 24, 2012
On Saturday, 24 March 2012 at 01:07:56 UTC, Jonathan M Davis wrote:
> On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
>> Does someone have a map implementation that maintains the insertion
>> order of the keys?
>> 
>> E.g.:
>> 
>> string[string] map;
>> map["foo"] = "x";
>> map["bar"] = "y";
>> 
>> When iterating over map keys I want to visit "foo" first, then "bar", and
>> so on.
>
> I can't think of any data structure that does that off the top of my head,
> though there may very well be one. If you don't mind the extra memory
> overhead, it wouldn't be all that hard to do it with multiple maps wrapped
> in a single type. You could do something like
>
> class InsertedOrderMap
> {
>     string[string] _map;
>     RedBlackTree!(Tuple!(int, string), "a[0] < b[0]") _insertionOrder;
>
>     ...
> }
>
> _map holds the normal map and _insertionOrder holds pairs of the insertion
> ordered and the key to _map.
>
> You can then have your range iterate over _insertionOrder and use the keys
> to return either the keys, values, or pairs of keys and values from _map,
> depending on what you want to iterate over.
>
> That _does_ require having two data structures in one, and there may a be a
> better way to do it, but just off the top of my head, that's the best that I
> can come up with. I don't know how you could have a data structure where
> it's efficient to index by both insertion order and key without effectively
> having two data structures. But I really should go reread my data structures
> books. It's been too long since I studied that stuff.
>
> - Jonathan M Davis

Ruby's Hash in 1.9 maintains insertion order.  It does it using a linked list between the nodes as they are created (so yes, essentially two data structures).

Regards,
Brad Anderson