Thread overview
Container templates and constancy of elements
Mar 14, 2012
Stewart Gordon
Mar 14, 2012
H. S. Teoh
Mar 14, 2012
Ali Çehreli
Mar 14, 2012
Stewart Gordon
March 14, 2012
With some containers, such as lists and trees (binary search trees aside), it doesn't matter if the elements can change state, since the data structure remains well-formed.

However, with others, such as sets and AAs, it's desirable if the keys don't mutate.  Any operations on them won't work right if there are keys in the wrong hash slots or out of order (depending on the underlying data structure) because they have changed since being put into the container.  Of course, this doesn't apply to _values_ in an AA, since these can safely mutate.

In Java and D1, it seems you just have to go on trust if you want your container to accept key types other than strict value types.  But can we do better in D2 - short of forcing the key type to be immutable, which would hinder the container's usefulness?

But it seems D2 has taken one step in that general direction by automatically tail-consting the key type of an AA.  But it doesn't stop modifications to the key through mutable references to the data.  And if the key is of class type, you can still modify the object, since D2 conflates constancy of an object reference with constancy of the object itself.  This is in itself silly.  Really, D should have tail-const built in for stuff like this.  OK, so there's Rebindable, but I've found it to be a PITA when trying to do generic programming with it, as well as leading to this AA key anomaly because it isn't a built-in feature.


I suppose there are a few possibilities for what constancy to apply to elements of a data structure to which changes might affect the structure's integrity:

(a) force the type to be tail-const if it's an array, otherwise don't add any constancy
(b) force the type to be tail-const if it's an array, or fully const if it's a class
(c) force the type to be tail-immutable if it's an array, otherwise don't add any constancy
(d) force the type to be tail-immutable if it's an array, or fully immutable if it's a class
(e) don't add any constancy, but just rely on trust


Currently, AAs implement (a).  (d) guarantees that changes to the data that mess up the data structure will never happen, at least if the programmer doesn't bypass the const system.  (e) is the route D1 and Java are forced to take.  Am I right in thinking that sets and maps in the C++ STL take the same path?


Anyway, what are people's thoughts on the right way to go here?

Stewart.
March 14, 2012
On Wed, Mar 14, 2012 at 06:51:54PM +0000, Stewart Gordon wrote:
> With some containers, such as lists and trees (binary search trees aside), it doesn't matter if the elements can change state, since the data structure remains well-formed.
> 
> However, with others, such as sets and AAs, it's desirable if the keys don't mutate.  Any operations on them won't work right if there are keys in the wrong hash slots or out of order (depending on the underlying data structure) because they have changed since being put into the container.  Of course, this doesn't apply to _values_ in an AA, since these can safely mutate.

IMO, AA keys should be *implicitly* immutable. That is, when you declare something like:

	int[ubyte[]] x;

then the key type of x should be immutable(ubyte)[], not ubyte[].

Otherwise, it just doesn't make any sense, and causes several of the AA-related bugs currently on the bugtracker.


> In Java and D1, it seems you just have to go on trust if you want your container to accept key types other than strict value types. But can we do better in D2 - short of forcing the key type to be immutable, which would hinder the container's usefulness?
> 
> But it seems D2 has taken one step in that general direction by automatically tail-consting the key type of an AA.  But it doesn't stop modifications to the key through mutable references to the data.

Exactly. AA keys must be immutable, no matter what. Of course, to interact nicely with existing code, methods like .opIndex or .get should also accept mutable keys for lookup purposes, and .idup them when a new entry needs to be created.


> And if the key is of class type, you can still modify the object, since D2 conflates constancy of an object reference with constancy of the object itself.  This is in itself silly.  Really, D should have tail-const built in for stuff like this.

This is a major PITA. Especially if you're trying to be const-correct in your code, then it leads to nonsense like being unable to traverse a linked list inside a const method, because the iteration pointer itself is const (whereas it's the *data* it's pointing to that's const) so you can't overwrite the loop variable. However, you *can* recursively search the list.

I find this to be a major WAT in D.


[...]
> I suppose there are a few possibilities for what constancy to apply to elements of a data structure to which changes might affect the structure's integrity:
> 
> (a) force the type to be tail-const if it's an array, otherwise don't add any constancy
> (b) force the type to be tail-const if it's an array, or fully const if it's a class
> (c) force the type to be tail-immutable if it's an array, otherwise don't add any constancy
> (d) force the type to be tail-immutable if it's an array, or fully immutable if it's a class
> (e) don't add any constancy, but just rely on trust
> 
> 
> Currently, AAs implement (a).

Which is prone to bugs.


> (d) guarantees that changes to the data that mess up the data structure will never happen, at least if the programmer doesn't bypass the const system.
[...]

I believe this is the best way to go. Well, at least for AA's this is needed. Otherwise there will always be the possibility that AA's will malfunction when the key changes behind the container's back.


T

-- 
Just because you survived after you did it, doesn't mean it wasn't stupid!
March 14, 2012
On 03/14/2012 12:24 PM, H. S. Teoh wrote:
> On Wed, Mar 14, 2012 at 06:51:54PM +0000, Stewart Gordon wrote:
>> With some containers, such as lists and trees (binary search trees
>> aside), it doesn't matter if the elements can change state, since
>> the data structure remains well-formed.
>>
>> However, with others, such as sets and AAs, it's desirable if the
>> keys don't mutate.  Any operations on them won't work right if there
>> are keys in the wrong hash slots or out of order (depending on the
>> underlying data structure) because they have changed since being put
>> into the container.  Of course, this doesn't apply to _values_ in an
>> AA, since these can safely mutate.
>
> IMO, AA keys should be *implicitly* immutable. That is, when you declare
> something like:
>
> 	int[ubyte[]] x;
>
> then the key type of x should be immutable(ubyte)[], not ubyte[].

Yes, the internally stored copy of the key should be immutable(ubyte[]). Please note the difference from yours, but I guess that extra protection is just to protect the developers from themselves by avoiding mutating the key in the implementation.

But the conceptual key type of the AA should be const(ubyte[]). The reason is, a const parameter is welcoming: It says "I accept both mutable and immutable as arguments". On the other hand, an immutable parameter is restricting: It says "I demand only immutable as argument".

The following opIndex accepts  keys of five different types of immutability:

class SingleElementAA
{
    immutable(uint[]) key;
    int value;

    ref int opIndex(const(uint[]) key)
    {
        return value;
    }
}

void main()
{
    auto aa = new SingleElementAA();

    uint[] mutable;
    immutable(uint)[] element_immutable;
    immutable(uint[]) all_immutable;
    enum uint[] enum_immutable = [ 1 ];

    aa[mutable] = 42;
    aa[element_immutable] = 43;
    aa[all_immutable] = 44;
    aa[enum_immutable] = 45;
    aa[[0, 1]] = 42;          // literal key
}

Ali

March 14, 2012
On Wed, 14 Mar 2012 14:51:54 -0400, Stewart Gordon <smjg_1998@yahoo.com> wrote:

> With some containers, such as lists and trees (binary search trees aside), it doesn't matter if the elements can change state, since the data structure remains well-formed.
>
> However, with others, such as sets and AAs, it's desirable if the keys don't mutate.  Any operations on them won't work right if there are keys in the wrong hash slots or out of order (depending on the underlying data structure) because they have changed since being put into the container.  Of course, this doesn't apply to _values_ in an AA, since these can safely mutate.
>
> In Java and D1, it seems you just have to go on trust if you want your container to accept key types other than strict value types.  But can we do better in D2 - short of forcing the key type to be immutable, which would hinder the container's usefulness?
>
> But it seems D2 has taken one step in that general direction by automatically tail-consting the key type of an AA.  But it doesn't stop modifications to the key through mutable references to the data.  And if the key is of class type, you can still modify the object, since D2 conflates constancy of an object reference with constancy of the object itself.  This is in itself silly.  Really, D should have tail-const built in for stuff like this.  OK, so there's Rebindable, but I've found it to be a PITA when trying to do generic programming with it, as well as leading to this AA key anomaly because it isn't a built-in feature.
>
>
> I suppose there are a few possibilities for what constancy to apply to elements of a data structure to which changes might affect the structure's integrity:
>
> (a) force the type to be tail-const if it's an array, otherwise don't add any constancy
> (b) force the type to be tail-const if it's an array, or fully const if it's a class
> (c) force the type to be tail-immutable if it's an array, otherwise don't add any constancy
> (d) force the type to be tail-immutable if it's an array, or fully immutable if it's a class
> (e) don't add any constancy, but just rely on trust
>
>
> Currently, AAs implement (a).  (d) guarantees that changes to the data that mess up the data structure will never happen, at least if the programmer doesn't bypass the const system.  (e) is the route D1 and Java are forced to take.  Am I right in thinking that sets and maps in the C++ STL take the same path?
>
>
> Anyway, what are people's thoughts on the right way to go here?

IMO, @safe code should only allow d, @system/trusted should allow e.

And the compiler shouldn't be "helping" you by adding const annotations.  That is:

int[char[]] aa;

this line should either compile, and the type should be int[char[]] aa, or it should not compile.

D is supposed to be able to do "at your own risk" bare-metal types of things.  This seems like it unnecessarily limits code.

-Steve
March 14, 2012
On 14/03/2012 19:24, H. S. Teoh wrote:
<snip>
> Exactly. AA keys must be immutable, no matter what. Of course, to
> interact nicely with existing code, methods like .opIndex or .get should
> also accept mutable keys for lookup purposes, and .idup them when a new
> entry needs to be created.

This would rely on class authors making sure they define .idup.  We would also need a deep version of .idup for array-of-array and array-of-class types.

<snip>
>> (d) guarantees that changes to the data that mess up the data
>> structure will never happen, at least if the programmer doesn't bypass
>> the const system.
> [...]
>
> I believe this is the best way to go. Well, at least for AA's this is
> needed. Otherwise there will always be the possibility that AA's will
> malfunction when the key changes behind the container's back.

Thinking about it, if we're going to go this far, maybe we could just make the key type fully immutable whatever it is.  This would enable the key variable in a foreach to be ref for efficiency (especially useful if it's a struct type) and still prevent changing of the key through it.

But this precludes implementing hash slots as arrays, at least if you want to be able to delete elements.  I might have to rethink my strategy here as well.  A linked list would get around it but increase the memory allocation overhead - not sure if this is worth worrying about.

Stewart.