Thread overview
AA key conversion woes
Apr 17, 2012
H. S. Teoh
Apr 17, 2012
Dmitry Olshansky
Apr 17, 2012
bearophile
Apr 17, 2012
bearophile
Apr 18, 2012
H. S. Teoh
Apr 18, 2012
bearophile
Apr 18, 2012
Jonathan M Davis
Apr 18, 2012
Somedude
Apr 17, 2012
Timon Gehr
April 17, 2012
I'm having some major roadblocks with the AA implementation w.r.t. how to store/convert AA key types correctly. I've been working on this for a while now but still cannot find a satisfactory solution; hopefully y'all can help.

First, in order to take advantage of the compiler's auto-conversion magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to avoid template bloat, we must take const(Key) as argument for all key lookup methods. So Key must at least be stored as const. But since we have to do this already, might as well do it right: Keys should be stored as immutable.

The nice thing about this is that we can now *guarantee* that AA's won't malfunction due to the user changing stored key values via an external mutable reference.

The bad thing about this is how to deal with the various key types that the user might try to pass in:

1) For value types like int, there's no problem: immutable(int) interconverts freely with int via assignment, so we can store int keys however we like, and we can always hand back unqualified ints to the user. So if the user declares int[int], .keys will give int[], and if the user declares const(int)[int], then .keys will give back const(int)[]. No problem.

2) Where things get hairy is when non-trivial keys are stored. Take arrays for example. If the user declares int[int[]], then we need an .idup so that we can store the key as immutable(int)[] (or should that be immutable(int[])?). But now if the user passes in an int[] key, we will need to .idup it in order to store it. This is acceptable, but what should .keys return?  If I were the user, I'd expect int[][], but since we can't implicitly convert immutable(int)[] to int[], we need a .dup for each key.  Which introduces unnecessary overhead, since for the most part the user doesn't need to change the keys anyway. But handing back immutable(int)[][] from .keys will break a lot of existing code.

3) With arrays, things are still not too bad because we have .dup and .idup. But what about structs or classes? We do not have .idup if the key type is specified as non-immutable, so how should the keys be stored? (Besides, do we *want* to clone objects used as AA keys in the first place?) And what type should .keys return?

4) What should be done if the key type is shared or inout? (I'm tempted to say outright prohibit shared, but people may not like their code breaking if they're actually using that in existing code.)

I'm tempted to propose that the language should be changed so that AA keys are *always* immutable implicitly. That is, writing int[int] is exactly the same as writing int[immutable(int)], and .keys will always return immutable. This is the "correct" approach IMO, and existing code should be fixed if this breaks them. This will simplify a lot of the mess above (we can still support .idup for when the user wants to lookup mutable keys, etc., but there will be no concessions for .keys to hand back mutable keys -- .dup them yourself).

But I'm expecting some negative reactions to this hardline approach. :-)

But even then, I'm considering if .keys should return a mutable array if the key is a value type (e.g., it's harmless for int[int].keys to return int[] because the int's are a copy anyway, and this way user code won't be unnecessarily straitjacketed to propagate immutable throughout).

I'd like to hear how people think this should work, so that the new AA implementation will be more acceptable.


T

-- 
Never trust an operating system you don't have source for! -- Martin Schulze
April 17, 2012
On 17.04.2012 22:41, H. S. Teoh wrote:
> I'm having some major roadblocks with the AA implementation w.r.t. how
> to store/convert AA key types correctly. I've been working on this for a
> while now but still cannot find a satisfactory solution; hopefully y'all
> can help.
>
> First, in order to take advantage of the compiler's auto-conversion
> magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
> avoid template bloat, we must take const(Key) as argument for all key
> lookup methods. So Key must at least be stored as const. But since we
> have to do this already, might as well do it right: Keys should be
> stored as immutable.
>
> The nice thing about this is that we can now *guarantee* that AA's won't
> malfunction due to the user changing stored key values via an external
> mutable reference.
>
> The bad thing about this is how to deal with the various key types that
> the user might try to pass in:
>
> 1) For value types like int, there's no problem: immutable(int)
> interconverts freely with int via assignment, so we can store int keys
> however we like, and we can always hand back unqualified ints to the
> user. So if the user declares int[int], .keys will give int[], and if
> the user declares const(int)[int], then .keys will give back
> const(int)[]. No problem.
>
> 2) Where things get hairy is when non-trivial keys are stored. Take
> arrays for example. If the user declares int[int[]], then we need an
> .idup so that we can store the key as immutable(int)[] (or should that
> be immutable(int[])?). But now if the user passes in an int[] key, we
> will need to .idup it in order to store it. This is acceptable, but what
> should .keys return?  If I were the user, I'd expect int[][], but since
> we can't implicitly convert immutable(int)[] to int[], we need a .dup
> for each key.  Which introduces unnecessary overhead, since for the most
> part the user doesn't need to change the keys anyway. But handing back
> immutable(int)[][] from .keys will break a lot of existing code.
>
> 3) With arrays, things are still not too bad because we have .dup and
> .idup. But what about structs or classes? We do not have .idup if the
> key type is specified as non-immutable, so how should the keys be
> stored? (Besides, do we *want* to clone objects used as AA keys in the
> first place?) And what type should .keys return?
>
> 4) What should be done if the key type is shared or inout? (I'm tempted
> to say outright prohibit shared, but people may not like their code
> breaking if they're actually using that in existing code.)
>
> I'm tempted to propose that the language should be changed so that AA
> keys are *always* immutable implicitly. That is, writing int[int] is
> exactly the same as writing int[immutable(int)], and .keys will always
> return immutable. This is the "correct" approach IMO, and existing code
> should be fixed if this breaks them. This will simplify a lot of the
> mess above (we can still support .idup for when the user wants to lookup
> mutable keys, etc., but there will be no concessions for .keys to hand
> back mutable keys -- .dup them yourself).
>
> But I'm expecting some negative reactions to this hardline approach. :-)
>
+1 actually. Given H[Key] assoc;
I just expect assoc.keys() to return array of immutable(Key). It most likely needs to be allocated.

> But even then, I'm considering if .keys should return a mutable array if
> the key is a value type (e.g., it's harmless for int[int].keys to return
> int[] because the int's are a copy anyway, and this way user code won't
> be unnecessarily straitjacketed to propagate immutable throughout).
>
> I'd like to hear how people think this should work, so that the new AA
> implementation will be more acceptable.
>
>
> T
>


-- 
Dmitry Olshansky
April 17, 2012
H. S. Teoh:

> I'm tempted to propose that the language should be changed
> so that AA keys are *always* immutable implicitly. That is,
> writing int[int] is exactly the same as writing
> int[immutable(int)],

See:
http://d.puremagic.com/issues/show_bug.cgi?id=6253

Bye,
bearophile
April 17, 2012
On 04/17/2012 08:41 PM, H. S. Teoh wrote:
> I'm having some major roadblocks with the AA implementation w.r.t. how
> to store/convert AA key types correctly. I've been working on this for a
> while now but still cannot find a satisfactory solution; hopefully y'all
> can help.
>
> First, in order to take advantage of the compiler's auto-conversion
> magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
> avoid template bloat, we must take const(Key) as argument for all key
> lookup methods. So Key must at least be stored as const. But since we
> have to do this already, might as well do it right: Keys should be
> stored as immutable.
>
> The nice thing about this is that we can now *guarantee* that AA's won't
> malfunction due to the user changing stored key values via an external
> mutable reference.
>
> The bad thing about this is how to deal with the various key types that
> the user might try to pass in:
>
> 1) For value types like int, there's no problem: immutable(int)
> interconverts freely with int via assignment, so we can store int keys
> however we like, and we can always hand back unqualified ints to the
> user. So if the user declares int[int], .keys will give int[], and if
> the user declares const(int)[int], then .keys will give back
> const(int)[]. No problem.
>
> 2) Where things get hairy is when non-trivial keys are stored. Take
> arrays for example. If the user declares int[int[]], then we need an
> .idup so that we can store the key as immutable(int)[] (or should that
> be immutable(int[])?). But now if the user passes in an int[] key, we
> will need to .idup it in order to store it. This is acceptable, but what
> should .keys return?  If I were the user, I'd expect int[][], but since
> we can't implicitly convert immutable(int)[] to int[], we need a .dup
> for each key.  Which introduces unnecessary overhead, since for the most
> part the user doesn't need to change the keys anyway. But handing back
> immutable(int)[][] from .keys will break a lot of existing code.
>

How? This is the current behavior:
int[int[]] aa;
static assert(is(typeof(aa.keys)==const(int)[][]));

A change from const(int)[][] to immutable(int)[][] shouldn't break a significant amount of code.

> 3) With arrays, things are still not too bad because we have .dup and
> .idup. But what about structs or classes? We do not have .idup if the
> key type is specified as non-immutable, so how should the keys be
> stored? (Besides, do we *want* to clone objects used as AA keys in the
> first place?) And what type should .keys return?
>

They could implement .dup or .idup if they want those semantics. If the container cannot figure out how to get an immutable key from a mutable key, then indexing with mutable keys is disabled.

> 4) What should be done if the key type is shared or inout? (I'm tempted
> to say outright prohibit shared, but people may not like their code
> breaking if they're actually using that in existing code.)
>

immutable is implicitly shared, but implicit .idup should be disabled. For inout there would need to be a general solution for resolving templated types. For example:

struct S(T){
    int[] x;
    auto opResolveInout(string op){ ... } // op is "const"/immutable/""
}

> I'm tempted to propose that the language should be changed so that AA
> keys are *always* immutable implicitly. That is, writing int[int] is
> exactly the same as writing int[immutable(int)], and .keys will always
> return immutable.

Probably it should return an array of tail-immutable keys. Additionally, as Andrej proposes, there should be ikeys that returns a tail-immutable array.

> This is the "correct" approach IMO, and existing code
> should be fixed if this breaks them. This will simplify a lot of the
> mess above (we can still support .idup for when the user wants to lookup
> mutable keys,

Lookup should succeed without implicit .idup whenever supportable. Otherwise I fail to see the benefit of the implicit .idup behavior and it should be dropped entirely.

> etc., but there will be no concessions for .keys to hand
> back mutable keys -- .dup them yourself).
>
> But I'm expecting some negative reactions to this hardline approach. :-)
>
> But even then, I'm considering if .keys should return a mutable array if
> the key is a value type (e.g., it's harmless for int[int].keys to return
> int[] because the int's are a copy anyway, and this way user code won't
> be unnecessarily straitjacketed to propagate immutable throughout).
>
> I'd like to hear how people think this should work, so that the new AA
> implementation will be more acceptable.
>
>
> T
>

'keys' should return an array of tail-immutable keys.
int[int] => typeof(keys)==int[], typeof(ikeys)==immutable(int)[]
int[int[]] => typeof(keys) == immutable(int)[][], typeof(ikeys)==immutable(int[])[].

etc.

April 17, 2012
> http://d.puremagic.com/issues/show_bug.cgi?id=6253

Maybe it needs a bit more explanation. It goes according to this idea:
http://en.wikipedia.org/wiki/Principle_of_least_astonishment

If I define:
Foo[] a;
I expect those Foo items to be mutable.

If I see:
int[Foo]
I expect those Foo keys to be mutable.

If I see:
int[Foo]
I expect those Foo keys to be mutable.

If I see:
int[immutable Foo]
I expect those Foo keys to be immutable.

If I see a int[Foo] and I get immutable Foo keys, I am astonished.

Not doing what I am saying here will add another special case to D language. Avoiding many special cases is one the reasons I program in D instead of C++.

Bye,
bearophile
April 18, 2012
On Tue, Apr 17, 2012 at 10:13:49PM +0200, bearophile wrote:
> >http://d.puremagic.com/issues/show_bug.cgi?id=6253
> 
> Maybe it needs a bit more explanation. It goes according to this idea: http://en.wikipedia.org/wiki/Principle_of_least_astonishment
> 
> If I define:
> Foo[] a;
> I expect those Foo items to be mutable.
> 
> If I see:
> int[Foo]
> I expect those Foo keys to be mutable.
> 
> If I see:
> int[Foo]
> I expect those Foo keys to be mutable.
> 
> If I see:
> int[immutable Foo]
> I expect those Foo keys to be immutable.
> 
> If I see a int[Foo] and I get immutable Foo keys, I am astonished.
> 
> Not doing what I am saying here will add another special case to D language. Avoiding many special cases is one the reasons I program in D instead of C++.
[...]

So you're basically saying that we should refuse all non-immutable keys in AA's?


T

-- 
Designer clothes: how to cover less by paying more.
April 18, 2012
H. S. Teoh:

> So you're basically saying that we should refuse all non-immutable keys in AA's?

Something like that.

Bye,
bearophile
April 18, 2012
On Tuesday, April 17, 2012 14:17:38 H. S. Teoh wrote:
> So you're basically saying that we should refuse all non-immutable keys in AA's?

I would not be against enforcing that all key types be immutable or implicitly convertible to immutable. So, something like int[Foo] would be illegal unless Foo was a struct which could be implicitly converted to immutable.

But there's certainly an argument for allowing non-immutable key types in the declaration and automatically tacking on immutable so that int[Foo] is legal, but is of course really int[immutable Foo].

Allowing int[Foo] has the benefit of brevity but with a cost in clarity. The fact that int[Foo] is currently allowed though is a definite argument against making it illegal.

- Jonathan M Davis
April 18, 2012
On Tue, 17 Apr 2012 14:41:42 -0400, H. S. Teoh <hsteoh@quickfur.ath.cx> wrote:

> I'm having some major roadblocks with the AA implementation w.r.t. how
> to store/convert AA key types correctly. I've been working on this for a
> while now but still cannot find a satisfactory solution; hopefully y'all
> can help.
>
> First, in order to take advantage of the compiler's auto-conversion
> magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
> avoid template bloat, we must take const(Key) as argument for all key
> lookup methods. So Key must at least be stored as const. But since we
> have to do this already, might as well do it right: Keys should be
> stored as immutable.

This is probably the best solution.  I will note that dcollections does not require immutable keys for maps, and likely would need work to be able to do immutable keys anyway.

The one part where I think immutable keys can be hindering/confusing is for classes.  A coder may write something like:

class Foo
{
   private int _x;
   @property int x() { return _x;}
}

In effect, Foo is immutable, since there is no valid way to change Foo.  But having Foo not be actually marked as immutable is nice in that you don't have to deal with the lack of tail-immutable for classes.

In such a case, effectively, int[Foo] is really never going to be a problem.  But int[immutable(Foo)] will not allow you to access the x property.

Yes, there are better ways to mark Foo, but it still poses an unnecessary restriction.

> 1) For value types like int, there's no problem: immutable(int)
> interconverts freely with int via assignment, so we can store int keys
> however we like, and we can always hand back unqualified ints to the
> user. So if the user declares int[int], .keys will give int[], and if
> the user declares const(int)[int], then .keys will give back
> const(int)[]. No problem.

Sounds good (think you meant int[const(int)])

> 2) Where things get hairy is when non-trivial keys are stored. Take
> arrays for example. If the user declares int[int[]], then we need an
> .idup so that we can store the key as immutable(int)[] (or should that
> be immutable(int[])?). But now if the user passes in an int[] key, we
> will need to .idup it in order to store it. This is acceptable, but what
> should .keys return?  If I were the user, I'd expect int[][], but since
> we can't implicitly convert immutable(int)[] to int[], we need a .dup
> for each key.  Which introduces unnecessary overhead, since for the most
> part the user doesn't need to change the keys anyway. But handing back
> immutable(int)[][] from .keys will break a lot of existing code.

My first reaction is, just don't allow it.  If you are going to store immutable keys, require immutable keys.

BTW, immutable(int[]) is likely going to be required to be stored as a key, since this can also be a key:

struct S
{
   int[]
}

and there's no way to turn that into immutable(int)[].  So you will have to deal with this situation, might as well do it now.

> 3) With arrays, things are still not too bad because we have .dup and
> .idup. But what about structs or classes? We do not have .idup if the
> key type is specified as non-immutable, so how should the keys be
> stored? (Besides, do we *want* to clone objects used as AA keys in the
> first place?) And what type should .keys return?

.dup and .idup should not be used lightly.  Specifically, setting a hash value should be an amortized O(1) operation.  Using idup will make it O(unbounded) :)

> 4) What should be done if the key type is shared or inout? (I'm tempted
> to say outright prohibit shared, but people may not like their code
> breaking if they're actually using that in existing code.)

I think with the current implementation, inout is disallowed on a member field.  And for good reason.  inout only really makes sense inside an inout function.

> I'm tempted to propose that the language should be changed so that AA
> keys are *always* immutable implicitly. That is, writing int[int] is
> exactly the same as writing int[immutable(int)], and .keys will always
> return immutable. This is the "correct" approach IMO, and existing code
> should be fixed if this breaks them. This will simplify a lot of the
> mess above (we can still support .idup for when the user wants to lookup
> mutable keys, etc., but there will be no concessions for .keys to hand
> back mutable keys -- .dup them yourself).

I think it already does this.  But I don't like it.  I think it's better to just require immutable and be done with it.

People will say we can't do that because int[int[]] currently compiles, but somehow it was ok to make int[int[]] silently turn into int[immutable(int[])], which broke quite a bit of code.  I see this as a similar scenario.

> But I'm expecting some negative reactions to this hardline approach. :-)
>
> But even then, I'm considering if .keys should return a mutable array if
> the key is a value type (e.g., it's harmless for int[int].keys to return
> int[] because the int's are a copy anyway, and this way user code won't
> be unnecessarily straitjacketed to propagate immutable throughout).
>
> I'd like to hear how people think this should work, so that the new AA
> implementation will be more acceptable.
>
>
> T
April 18, 2012
Le 17/04/2012 22:13, bearophile a écrit :
>> http://d.puremagic.com/issues/show_bug.cgi?id=6253
> If I see a int[Foo] and I get immutable Foo keys, I am astonished.
> 
> Not doing what I am saying here will add another special case to D language. Avoiding many special cases is one the reasons I program in D instead of C++.
> 
> Bye,
> bearophile

I agree, but if I had to choose, I'd rather be astonished than forced to have a program twice as slow because I need to duplicate all my keys for the compiler to be happy.