View mode: basic / threaded / horizontal-split · Log in · Help
April 17, 2012
AA key conversion woes
I'm having some major roadblocks with the AA implementation w.r.t. how
to store/convert AA key types correctly. I've been working on this for a
while now but still cannot find a satisfactory solution; hopefully y'all
can help.

First, in order to take advantage of the compiler's auto-conversion
magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
avoid template bloat, we must take const(Key) as argument for all key
lookup methods. So Key must at least be stored as const. But since we
have to do this already, might as well do it right: Keys should be
stored as immutable.

The nice thing about this is that we can now *guarantee* that AA's won't
malfunction due to the user changing stored key values via an external
mutable reference.

The bad thing about this is how to deal with the various key types that
the user might try to pass in:

1) For value types like int, there's no problem: immutable(int)
interconverts freely with int via assignment, so we can store int keys
however we like, and we can always hand back unqualified ints to the
user. So if the user declares int[int], .keys will give int[], and if
the user declares const(int)[int], then .keys will give back
const(int)[]. No problem.

2) Where things get hairy is when non-trivial keys are stored. Take
arrays for example. If the user declares int[int[]], then we need an
.idup so that we can store the key as immutable(int)[] (or should that
be immutable(int[])?). But now if the user passes in an int[] key, we
will need to .idup it in order to store it. This is acceptable, but what
should .keys return?  If I were the user, I'd expect int[][], but since
we can't implicitly convert immutable(int)[] to int[], we need a .dup
for each key.  Which introduces unnecessary overhead, since for the most
part the user doesn't need to change the keys anyway. But handing back
immutable(int)[][] from .keys will break a lot of existing code.

3) With arrays, things are still not too bad because we have .dup and
.idup. But what about structs or classes? We do not have .idup if the
key type is specified as non-immutable, so how should the keys be
stored? (Besides, do we *want* to clone objects used as AA keys in the
first place?) And what type should .keys return?

4) What should be done if the key type is shared or inout? (I'm tempted
to say outright prohibit shared, but people may not like their code
breaking if they're actually using that in existing code.)

I'm tempted to propose that the language should be changed so that AA
keys are *always* immutable implicitly. That is, writing int[int] is
exactly the same as writing int[immutable(int)], and .keys will always
return immutable. This is the "correct" approach IMO, and existing code
should be fixed if this breaks them. This will simplify a lot of the
mess above (we can still support .idup for when the user wants to lookup
mutable keys, etc., but there will be no concessions for .keys to hand
back mutable keys -- .dup them yourself).

But I'm expecting some negative reactions to this hardline approach. :-)

But even then, I'm considering if .keys should return a mutable array if
the key is a value type (e.g., it's harmless for int[int].keys to return
int[] because the int's are a copy anyway, and this way user code won't
be unnecessarily straitjacketed to propagate immutable throughout).

I'd like to hear how people think this should work, so that the new AA
implementation will be more acceptable.


T

-- 
Never trust an operating system you don't have source for! -- Martin Schulze
April 17, 2012
Re: AA key conversion woes
On 17.04.2012 22:41, H. S. Teoh wrote:
> I'm having some major roadblocks with the AA implementation w.r.t. how
> to store/convert AA key types correctly. I've been working on this for a
> while now but still cannot find a satisfactory solution; hopefully y'all
> can help.
>
> First, in order to take advantage of the compiler's auto-conversion
> magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
> avoid template bloat, we must take const(Key) as argument for all key
> lookup methods. So Key must at least be stored as const. But since we
> have to do this already, might as well do it right: Keys should be
> stored as immutable.
>
> The nice thing about this is that we can now *guarantee* that AA's won't
> malfunction due to the user changing stored key values via an external
> mutable reference.
>
> The bad thing about this is how to deal with the various key types that
> the user might try to pass in:
>
> 1) For value types like int, there's no problem: immutable(int)
> interconverts freely with int via assignment, so we can store int keys
> however we like, and we can always hand back unqualified ints to the
> user. So if the user declares int[int], .keys will give int[], and if
> the user declares const(int)[int], then .keys will give back
> const(int)[]. No problem.
>
> 2) Where things get hairy is when non-trivial keys are stored. Take
> arrays for example. If the user declares int[int[]], then we need an
> .idup so that we can store the key as immutable(int)[] (or should that
> be immutable(int[])?). But now if the user passes in an int[] key, we
> will need to .idup it in order to store it. This is acceptable, but what
> should .keys return?  If I were the user, I'd expect int[][], but since
> we can't implicitly convert immutable(int)[] to int[], we need a .dup
> for each key.  Which introduces unnecessary overhead, since for the most
> part the user doesn't need to change the keys anyway. But handing back
> immutable(int)[][] from .keys will break a lot of existing code.
>
> 3) With arrays, things are still not too bad because we have .dup and
> .idup. But what about structs or classes? We do not have .idup if the
> key type is specified as non-immutable, so how should the keys be
> stored? (Besides, do we *want* to clone objects used as AA keys in the
> first place?) And what type should .keys return?
>
> 4) What should be done if the key type is shared or inout? (I'm tempted
> to say outright prohibit shared, but people may not like their code
> breaking if they're actually using that in existing code.)
>
> I'm tempted to propose that the language should be changed so that AA
> keys are *always* immutable implicitly. That is, writing int[int] is
> exactly the same as writing int[immutable(int)], and .keys will always
> return immutable. This is the "correct" approach IMO, and existing code
> should be fixed if this breaks them. This will simplify a lot of the
> mess above (we can still support .idup for when the user wants to lookup
> mutable keys, etc., but there will be no concessions for .keys to hand
> back mutable keys -- .dup them yourself).
>
> But I'm expecting some negative reactions to this hardline approach. :-)
>
+1 actually. Given H[Key] assoc;
I just expect assoc.keys() to return array of immutable(Key). It most 
likely needs to be allocated.

> But even then, I'm considering if .keys should return a mutable array if
> the key is a value type (e.g., it's harmless for int[int].keys to return
> int[] because the int's are a copy anyway, and this way user code won't
> be unnecessarily straitjacketed to propagate immutable throughout).
>
> I'd like to hear how people think this should work, so that the new AA
> implementation will be more acceptable.
>
>
> T
>


-- 
Dmitry Olshansky
April 17, 2012
Re: AA key conversion woes
H. S. Teoh:

> I'm tempted to propose that the language should be changed
> so that AA keys are *always* immutable implicitly. That is,
> writing int[int] is exactly the same as writing
> int[immutable(int)],

See:
http://d.puremagic.com/issues/show_bug.cgi?id=6253

Bye,
bearophile
April 17, 2012
Re: AA key conversion woes
On 04/17/2012 08:41 PM, H. S. Teoh wrote:
> I'm having some major roadblocks with the AA implementation w.r.t. how
> to store/convert AA key types correctly. I've been working on this for a
> while now but still cannot find a satisfactory solution; hopefully y'all
> can help.
>
> First, in order to take advantage of the compiler's auto-conversion
> magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
> avoid template bloat, we must take const(Key) as argument for all key
> lookup methods. So Key must at least be stored as const. But since we
> have to do this already, might as well do it right: Keys should be
> stored as immutable.
>
> The nice thing about this is that we can now *guarantee* that AA's won't
> malfunction due to the user changing stored key values via an external
> mutable reference.
>
> The bad thing about this is how to deal with the various key types that
> the user might try to pass in:
>
> 1) For value types like int, there's no problem: immutable(int)
> interconverts freely with int via assignment, so we can store int keys
> however we like, and we can always hand back unqualified ints to the
> user. So if the user declares int[int], .keys will give int[], and if
> the user declares const(int)[int], then .keys will give back
> const(int)[]. No problem.
>
> 2) Where things get hairy is when non-trivial keys are stored. Take
> arrays for example. If the user declares int[int[]], then we need an
> .idup so that we can store the key as immutable(int)[] (or should that
> be immutable(int[])?). But now if the user passes in an int[] key, we
> will need to .idup it in order to store it. This is acceptable, but what
> should .keys return?  If I were the user, I'd expect int[][], but since
> we can't implicitly convert immutable(int)[] to int[], we need a .dup
> for each key.  Which introduces unnecessary overhead, since for the most
> part the user doesn't need to change the keys anyway. But handing back
> immutable(int)[][] from .keys will break a lot of existing code.
>

How? This is the current behavior:
int[int[]] aa;
static assert(is(typeof(aa.keys)==const(int)[][]));

A change from const(int)[][] to immutable(int)[][] shouldn't break a 
significant amount of code.

> 3) With arrays, things are still not too bad because we have .dup and
> .idup. But what about structs or classes? We do not have .idup if the
> key type is specified as non-immutable, so how should the keys be
> stored? (Besides, do we *want* to clone objects used as AA keys in the
> first place?) And what type should .keys return?
>

They could implement .dup or .idup if they want those semantics. If the 
container cannot figure out how to get an immutable key from a mutable 
key, then indexing with mutable keys is disabled.

> 4) What should be done if the key type is shared or inout? (I'm tempted
> to say outright prohibit shared, but people may not like their code
> breaking if they're actually using that in existing code.)
>

immutable is implicitly shared, but implicit .idup should be disabled. 
For inout there would need to be a general solution for resolving 
templated types. For example:

struct S(T){
    int[] x;
    auto opResolveInout(string op){ ... } // op is "const"/immutable/""
}

> I'm tempted to propose that the language should be changed so that AA
> keys are *always* immutable implicitly. That is, writing int[int] is
> exactly the same as writing int[immutable(int)], and .keys will always
> return immutable.

Probably it should return an array of tail-immutable keys. Additionally, 
as Andrej proposes, there should be ikeys that returns a tail-immutable 
array.

> This is the "correct" approach IMO, and existing code
> should be fixed if this breaks them. This will simplify a lot of the
> mess above (we can still support .idup for when the user wants to lookup
> mutable keys,

Lookup should succeed without implicit .idup whenever supportable. 
Otherwise I fail to see the benefit of the implicit .idup behavior and 
it should be dropped entirely.

> etc., but there will be no concessions for .keys to hand
> back mutable keys -- .dup them yourself).
>
> But I'm expecting some negative reactions to this hardline approach. :-)
>
> But even then, I'm considering if .keys should return a mutable array if
> the key is a value type (e.g., it's harmless for int[int].keys to return
> int[] because the int's are a copy anyway, and this way user code won't
> be unnecessarily straitjacketed to propagate immutable throughout).
>
> I'd like to hear how people think this should work, so that the new AA
> implementation will be more acceptable.
>
>
> T
>

'keys' should return an array of tail-immutable keys.
int[int] => typeof(keys)==int[], typeof(ikeys)==immutable(int)[]
int[int[]] => typeof(keys) == immutable(int)[][], 
typeof(ikeys)==immutable(int[])[].

etc.
April 17, 2012
Re: AA key conversion woes
> http://d.puremagic.com/issues/show_bug.cgi?id=6253

Maybe it needs a bit more explanation. It goes according to this 
idea:
http://en.wikipedia.org/wiki/Principle_of_least_astonishment

If I define:
Foo[] a;
I expect those Foo items to be mutable.

If I see:
int[Foo]
I expect those Foo keys to be mutable.

If I see:
int[Foo]
I expect those Foo keys to be mutable.

If I see:
int[immutable Foo]
I expect those Foo keys to be immutable.

If I see a int[Foo] and I get immutable Foo keys, I am astonished.

Not doing what I am saying here will add another special case to 
D language. Avoiding many special cases is one the reasons I 
program in D instead of C++.

Bye,
bearophile
April 18, 2012
Re: AA key conversion woes
On Tue, Apr 17, 2012 at 10:13:49PM +0200, bearophile wrote:
> >http://d.puremagic.com/issues/show_bug.cgi?id=6253
> 
> Maybe it needs a bit more explanation. It goes according to this
> idea:
> http://en.wikipedia.org/wiki/Principle_of_least_astonishment
> 
> If I define:
> Foo[] a;
> I expect those Foo items to be mutable.
> 
> If I see:
> int[Foo]
> I expect those Foo keys to be mutable.
> 
> If I see:
> int[Foo]
> I expect those Foo keys to be mutable.
> 
> If I see:
> int[immutable Foo]
> I expect those Foo keys to be immutable.
> 
> If I see a int[Foo] and I get immutable Foo keys, I am astonished.
> 
> Not doing what I am saying here will add another special case to D
> language. Avoiding many special cases is one the reasons I program
> in D instead of C++.
[...]

So you're basically saying that we should refuse all non-immutable keys
in AA's?


T

-- 
Designer clothes: how to cover less by paying more.
April 18, 2012
Re: AA key conversion woes
H. S. Teoh:

> So you're basically saying that we should refuse all 
> non-immutable keys in AA's?

Something like that.

Bye,
bearophile
April 18, 2012
Re: AA key conversion woes
On Tuesday, April 17, 2012 14:17:38 H. S. Teoh wrote:
> So you're basically saying that we should refuse all non-immutable keys
> in AA's?

I would not be against enforcing that all key types be immutable or implicitly 
convertible to immutable. So, something like int[Foo] would be illegal unless 
Foo was a struct which could be implicitly converted to immutable.

But there's certainly an argument for allowing non-immutable key types in the 
declaration and automatically tacking on immutable so that int[Foo] is legal, 
but is of course really int[immutable Foo].

Allowing int[Foo] has the benefit of brevity but with a cost in clarity. The 
fact that int[Foo] is currently allowed though is a definite argument against 
making it illegal.

- Jonathan M Davis
April 18, 2012
Re: AA key conversion woes
On Tue, 17 Apr 2012 14:41:42 -0400, H. S. Teoh <hsteoh@quickfur.ath.cx>  
wrote:

> I'm having some major roadblocks with the AA implementation w.r.t. how
> to store/convert AA key types correctly. I've been working on this for a
> while now but still cannot find a satisfactory solution; hopefully y'all
> can help.
>
> First, in order to take advantage of the compiler's auto-conversion
> magic (such as using literal [1,2,3] for ubyte[] keys, etc.) and to
> avoid template bloat, we must take const(Key) as argument for all key
> lookup methods. So Key must at least be stored as const. But since we
> have to do this already, might as well do it right: Keys should be
> stored as immutable.

This is probably the best solution.  I will note that dcollections does  
not require immutable keys for maps, and likely would need work to be able  
to do immutable keys anyway.

The one part where I think immutable keys can be hindering/confusing is  
for classes.  A coder may write something like:

class Foo
{
   private int _x;
   @property int x() { return _x;}
}

In effect, Foo is immutable, since there is no valid way to change Foo.   
But having Foo not be actually marked as immutable is nice in that you  
don't have to deal with the lack of tail-immutable for classes.

In such a case, effectively, int[Foo] is really never going to be a  
problem.  But int[immutable(Foo)] will not allow you to access the x  
property.

Yes, there are better ways to mark Foo, but it still poses an unnecessary  
restriction.

> 1) For value types like int, there's no problem: immutable(int)
> interconverts freely with int via assignment, so we can store int keys
> however we like, and we can always hand back unqualified ints to the
> user. So if the user declares int[int], .keys will give int[], and if
> the user declares const(int)[int], then .keys will give back
> const(int)[]. No problem.

Sounds good (think you meant int[const(int)])

> 2) Where things get hairy is when non-trivial keys are stored. Take
> arrays for example. If the user declares int[int[]], then we need an
> .idup so that we can store the key as immutable(int)[] (or should that
> be immutable(int[])?). But now if the user passes in an int[] key, we
> will need to .idup it in order to store it. This is acceptable, but what
> should .keys return?  If I were the user, I'd expect int[][], but since
> we can't implicitly convert immutable(int)[] to int[], we need a .dup
> for each key.  Which introduces unnecessary overhead, since for the most
> part the user doesn't need to change the keys anyway. But handing back
> immutable(int)[][] from .keys will break a lot of existing code.

My first reaction is, just don't allow it.  If you are going to store  
immutable keys, require immutable keys.

BTW, immutable(int[]) is likely going to be required to be stored as a  
key, since this can also be a key:

struct S
{
   int[]
}

and there's no way to turn that into immutable(int)[].  So you will have  
to deal with this situation, might as well do it now.

> 3) With arrays, things are still not too bad because we have .dup and
> .idup. But what about structs or classes? We do not have .idup if the
> key type is specified as non-immutable, so how should the keys be
> stored? (Besides, do we *want* to clone objects used as AA keys in the
> first place?) And what type should .keys return?

.dup and .idup should not be used lightly.  Specifically, setting a hash  
value should be an amortized O(1) operation.  Using idup will make it  
O(unbounded) :)

> 4) What should be done if the key type is shared or inout? (I'm tempted
> to say outright prohibit shared, but people may not like their code
> breaking if they're actually using that in existing code.)

I think with the current implementation, inout is disallowed on a member  
field.  And for good reason.  inout only really makes sense inside an  
inout function.

> I'm tempted to propose that the language should be changed so that AA
> keys are *always* immutable implicitly. That is, writing int[int] is
> exactly the same as writing int[immutable(int)], and .keys will always
> return immutable. This is the "correct" approach IMO, and existing code
> should be fixed if this breaks them. This will simplify a lot of the
> mess above (we can still support .idup for when the user wants to lookup
> mutable keys, etc., but there will be no concessions for .keys to hand
> back mutable keys -- .dup them yourself).

I think it already does this.  But I don't like it.  I think it's better  
to just require immutable and be done with it.

People will say we can't do that because int[int[]] currently compiles,  
but somehow it was ok to make int[int[]] silently turn into  
int[immutable(int[])], which broke quite a bit of code.  I see this as a  
similar scenario.

> But I'm expecting some negative reactions to this hardline approach. :-)
>
> But even then, I'm considering if .keys should return a mutable array if
> the key is a value type (e.g., it's harmless for int[int].keys to return
> int[] because the int's are a copy anyway, and this way user code won't
> be unnecessarily straitjacketed to propagate immutable throughout).
>
> I'd like to hear how people think this should work, so that the new AA
> implementation will be more acceptable.
>
>
> T
April 18, 2012
Re: AA key conversion woes
Le 17/04/2012 22:13, bearophile a écrit :
>> http://d.puremagic.com/issues/show_bug.cgi?id=6253
> If I see a int[Foo] and I get immutable Foo keys, I am astonished.
> 
> Not doing what I am saying here will add another special case to D
> language. Avoiding many special cases is one the reasons I program in D
> instead of C++.
> 
> Bye,
> bearophile

I agree, but if I had to choose, I'd rather be astonished than forced to
have a program twice as slow because I need to duplicate all my keys for
the compiler to be happy.
Top | Discussion index | About this forum | D home