Thread overview
Associative array literal: length wrong when duplicate keys found
Jan 31, 2017
Ivan Kazmenko
Jan 31, 2017
John Colvin
Jan 31, 2017
Ivan Kazmenko
Feb 03, 2017
Ivan Kazmenko
January 31, 2017
Hi.

I wanted to check whether a few variables of the same type are all distinct, in a quick and dirty way.  I tried to do it similar to Python's "len(set(value_list)) == len(value_list)" idiom by using an associative array (AA).  At this point, I found out that when initializing the AA with a literal, the length is the number of keys given, regardless of whether some of them were the same.

A minimized example:

-----
import std.stdio;
void main () {
	auto aa = [1 : 2, 1 : 3];
	writeln (aa.length, " ", aa); // 2 [1:3, ]
}
-----

See, the length is 2, but iteration over aa yields only one key:value pair.  Also, note the comma which is a sign of internal confusion as well.

My question is, what's the state of this?  Is this a bug?  Or should it be forbidden to have such an initializer?  Or maybe it is a feature with some actual merit?

Ivan Kazmenko.

January 31, 2017
On Tuesday, 31 January 2017 at 14:15:58 UTC, Ivan Kazmenko wrote:
> Hi.
>
> I wanted to check whether a few variables of the same type are all distinct, in a quick and dirty way.  I tried to do it similar to Python's "len(set(value_list)) == len(value_list)" idiom by using an associative array (AA).  At this point, I found out that when initializing the AA with a literal, the length is the number of keys given, regardless of whether some of them were the same.
>
> A minimized example:
>
> -----
> import std.stdio;
> void main () {
> 	auto aa = [1 : 2, 1 : 3];
> 	writeln (aa.length, " ", aa); // 2 [1:3, ]
> }
> -----
>
> See, the length is 2, but iteration over aa yields only one key:value pair.  Also, note the comma which is a sign of internal confusion as well.
>
> My question is, what's the state of this?  Is this a bug?  Or should it be forbidden to have such an initializer?  Or maybe it is a feature with some actual merit?
>
> Ivan Kazmenko.

It's a bug, please report it. The initializer should be statically disallowed.

Adding a .dup works around the problem.

By the way, you can do sets like this, avoiding storing any dummy values, only keys:

struct Set(T)
{
	void[0][T] data;
	
	void insert(T x)
	{
		data[x] = (void[0]).init;
	}
	
	void remove(T x)
	{
		data.remove(x);
	}
	
	bool opBinaryRight(string op : "in")(T e)
	{
		return !!(e in data);
	}
	
	// other things like length, etc.
}

unittest
{
	Set!int s;
	s.insert(4);
	assert(4 in s);
	s.remove(4);
	assert(4 !in s);
}
January 31, 2017
On Tuesday, 31 January 2017 at 17:20:00 UTC, John Colvin wrote:
> It's a bug, please report it. The initializer should be statically disallowed.
>
> Adding a .dup works around the problem.

OK.  Hmm, but the real use case was a bit more complicated, more like:

-----
int n = 10;
foreach (i; 0..n)
    foreach (j; 0..n)
        foreach (k; 0..n)
            ... and maybe a couple more ...
            if ([i: true, j: true, k: true].length == 3)
                {...} // i, j, k is a set of distinct values
-----

Here, we don't know i, j and k statically, yet the problem is the same.

Anyway, I'll file a bug report.

> By the way, you can do sets like this, avoiding storing any dummy values, only keys:
>
> struct Set(T)
> {
> 	void[0][T] data;
> 	
> 	void insert(T x)
> 	{
> 		data[x] = (void[0]).init;
> 	}
> 	
> 	void remove(T x)
> 	{
> 		data.remove(x);
> 	}
> 	
> 	bool opBinaryRight(string op : "in")(T e)
> 	{
> 		return !!(e in data);
> 	}
> 	
> 	// other things like length, etc.
> }
>
> unittest
> {
> 	Set!int s;
> 	s.insert(4);
> 	assert(4 in s);
> 	s.remove(4);
> 	assert(4 !in s);
> }

Yeah, thanks for the recipe!  I usually do bool [key] since it does not add much overhead, but would definitely like the real set (void[0] or otherwise) when performance matters.

Ivan Kazmenko.

February 03, 2017
On Tuesday, 31 January 2017 at 19:45:33 UTC, Ivan Kazmenko wrote:
> On Tuesday, 31 January 2017 at 17:20:00 UTC, John Colvin wrote:
>> It's a bug, please report it. The initializer should be statically disallowed.
>
> Anyway, I'll file a bug report.

Hmm, found it:  https://issues.dlang.org/show_bug.cgi?id=15290

I'll add details about my use case to the report, for what it's worth.