Thread overview | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
April 12, 2009 Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Sometimes you want to use an associative array just to track keys you've seen, or count distinct keys, and you don't care about the values. The language will let you declare an array such as void[int]; but you can't actually do anything with it: void main() { void[int] voidmap; // This compiles voidmap[1] = void; // This doesn't } My question is, is this a common or useful enough special case to warrant inclusion in the language? Or should I just go quietly and use a bool or a byte or something? |
April 12, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to Doctor J | On 12.04.2009 17:50, Doctor J wrote:
> Sometimes you want to use an associative array just to track keys you've seen, or count distinct keys, and you don't care about the values. The language will let you declare an array such as void[int]; but you can't actually do anything with it:
>
> void main()
> {
> void[int] voidmap; // This compiles
> voidmap[1] = void; // This doesn't
> }
>
> My question is, is this a common or useful enough special case to warrant inclusion in the language? Or should I just go quietly and use a bool or a byte or something?
>
>
You can just use an associative array to make a set type, which has the operations you need. Tango has a HashSet, but I assume you're using phobos. The implementation below is very basic, but it did what I needed at the time. No difference, union, or intersection operations, though.
struct Set(T) {
private bool[T] data_;
void add(T val) { data_[val] = true; } ///
void remove(T val) { data_.remove(val); } ///
bool opIn_r(T val) { return (val in data_) != null; } ///
size_t length() { return data_.length; } ///
void rehash() { data_.rehash; } ///
T[] toArray() { return data_.keys; } ///
///
static Set!(T) opCall(T[] values=null)
{
Set!(T) set;
foreach (T val; values)
set.add(val);
return set;
}
///
int opApply(int delegate(ref T) dg)
{
int result = 0;
foreach (key, val; data_) {
result = dg(key);
if (result)
break;
}
return result;
}
}
|
April 12, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to Doctor J | Doctor J: In my libs there is a Set(), but it maps to the built-in AAs: http://www.fantascienza.net/leonardo/so/dlibs/sets.html (and adds set methods). Adding sets to the language as AAs with void values, plus set methods is nice. Bye, bearophile |
April 12, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to Doctor J | == Quote from Doctor J (nobody@nowhere.com)'s article > Sometimes you want to use an associative array just to track keys you've seen, or count distinct keys, and you don't care about the values. The language will let you declare an array such as void[int]; but you can't actually do anything with it: > void main() > { > void[int] voidmap; // This compiles > voidmap[1] = void; // This doesn't > } > My question is, is this a common or useful enough special case to warrant inclusion in the language? Or should I just go quietly and use a bool or a byte or something? IMHO using AAs as makeshift sets is a terrible solution for a few reasons: 1. Without the void[someType] hack, it wastes at least one byte per element, sometimes a lot more depending on alignment issues. If you use the void[SomeType] hack, you get really screwy syntax, to the point where a library type would have much better syntax. This means you gain nothing except maybe usability at compile time by having it builtin. 2. A proper set type needs to have some extra methods like intersect, etc. If we're going to add all this, we should step back and figure out how to add sets to the language the right way instead of doing it the most immediately expedient way and being stuck with it. 3. D's current AA implementation kind of sucks anyhow, at least when it interacts with conservative GC. You probably don't want to build too much more stuff on top of it. That said, sets should be in either the language or in the standard lib. There are at least a few good implementations (Tango, Dcollections, probably a bunch more out there). Including one with an appropriate license and maybe some consistency tweaks in Phobos wouldn't be too hard technically, though it might be politically. On the other hand, I'm not sure if it makes sense from a consistency perspective to have AAs as a builtin, first class type and sets as a library type. I'm not sure whether this argues more for AAs being a library type or sets being builtin, but the inconsistency is just weird. |
April 12, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to dsimcha | dsimcha:
> 1. Without the void[someType] hack, it wastes at least one byte per element, sometimes a lot more depending on alignment issues.
I think it never wastes one byte. The GC allocates blocks of memory with a length as a power of 2, for small sizes. You can see this very well if you do few experiments with big AAs.
Bye,
bearophile
|
April 13, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | == Quote from bearophile (bearophileHUGS@lycos.com)'s article > dsimcha: > > 1. Without the void[someType] hack, it wastes at least one byte per element, sometimes a lot more depending on alignment issues. > I think it never wastes one byte. The GC allocates blocks of memory with a length as a power of 2, for small sizes. You can see this very well if you do few experiments with big AAs. > Bye, > bearophile What if the key is a long? |
April 13, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to dsimcha | dsimcha wrote:
> On the other hand, I'm not sure if it makes sense from a consistency perspective
> to have AAs as a builtin, first class type and sets as a library type. I'm not
> sure whether this argues more for AAs being a library type or sets being builtin,
> but the inconsistency is just weird.
Especially since an associative array should have a .keys property that returns a set.
(Incidentally, I also think the natural set operations, like intersection and mutual exclusion, are just as handy for maps as for sets.)
The natural conclusion is that AAs should be library types.
I like the fact that D provides literal syntax for AAs, but I think the correct implementation is for the compiler to pass the values from those literal expressions into a library type constructor.
--benji
|
April 13, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to dsimcha | dsimcha:
> What if the key is a long?
At the moment an AA like this:
byte[long] aa;
allocates 16 or more bytes/pair, so it needs the same memory as:
long[long] aa;
A built-in set can of course use only about 8 bytes (plus few empty cells in the hash) for a:
HashSet!(long)
Bye,
bearophile
|
April 13, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to Benji Smith | Benji Smith: > Especially since an associative array should have a .keys property that returns a set. I don't agree. I think associative arrays should have .keys/.values/.items that return a lazy view that acts like a .set/.list/.list of pairs. Such "lazy views" don't actually store anything, they are very light. This design is now present in Python3, Java and I have done very similar things in my dlibs (named xkeys/xvalues/xitems in my dlibs, but xkeys isn't a set-like thing yet). > (Incidentally, I also think the natural set operations, like intersection and mutual exclusion, are just as handy for maps as for sets.) It's less semantically clean to define certain set operations on AAs, because for example you have to decide what to do when keys are equal but their values are not. You can avoid such semantic troubles altogether performing set operations just only on the lazy view of the keys. > The natural conclusion is that AAs should be library types. I agree that probably now it's better to keep built-in only the syntax of AAs and keep their semantics into D modules code. But I think what we/I have discussed here doesn't imply such conclusion :-) I think you can do all the things I have explained here even with a fully built-in. I'm waiting for a more community-driven design of D ;-) Bye, bearophile |
April 13, 2009 Re: Associative arrays with void values | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 2009-04-13 01:53:41 -0400, bearophile <bearophileHUGS@lycos.com> said: >> (Incidentally, I also think the natural set operations, like >> intersection and mutual exclusion, are just as handy for maps as for sets.) > > It's less semantically clean to define certain set operations on AAs, because for example you have to decide what to do when keys are equal but their values are not. You could pass an alias to a function that would perform the merge between values. That way, there's not ambiguity. > You can avoid such semantic troubles altogether performing set operations just only on the lazy view of the keys. That'd require two passes: one for the keys, a second for getting the values from both sides (merging the values). It's going to be less efficient. -- Michel Fortin michel.fortin@michelf.com http://michelf.com/ |
Copyright © 1999-2021 by the D Language Foundation