Thread overview | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
February 05, 2007 Sets | ||||
---|---|---|---|---|
| ||||
Is there a built-in way to create a set in D? Like an associative array, only without the value? I tried: void[KeyType] setName; It compiled. But there doesn't seem to be a way to add an item to the set. Am I missing something? Or is it just a bug that this even compiled? I know I could just use a dummy value, but that's just not very elegant. Thanks! |
February 05, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michiel | On Mon, 5 Feb 2007 17:44:10 +0000 (UTC), Michiel wrote:
> Is there a built-in way to create a set in D? Like an associative array, only without the value?
>
> I tried: void[KeyType] setName;
>
> It compiled. But there doesn't seem to be a way to add an item to the set.
>
> Am I missing something? Or is it just a bug that this even compiled?
>
> I know I could just use a dummy value, but that's just not very elegant.
>
> Thanks!
I just use: bool[Keytype] setName;
|
February 05, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Derek Parnell | Derek Parnell wrote:
> I just use: bool[Keytype] setName;
The problem is that the set of possible keys isn't fixed (I'll often use char[] as key-type). So with your idea, there are two possibilities for elements not in the set. Either they're not in the array at all, or they are, but their value is false.
This gives me a very PHPish feeling. :)
I'd feel better about it if sets were built into D. But I know using a dummy value is an option. And that's what I'll use if I have no other choice.
|
February 05, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michiel | Michiel wrote:
> Derek Parnell wrote:
>
>> I just use: bool[Keytype] setName;
>
> The problem is that the set of possible keys isn't fixed (I'll often use char[] as
> key-type). So with your idea, there are two possibilities for elements not in the
> set. Either they're not in the array at all, or they are, but their value is false.
>
> This gives me a very PHPish feeling. :)
>
> I'd feel better about it if sets were built into D. But I know using a dummy value
> is an option. And that's what I'll use if I have no other choice.
The behavior for sets was cleaner before their behavior was changed such that an opIndex lookup throws an exception if the lookup fails. Previously, it would return the default value, which for bool would be 'false'. Insertions were still a bit messy as setName[key] = true, but it was better than nothing. Personally, I'm not fond of how AA indexing and 'in' works now. It feels like a kludge.
Sean
|
February 05, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly wrote:
> The behavior for sets was cleaner before their behavior was changed such that an opIndex lookup throws an exception if the lookup fails. Previously, it would return the default value, which for bool would be 'false'. Insertions were still a bit messy as setName[key] = true, but it was better than nothing. Personally, I'm not fond of how AA indexing and 'in' works now. It feels like a kludge.
Ah, I see. I understand how the old behavior was handy for faking sets. But to be honest, I like the new behavior better in general. That old behavior was much more ambiguous. And the 'in' notation is more intuitive for people with a background in basic set theory.
My problems are:
* Declaring the set: You have to give a value type, even though you don't really
need it.
* Adding to the set: You have to assign a dummy value, which isn't very elegant.
I think the void[KeyType] notation for sets is very intuitive. All it further needs is an add(KeyType element) function. I think it can be well defined and would be a nice addition to the D language.
Do the D developers read this newsgroup? Or should I mail them with this suggestion?
|
February 05, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michiel | Michiel wrote:
> Sean Kelly wrote:
>
>> The behavior for sets was cleaner before their behavior was changed such
>> that an opIndex lookup throws an exception if the lookup fails.
>> Previously, it would return the default value, which for bool would be
>> 'false'. Insertions were still a bit messy as setName[key] = true, but
>> it was better than nothing. Personally, I'm not fond of how AA indexing
>> and 'in' works now. It feels like a kludge.
>
> Ah, I see. I understand how the old behavior was handy for faking sets. But to be
> honest, I like the new behavior better in general. That old behavior was much more
> ambiguous. And the 'in' notation is more intuitive for people with a background in
> basic set theory.
>
> My problems are:
> * Declaring the set: You have to give a value type, even though you don't really
> need it.
> * Adding to the set: You have to assign a dummy value, which isn't very elegant.
>
> I think the void[KeyType] notation for sets is very intuitive. All it further
> needs is an add(KeyType element) function. I think it can be well defined and
> would be a nice addition to the D language.
>
> Do the D developers read this newsgroup? Or should I mail them with this suggestion?
Walter reads this newsgroup, so there's nothing in particular you need to do.
The lack of a set type and the use of void[KeyType] for sets has been discussed a few times in the past, I think. It makes sense to me too, at first glance. But I suspect it would require a whole different codepath in the compiler to deal with because you can't have arrays of voids etc. You really want a different data structure to store a set. And as you mentioned it should have an 'add' method which would be meaningless for ordinary AA's. Following AA syntax could also lead to annoying special cases for generic templates that would have to be prepared for the V==void case:
foo(K,V)(V[K] aa) { ...V val; // whoops V could be void! }
Anyway, all that says to me that void[KeyType] for sets is not really a slam dunk, and a simple set() class in the std library might be a better option.
I'm surprised Sean didn't mention it, but the Tango replacement for the standard library does have Set as well as Bag (aka "multi-set") collection types.
--bb
|
February 05, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | > The lack of a set type and the use of void[KeyType] for sets has been discussed a few times in the past, I think. It makes sense to me too, at first glance. But I suspect it would require a whole different codepath in the compiler to deal with because you can't have arrays of voids etc. You really want a different data structure to store a set. I think creating a whole different data structure just for the set isn't worth the trouble. For now, I've created the following simple code, which I will use for now. It still uses dummy values, but hides this behind a template and an array function: --------------------------------- template Set(T) { alias bit[T] Set; } void add(T)(inout bit[T] set, T element) { set[element] = 1; } --------------------------------- Set!(char[]) set; set.add("one"); set.add("two"); assert("one" in set); assert("two" in set); assert(!("fortytwo" in set)); --------------------------------- > And as you mentioned it should have an 'add' method which would be meaningless for ordinary AA's. For ordinary AA's, another parameter could be passed for the value, which could default to the type's init value. > Following AA syntax could also lead to > annoying special cases for generic templates that would have to be > prepared for the V==void case: > foo(K,V)(V[K] aa) { ...V val; // whoops V could be void! } Isn't the same thing true for many situations? What if you wanted to compare a V object with an integer? The compiler would generate an error for any V for which no such comparison is defined. I would simply expect the compiler to complain in your example. I'm not saying it's easy. But it should be worth it for the same reason that associative arrays are included into the core language. |
February 05, 2007 Re: Sets, please | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michiel | Michiel wrote:
> Is there a built-in way to create a set in D? Like an associative array, only without the value?
Sets are so useful, and aa with void value could work as sets quite well.
Walter ... pleeease? :)
|
February 05, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michiel | Reply to Michiel,
> Derek Parnell wrote:
>
>> I just use: bool[Keytype] setName;
>>
> The problem is that the set of possible keys isn't fixed (I'll often
> use char[] as key-type). So with your idea, there are two
> possibilities for elements not in the set. Either they're not in the
> array at all, or they are, but their value is false.
>
> This gives me a very PHPish feeling. :)
>
> I'd feel better about it if sets were built into D. But I know using a
> dummy value is an option. And that's what I'll use if I have no other
> choice.
>
you might look at the (<key> "in" <set>) syntax
bool[Keys] set;
set[key1] = false;
set[key2] = true;
(key1 in set) // is true (!= null to be totally correct)
(key2 in set) // is true
(key3 in set) // is false (== null)
a good set syntax would be nice
void[key] set; // no storage for each member
set[key1] = true; // adds member
set[key1] = false; // removes member
if(set[key1]) // test for member
set.keys // get set
...
|
February 06, 2007 Re: Sets | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michiel | Michiel wrote: >> The lack of a set type and the use of void[KeyType] for sets has been >> discussed a few times in the past, I think. It makes sense to me too, >> at first glance. But I suspect it would require a whole different >> codepath in the compiler to deal with because you can't have arrays of >> voids etc. You really want a different data structure to store a set. > > I think creating a whole different data structure just for the set isn't worth the > trouble. Maybe not for casual use, but if for some reason you are creating huge sets, you'll probably care whether it uses 1GB or 2GB of memory. It's not a "whole different" data structure that's required, just a modified version of the AA data structure that doesn't store values. But still that makes it a different data structure that will require different code in D. I'm kinda just playing devil's advocate here. I was actually one of the people who previously commented that void[KeyType] should just work. But if it's going into the core language it better be a slam dunk. > For now, I've created the following simple code, which I will use for now. It > still uses dummy values, but hides this behind a template and an array function: > > --------------------------------- > template Set(T) { > alias bit[T] Set; > } > > void add(T)(inout bit[T] set, T element) { > set[element] = 1; > } > --------------------------------- > Set!(char[]) set; > > set.add("one"); > set.add("two"); > > assert("one" in set); > assert("two" in set); > assert(!("fortytwo" in set)); > --------------------------------- I think the type 'bit' is deprecated. Anyway, it's just an alias for 'bool' (see phobos/object.d) so you might as well use bool rather than a type that isn't even listed in the spec: http://www.digitalmars.com/d/type.html > >> And as you mentioned it should have an 'add' method which would be >> meaningless for ordinary AA's. > > For ordinary AA's, another parameter could be passed for the value, which could > default to the type's init value. True. > >> Following AA syntax could also lead to >> annoying special cases for generic templates that would have to be >> prepared for the V==void case: >> foo(K,V)(V[K] aa) { ...V val; // whoops V could be void! } > > Isn't the same thing true for many situations? What if you wanted to compare a V > object with an integer? The compiler would generate an error for any V for which > no such comparison is defined. I would simply expect the compiler to complain in > your example. Yeh, despite what I said, you could also see sets having AA syntax as being a *good* thing for templates, since it would let any template that only cares about keys to match both sets and AA's. But really it's kind of a bogus argument anyway. Templates should use duck typing wherever possible. I.e. the template should be written such that it supports any object with the necessary interface, like .keys and .values methods or a[x] syntax. > I'm not saying it's easy. But it should be worth it for the same reason that > associative arrays are included into the core language. One issue with sets as AAs is that really one typically thinks of sets as a collection of "values" not a collection of "keys". If you make a set type from scratch, you'd be more likely to have a "set.values" member than a "set.keys" member. Here's a thought - one way to deal with that would be for the implementation of void[KeyType] to just return the key itself for any attempts to access an existing key. So myset["key"] would return "key" if key was in the array, or throw an exception if not. That way you could also use ".values" on a void[KeyType] set instead of ".keys", and handle the issue of what auto = foo in myset; should return. It looks like a number of things in dmd/src/phobos/internal/aaA.d would have to be rewritten with special versions for handling void. But it doesn't look like it would be a huge amount of code. In particular the main data structure already seems to allocate nodes as a big void* with size of keysize+valuesize, so that should still be ok with valuesize==0. Then you'd also need some conditionals added to the code that calls into the aaA.d code. --bb |
Copyright © 1999-2021 by the D Language Foundation