Thread overview | |||||||||
---|---|---|---|---|---|---|---|---|---|
|
October 21, 2011 Chars sorting and copies | ||||
---|---|---|---|---|
| ||||
I have many strings and I want to use as associative array kay a sorted concat of two strings (it's a signature of the two strings): import std.algorithm; void main() { string a = "red"; string b = "green"; int[string] aa; //aa[(a ~ b).sort] = 1; //aa[(a ~ b).sort.idup] = 1; aa[(a.dup ~ b.dup).sort.idup] = 1; } I think the first way used to work, the second way was recently forbidden (and you can't use char[] as associative array key). The third way now works, but array.sort will go away. So do you know what's a copy-minimizing (and possibly short) way to perform this, in future? Bye and thank you, bearophile |
October 21, 2011 Re: Chars sorting and copies | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Thursday, October 20, 2011 21:49:27 bearophile wrote:
> I have many strings and I want to use as associative array kay a sorted concat of two strings (it's a signature of the two strings):
>
>
> import std.algorithm;
> void main() {
> string a = "red";
> string b = "green";
> int[string] aa;
> //aa[(a ~ b).sort] = 1;
> //aa[(a ~ b).sort.idup] = 1;
> aa[(a.dup ~ b.dup).sort.idup] = 1;
> }
>
> I think the first way used to work, the second way was recently forbidden (and you can't use char[] as associative array key). The third way now works, but array.sort will go away.
>
> So do you know what's a copy-minimizing (and possibly short) way to perform
> this, in future?
The elements in a string are immutable. I don't see how the first twe sorts _ever_ worked, unless it was a bug. The reality of the matter is that if you want to sort an array, its elements must be mutable, but if you want to use it as a key in an AA, its elements must be immutable. So, you're either going to have to copy them or cast to immutable.
Also, you're going to have to use std.algorithm.sort eventually, and isn't going to work on chars or wchars (and what you're doing now is buggy unless you're restricting yourself to ASCII). So, you're either going to have to do one of
1. Use dchar[] to sort and then convert it to dstring for the key (either by casting or copying).
2. Use dchar[] to sort and then convert it to a string for the key (which results in a copy).
3. If you're using ASCII only, you could use char[], cast it to ubyte[] to sort it, and then cast it to string to use as the key (or idup it).
But sorting and and immutability are obviously at odds with one another, requiring either a copy or a cast, and sorting really isn't going to work with char[] or wchar[], unless you restrict yourself to values that fit in a UTF-8 or UTF-16 code unit.
- Jonathan M Davis
|
October 21, 2011 Re: Chars sorting and copies | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | Jonathan M Davis: > The elements in a string are immutable. I don't see how the first twe sorts _ever_ worked, unless it was a bug. Sorry, I meant that it used to work in D1: http://codepad.org/HYdNktNd Later I'll try some of your solutions. Bye, bearophile |
October 21, 2011 Re: Chars sorting and copies | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Thu, 20 Oct 2011 21:49:27 -0400, bearophile <bearophileHUGS@lycos.com> wrote:
> I have many strings and I want to use as associative array kay a sorted concat of two strings (it's a signature of the two strings):
>
>
> import std.algorithm;
> void main() {
> string a = "red";
> string b = "green";
> int[string] aa;
> //aa[(a ~ b).sort] = 1;
> //aa[(a ~ b).sort.idup] = 1;
> aa[(a.dup ~ b.dup).sort.idup] = 1;
> }
>
> I think the first way used to work, the second way was recently forbidden (and you can't use char[] as associative array key). The third way now works, but array.sort will go away.
>
> So do you know what's a copy-minimizing (and possibly short) way to perform this, in future?
a ~ b should technically be assignable to char[], since it's alread new memory. We may yet get there with pure functions being able to implicit cast to immutable.
You could do this for now, but it's ugly:
char[] concatStr(string[] s...) pure
{
auto len = 0;
foreach(str; s) len += str.length;
auto result = new char[len];
auto slice = result;
foreach(str; s)
{
slice[0..str.length] = str;
slice = slice[str.length..$];
}
return result;
}
...
auto str = concatStr(a, b);
str.sort; // this is going to stop working, you'll have to cast to byte[]
aa[assumeUnique(str)] = 1;
There are also alternatives to AA (such as dcollections' HashMap) which are willing to take non-immutable keys.
-Steve
|
October 21, 2011 Re: Chars sorting and copies | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | bearophile Wrote: > I have many strings and I want to use as associative array kay a sorted concat of two strings (it's a signature of the two strings): > > > import std.algorithm; > void main() { > string a = "red"; > string b = "green"; > int[string] aa; > //aa[(a ~ b).sort] = 1; > //aa[(a ~ b).sort.idup] = 1; > aa[(a.dup ~ b.dup).sort.idup] = 1; > } > > I think the first way used to work, the second way was recently forbidden (and you can't use char[] as associative array key). The third way now works, but array.sort will go away. > > So do you know what's a copy-minimizing (and possibly short) way to perform this, in future? > > Bye and thank you, > bearophile http://www.d-programming-language.org/phobos/std_algorithm.html#setUnion ? |
October 23, 2011 Re: Chars sorting and copies | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Thanks for all the answers. Steven Schveighoffer: > a ~ b should technically be assignable to char[], since it's alread new memory. We may yet get there with pure functions being able to implicit cast to immutable. Isn't that kind of the opposite? Is this already in Bugzilla? Some versions I have written, there is no very good solution, it seems: import std.algorithm, std.exception; void main() { string a = "red"; string b = "green"; int[string] aa; // Works in D1. // aa[(a ~ b).sort] = 1; // Works in D2, but will stop working. aa[(a.dup ~ b.dup).sort.idup] = 1; // Works in D2, but will stop working. char[] ab = (a.dup ~ b.dup).sort; aa[assumeUnique(ab)] = 2; // If keys don't actually need to be strings // the code gets a bit simpler. int[immutable(ubyte[])] aa2; // not good // aa2[assumeUnique(sort(cast(ubyte[])(a ~ b)))] = 3; // not good? // aa2[assumeUnique(sort(cast(ubyte[])(a ~ b)).release())] = 3; // just 1 copy, two visible casts, 3 lines auto ab4 = cast(ubyte[])(a ~ b); ab4.sort(); aa2[cast(immutable(ubyte[]))ab4] = 3; // a visible cast and an hidden cast ubyte[] ab2 = cast(ubyte[])(a ~ b); ab2.sort(); aa2[assumeUnique(ab2)] = 3; // 2 lines, more complex looking auto ab3 = sort(cast(ubyte[])(a ~ b)).release(); aa2[assumeUnique(ab3)] = 3; } Bye, bearophile |
October 24, 2011 Re: Chars sorting and copies | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Sat, 22 Oct 2011 23:01:17 -0400, bearophile <bearophileHUGS@lycos.com> wrote: > Thanks for all the answers. > > Steven Schveighoffer: > >> a ~ b should technically be assignable to char[], since it's alread new >> memory. We may yet get there with pure functions being able to implicit >> cast to immutable. > > Isn't that kind of the opposite? No, if a ~ b would call a pure function that returns mutable when possible, then you could use it for mutable, immutable, or const. > Is this already in Bugzilla? Somewhat, although I like using the pure function avenue to ensure immutable does not violate const than the crazy rules we came up with here: http://d.puremagic.com/issues/show_bug.cgi?id=1654 Note this was in the infancy of my D knowledge ;) > Some versions I have written, there is no very good solution, it seems: > > > import std.algorithm, std.exception; > > void main() { > string a = "red"; > string b = "green"; > int[string] aa; > > // Works in D1. > // aa[(a ~ b).sort] = 1; > > // Works in D2, but will stop working. > aa[(a.dup ~ b.dup).sort.idup] = 1; > > // Works in D2, but will stop working. > char[] ab = (a.dup ~ b.dup).sort; > aa[assumeUnique(ab)] = 2; > > // If keys don't actually need to be strings > // the code gets a bit simpler. > int[immutable(ubyte[])] aa2; > > // not good > // aa2[assumeUnique(sort(cast(ubyte[])(a ~ b)))] = 3; > > // not good? > // aa2[assumeUnique(sort(cast(ubyte[])(a ~ b)).release())] = 3; > > // just 1 copy, two visible casts, 3 lines > auto ab4 = cast(ubyte[])(a ~ b); > ab4.sort(); > aa2[cast(immutable(ubyte[]))ab4] = 3; > > // a visible cast and an hidden cast > ubyte[] ab2 = cast(ubyte[])(a ~ b); > ab2.sort(); > aa2[assumeUnique(ab2)] = 3; > > // 2 lines, more complex looking > auto ab3 = sort(cast(ubyte[])(a ~ b)).release(); > aa2[assumeUnique(ab3)] = 3; > } I honestly am not sure your specific problem is solvable without a lot of special cases. It may be you have to write a sub-function that hides the mess. -Steve |
Copyright © 1999-2021 by the D Language Foundation