Thread overview | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
April 27, 2008 Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Hi all, 'scuse me for not being familiar with previous or ongoing discussion on this subject, but I'm just coming back to D after a couple of years away. I have some strings read in from external source that I need to convert to uppercase. A quick look at Phobos and I find std.string has a toupper method. import std.stdio; import std.string; int main( char[][] args ) { char[] a = args[ 0 ].toupper(); writefln( a ); return 0; } c:\dmd\test>dmd junk.d junk.d(5): function std.string.toupper (invariant(char)[]) does not match parameter types (char[]) junk.d(5): Error: cannot implicitly convert expression (args[0u]) of type char[] to invariant(char)[] junk.d(5): Error: cannot implicitly convert expression (toupper(cast(invariant (char)[])(args[0u]))) of type invariant(char)[] to char[] Hm. Okey dokey. import std.stdio; import std.string; int main( char[][] args ) { char[] a = ( cast(invariant(char)[]) args[ 0 ] ).toupper(); writefln( a ); return 0; } junk.d(5): Error: cannot implicitly convert expression (toupper(cast(invariant (char)[])(args[0u]))) of type invariant(char)[] to char[] Shoulda known :( import std.stdio; import std.string; int main( char[][] args ) { string a = ( cast(invariant(char)[]) args[ 0 ] ).toupper(); writefln( a ); return 0; } c:\dmd\test>dmd junk.d c:\dmd\test>junk C:\DMD\TEST\JUNK.EXE Great! Now I need to replace the bit in the middle: import std.stdio; import std.string; int main( char[][] args ) { string a = ( cast(invariant(char)[]) args[ 0 ] ).toupper(); a[ 2 .. 4 ] = "XXX"; writefln( a ); return 0; } c:\dmd\test>dmd junk.d junk.d(6): Error: slice a[cast(uint)2..cast(uint)4] is not mutable Wha..? What's the point in having slices if I can't use them? import std.stdio; import std.string; int main( char[][] args ) { char[] a = cast(char[]) ( cast(invariant(char)[]) args[ 0 ] ).toupper(); a[ 2 .. 4 ] = "XXX"; writefln( a ); return 0; } Finally, it works. But can you see what's going on in line 5 amongst all that casting? Cos I sure can't. So, I read that all this invarient stuff is about efficiency. For whom? Must be the compiler because it sure ain't about programmer efficiency. Ah. Maybe I meant to ignore the beauty of slices and use strings and method calls for everything? import std.stdio; import std.string; int main( string[] args ) { string a = args[ 0 ].toupper(); a.replace( a[ 2 .. 4 ], "XXX" ); writefln( a ); return 0; } Compiles clean and runs: c:\dmd\test>dmd junk.d c:\dmd\test>junk C:\DMD\TEST\JUNK.EXE But does nothing! import std.stdio; import std.string; int main( string[] args ) { string a = args[ 0 ].toupper(); a = a.replace( a[ 2 .. 4 ], "XXX" ); writefln( a ); return 0; } c:\dmd\test>dmd junk.d c:\dmd\test>junk C:XXXMD\TEST\JUNK.EXE Finally, it runs. But at what cost? The 'immutable' a has ended up being mutated. I still had to specify the slice, but I had to call another method call to actually do the deed. Of course, a wasn't really mutated. Instead, args[0] was copied and then mutated and labelled a. Then a was copied and mutated and reassigned the mutated copy. So, that's two copies of the string, plus a slice, plus an extra method call to achieve what used to be achievable in place on the original string. Which is now immutable, but I'll never need it again. Of course, on these short 1-off strings it doesn't matter a hoot. But when the strings are 200 to 500 characters a pop and there are 20,000,000 of them. It matters. Did I suggest this was an optimisation? Whatever immutability-purity cool aid you've been drinking, please go back to coke. And give us usable libraries and sensible implicit conversions. Cos this sucks bigtime. b. |
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p9e883002 | <p9e883002@sneakemail.com> wrote:
> Of course, a wasn't really mutated. Instead, args[0] was copied and then
> mutated and labelled a. Then a was copied and mutated and reassigned the
> mutated copy.
>
> So, that's two copies of the string, plus a slice, plus an extra method call to
> achieve what used to be achievable in place on the original string. Which is now
> immutable, but I'll never need it again.
>
> Of course, on these short 1-off strings it doesn't matter a hoot. But when the
> strings are 200 to 500 characters a pop and there are 20,000,000 of them. It
> matters.
>
> Did I suggest this was an optimisation?
>
> Whatever immutability-purity cool aid you've been drinking, please go back to
> coke. And give us usable libraries and sensible implicit conversions. Cos this sucks
> bigtime.
>
> b.
Is this what you wanted to write?
int main(string[] args)
{
char[] a = cast(char[])args[0];
a[2..5] = "XXX";
writefln(a);
return 0;
}
This compiles and runs, and seems to do what you describe. Sure, there's a
cast there, but it's not all that bad, is it?
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjaeraas | On Mon, 28 Apr 2008 02:14:19 +0200, Simen Kjaeraas <simen.kjaras@gmail.com> wrote:
> <p9e883002@sneakemail.com> wrote:
>
>> Of course, a wasn't really mutated. Instead, args[0] was copied and then
>> mutated and labelled a. Then a was copied and mutated and reassigned the
>> mutated copy.
>>
>> So, that's two copies of the string, plus a slice, plus an extra method call to
>> achieve what used to be achievable in place on the original string. Which is now
>> immutable, but I'll never need it again.
>>
>> Of course, on these short 1-off strings it doesn't matter a hoot. But when the
>> strings are 200 to 500 characters a pop and there are 20,000,000 of them. It
>> matters.
>>
>> Did I suggest this was an optimisation?
>>
>> Whatever immutability-purity cool aid you've been drinking, please go back to
>> coke. And give us usable libraries and sensible implicit conversions. Cos this sucks
>> bigtime.
>>
>> b.
>
>
> Is this what you wanted to write?
>
> int main(string[] args)
> {
> char[] a = cast(char[])args[0];
> a[2..5] = "XXX";
> writefln(a);
> return 0;
> }
> This compiles and runs, and seems to do what you describe. Sure, there's a
> cast there, but it's not all that bad, is it?
Sorry, forgot the .toupper() call there. Should be
char[] a = cast(char[])args[0].toupper();
-- Simen
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjaeraas | On Mon, 28 Apr 2008 02:14:19 +0200, "Simen Kjaeraas" <simen.kjaras@gmail.com> wrote:
><p9e883002@sneakemail.com> wrote:
>
>> Of course, a wasn't really mutated. Instead, args[0] was copied and then mutated and labelled a. Then a was copied and mutated and reassigned the mutated copy.
>>
>> So, that's two copies of the string, plus a slice, plus an extra method
>> call to
>> achieve what used to be achievable in place on the original string.
>> Which is now
>> immutable, but I'll never need it again.
>>
>> Of course, on these short 1-off strings it doesn't matter a hoot. But
>> when the
>> strings are 200 to 500 characters a pop and there are 20,000,000 of
>> them. It
>> matters.
>>
>> Did I suggest this was an optimisation?
>>
>> Whatever immutability-purity cool aid you've been drinking, please go
>> back to
>> coke. And give us usable libraries and sensible implicit conversions.
>> Cos this sucks
>> bigtime.
>>
>> b.
>
>
>Is this what you wanted to write?
>
>int main(string[] args)
>{
> char[] a = cast(char[])args[0];
> a[2..5] = "XXX";
> writefln(a);
> return 0;
>}
>This compiles and runs, and seems to do what you describe. Sure, there's a
>cast there, but it's not all that bad, is it?
Or just add a dup.
int main(string[] args)
{
char[] a = args[0].dup;
a[2..5] = "XXX";
writefln(a);
return 0;
}
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjaeraas | Simen Kjaeraas wrote:
> <p9e883002@sneakemail.com> wrote:
>
>> Of course, a wasn't really mutated. Instead, args[0] was copied and then
>> mutated and labelled a. Then a was copied and mutated and reassigned the
>> mutated copy.
>>
>> So, that's two copies of the string, plus a slice, plus an extra method call to
>> achieve what used to be achievable in place on the original string. Which is now
>> immutable, but I'll never need it again.
>>
>> Of course, on these short 1-off strings it doesn't matter a hoot. But when the
>> strings are 200 to 500 characters a pop and there are 20,000,000 of them. It
>> matters.
>>
>> Did I suggest this was an optimisation?
>>
>> Whatever immutability-purity cool aid you've been drinking, please go back to
>> coke. And give us usable libraries and sensible implicit conversions. Cos this sucks
>> bigtime.
>>
>> b.
>
>
> Is this what you wanted to write?
>
> int main(string[] args)
> {
> char[] a = cast(char[])args[0];
> a[2..5] = "XXX";
> writefln(a);
> return 0;
> }
> This compiles and runs, and seems to do what you describe. Sure, there's a
> cast there, but it's not all that bad, is it?
I'm no invariant guru, but I don't think that's legal. 'invariant' means the data could be stored in a portion of memory that the OS will not allow the program to write to. So you need to dup it:
char[] a = args[0].dup;
a[2..5] = "XXX";
writefln(a);
return 0;
That stuff like this compiles and seems to work is why we really need to make at least one alternative version of cast. One would be for relative safe run-of-the-mill casts, like casting float to int, or casting Object to some class (and checking for null), and the other category would be for dangerous big red flags kind of things like the above. Using the run-of-the-mill cast in the above situation would not be allowed.
--bb
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjaeraas | On Mon, 28 Apr 2008 02:14:19 +0200, "Simen Kjaeraas" <simen.kjaras@gmail.com> wrote:
> <p9e883002@sneakemail.com> wrote:
>
> Is this what you wanted to write?
>
> int main(string[] args)
> {
> char[] a =3D cast(char[])args[0];
> a[2..5] =3D "XXX";
> writefln(a);
> return 0;
> }
> This compiles and runs, and seems to do what you describe. Sure, there's=
> a
> cast there, but it's not all that bad, is it?
No. You missed out uppercasing the string before replacing the slice.
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p9e883002 | On Mon, 28 Apr 2008 02:44:14 +0200, <p9e883002@sneakemail.com> wrote:
> On Mon, 28 Apr 2008 02:14:19 +0200, "Simen Kjaeraas"
> <simen.kjaras@gmail.com> wrote:
>> <p9e883002@sneakemail.com> wrote:
>>
>> Is this what you wanted to write?
>>
>> int main(string[] args)
>> {
>> char[] a =3D cast(char[])args[0];
>> a[2..5] =3D "XXX";
>> writefln(a);
>> return 0;
>> }
>> This compiles and runs, and seems to do what you describe. Sure, there's=
>> a
>> cast there, but it's not all that bad, is it?
>
> No. You missed out uppercasing the string before replacing the slice.
That's why I replied to my own post stating just that.
Anyways, Gide got it right. A .dup is the correct way, a cast is wrong.
-- Simen
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjaeraas | On Mon, 28 Apr 2008 02:28:23 +0200, "Simen Kjaeraas" <simen.kjaras@gmail.com> wrote:
> On Mon, 28 Apr 2008 02:14:19 +0200, Simen Kjaeraas =
>
> <simen.kjaras@gmail.com> wrote:
>
> > <p9e883002@sneakemail.com> wrote:
> >
> >
> > Is this what you wanted to write?
> >
> > int main(string[] args)
> > {
> > char[] a =3D cast(char[])args[0];
> > a[2..5] =3D "XXX";
> > writefln(a);
> > return 0;
> > }
> > This compiles and runs, and seems to do what you describe. Sure, there=
> 's =
>
> > a
> > cast there, but it's not all that bad, is it?
>
>
> Sorry, forgot the .toupper() call there. Should be
> char[] a =3D cast(char[])args[0].toupper();
>
> -- Simen
Okay, you got around the first cast by using
int main( string[] ) {
So now you want to lowercase it again:
import std.stdio;
import std.string;
int main( string[] args) {
char[] a = cast(char[])args[0].toupper();
a[2..5] = "XXX";
a = a.tolower;
writefln(a);
return 0;
}
c:\dmd\test>dmd junk.d
junk.d(7): Error: no property 'tolower' for type 'char[]'
junk.d(7): Error: cannot implicitly convert expression (1) of type int to char[]
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]
So, cast a back to being a string, so that we can call tolower() on it and then cast the copied mutated string back to a char[]:
import std.stdio;
import std.string;
int main( string[] args) {
char[] a = cast(char[])args[0].toupper();
a[2..5] = "XXX";
a = cast(char[]) ( ( cast(string)a ).tolower );
writefln(a);
return 0;
}
c:\dmd\test>dmd junk.d
junk.d(7): Error: no property 'tolower' for type 'invariant(char)[]'
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]
junk.d(7): Error: cannot implicitly convert expression (0) of type int to char[]
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]
Nope. That don't work.
import std.stdio;
import std.string;
int main( string[] args) {
char[] a = cast(char[])args[0].toupper();
a[2..5] = "XXX";
a = cast(char[])tolower( cast(string)a );
writefln(a);
return 0;
}
Finally. It works.
Summary:
If I want to be able to lvalue slice operations on 'strings' (for efficiency) I have to have them as char[].
If I want to be able to use std.string methods on those same strings, I have to cast them to invariant(char)[] and the results back to char[] which involves a at least one copy operation, and probably two.
And the invariant-ness of the string library is done "for efficiency"?
Cheers, b.
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjaeraas | 2008/4/28 Simen Kjaeraas <simen.kjaras@gmail.com>:
> int main(string[] args)
> {
> char[] a = cast(char[])args[0];
> a[2..5] = "XXX";
> writefln(a);
> return 0;
> }
> This compiles and runs, and seems to do what you describe. Sure, there's a
> cast there, but it's not all that bad, is it?
Yes, it's extremely bad. Casting away invariant is UNDEFINED BEHAVIOR, and should never be done.
You should never need an explicit cast just to handle text!
|
April 28, 2008 Re: Is all this Invarient **** er... stuff, premature optimisation? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p9e883002 | 2008/4/28 <p9e883002@sneakemail.com>: > import std.string; > > int main( string[] args) { > char[] a = cast(char[])args[0].toupper(); **** UNDEFINED BEHAVIOR **** (1) args might be placed in a hardware-locked read-only segment. Then the following line would fail (2) there might be other pointers to the string, which expect it never to change. > a[2..5] = "XXX"; > a = cast(char[])tolower( cast(string)a ); > writefln(a); > return 0; > } > > Finally. It works. But not necessarily on all architectures, because of the undefined behavior. This is how you do it without undefined behavior. import std.string; int main( string[] args) { string a = args[0].toupper(); a = a[0..2] ~ "XXX" ~ a[5..$]; a = a.tolower(); writefln(a); return 0; } |
Copyright © 1999-2021 by the D Language Foundation