Jump to page: 1 27  
Page
Thread overview
Is all this Invarient **** er... stuff, premature optimisation?
Apr 27, 2008
p9e883002
Apr 28, 2008
Simen Kjaeraas
Apr 28, 2008
Simen Kjaeraas
Apr 28, 2008
p9e883002
Apr 28, 2008
Janice Caron
Apr 28, 2008
Me Here
Apr 28, 2008
Janice Caron
Apr 28, 2008
Me Here
Apr 28, 2008
Janice Caron
Apr 28, 2008
Walter Bright
Apr 28, 2008
Sean Kelly
Apr 28, 2008
Lars Ivar Igesund
Apr 28, 2008
Janice Caron
Apr 28, 2008
Walter Bright
Apr 29, 2008
Me Here
Apr 29, 2008
Walter Bright
Apr 29, 2008
Lionello Lunesu
Apr 29, 2008
Walter Bright
Apr 29, 2008
e-t172
Apr 30, 2008
Hans W. Uhlig
Apr 28, 2008
Gide Nwawudu
Apr 28, 2008
Bill Baxter
Apr 29, 2008
Lionello Lunesu
Apr 28, 2008
p9e883002
Apr 28, 2008
Simen Kjaeraas
Apr 28, 2008
Janice Caron
Re: Is all this Invarient **** er... stuff, premature optimisation? (on undefined)
Apr 29, 2008
Bruno Medeiros
Apr 28, 2008
Janice Caron
Apr 28, 2008
Sean Kelly
Apr 28, 2008
Janice Caron
Apr 28, 2008
Walter Bright
Apr 28, 2008
Walter Bright
Apr 28, 2008
Lars Ivar Igesund
Apr 28, 2008
Walter Bright
Apr 29, 2008
Walter Bright
Apr 29, 2008
Ary Borenszweig
Apr 29, 2008
Robert Fraser
Apr 29, 2008
Sean Kelly
Apr 29, 2008
Walter Bright
Apr 29, 2008
Christopher Wright
Apr 28, 2008
Lars Ivar Igesund
Apr 29, 2008
Bruno Medeiros
Apr 29, 2008
Me Here
Apr 29, 2008
Sean Kelly
Apr 29, 2008
Walter Bright
Apr 29, 2008
Me Here
Apr 29, 2008
Walter Bright
Apr 29, 2008
Me Here
Apr 29, 2008
Walter Bright
Apr 29, 2008
Me Here
Apr 29, 2008
Walter Bright
Apr 29, 2008
Sean Kelly
Apr 29, 2008
Me Here
Apr 29, 2008
Sean Kelly
Apr 29, 2008
Me Here
Apr 29, 2008
Sean Kelly
Apr 29, 2008
Me Here
Apr 29, 2008
Me Here
Apr 29, 2008
Walter Bright
Apr 29, 2008
Me Here
Apr 30, 2008
Sean Kelly
Apr 30, 2008
Bill Baxter
Apr 30, 2008
Sean Kelly
Apr 30, 2008
Bill Baxter
April 27, 2008
Hi all,

'scuse me for not being familiar with previous or ongoing discussion on this subject, but I'm just coming back to D after a couple of years away.

I have some strings read in from external source that I need to convert to uppercase. A quick look at Phobos and I find std.string has a toupper method.

import std.stdio;
import std.string;

int main( char[][] args ) {
    char[] a = args[ 0 ].toupper();
    writefln( a );
    return 0;
}

c:\dmd\test>dmd junk.d
junk.d(5): function std.string.toupper (invariant(char)[]) does not match
parameter types (char[])
junk.d(5): Error: cannot implicitly convert expression (args[0u]) of type char[]
to invariant(char)[]
junk.d(5): Error: cannot implicitly convert expression (toupper(cast(invariant
(char)[])(args[0u]))) of type invariant(char)[] to char[]

Hm. Okey dokey.

import std.stdio;
import std.string;

int main( char[][] args ) {
    char[] a = ( cast(invariant(char)[]) args[ 0 ] ).toupper();
    writefln( a );
    return 0;
}

junk.d(5): Error: cannot implicitly convert expression (toupper(cast(invariant
(char)[])(args[0u]))) of type invariant(char)[] to char[]

Shoulda known :(

import std.stdio;
import std.string;

int main( char[][] args ) {
    string a = ( cast(invariant(char)[]) args[ 0 ] ).toupper();
    writefln( a );
    return 0;
}

c:\dmd\test>dmd junk.d

c:\dmd\test>junk
C:\DMD\TEST\JUNK.EXE

Great! Now I need to replace the bit in the middle:

import std.stdio;
import std.string;

int main( char[][] args ) {
    string a = ( cast(invariant(char)[]) args[ 0 ] ).toupper();
    a[ 2 .. 4 ] = "XXX";
    writefln( a );
    return 0;
}

c:\dmd\test>dmd junk.d
junk.d(6): Error: slice a[cast(uint)2..cast(uint)4] is not mutable

Wha..? What's the point in having slices if I can't use them?

import std.stdio;
import std.string;

int main( char[][] args ) {
    char[] a = cast(char[]) ( cast(invariant(char)[]) args[ 0 ] ).toupper();
    a[ 2 .. 4 ] = "XXX";
    writefln( a );
    return 0;
}

Finally, it works. But can you see what's going on in line 5 amongst all that casting? Cos I sure can't.

So, I read that all this invarient stuff is about efficiency. For whom? Must be the compiler because it sure ain't about programmer efficiency.

Ah. Maybe I meant to ignore the beauty of slices and use strings  and method calls for everything?

import std.stdio;
import std.string;

int main( string[] args ) {
    string a = args[ 0 ].toupper();
    a.replace(  a[ 2 .. 4 ], "XXX" );
    writefln( a );
    return 0;
}

Compiles clean and runs:

c:\dmd\test>dmd junk.d

c:\dmd\test>junk
C:\DMD\TEST\JUNK.EXE

But does nothing!

import std.stdio;
import std.string;

int main( string[] args ) {
    string a = args[ 0 ].toupper();
    a = a.replace(  a[ 2 .. 4 ], "XXX" );
    writefln( a );
    return 0;
}

c:\dmd\test>dmd junk.d

c:\dmd\test>junk
C:XXXMD\TEST\JUNK.EXE

Finally, it runs. But at what cost? The 'immutable' a has ended up being mutated. I still had to specify the slice, but I had to call another method call to actually do the deed.

Of course, a wasn't really mutated. Instead, args[0] was copied and then mutated and labelled a. Then a was copied and mutated and reassigned the mutated copy.

So, that's two copies of the string, plus a slice, plus an extra method call to achieve what used to be achievable in place on the original string. Which is now immutable, but I'll never need it again.

Of course, on these short 1-off strings it doesn't matter a hoot. But when the strings are 200 to 500 characters a pop and there are 20,000,000 of them. It matters.

Did I suggest this was an optimisation?

Whatever immutability-purity cool aid you've been drinking, please go back to coke. And give us usable libraries and sensible implicit conversions. Cos this sucks bigtime.

b.


April 28, 2008
<p9e883002@sneakemail.com> wrote:

> Of course, a wasn't really mutated. Instead, args[0] was copied and then
> mutated and labelled a. Then a was copied and mutated and reassigned the
> mutated copy.
>
> So, that's two copies of the string, plus a slice, plus an extra method call to
> achieve what used to be achievable in place on the original string. Which is now
> immutable, but I'll never need it again.
>
> Of course, on these short 1-off strings it doesn't matter a hoot. But when the
> strings are 200 to 500 characters a pop and there are 20,000,000 of them. It
> matters.
>
> Did I suggest this was an optimisation?
>
> Whatever immutability-purity cool aid you've been drinking, please go back to
> coke. And give us usable libraries and sensible implicit conversions. Cos this sucks
> bigtime.
>
> b.


Is this what you wanted to write?

int main(string[] args)
{
  char[] a = cast(char[])args[0];
  a[2..5] = "XXX";
  writefln(a);
  return 0;
}
This compiles and runs, and seems to do what you describe. Sure, there's a
cast there, but it's not all that bad, is it?
April 28, 2008
On Mon, 28 Apr 2008 02:14:19 +0200, Simen Kjaeraas <simen.kjaras@gmail.com> wrote:

> <p9e883002@sneakemail.com> wrote:
>
>> Of course, a wasn't really mutated. Instead, args[0] was copied and then
>> mutated and labelled a. Then a was copied and mutated and reassigned the
>> mutated copy.
>>
>> So, that's two copies of the string, plus a slice, plus an extra method call to
>> achieve what used to be achievable in place on the original string. Which is now
>> immutable, but I'll never need it again.
>>
>> Of course, on these short 1-off strings it doesn't matter a hoot. But when the
>> strings are 200 to 500 characters a pop and there are 20,000,000 of them. It
>> matters.
>>
>> Did I suggest this was an optimisation?
>>
>> Whatever immutability-purity cool aid you've been drinking, please go back to
>> coke. And give us usable libraries and sensible implicit conversions. Cos this sucks
>> bigtime.
>>
>> b.
>
>
> Is this what you wanted to write?
>
> int main(string[] args)
> {
>    char[] a = cast(char[])args[0];
>    a[2..5] = "XXX";
>    writefln(a);
>    return 0;
> }
> This compiles and runs, and seems to do what you describe. Sure, there's a
> cast there, but it's not all that bad, is it?


Sorry, forgot the .toupper() call there. Should be
  char[] a = cast(char[])args[0].toupper();

-- Simen
April 28, 2008
On Mon, 28 Apr 2008 02:14:19 +0200, "Simen Kjaeraas" <simen.kjaras@gmail.com> wrote:

><p9e883002@sneakemail.com> wrote:
>
>> Of course, a wasn't really mutated. Instead, args[0] was copied and then mutated and labelled a. Then a was copied and mutated and reassigned the mutated copy.
>>
>> So, that's two copies of the string, plus a slice, plus an extra method
>> call to
>> achieve what used to be achievable in place on the original string.
>> Which is now
>> immutable, but I'll never need it again.
>>
>> Of course, on these short 1-off strings it doesn't matter a hoot. But
>> when the
>> strings are 200 to 500 characters a pop and there are 20,000,000 of
>> them. It
>> matters.
>>
>> Did I suggest this was an optimisation?
>>
>> Whatever immutability-purity cool aid you've been drinking, please go
>> back to
>> coke. And give us usable libraries and sensible implicit conversions.
>> Cos this sucks
>> bigtime.
>>
>> b.
>
>
>Is this what you wanted to write?
>
>int main(string[] args)
>{
>   char[] a = cast(char[])args[0];
>   a[2..5] = "XXX";
>   writefln(a);
>   return 0;
>}
>This compiles and runs, and seems to do what you describe. Sure, there's a
>cast there, but it's not all that bad, is it?

Or just add a dup.

int main(string[] args)
{
   char[] a = args[0].dup;
   a[2..5] = "XXX";
   writefln(a);
   return 0;
}
April 28, 2008
Simen Kjaeraas wrote:
> <p9e883002@sneakemail.com> wrote:
> 
>> Of course, a wasn't really mutated. Instead, args[0] was copied and then
>> mutated and labelled a. Then a was copied and mutated and reassigned the
>> mutated copy.
>>
>> So, that's two copies of the string, plus a slice, plus an extra method call to
>> achieve what used to be achievable in place on the original string. Which is now
>> immutable, but I'll never need it again.
>>
>> Of course, on these short 1-off strings it doesn't matter a hoot. But when the
>> strings are 200 to 500 characters a pop and there are 20,000,000 of them. It
>> matters.
>>
>> Did I suggest this was an optimisation?
>>
>> Whatever immutability-purity cool aid you've been drinking, please go back to
>> coke. And give us usable libraries and sensible implicit conversions. Cos this sucks
>> bigtime.
>>
>> b.
> 
> 
> Is this what you wanted to write?
> 
> int main(string[] args)
> {
>   char[] a = cast(char[])args[0];
>   a[2..5] = "XXX";
>   writefln(a);
>   return 0;
> }
> This compiles and runs, and seems to do what you describe. Sure, there's a
> cast there, but it's not all that bad, is it?

I'm no invariant guru, but I don't think that's legal.  'invariant' means the data could be stored in a portion of memory that the OS will not allow the program to write to.  So you need to dup it:

   char[] a = args[0].dup;
   a[2..5] = "XXX";
   writefln(a);
   return 0;

That stuff like this compiles and seems to work is why we really need to make at least one alternative version of cast.  One would be for relative safe run-of-the-mill casts, like casting float to int, or casting Object to some class (and checking for null),  and the other category would be for dangerous big red flags kind of things like the above.  Using the run-of-the-mill cast in the above situation would not be allowed.

--bb
April 28, 2008
On Mon, 28 Apr 2008 02:14:19 +0200, "Simen Kjaeraas" <simen.kjaras@gmail.com> wrote:
> <p9e883002@sneakemail.com> wrote:
> 
> Is this what you wanted to write?
> 
> int main(string[] args)
> {
>    char[] a =3D cast(char[])args[0];
>    a[2..5] =3D "XXX";
>    writefln(a);
>    return 0;
> }
> This compiles and runs, and seems to do what you describe. Sure, there's=
>  a
> cast there, but it's not all that bad, is it?

No. You missed out uppercasing the string before replacing the slice.


April 28, 2008
On Mon, 28 Apr 2008 02:44:14 +0200, <p9e883002@sneakemail.com> wrote:

> On Mon, 28 Apr 2008 02:14:19 +0200, "Simen Kjaeraas"
> <simen.kjaras@gmail.com> wrote:
>> <p9e883002@sneakemail.com> wrote:
>>
>> Is this what you wanted to write?
>>
>> int main(string[] args)
>> {
>>    char[] a =3D cast(char[])args[0];
>>    a[2..5] =3D "XXX";
>>    writefln(a);
>>    return 0;
>> }
>> This compiles and runs, and seems to do what you describe. Sure, there's=
>>  a
>> cast there, but it's not all that bad, is it?
>
> No. You missed out uppercasing the string before replacing the slice.


That's why I replied to my own post stating just that.
Anyways, Gide got it right. A .dup is the correct way, a cast is wrong.

-- Simen
April 28, 2008
On Mon, 28 Apr 2008 02:28:23 +0200, "Simen Kjaeraas" <simen.kjaras@gmail.com> wrote:
> On Mon, 28 Apr 2008 02:14:19 +0200, Simen Kjaeraas  =
> 
> <simen.kjaras@gmail.com> wrote:
> 
> > <p9e883002@sneakemail.com> wrote:
> >
> >
> > Is this what you wanted to write?
> >
> > int main(string[] args)
> > {
> >    char[] a =3D cast(char[])args[0];
> >    a[2..5] =3D "XXX";
> >    writefln(a);
> >    return 0;
> > }
> > This compiles and runs, and seems to do what you describe. Sure, there=
> 's  =
> 
> > a
> > cast there, but it's not all that bad, is it?
> 
> 
> Sorry, forgot the .toupper() call there. Should be
>    char[] a =3D cast(char[])args[0].toupper();
> 
> -- Simen

Okay, you got around the first cast by using

int main( string[] ) {

So now you want to lowercase it again:

import std.stdio;
import std.string;

int main( string[] args) {
    char[] a = cast(char[])args[0].toupper();
    a[2..5] = "XXX";
    a = a.tolower;
    writefln(a);
    return 0;
}

c:\dmd\test>dmd junk.d
junk.d(7): Error: no property 'tolower' for type 'char[]'
junk.d(7): Error: cannot implicitly convert expression (1) of type int to char[]
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]

So, cast a back to being a string, so that we can call tolower() on it and then cast the copied mutated string back to a char[]:

import std.stdio;
import std.string;

int main( string[] args) {
    char[] a = cast(char[])args[0].toupper();
    a[2..5] = "XXX";
    a = cast(char[]) ( ( cast(string)a ).tolower );
    writefln(a);
    return 0;
}

c:\dmd\test>dmd junk.d
junk.d(7): Error: no property 'tolower' for type 'invariant(char)[]'
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]
junk.d(7): Error: cannot implicitly convert expression (0) of type int to char[]
junk.d(7): Error: cannot cast int to char[]
junk.d(7): Error: integral constant must be scalar type, not char[]

Nope. That don't work.

import std.stdio;
import std.string;

int main( string[] args) {
    char[] a = cast(char[])args[0].toupper();
    a[2..5] = "XXX";
    a = cast(char[])tolower( cast(string)a );
    writefln(a);
    return 0;
}

Finally. It works.

Summary:

If I want to be able to lvalue slice operations on 'strings' (for efficiency) I have to have them as char[].

If I want to be able to use std.string methods on those same strings, I have to cast them to invariant(char)[] and the results back to char[] which involves a at least one copy operation, and probably two.

And the invariant-ness of the string library is done "for efficiency"?

Cheers, b.


April 28, 2008
2008/4/28 Simen Kjaeraas <simen.kjaras@gmail.com>:
>  int main(string[] args)
>  {
>   char[] a = cast(char[])args[0];
>   a[2..5] = "XXX";
>   writefln(a);
>   return 0;
>  }
>  This compiles and runs, and seems to do what you describe. Sure, there's a
>  cast there, but it's not all that bad, is it?

Yes, it's extremely bad. Casting away invariant is UNDEFINED BEHAVIOR, and should never be done.

You should never need an explicit cast just to handle text!
April 28, 2008
2008/4/28  <p9e883002@sneakemail.com>:
>  import std.string;
>
>  int main( string[] args) {
>     char[] a = cast(char[])args[0].toupper();

**** UNDEFINED BEHAVIOR ****
(1) args might be placed in a hardware-locked read-only segment. Then
the following line would fail
(2) there might be other pointers to the string, which expect it never
to change.

>     a[2..5] = "XXX";
>     a = cast(char[])tolower( cast(string)a );
>     writefln(a);
>     return 0;
>  }
>
>  Finally. It works.

But not necessarily on all architectures, because of the undefined behavior. This is how you do it without undefined behavior.

    import std.string;

    int main( string[] args) {
        string a = args[0].toupper();
        a = a[0..2] ~ "XXX" ~ a[5..$];
        a = a.tolower();
        writefln(a);
        return 0;
    }
« First   ‹ Prev
1 2 3 4 5 6 7