View mode: basic / threaded / horizontal-split · Log in · Help
May 27, 2007
Re: string types: const(char)[] and cstring
Walter Bright wrote:
> Reiner Pope wrote:
>> Will there be something in the type system which enables you to safely 
>> say, "This is the only reference to this data, so it's ok for me to 
>> make this invariant" ?
> 
> Safely? No. You will be able to explicitly cast to invariant, however, 
> the programmer will have to ensure it is safe to do so.
> 
>> Does 'scope' happen to have anything to do with that?
> 
> No. Scope just ensures that the reference does not 'escape' the scope 
> it's in.
> 
>> invariant(char)[] createJunk()
>> {
>>     /* scope? */ char[] val = "aaaaa".dup;
>>     size_t index = rand() % 5;
>>     val[index] = rand();
>>
>>     return cast(invariant(char)[]) val;
>> }
>>
>> I mean, do I really need to cast it to invariant there? It's easy to 
>> see that there's only one copy of val's data in existance.
> 
> Easy for you to see, not so easy for the compiler to. And besides:
> 
>     return cast(invariant)val;
> 
> will do the trick more conveniently.

That's an interesting syntax, casting to a trait/attribute with the rest of the type 
inferred.  I presume cast(const) works as well.  (Maybe cast(scope)?  Then again, what's 
the use...)  Given cast(*) where * is invariant/const, is cast(*)T[] the same as 
cast(*(T)[]) or cast(*(T[]))?  That is, does the trait apply to the element type, or the 
array?

-- Chris Nicholson-Sauls
May 27, 2007
Re: string types: const(char)[] and cstring
Walter Bright wrote:
> Reiner Pope wrote:
>> Will there be something in the type system which enables you to safely 
>> say, "This is the only reference to this data, so it's ok for me to 
>> make this invariant" ?
> 
> Safely? No. You will be able to explicitly cast to invariant, however, 
> the programmer will have to ensure it is safe to do so.
> 
>> Does 'scope' happen to have anything to do with that?
> 
> No. Scope just ensures that the reference does not 'escape' the scope 
> it's in.

I must have misunderstood what scope specifies. I had thought that, to 
avoid being escaped, scope specified that your variable may not be 
aliased by another (non-scope) name. In that case, I thought, can't you 
say: "well, when I leave this function, I'm the only one holding a 
reference to this data, so it would be safe to call it invariant (or 
anything else I choose)." I thought a compiler could have a special case 
saying, "at the end of scope, you can safely turn any scope variables 
into whatever you want".

However, I was surprised to find out that the following code compiled 
fine, although it returns a dead object:

Foo foo()
{
    scope Foo f = new Foo();
    Foo g = f;
    return g;
}


  -- Reiner
May 27, 2007
Re: string types: const(char)[] and cstring
Reiner Pope wrote:
> However, I was surprised to find out that the following code compiled 
> fine, although it returns a dead object:

Sadly, it currently isn't enforced.
May 27, 2007
Re: string types: const(char)[] and cstring
Chris Nicholson-Sauls wrote:
> That's an interesting syntax, casting to a trait/attribute with the rest 
> of the type inferred.  I presume cast(const) works as well.  (Maybe 
> cast(scope)?  Then again, what's the use...)  Given cast(*) where * is 
> invariant/const, is cast(*)T[] the same as cast(*(T)[]) or 
> cast(*(T[]))?  That is, does the trait apply to the element type, or the 
> array?

Both.
May 27, 2007
Re: string types: const(char)[] and cstring
On Sat, 26 May 2007 22:27:18 -0700, Walter Bright wrote:

> Derek Parnell wrote:
>> We seem to have different experience. Most of the code I write deals with
>> changing strings - in other words, manipulating strings is very very common
>> in the sorts of programs I write.
> 
> You'll still be able to concatenate and slice invariant strings. You can 
> also cast a char[] to an invariant, when you're done building it.

While that is interesting, it has not much to do with what I was saying.

You said "strings should be immutable" and I saying that seems odd because
my experience is that most strings are meant to be changed. 

So now I'm thinking that we are talking about different things when we use
the word "string". I'm guessing you are really referring to compile-time
generated string data (e.g. literals) rather than run-time generated string
data.


>> So 'const(char)[] x' means that I can change x.ptr and x.length but I
>> cannot change anything that x.ptr points to, right?
> 
> Right.
> 
>> And  'invariant(char)[] x' means that I cannot change x.ptr or x.length and
>> I cannot change anything that x.ptr points to, right?
> 
> Wrong. The difference between const and invariant is that invariant is 
> truly, absolutely, immutable. 

Huh??? Isn't that what I just said? Now I'm even more confused about these
terms. They are just not intuitive, are they?

> Const is only immutable through the 
> reference - another reference to the same data can change it.

Ok ... so this below won't fail ...

 void func(const char[] parm)
 {
     char [] q;
     q = parm;
     q[0] = 'a';
 }

or is the "q = parm" not really permitted.

>> So what syntax is to be used so that x.ptr and x.length cannot be changed
>> but the characters referred to by 'x' can be changed?
> 
> final char[] x;


Given the syntax on the form "  void func(<X> char[] parm) ", is the table
below true ...

*-------------------------------------*
| <X>         + parm.ptr  |  parm[0]  |    
|-------------+-----------------------+
| const       | mutable   | immutable |
| final       | immutable | mutable   |
| invariant   | immutable | immutable |
|             | mutable   | mutable   |
*-------------------------------------*


I'm sorry I'm a bit slow on this ... but what is the difference between
"invariant" and "const final" ? Is it that "invariant" is sort of a global
effect but "const final" is only in effect for the specific reference it
occurs on.

I'm not looking forward to reading the docs on this. I hope you get a lot
of people to edit the docs to make it understandable for everyone.

-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell
May 27, 2007
Re: string types: const(char)[] and cstring
Derek Parnell wrote:
> You said "strings should be immutable" and I saying that seems odd because
> my experience is that most strings are meant to be changed. 

I'm going to argue that your experience is unusual. I do a lot of string 
manipulation (after all, that's what a compiler does) and the strings, 
once constructed, are essentially always immutable. In conversations 
with many others, my experience is commonplace.

But still, in D, nothing prevents you from using mutable strings.

> So now I'm thinking that we are talking about different things when we use
> the word "string". I'm guessing you are really referring to compile-time
> generated string data (e.g. literals) rather than run-time generated string
> data.

I'm referring to the arrays of characters, generated or literals.


>>> So 'const(char)[] x' means that I can change x.ptr and x.length but I
>>> cannot change anything that x.ptr points to, right?
>> Right.
>>
>>> And  'invariant(char)[] x' means that I cannot change x.ptr or x.length and
>>> I cannot change anything that x.ptr points to, right?
>> Wrong. The difference between const and invariant is that invariant is 
>> truly, absolutely, immutable. 
> 
> Huh??? Isn't that what I just said?

No. You said for const you could change x.ptr and x.length, but for 
invariant you could not. For both const and invariant, you can change 
x.ptr and x.length.


> Now I'm even more confused about these
> terms. They are just not intuitive, are they?

The problem is I have failed to explain them. Invariant data can go into 
read-only memory. Const data can be changed by another reference to the 
same data (just like in C++). In other words, const is a read-only 
*view* of the data, whereas invariant data is read-only for all views of it.


>> Const is only immutable through the 
>> reference - another reference to the same data can change it.
> 
> Ok ... so this below won't fail ...
> 
>   void func(const char[] parm)
>   {
>       char [] q;
>       q = parm;
error, q is not const.
>       q[0] = 'a';
>   }
> 
> or is the "q = parm" not really permitted.

Right.

> 
>>> So what syntax is to be used so that x.ptr and x.length cannot be changed
>>> but the characters referred to by 'x' can be changed?
>> final char[] x;
> 
> 
> Given the syntax on the form "  void func(<X> char[] parm) ", is the table
> below true ...
> 
> *-------------------------------------*
> | <X>         + parm.ptr  |  parm[0]  |    
> |-------------+-----------------------+
> | const       | mutable   | immutable |
> | final       | immutable | mutable   |
> | invariant   | immutable | immutable |
> |             | mutable   | mutable   |
> *-------------------------------------*

You've got invariant wrong, it's mutable|immutable.


> I'm sorry I'm a bit slow on this ... but what is the difference between
> "invariant" and "const final" ? Is it that "invariant" is sort of a global
> effect but "const final" is only in effect for the specific reference it
> occurs on.

First differences: final is a *storage class*. const and invariant are 
*type constructors*.

final only refers to the actual value that a symbol has, and it means 
that, once a value is assigned to a symbol, that value can never change. 
If the value is a pointer or reference, what it points to *can* be changed.

int x = 3;
final int* p = &x;
p = null; // error, p is final
*p = 1; // ok

const(int)* q = null;
q = &x;  // ok, q is not const, and now *q is 1
*q = 2;  // error, *q is const
*p = 5;  // ok, but now *q is 5, too!
x = 6;   // ok, but now *q is 6

invariant(int)* s = null;
s = &x;  // error, cannot implicitly convert int* to invariant(int)*
int y = 4;
s = cast(invariant(int)*)&y; // ok, trust programmer that y is immutable
*s = 3;  // error, *s is immutable
y = 5;   // undefined behavior, as y is never supposed to change,
         // and compiler assumes *s is still 4

Note that int* can be implicitly converted to const(int)*, and 
invariant(int)* can be implicitly converted to const(int)*.

> I'm not looking forward to reading the docs on this. I hope you get a lot
> of people to edit the docs to make it understandable for everyone.

The thing is actually rather simple, but I am having trouble finding the 
right words to express it. Certainly, the mishmash of C++ const has 
badly muddied the waters about what const means.
May 27, 2007
Re: string types: const(char)[] and cstring
Bill Baxter wrote:

>> The same here. I don't have much experience with Java and really don't 
>> know
>> why const strings are so usefull...
>> Maybe someone could elaborate a little bit more?
> 
> Ditto here.  When I've used java I found it more annoying that strings 
> were immutable than anything else.

When using Java (and Objective-C), I've found it very useful that 
strings (and others) are immutable since they are then thread-safe.

--anders
May 27, 2007
Re: string types: const(char)[] and cstring
On Sun, 27 May 2007 01:09:40 -0700, Walter Bright wrote:

Thanks for taking the time out to help me understand the proposed D
changes. I really appreciate it.

I think that I'm going to have to wait until you have an implementation to
try it on; to see how it fits with my terminology and needs.

> Derek Parnell wrote:
>> You said "strings should be immutable" and I saying that seems odd because
>> my experience is that most strings are meant to be changed. 
> 
> I'm going to argue that your experience is unusual. I do a lot of string 
> manipulation (after all, that's what a compiler does) and the strings, 
> once constructed, are essentially always immutable. In conversations 
> with many others, my experience is commonplace.

Ok we'll leave it that then. However the phrase "once constructed" is the
key one I suspect. Its like saying, once I've finished changing things I
don't want them to change anymore - no argument there. So the idea would be
to work with mutable strings until they are finished being constructed and
then cast them to immutable for the rest of the run time. I'm thinking here
of things like changing case, macro expansion, standarizing file names,
constructing message text, etc ...

> But still, in D, nothing prevents you from using mutable strings.

That's why I can see that I'll be continuing to use 'alias char[] string',
unless you make 'string' the immutable beastie of course <g>

>>>> So 'const(char)[] x' means that I can change x.ptr and x.length but I
>>>> cannot change anything that x.ptr points to, right?
>>> Right.
>>>
>>>> And  'invariant(char)[] x' means that I cannot change x.ptr or x.length and
>>>> I cannot change anything that x.ptr points to, right?
>>> Wrong. The difference between const and invariant is that invariant is 
>>> truly, absolutely, immutable. 
>> 
>> Huh??? Isn't that what I just said?
> 
> No. You said for const you could change x.ptr and x.length, but for 
> invariant you could not. For both const and invariant, you can change 
> x.ptr and x.length.

See, this is what is weird ... I can have an invariant string which can be
changed, thus making it not really invariant in the English language sense.
I'm still thinking that "invariant" means "does not change ever". 

But it seems that I'm wrong ...

invariant char[] x; 
x = "abc".dup;  // The string 'x' now contains "abc";
x = "def".dup;  // The string (which is not supposed to change
                // i.e invariant) has been changed to "def".

Now this is counter-intuitive (read: *WEIRD*), no?

>> Now I'm even more confused about these
>> terms. They are just not intuitive, are they?
> 
> The problem is I have failed to explain them. Invariant data can go into 
> read-only memory. Const data can be changed by another reference to the 
> same data (just like in C++). In other words, const is a read-only 
> *view* of the data, whereas invariant data is read-only for all views of it.

Okay, I've got that now ... but how to remember that two terms that mean
the same in English actually mean different things in D <G>

I think I read that someone suggested that 'const' be a contraction of
'constrained' rather than 'constant' - that might help. And that
'invariant' is longer than 'const' so its effect is 'bigger'.

 invariant char[] x; // The data pointed to by 'x' cannot be changed
                     // by anything anytime during the execution
                     // of the program.
                     // (So how do I populate it then? Hmmmm ...)

 const char[] y;    // The data pointed to by 'y' cannot be changed
                    // by anything anytime during the execution
                    // of the program when using the 'y' variable,
                    // however using another variable that also
                    // refers to y's data, or some of it, is ok.

For example ...

 void func (const char[] a, char[] b)
 {
       a[0] = 'a'; // fails
       b[0] = 'a'; // succeeds
 }

 char[] y = "def".dup;
 func( y, y);
 
>> I'm sorry I'm a bit slow on this ... but what is the difference between
>> "invariant" and "const final" ? Is it that "invariant" is sort of a global
>> effect but "const final" is only in effect for the specific reference it
>> occurs on.
> 
> First differences: final is a *storage class*. const and invariant are 
> *type constructors*.

Thanks. So 'final' means that it can be changed (from its initial default
value) once and only once.

/* --- Scenario #1 --- */
 final int r;
 r = randomer(); // succeeds
 foo(); // fails 

 int randomer() { 
     // Get a random integer between -100 and 100.
     return cast(int)(std.random.rand() % 201) - 100; 
 }
 void foo() { 
   r = randomer(); // success depends on whether or not 'r' 
                   // has already been set.
 }


/* --- Scenario #2 --- */
 final int r;

 foo(); // succeeds
 r = randomer(); // fails

 int randomer() { 
     // Get a random integer between -100 and 100.
     return cast(int)(std.random.rand() % 201) - 100; 
 }
 void foo() { 
   r = randomer(); // success depends on whether or not 'r' 
                   // has already been set.
 }

Is this a run-time check or a compile time one? If run-time, would it be
possible to somehow 'unfinal' a variable using some implementation
dependant trickery.

>> I'm not looking forward to reading the docs on this. I hope you get a lot
>> of people to edit the docs to make it understandable for everyone.
> 
> The thing is actually rather simple, but I am having trouble finding the 
> right words to express it. 

And thus my comment re editors.

> Certainly, the mishmash of C++ const has 
> badly muddied the waters about what const means.

I have no real knowledge of C++ or its const, and I'm still weirded out by
it all <G>

-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell
May 27, 2007
Re: string types: const(char)[] and cstring
On Fri, 25 May 2007 19:47:24 -0700, Walter Bright wrote:

> Under the new const/invariant/final regime, what are strings going to be 
> ? Experience with other languages suggest that strings should be 
> immutable. To express an array of const chars, one would write:
> 
> 	const(char)[]
> 
> but while that's clear, it doesn't just flow off the keyboard. Strings 
> are so common this needs an alias, so:
> 
> 	alias const(char)[] cstring;
> 

const(char)[]  // A mutable array of immutable characters?
const(char[])  // An immutable array of mutable characters?
const(const(char)[]) // An immutable array of immutable characters?
char[]         // A mutable array of mutable characters?

What will happen with the .reverse and .sort array properties when used
with const, invariant, and final qualifiers?

-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell
May 27, 2007
Re: string types: const(char)[] and cstring
Marcin Kuszczak a écrit :
> Chris Miller wrote:
> 
>> Actually, while we're at a change for strings, why not bring in something
>> similar to my dstring module, where slicing and indexing never result in
>> an invalid UTF sequence? http://www.dprogramming.com/dstring.php - the
>> code may not be ideal, but it's the concept I'm referring to.
> 
> Yup. That's my opinion also...
> 
> For me advantages of such a string are quite obvious:
> 1. Easy slicing and indexing of utf8 sequences (without corrupting this
> sequence - as mention above)
> 2. Common denominator for char[], wchar[] and dchar[]
> 3. For classes which doesn't need speed it simplifies API (only one version
> of functions instead of 3)
> 4. With some additional support from language (cast operators to different
> types and opImplicitCast) it can be fully interchangeable with every method
> taking char[], wchar[], dchar[].
> 
> Having another 3 names for string is not very appealing for me. We would
> have 9 official versions of string available in D:
> char[], wchar[], dchar[], string, cwstring, cdstring, tango String!(char),
> tango String!(wchar), tango String!(dchar)
> 
> To write nice, fully functional library you have to write 3 versions of
> every function which takes different string types (I know, templates makes
> it a little bit easier). Probably I will not be wrong when I say that
> reality is that people just write one version for char[], because it is
> convenient (see: SWT ported from Java). It causes that wchar and dchar are
> treated as second class citizens in D. Additionally when people design
> their program for char[], they mostly don't think about issues with slicing
> of char[] utf8 sequence (warning! assumption!), so default way of writing
> programs is *NOT SAFE*. When you write code and don't care about bare metal
> speed it is just tedious to do this additional work... 
> 
> Having one string, which hides differences between char[], wchar[] and
> dchar[] would solve problem nicely. Adding constness would also be easy.
> And you use only one reserved keyword - string - for everything.
> 
> I would be happy to hear some other opinions from people on NG. Maybe I am
> wrong with above arguments, so probably someone can give
> counterarguments... I think it is very important issue as it seems that
> most developers over the world are non-native-english-speakers...
> 
> PS. See also thread on DWT NG.

I agree with you, I don't think that the string should be a char[] 
alias, wether it's const or not but a class with char[],dchar[],wchar[] 
under the hood representation and safe slicing by default.

The difficulty is providing enough flexibility for managing correctly 
the internal representation: there should be a possibility to say use 
UTF8 even though there are multibyte characters for example (a size 
optimization with some CPU cost).

renoX
1 2 3 4 5 6 7 8
Top | Discussion index | About this forum | D home