May 27, 2007
Walter Bright wrote:
> Under the new const/invariant/final regime, what are strings going to be ? Experience with other languages suggest that strings should be immutable. To express an array of const chars, one would write:
> 
>     const(char)[]
> 
> but while that's clear, it doesn't just flow off the keyboard. Strings are so common this needs an alias, so:
> 
>     alias const(char)[] cstring;
> 
> Why cstring? Because 'string' appears as both a module name and a common variable name. cstring also implies wstring for wchar strings, and dstring for dchars.
> 
> String literals, on the other hand, will be invariant (which means they can be stuffed into read-only memory). So,
>     typeof("abc")
> will be:
>     invariant(char)[3]
> 
> Invariants can be implicitly cast to const.
> 
> In my playing around with source code, using cstring's seems to work out rather nicely.
> 
> So, why not alias cstring to invariant(char)[] ? That way strings really would be immutable. The reason is that mutables cannot be implicitly cast to invariant, meaning that there'd be a lot of casts in the code. Casts are a sledgehammer, and a coding style that requires too many casts is a bad coding style.

Perhaps I should just wait for the implementation, but I'm interested in knowing what your solution to .dup is. Given

   auto foo = "hello".dup;

what is the type of foo?

How do you support both of

   invariant char[] foo = "hello".dup;
   char[] bar = "hello".dup;

  -- Reiner
May 28, 2007
noSpam wrote:
> Walter Bright wrote:
>> Derek Parnell wrote:
>>>  const(char)[]  // A mutable array of immutable characters?
>>>  const(char[])  // An immutable array of mutable characters?
>>>  const(const(char)[]) // An immutable array of immutable characters?
>>>  char[]         // A mutable array of mutable characters?
>>>
>>> What will happen with the .reverse and .sort array properties when used
>>> with const, invariant, and final qualifiers?
>>
>> They'll all fail.
> 
> I think it's better to return reversed/sorted copy. This will make such change more backward compatibile.

This makes sense. For immutable arrays, the definition should drop "in place" and just return a copy.

May 28, 2007
Myron Alexander skrev:
> noSpam wrote:
>> Walter Bright wrote:
>>> Derek Parnell wrote:
>>>>  const(char)[]  // A mutable array of immutable characters?
>>>>  const(char[])  // An immutable array of mutable characters?
>>>>  const(const(char)[]) // An immutable array of immutable characters?
>>>>  char[]         // A mutable array of mutable characters?
>>>>
>>>> What will happen with the .reverse and .sort array properties when used
>>>> with const, invariant, and final qualifiers?
>>>
>>> They'll all fail.
>>
>> I think it's better to return reversed/sorted copy. This will make such change more backward compatibile.
> 
> This makes sense. For immutable arrays, the definition should drop "in place" and just return a copy.

Which would be very confusing. This is instead a perfect opportunity to  take the *much* better path of finally depreciating the .sort and .reverse "properties". Equally good or better library implementations are possible (and exists). For example, .sort can't take an ordering predicate. Also, the special casing of reversing char[] and wchar[] arrays, preserving the encoded unicode code points is definitely (imho) too specialized to belong in the language (runtime) as opposed to a library.

/ Oskar
May 28, 2007
Reiner Pope wrote:
> Perhaps I should just wait for the implementation, but I'm interested in knowing what your solution to .dup is. Given
> 
>    auto foo = "hello".dup;
> 
> what is the type of foo?

Most likely a plain (mutable) char[].

> How do you support both of
> 
>    invariant char[] foo = "hello".dup;
>    char[] bar = "hello".dup;

Likely the first will be an error as written, requiring a cast(invariant) to be inserted.
Of course, since it doesn't make much sense to .dup in the example above ("hello" is already invariant, and copying an invariant array but not modifying the copy isn't typically useful) that shouldn't be much of a problem in this case.

For other cases though, I could see how a "unique" (or similar) type constructor that would allow implicit conversion to both mutable and invariant (and const) types could be useful.
For instance, if the strings in your example were replaced by mutable arrays, a "unique char[]" return value of .dup could then be assigned to mutable/const/invariant references without needing casts.
May 28, 2007
Oskar Linde wrote:
> Which would be very confusing. This is instead a perfect opportunity to  take the *much* better path of finally depreciating the .sort and .reverse "properties". Equally good or better library implementations are possible (and exists). For example, .sort can't take an ordering predicate. Also, the special casing of reversing char[] and wchar[] arrays, preserving the encoded unicode code points is definitely (imho) too specialized to belong in the language (runtime) as opposed to a library.
> 
> / Oskar

I see your point and agree.

Regards,

Myron.
May 28, 2007
renoX Wrote:
> Regan Heath a écrit :
> > renoX Wrote:
> >> I agree with you, I don't think that the string should be a char[]
> >>  alias, wether it's const or not but a class with
> >> char[],dchar[],wchar[] under the hood representation and safe
> >> slicing by default.
> >> 
> >> The difficulty is providing enough flexibility for managing correctly the internal representation: there should be a possibility to say use UTF8 even though there are multibyte characters for example (a size optimization with some CPU cost).
> > 
> > I think the class you describe would be useful, but only for certain types of application.  Many applications (those that deal with ASCII
> 
> Hopefully a rare thing now.

No, sadly they aren't.  Most existing applications these days deal with ASCII or one of the strange code pages (which youd handle in D with ubyte and appropriate conversion to one of UTF8, 16 or 32 internally).

Granted in the case of the code page apps you might want a String class which can be produced by a <codepage>toString() free function which leverages iconv (which is just what I suggested)

However you may only want to deal with them as UTF-8 internally therefore not need the functionality provided by the class, opting instead to use 'string' directly.

Sure, in the future I expect/hope people will move to UTF8, 16, and 32 but I suspect code pages will be hauting us for many years to come.

> > wont need the sorts of
> > things this class provides and can get away with just using
> > 'const(char[])' AKA 'string'.  Basically I think there is a ample
> > room for both 'string' as an alias and 'String' as a class to exist
> > at the same time.
> 
> Room of course, but IMHO one should almost always use the class (except in wrappers of native calls) instead of the alias.

I think that's an invalid assertion, specifically your use of the word 'always'.  There are 'almost certainly' (see, my term leaves room for me to be wrong) many cases where the alias would be preferred, most likely for performance reasons, espeically if the added functionality isn't required.

In other words, all I'm saying is; sometimes you want it, sometimes you don't.  Both can exist, both can be used and both should be interchangable (without too much trouble).

Regan
May 28, 2007
Oskar Linde wrote:
> Myron Alexander skrev:
>> noSpam wrote:
>>> Walter Bright wrote:
>>>> Derek Parnell wrote:
>>>>>  const(char)[]  // A mutable array of immutable characters?
>>>>>  const(char[])  // An immutable array of mutable characters?
>>>>>  const(const(char)[]) // An immutable array of immutable characters?
>>>>>  char[]         // A mutable array of mutable characters?
>>>>>
>>>>> What will happen with the .reverse and .sort array properties when used
>>>>> with const, invariant, and final qualifiers?
>>>>
>>>> They'll all fail.
>>>
>>> I think it's better to return reversed/sorted copy. This will make such change more backward compatibile.
>>
>> This makes sense. For immutable arrays, the definition should drop "in place" and just return a copy.
> 
> Which would be very confusing. This is instead a perfect opportunity to  take the *much* better path of finally depreciating the .sort and .reverse "properties". Equally good or better library implementations are possible (and exists). For example, .sort can't take an ordering predicate. 

+1 (and thanks for your predicate-accepting sort routine, Oskar!)

> Also, the special casing of reversing char[] and wchar[] arrays, preserving the encoded unicode code points is definitely (imho) too specialized to belong in the language (runtime) as opposed to a library.

No opinion there.  What about the special code-point-at-a-time foreach for char[]?  Do you dislike that too?

--bb
May 29, 2007
Bill Baxter pisze:
> Oskar Linde wrote:
>> Myron Alexander skrev:
>>> noSpam wrote:
>>>> Walter Bright wrote:
>>>>> Derek Parnell wrote:
>>>>>>  const(char)[]  // A mutable array of immutable characters?
>>>>>>  const(char[])  // An immutable array of mutable characters?
>>>>>>  const(const(char)[]) // An immutable array of immutable characters?
>>>>>>  char[]         // A mutable array of mutable characters?
>>>>>>
>>>>>> What will happen with the .reverse and .sort array properties when used
>>>>>> with const, invariant, and final qualifiers?
>>>>>
>>>>> They'll all fail.
>>>>
>>>> I think it's better to return reversed/sorted copy. This will make such change more backward compatibile.
>>>
>>> This makes sense. For immutable arrays, the definition should drop "in place" and just return a copy.
>>
>> Which would be very confusing. This is instead a perfect opportunity to  take the *much* better path of finally depreciating the .sort and .reverse "properties". Equally good or better library implementations are possible (and exists). For example, .sort can't take an ordering predicate. 
> 
> +1 (and thanks for your predicate-accepting sort routine, Oskar!)

+1

> 
>> Also, the special casing of reversing char[] and wchar[] arrays, preserving the encoded unicode code points is definitely (imho) too specialized to belong in the language (runtime) as opposed to a library.
> 
> No opinion there.  What about the special code-point-at-a-time foreach for char[]?  Do you dislike that too?
> 

IMHO that should not be in language. That's why I am opting for string *library* class/struct which could take care about such cases.

BR
Marcin Kuszczak
(Aarti_pl)
May 29, 2007
Frits van Bommel wrote:
> Reiner Pope wrote:
>> Perhaps I should just wait for the implementation, but I'm interested in knowing what your solution to .dup is. Given
>>
>>    auto foo = "hello".dup;
>>
>> what is the type of foo?
> 
> Most likely a plain (mutable) char[].
> 
>> How do you support both of
>>
>>    invariant char[] foo = "hello".dup;
>>    char[] bar = "hello".dup;
> 
> Likely the first will be an error as written, requiring a cast(invariant) to be inserted.
> Of course, since it doesn't make much sense to .dup in the example above ("hello" is already invariant, and copying an invariant array but not modifying the copy isn't typically useful) that shouldn't be much of a problem in this case.
> 
> For other cases though, I could see how a "unique" (or similar) type constructor that would allow implicit conversion to both mutable and invariant (and const) types could be useful.
> For instance, if the strings in your example were replaced by mutable arrays, a "unique char[]" return value of .dup could then be assigned to mutable/const/invariant references without needing casts.
Funny, that's just what I thought of (including the name unique). When I  first thought about it, I thought that such a construct would be very useful and very powerful, but I can't actually think of any use cases except with .dup and other constructor-type functions. (Although supporting them should alone be enough motivation).
May 29, 2007
Aarti_pl Wrote:
> Bill Baxter pisze:
> > Oskar Linde wrote:
> >> Myron Alexander skrev:
> >>> noSpam wrote:
> >>>> Walter Bright wrote:
> >>>>> Derek Parnell wrote:
> >>>>>>  const(char)[]  // A mutable array of immutable characters?
> >>>>>>  const(char[])  // An immutable array of mutable characters?
> >>>>>>  const(const(char)[]) // An immutable array of immutable characters?
> >>>>>>  char[]         // A mutable array of mutable characters?
> >>>>>>
> >>>>>> What will happen with the .reverse and .sort array properties when
> >>>>>> used
> >>>>>> with const, invariant, and final qualifiers?
> >>>>>
> >>>>> They'll all fail.
> >>>>
> >>>> I think it's better to return reversed/sorted copy. This will make such change more backward compatibile.
> >>>
> >>> This makes sense. For immutable arrays, the definition should drop "in place" and just return a copy.
> >>
> >> Which would be very confusing. This is instead a perfect opportunity to  take the *much* better path of finally depreciating the .sort and .reverse "properties". Equally good or better library implementations are possible (and exists). For example, .sort can't take an ordering predicate.
> > 
> > +1 (and thanks for your predicate-accepting sort routine, Oskar!)
> 
> +1
> 
> > 
> >> Also, the special casing of reversing char[] and wchar[] arrays, preserving the encoded unicode code points is definitely (imho) too specialized to belong in the language (runtime) as opposed to a library.
> > 
> > No opinion there.  What about the special code-point-at-a-time foreach for char[]?  Do you dislike that too?
> > 
> 
> IMHO that should not be in language. That's why I am opting for string *library* class/struct which could take care about such cases.

I agree.  I tend to think there are certain things which some apps don't need, in which case they can use the 'string' alias.  Other apps need to do this sort of thing and want a 'String' class to handle it.  I think there is room for both in the phobos/tango libraries.

The default language/library support can reverse utf8 and 16 but it's not ideal, eg.  convert to utf32, reverse, convert back. ;)

Regan