Jump to page: 1 2
Thread overview
Bug: string cast overrides string decorator
Nov 18, 2005
Bruno Medeiros
Nov 18, 2005
Ivan Senji
Nov 18, 2005
Derek Parnell
Nov 18, 2005
Regan Heath
Nov 18, 2005
Kris
Nov 18, 2005
Regan Heath
Nov 18, 2005
Kris
Nov 19, 2005
Regan Heath
Nov 19, 2005
Kris
Nov 19, 2005
Ivan Senji
Nov 20, 2005
Bruno Medeiros
Nov 20, 2005
Ivan Senji
November 18, 2005
In the following code:

  auto str = cast(wchar[])("123456"c);
  writefln(typeid(typeof(str)));
  writefln("str: ", (cast(char[])str).length,": ", (cast(char[])str));

the (original) string literal will be of type wchar[$] and not char[$], as the cast takes precedence over the string decorator.


-- 
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to be... unnatural."
November 18, 2005
Bruno Medeiros wrote:
> In the following code:
> 
>   auto str = cast(wchar[])("123456"c);
>   writefln(typeid(typeof(str)));
>   writefln("str: ", (cast(char[])str).length,": ", (cast(char[])str));
> 
> the (original) string literal will be of type wchar[$] and not char[$], as the cast takes precedence over the string decorator.
> 
> 

How is that a bug? The type of the right side is wchar[].

auto x = cast(float)4;

x is float, not int, same thing above.
November 18, 2005
On Fri, 18 Nov 2005 12:54:38 +0100, Ivan Senji wrote:

> Bruno Medeiros wrote:
>> In the following code:
>> 
>>   auto str = cast(wchar[])("123456"c);
>>   writefln(typeid(typeof(str)));
>>   writefln("str: ", (cast(char[])str).length,": ", (cast(char[])str));
>> 
>> the (original) string literal will be of type wchar[$] and not char[$], as the cast takes precedence over the string decorator.
>> 
>> 
> 
> How is that a bug? The type of the right side is wchar[].
> 
> auto x = cast(float)4;
> 
> x is float, not int, same thing above.

Almost the same thing but not quite.

When one uses 'cast(float)', D converts the content of the data to the float format. However, when one uses 'cast(wchar[])' D does not convert the internal representation. I think this is probable the wrong behaviour for D but that's what it does now. With one exception, the form

  cast(wchar[])"string"

has the compiler setting the storage format for the data.

  int y;
  float x;
  x = cast(float)y;  // Convert y to a float format at runtime
  x = cast(float)(4L); // Convert to float format at compile time

but ...

  char[] y;
  wchar[] x;
  x = cast(wchar[])y; // No conversion takes place.
  x = cast(wchar[])("abc"c); // What should happen here?

To be consistent, no conversion should take place.


-- 
Derek Parnell
Melbourne, Australia
18/11/2005 11:06:44 PM
November 18, 2005
On Fri, 18 Nov 2005 23:18:57 +1100, Derek Parnell <derek@psych.ward> wrote:
> On Fri, 18 Nov 2005 12:54:38 +0100, Ivan Senji wrote:
>
>> Bruno Medeiros wrote:
>>> In the following code:
>>>
>>>   auto str = cast(wchar[])("123456"c);
>>>   writefln(typeid(typeof(str)));
>>>   writefln("str: ", (cast(char[])str).length,": ", (cast(char[])str));
>>>
>>> the (original) string literal will be of type wchar[$] and not char[$],
>>> as the cast takes precedence over the string decorator.
>>>
>>>
>>
>> How is that a bug? The type of the right side is wchar[].
>>
>> auto x = cast(float)4;
>>
>> x is float, not int, same thing above.
>
> Almost the same thing but not quite.
>
> When one uses 'cast(float)', D converts the content of the data to the
> float format.

Exactly! because it makes no sense not to, you'd never 'paint' an int as a float or vice versa, it's meaningless.

The same is true for char[], wchar[] and dchar[]. Yes, some content will paint just fine, ASCII for example, but others will result in invalid UTF data.

Regan
November 18, 2005
I believe you can draw a parallel with pretty much all array types. What would you expect to happen when casting an int[] to a byte[]? Or an int[] to float[]?

The array /content/ are not transformed. Instead, the outcome causes the array.length property to change, bases upon the difference in the respective source and target element width. For example:

int[10] i;

byte[] b = cast(byte[]) i;

b.length will now be 40; or something like that. This /can/ actually be useful in certain circumstances, but I expect it's rare. I've used it on one occasion, and was happy about it :-)

Casting char[]/wchar[]/dchar[] is consitent with all the rest. I suppose it's a question as to whether casting *any* array should change the underlying type? If so, what happens when you try to cast multi-dimensional arrays? Or try to cast an array of structs?  I suspect the semantics tend to become rather complex? At least with a struct/class one can overload the cast() operator, but only for *one* target type.

My guess is that making the cast() operator invoke utf transcoding would thus make it stand out as a rather special case?

2c


"Derek Parnell" <derek@psych.ward> wrote in message news:whb79upqqmw3.rq713zqunjpa.dlg@40tude.net...
> On Fri, 18 Nov 2005 12:54:38 +0100, Ivan Senji wrote:
>
>> Bruno Medeiros wrote:
>>> In the following code:
>>>
>>>   auto str = cast(wchar[])("123456"c);
>>>   writefln(typeid(typeof(str)));
>>>   writefln("str: ", (cast(char[])str).length,": ", (cast(char[])str));
>>>
>>> the (original) string literal will be of type wchar[$] and not char[$], as the cast takes precedence over the string decorator.
>>>
>>>
>>
>> How is that a bug? The type of the right side is wchar[].
>>
>> auto x = cast(float)4;
>>
>> x is float, not int, same thing above.
>
> Almost the same thing but not quite.
>
> When one uses 'cast(float)', D converts the content of the data to the
> float format. However, when one uses 'cast(wchar[])' D does not convert
> the
> internal representation. I think this is probable the wrong behaviour for
> D
> but that's what it does now. With one exception, the form
>
>  cast(wchar[])"string"
>
> has the compiler setting the storage format for the data.
>
>  int y;
>  float x;
>  x = cast(float)y;  // Convert y to a float format at runtime
>  x = cast(float)(4L); // Convert to float format at compile time
>
> but ...
>
>  char[] y;
>  wchar[] x;
>  x = cast(wchar[])y; // No conversion takes place.
>  x = cast(wchar[])("abc"c); // What should happen here?
>
> To be consistent, no conversion should take place.
>
>
> -- 
> Derek Parnell
> Melbourne, Australia
> 18/11/2005 11:06:44 PM


November 18, 2005
On Fri, 18 Nov 2005 14:00:12 -0800, Kris <fu@bar.com> wrote:
> I believe you can draw a parallel with pretty much all array types. What
> would you expect to happen when casting an int[] to a byte[]? Or an int[] to float[]?
>
> The array /content/ are not transformed. Instead, the outcome causes the
> array.length property to change, bases upon the difference in the respective
> source and target element width. For example:
>
> int[10] i;
>
> byte[] b = cast(byte[]) i;
>
> b.length will now be 40; or something like that. This /can/ actually be
> useful in certain circumstances, but I expect it's rare. I've used it on one occasion, and was happy about it :-)

Me too. I think this behaviour can be really useful.

> Casting char[]/wchar[]/dchar[] is consitent with all the rest. I suppose
> it's a question as to whether casting *any* array should change the
> underlying type? If so, what happens when you try to cast multi-dimensional
> arrays? Or try to cast an array of structs?  I suspect the semantics tend to
> become rather complex? At least with a struct/class one can overload the
> cast() operator, but only for *one* target type.

I've just had an idea/thought.

When we get "array operations" won't we have another method of casting? one which casts each item in the array. In which case casting int[] to float[] in this manner would convert each item, as opposed to 'painting' it. I believe this could also be very useful.

This idea _almost_ helps the char[]/wchar[]/dchar[] case, except that single items in char[] arrays may not represent an entire character. This appears to be one area where they differ significantly from other arrays, where each item in the array is a complete <thing>.

> My guess is that making the cast() operator invoke utf transcoding would
> thus make it stand out as a rather special case?

I agree. I think making it explicit is a good middle ground.

Regan

> "Derek Parnell" <derek@psych.ward> wrote in message
> news:whb79upqqmw3.rq713zqunjpa.dlg@40tude.net...
>> On Fri, 18 Nov 2005 12:54:38 +0100, Ivan Senji wrote:
>>
>>> Bruno Medeiros wrote:
>>>> In the following code:
>>>>
>>>>   auto str = cast(wchar[])("123456"c);
>>>>   writefln(typeid(typeof(str)));
>>>>   writefln("str: ", (cast(char[])str).length,": ", (cast(char[])str));
>>>>
>>>> the (original) string literal will be of type wchar[$] and not char[$],
>>>> as the cast takes precedence over the string decorator.
>>>>
>>>>
>>>
>>> How is that a bug? The type of the right side is wchar[].
>>>
>>> auto x = cast(float)4;
>>>
>>> x is float, not int, same thing above.
>>
>> Almost the same thing but not quite.
>>
>> When one uses 'cast(float)', D converts the content of the data to the
>> float format. However, when one uses 'cast(wchar[])' D does not convert
>> the
>> internal representation. I think this is probable the wrong behaviour for
>> D
>> but that's what it does now. With one exception, the form
>>
>>  cast(wchar[])"string"
>>
>> has the compiler setting the storage format for the data.
>>
>>  int y;
>>  float x;
>>  x = cast(float)y;  // Convert y to a float format at runtime
>>  x = cast(float)(4L); // Convert to float format at compile time
>>
>> but ...
>>
>>  char[] y;
>>  wchar[] x;
>>  x = cast(wchar[])y; // No conversion takes place.
>>  x = cast(wchar[])("abc"c); // What should happen here?
>>
>> To be consistent, no conversion should take place.
>>
>>
>> --
>> Derek Parnell
>> Melbourne, Australia
>> 18/11/2005 11:06:44 PM
>
>

November 18, 2005
"Regan Heath" <regan@netwin.co.nz> wrote
> On Fri, 18 Nov 2005 14:00:12 -0800, Kris <fu@bar.com> wrote:

>> My guess is that making the cast() operator invoke utf transcoding would thus make it stand out as a rather special case?
>
> I agree. I think making it explicit is a good middle ground.

:-)

I actually meant that cast([]) for char transcoding would make it behave differently than all other types of array casting. That would make it a special case, which would be "bad". It would stand out as such in the D spec. Didn't mean it would stand out in the code! Like many others, I have a thing about consistency <g>

Having said that: if, as you indicate, the array handling evolves to include new functionality, then "explicit" transcoding would certainly be a good candiate to consider. There are likely some serious limitations on what it could do though. There's another post in the bugs list with some ICU examples ~ hopefully that indicates some of the difficulties that might be faced.

 I do wonder if cast() is the right operator though (because of the
limitation of one opcast per struct/class).


November 19, 2005
Derek Parnell wrote:
> On Fri, 18 Nov 2005 12:54:38 +0100, Ivan Senji wrote:
> 
> 
>>Bruno Medeiros wrote:
>>
>>>In the following code:
>>>
>>>  auto str = cast(wchar[])("123456"c);
>>>  writefln(typeid(typeof(str)));
>>>  writefln("str: ", (cast(char[])str).length,": ", (cast(char[])str));
>>>
>>>the (original) string literal will be of type wchar[$] and not char[$], as the cast takes precedence over the string decorator.
>>>
>>>
>>
>>How is that a bug? The type of the right side is wchar[].
>>
>>auto x = cast(float)4;
>>
>>x is float, not int, same thing above.
> 
> 
> Almost the same thing but not quite.
> 
<snip>

I didn't try/want to start another discussion about string literals. All i was trying to do is  to explain to Bruno why str from his post has the type it has (because of cast).

Maybe my example should have been something like:

  static int[] numbers = [1,2];
  auto array = cast(double[])numbers;

  writefln(typeid(typeof(numbers)));
  writefln(typeid(typeof(array)));
  writefln(numbers);
  writefln(array);

prints:
int[]
double[]
[1,2]
[4.24399e-314]

My point was that the type of any expression is that of a type it is being cast to if that cast is legal.
November 19, 2005
On Fri, 18 Nov 2005 15:45:43 -0800, Kris <fu@bar.com> wrote:
> "Regan Heath" <regan@netwin.co.nz> wrote
>> On Fri, 18 Nov 2005 14:00:12 -0800, Kris <fu@bar.com> wrote:
>
>>> My guess is that making the cast() operator invoke utf transcoding would
>>> thus make it stand out as a rather special case?
>>
>> I agree. I think making it explicit is a good middle ground.
>
> :-)
>
> I actually meant that cast([]) for char transcoding would make it behave
> differently than all other types of array casting. That would make it a
> special case, which would be "bad". It would stand out as such in the D
> spec. Didn't mean it would stand out in the code! Like many others, I have a thing about consistency <g>

I've noticed. IMO different things should be expected to behave differently. Of course it all comes down to whether you think char[] is any different to int[], or how different, different enough to behave differently?

char[] appears to be somewhat different to int[] (as mentioned in my last post).

> Having said that: if, as you indicate, the array handling evolves to include new functionality, then "explicit" transcoding would certainly be a good
> candiate to consider.

I'm not sure a cast style "array operations" actually help due to characters being represented by 1 or more 'char' items in the char[] array. But it seems like a nice idea for converting int[] to float[] and vice versa (not as a replacement for the current paint behaviour but as an alternative)

> There are likely some serious limitations on what it could do though. There's another post in the bugs list with some ICU
> examples ~ hopefully that indicates some of the difficulties that might be faced.

I've seen it, still digesting it, may reply, may not :)

>  I do wonder if cast() is the right operator though (because of the
> limitation of one opcast per struct/class).

That limitation is a problem all by itself, IMO.

Regan
November 19, 2005
"Regan Heath" <regan@netwin.co.nz> wrote ..

> I've seen it, still digesting it, may reply, may not :)

Humour is a great thing <g>


« First   ‹ Prev
1 2