Jump to page: 1 2
Thread overview
char[] -> wchar[] cast error
Jul 12, 2004
teqDruid
Jul 12, 2004
Arcane Jill
Jul 12, 2004
Regan Heath
Jul 12, 2004
Regan Heath
Jul 13, 2004
Arcane Jill
Jul 14, 2004
Walter
Jul 14, 2004
Regan Heath
Jul 15, 2004
teqDruid
Aug 18, 2004
Walter
Jul 13, 2004
teqDruid
July 12, 2004
If char[].length is not even, casting to a wchar[] gives an "Error: array cast misalignment" in some cases.  Seems to happen only if the char[] is a variable, since 'cast(wchar[])"hello"' works.  Interesting bug...

I'm running DMD 0.95 on Linux.

John

Example:
--------- dtest.d ----------

void main(char[][] args)
{
	wchar[] wstring = cast(wchar[])args[1];
}
----------------------------
$ dmd dtest.d
gcc dtest.o -o dtest -lphobos -lpthread -lm
$ ./dtest hello
Error: array cast misalignment
$ ./dtest hell
$ ./dtest fivec
Error: array cast misalignment
$ ./dtest five
$
July 12, 2004
In article <pan.2004.07.12.05.44.19.995572@teqdruid.com>, teqDruid says...
>
>If char[].length is not even, casting to a wchar[] gives an "Error: array cast misalignment" in some cases.

Forgive me, but what possible meaning can there be to *CAST* a char[] array to a wchar[] array? The only thing I can imagine CASTING a char[] array to is a ubyte[] array or a void[] array. Nothing else makes even the remotest conceptual sense (to me). I'm not surprised that this gives an error.

(In fact, as a advocate of typesafety, I would even argue that such a cast ought to a compile time error, regardeless of the length, but D lets you do type-unsafe things).

If you want to CONVERT from UTF-8 to UTF-16, what you need is not a cast, but a function call: std.utf.toUTF16().

What are you trying to achieve, exactly?

Arcane Jill


July 12, 2004
way to be kind and understanding jill :|

i would imagine he's trying to convert to utf-16, in which case you're right, he should be using that function.

but i won't lie to you - the manual is never really clear as to what _exactly_ casting a char[] to a wchar[]/dchar[] does!  i would've thought it would convert as well, but oh well..


July 12, 2004
On Mon, 12 Jul 2004 11:05:28 -0400, Jarrett Billingsley <kb3ctd2@yahoo.com> wrote:
> way to be kind and understanding jill :|
>
> i would imagine he's trying to convert to utf-16, in which case you're
> right, he should be using that function.
>
> but i won't lie to you - the manual is never really clear as to what
> _exactly_ casting a char[] to a wchar[]/dchar[] does!  i would've thought it
> would convert as well, but oh well..

Given that the documentation on arrays:
http://www.digitalmars.com/d/arrays.html

"String literals are implicitly converted between chars, wchars, and dchars as necessary."

Then the compiler must know how to convert them, so it should probably convert them on a cast. Unless these conversions add bloat to the compiler, or executable produced, then it might be best to leave the conversion to library functions. The former has the 'it's so easy' factor however. ;)

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 12, 2004
On Tue, 13 Jul 2004 09:24:19 +1200, Regan Heath <regan@netwin.co.nz> wrote:

> On Mon, 12 Jul 2004 11:05:28 -0400, Jarrett Billingsley <kb3ctd2@yahoo.com> wrote:
>> way to be kind and understanding jill :|
>>
>> i would imagine he's trying to convert to utf-16, in which case you're
>> right, he should be using that function.
>>
>> but i won't lie to you - the manual is never really clear as to what
>> _exactly_ casting a char[] to a wchar[]/dchar[] does!  i would've thought it
>> would convert as well, but oh well..
>
> Given that the documentation on arrays:
> http://www.digitalmars.com/d/arrays.html
>
> "String literals are implicitly converted between chars, wchars, and dchars as necessary."
>
> Then the compiler must know how to convert them, so it should probably convert them on a cast. Unless these conversions add bloat to the compiler, or executable produced, then it might be best to leave the conversion to library functions. The former has the 'it's so easy' factor however. ;)

I have had jumped the fence at least 3 times thinking about this.

My current thought is that it makes sense that casting from char[] to dchar[] should convert the data but only because char and dchar have a specified encoding type.

It does not make sense to convert the data if going to/from a type with no specified encoding i.e. ubyte[] to dchar[] should not attempt any conversion.

If you do not convert the data, as is currently the case, then the cast could cause illegal values in the resulting array.. couldn't it?

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 13, 2004
When you use the cast(wchar[])"something here" it does a convert, so I
thought that it did a convert in the case of something like char[] a=
"something here";wchar[] wa = cast(wchar[])a;

I deleted the post when I realized my stupidity, and posted another one that this behavior should be noted in the specs.

On Mon, 12 Jul 2004 08:17:34 +0000, Arcane Jill wrote:

> In article <pan.2004.07.12.05.44.19.995572@teqdruid.com>, teqDruid says...
>>
>>If char[].length is not even, casting to a wchar[] gives an "Error: array cast misalignment" in some cases.
> 
> Forgive me, but what possible meaning can there be to *CAST* a char[] array to a wchar[] array? The only thing I can imagine CASTING a char[] array to is a ubyte[] array or a void[] array. Nothing else makes even the remotest conceptual sense (to me). I'm not surprised that this gives an error.
> 
> (In fact, as a advocate of typesafety, I would even argue that such a cast ought to a compile time error, regardeless of the length, but D lets you do type-unsafe things).
> 
> If you want to CONVERT from UTF-8 to UTF-16, what you need is not a cast, but a function call: std.utf.toUTF16().
> 
> What are you trying to achieve, exactly?
> 
> Arcane Jill

July 13, 2004
In article <opsa1q9btr5a2sq9@digitalmars.com>, Regan Heath says...

>If you do not convert the data, as is currently the case, then the cast could cause illegal values in the resulting array.. couldn't it?

Yes.

Which I why I think it should be either outlawed or made to work.


It would appear that cast(dchar[])"string" converts fine - but that of course is done at compile-time, so there's no run-time overhead. It's just another way of writing a dchar[] literal.

I would be well in favor of extending this behavior to run-time. If
cast(dchar[]) could be made to call std.utf.toUTF32(), things would be a lot
more consistent.

Arcane Jill


July 14, 2004
"Arcane Jill" <Arcane_member@pathlink.com> wrote in message news:cd05jl$25af$1@digitaldaemon.com...
> I would be well in favor of extending this behavior to run-time. If
> cast(dchar[]) could be made to call std.utf.toUTF32(), things would be a
lot
> more consistent.

Casting on arrays is done as a type 'paint', because there are many programming tasks where an array of data is built up as one type, then interpreted as another. For example reading things off of disk.


July 14, 2004
On Wed, 14 Jul 2004 02:44:42 -0700, Walter <newshound@digitalmars.com> wrote:

>
> "Arcane Jill" <Arcane_member@pathlink.com> wrote in message
> news:cd05jl$25af$1@digitaldaemon.com...
>> I would be well in favor of extending this behavior to run-time. If
>> cast(dchar[]) could be made to call std.utf.toUTF32(), things would be a
> lot
>> more consistent.
>
> Casting on arrays is done as a type 'paint', because there are many
> programming tasks where an array of data is built up as one type, then
> interpreted as another. For example reading things off of disk.

I agree that this is the way it should work for types like ubyte, ushort etc which 'have no specified encoding' BUT types with an encoding should be treated differently BUT only when casting from one with an encoding to another with an encoding.

Using your reading from disk example.

You read the data into a ubyte[] (has NO encoding) then you cast to char[] (has encoding) this does NOT perform any conversion. It 'paint's the ubytes as chars. Alternately if you know the encoding of the file, you could read straight into char, wchar or dchar.

If you next cast from that char[] to dchar[] it SHOULD convert the data, and it can convert the data, it knows the first encoding UTF-8 and it knows the seccond encoding UTF-32. It makes NO sense whatsoever to 'paint' UTF-8 as UTF-32 all you get is an illegal UTF-32 array.

If you need to 'paint' the char[], wchar[], or dchar[] to int[] this would NOT convert the data either. as int has no encoding.

So the rule is, if both have encoding then convert, otherwise paint.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 15, 2004
It sure would be handy if the compiler could implicitly convert from char[] to wchar[] to dchar[].

« First   ‹ Prev
1 2