Thread overview
[Issue 7328] New: Allow casting between ubyte[4] and int
Jan 20, 2012
Jonathan M Davis
Jan 20, 2012
Jonathan M Davis
Jan 20, 2012
Peter Alexander
Jan 20, 2012
timon.gehr@gmx.ch
Jan 20, 2012
Jonathan M Davis
Jan 20, 2012
timon.gehr@gmx.ch
Jan 20, 2012
Peter Alexander
Jan 21, 2012
timon.gehr@gmx.ch
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328

           Summary: Allow casting between ubyte[4] and int
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody@puremagic.com
        ReportedBy: jmdavisProg@gmx.com


--- Comment #0 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-01-20 11:10:23 PST ---
It would be very nice to be able to cast between arrays of ubyte and integral type as long as they're the same size. So, ubyte[4] -> int, ubyte[2] -> short, etc. Maybe even ubyteArr[0 .. 4] -> int as long as the indices are known at compile time.

As it stands, the only two ways that I can think of doing this are to use a union, e.g.

union IntegerT)
    if(isIntegral!T)
{
    Unqual!T value;
    ubyte[T.sizeof] array;
}

or to do some nasty casting, e.g.

ubyte[4] a = (cast(ubyte*)[0x28A].ptr)[0 .. 4];
int b = (cast(int*)a.ptr)[0];

It would be much easier to manipulate buffers (which are generally arrays of ubytes) if casting between static arrays (and preferrably even dynamic arrays if the indices are known at compile time) and integral values - as long as the lengths match of course.

Worst case, something can be added to std.bitmanip to do this, but I'm a bit surprised that that casts such as cast(ubyte[4])7 aren't allowed by the compiler.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328


Alex Rønne Petersen <xtzgzorex@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xtzgzorex@gmail.com


--- Comment #1 from Alex Rønne Petersen <xtzgzorex@gmail.com> 2012-01-20 11:32:33 PST ---
I like the idea, but I'm a bit worried about endianness pitfalls with such a feature...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328



--- Comment #2 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-01-20 11:38:53 PST ---
A good point. I don't know whether that's enough to make it a bad idea or not. If you're worried about endianness though, the functions in std.bitmanip (e.g. bigEndianToNative and nativeToBigEndian) already take care of it for you, since they put non-native in static ubyte arrays of the appropriate type (the conversion is dealt with internally in a union). So, maybe that in of itself effectively solves the problem.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328


Peter Alexander <peter.alexander.au@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter.alexander.au@gmail.co
                   |                            |m


--- Comment #3 from Peter Alexander <peter.alexander.au@gmail.com> 2012-01-20 13:10:29 PST ---
Why does this need to be part of the language? It is trivially implemented as a function.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328


timon.gehr@gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timon.gehr@gmx.ch


--- Comment #4 from timon.gehr@gmx.ch 2012-01-20 14:02:56 PST ---
So is most of the language.
It needs to be in the language because it is already there, sort of:

import std.stdio;
struct S{int x;}
void main(){
    writeln(cast(ubyte[4])S(28298298)); // ok
    // writeln(cast(ubyte[4])28298298); // ng
}

I have always considered this an inconsistency. The implementation is a trivial rewrite.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328



--- Comment #5 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-01-20 14:15:49 PST ---
It doesn't _have_ to be, but as Timon says, it's odd that it isn't, and his examples should that the current situation is inconsistent. I was surprised when the cast didn't work. It seems obvious to me that it would. Maybe the endianness issue is why it doesn't.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328



--- Comment #6 from timon.gehr@gmx.ch 2012-01-20 14:19:01 PST ---
The issue is the same for structs and any programmer who performs the cast is
aware of it. (otherwise they wouldn't use a cast ;))

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328



--- Comment #7 from Peter Alexander <peter.alexander.au@gmail.com> 2012-01-20 15:16:21 PST ---
(In reply to comment #6)
> The issue is the same for structs and any programmer who performs the cast is
> aware of it. (otherwise they wouldn't use a cast ;))

I'm am sure there are many programmers that are *not* aware of endianness, even if they know that everything is made up of bytes and may use that cast.

I was unaware of the consistency though. Personally I consider the ability to cast a struct to a ubyte[n] an error in the language design also. Consider:

ubyte[8] a = cast(ubyte[8]) iota(0, 8);
writeln(a);

You get [0, 0, 0, 0, 8, 0, 0, 0]

I think this is something that an inexperienced D programmer could write expecting to get [0, 1, 2, 3, 4, 5, 6, 7] back.

Furthermore, you cannot rely on cast(ubyte[N]) to return a reinterpreted struct
because the struct may define opCast for ubyte[N] (imagine a container struct
Array(T, size_t N) that has opCast for T[N] -- casting to ubyte[Array(T,
N).sizeof] will reinterpret in most cases, except when T=ubyte and N=the
sizeof, good luck finding that bug in your generic serialisation code).

Reinterpreting memory should require nasty pointer casts. It's not common (or safe) enough to have convenient syntax, in my opinion.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 21, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7328



--- Comment #8 from timon.gehr@gmx.ch 2012-01-20 17:45:30 PST ---
(In reply to comment #7)
> (In reply to comment #6)
> > The issue is the same for structs and any programmer who performs the cast is
> > aware of it. (otherwise they wouldn't use a cast ;))
> 
> I'm am sure there are many programmers that are *not* aware of endianness, even if they know that everything is made up of bytes and may use that cast.
> 

If they are not aware of all relevant issues, they may not use a type cast. Knowing what one does is a precondition for using a type cast.

> I was unaware of the consistency though. Personally I consider the ability to cast a struct to a ubyte[n] an error in the language design also.

If you didn't notice it existed, is it important enough to be called an 'error'?

> Consider:
> 
> ubyte[8] a = cast(ubyte[8]) iota(0, 8);
> writeln(a);
> 
> You get [0, 0, 0, 0, 8, 0, 0, 0]
> 
> I think this is something that an inexperienced D programmer could write expecting to get [0, 1, 2, 3, 4, 5, 6, 7] back.
> 

This is a constructed example. If at all, they'll cast to ubyte[] (and that fails). But inexperienced D programmers don't use type casts. It is the first thing they learn about type casts. If they do, it is their own fault.

> Furthermore, you cannot rely on cast(ubyte[N]) to return a reinterpreted struct
> because the struct may define opCast for ubyte[N] (imagine a container struct
> Array(T, size_t N) that has opCast for T[N] -- casting to ubyte[Array(T,
> N).sizeof] will reinterpret in most cases, except when T=ubyte and N=the
> sizeof, good luck finding that bug in your generic serialisation code).
> 

Wrong. In most cases it will be a compiler error, because the compiler does not fall back to reinterpreting if the struct defines an opCast. opCast is an all-or-nothing thing.

> Reinterpreting memory should require nasty pointer casts. It's not common (or safe) enough to have convenient syntax, in my opinion.

Its only the pointer casts that are unsafe. cast(ubyte[4])1234 is perfectly
@safe. It will even catch size mismatches! (<insert 'good luck finding that
bug' comment here>)

By the way, it is possible to cast between two arbitrary structs of identical size ;).

I would not mind if the feature was removed for structs. I'd just like to restore consistency.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------