byte swapping ...

Sep 14, 2003

Matthew Wilson

Sep 14, 2003

Sep 14, 2003

Sep 14, 2003

Sep 14, 2003

Sep 14, 2003

Sep 15, 2003

Sep 15, 2003

Sep 15, 2003

Sep 16, 2003

Sep 16, 2003

Sep 16, 2003

"Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk0vkl$cgh$1@digitaldaemon.com... > Are there any built-in facilities for swapping the order of 16/32/64-bit quantities? No. But you could write a function that uses inline asm and the BSWAP instruction.

Well I just cobbled together private uint swap(in uint i) { uint v_swap = (i & 0xff) << 24 | (i & 0xff00) << 8 | (i & 0xff0000) >> 8 | (i & 0xff000000) << 24; return v_swap; } Will the inline asm version be substantially faster than this? "Walter" <walter@digitalmars.com> wrote in message news:bk10dd$dj3$4@digitaldaemon.com... > > "Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk0vkl$cgh$1@digitaldaemon.com... > > Are there any built-in facilities for swapping the order of 16/32/64-bit quantities? > > No. But you could write a function that uses inline asm and the BSWAP instruction. > >

"Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk1214$fkj$1@digitaldaemon.com... > Well I just cobbled together > > private uint swap(in uint i) > { > uint v_swap = (i & 0xff) << 24 > | (i & 0xff00) << 8 > | (i & 0xff0000) >> 8 > | (i & 0xff000000) << 24; > > return v_swap; > } > > Will the inline asm version be substantially faster than this? Yes. It does it in one instruction! asm { naked; bswap EAX ; ret ; }

Beautiful! "Walter" <walter@digitalmars.com> wrote in message news:bk16f2$mpo$2@digitaldaemon.com... > > "Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk1214$fkj$1@digitaldaemon.com... > > Well I just cobbled together > > > > private uint swap(in uint i) > > { > > uint v_swap = (i & 0xff) << 24 > > | (i & 0xff00) << 8 > > | (i & 0xff0000) >> 8 > > | (i & 0xff000000) << 24; > > > > return v_swap; > > } > > > > Will the inline asm version be substantially faster than this? > > Yes. It does it in one instruction! > > asm > { naked; > bswap EAX ; > ret ; > } > >

September 14, 2003

Re: byte swapping ...

Posted by Sean L. Palmer
in reply to Matthew Wilson

Permalink

Sean L. Palmer

Posted in reply to Matthew Wilson

Permalink

You have a bug there anyway.  The last bit should be " | (i & 0xff000000) >>
24;"

This is the kind of thing you can easily make a template for... this should go into the standard library.

template (class T)
{
    void bswap(inout T i)
    {
        byte* p = cast(byte*)&i;
        for (int b = 0; b < T.size/2; ++b)
            instance swap(byte).swap(p[b], p[T.size-1-b]);
    }
}

You can always specialize it for common types to use the bswap instruction, and on platforms where byte access is slow you can probably find a way to make it work only with words and shifts and masks.

Speaking of shifts and masks, I find that on many platforms it's better (and it's never worse) to write the below like this instead, because it has to load fewer mask constants:

uint v_swap = (i & 0xff) << 24
                    | (i & 0xff00) << 8
                    | (i >> 8) & 0xff00
                    | (i >> 24) & 0xff;

if your platform has ROL and ROR rotate shift instructions, since for ABCD, you want DCBA:

ABCD ror 8 = DABC,  has D and B in the right place
ABCD rol 8 = BCDA,  has A and C in the right place

So

uint v_swap = rol(i & 0x00ff00ff, 8) | ror(i, 8) & 0x00ff00ff;

Since on any platform with word size of x bits, rol(uintx a, int bits) can
be done equivalently by (a << bits | a >> (x-bits))
and ror(uintx a, int bits) can be done equivalently by (a >> bits | a <<
(x-bits))

And many platforms have an instruction for this, I am not sure why low-level
languages like C do not have a rotate operator or at least a standard
intrinsic.  __lrotl and __lrotr are not standard.
Sean

"Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk1214$fkj$1@digitaldaemon.com...
> Well I just cobbled together
>
>   private uint swap(in uint i)
>   {
>      uint v_swap = (i & 0xff) << 24
>                         | (i & 0xff00) << 8
>                         | (i & 0xff0000) >> 8
>                         | (i & 0xff000000) << 24;
>
>      return v_swap;
>   }
>
> Will the inline asm version be substantially faster than this?

> You have a bug there anyway. The last bit should be " | (i & 0xff000000) >> > 24;" Well spotted. I shall call you Eagle-eye Cherry. :) > This is the kind of thing you can easily make a template for... this should > go into the standard library. > > template (class T) > { > void bswap(inout T i) > { > byte* p = cast(byte*)&i; > for (int b = 0; b < T.size/2; ++b) > instance swap(byte).swap(p[b], p[T.size-1-b]); > } > } > > You can always specialize it for common types to use the bswap instruction, > and on platforms where byte access is slow you can probably find a way to make it work only with words and shifts and masks. Would love to, but I can't bring myself to get D-templating until we've persuaded Walter of the merits of implicit instantiation. :( > Speaking of shifts and masks, I find that on many platforms it's better (and > it's never worse) to write the below like this instead, because it has to load fewer mask constants: > > uint v_swap = (i & 0xff) << 24 > | (i & 0xff00) << 8 > | (i >> 8) & 0xff00 > | (i >> 24) & 0xff; Makes sense. I've used this form. > if your platform has ROL and ROR rotate shift instructions, since for ABCD, > you want DCBA: > > ABCD ror 8 = DABC, has D and B in the right place > ABCD rol 8 = BCDA, has A and C in the right place > > So > > uint v_swap = rol(i & 0x00ff00ff, 8) | ror(i, 8) & 0x00ff00ff; > > Since on any platform with word size of x bits, rol(uintx a, int bits) can > be done equivalently by (a << bits | a >> (x-bits)) > and ror(uintx a, int bits) can be done equivalently by (a >> bits | a << > (x-bits)) > > And many platforms have an instruction for this, I am not sure why low-level > languages like C do not have a rotate operator or at least a standard intrinsic. __lrotl and __lrotr are not standard. Agreed. Let's see all these nitty-bitty things built into D.

"Walter" <walter@digitalmars.com> a écrit dans le message de news:bk10dd$dj3$4@digitaldaemon.com... > > "Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk0vkl$cgh$1@digitaldaemon.com... > > Are there any built-in facilities for swapping the order of 16/32/64-bit quantities? > > No. But you could write a function that uses inline asm and the BSWAP instruction. > > But we would like to have intrinsic function for that since it would be portable and it would be more accessible to common programmers that may not have the knowledge on how to do it optimized... And it would also be nice to be able to discover if big or little endian is used on a given platform... Also conversions between different double format that exist on different platform might be interesting... since this would allows us to read binary files that where written on such platform.

"Philippe Mori" <philippe_mori@hotmail.com> wrote in message news:bk4eq6$218m$1@digitaldaemon.com... > But we would like to have intrinsic function for that since it would be > portable and it would be more accessible to common programmers > that may not have the knowledge on how to do it optimized... This might be a good idea. > And it would also be nice to be able to discover if big or little endian is used on a given platform... Already there as a predefined version: version (BigEndian) { ... } version (LittleEndian) { ... } > Also conversions between different double format that exist on different platform might be interesting... since this would allows us to read binary files that where written on such platform. I'll worry about that when we've got a port to a cpu with different double formats!

> I'll worry about that when we've got a port to a cpu with different double formats! Then you can relax - VAX is dead, and AlphaAXP has support for VAX double format only for backward compatibility. Or should it be "had"? ..since Alpha is almost dead.

Forums