Jump to page: 1 2
Thread overview
byte swapping ...
Sep 14, 2003
Matthew Wilson
Sep 14, 2003
Walter
Sep 14, 2003
Matthew Wilson
Sep 14, 2003
Walter
Sep 14, 2003
Matthew Wilson
Sep 14, 2003
Sean L. Palmer
Sep 15, 2003
Matthew Wilson
Sep 15, 2003
Philippe Mori
Sep 15, 2003
Walter
Sep 16, 2003
Serge K
Sep 16, 2003
John Boucher
Sep 16, 2003
Serge K
September 14, 2003
Are there any built-in facilities for swapping the order of 16/32/64-bit quantities?


September 14, 2003
"Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk0vkl$cgh$1@digitaldaemon.com...
> Are there any built-in facilities for swapping the order of 16/32/64-bit quantities?

No. But you could write a function that uses inline asm and the BSWAP instruction.


September 14, 2003
Well I just cobbled together

  private uint swap(in uint i)
  {
     uint v_swap = (i & 0xff) << 24
                        | (i & 0xff00) << 8
                        | (i & 0xff0000) >> 8
                        | (i & 0xff000000) << 24;

     return v_swap;
  }

Will the inline asm version be substantially faster than this?


"Walter" <walter@digitalmars.com> wrote in message news:bk10dd$dj3$4@digitaldaemon.com...
>
> "Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk0vkl$cgh$1@digitaldaemon.com...
> > Are there any built-in facilities for swapping the order of 16/32/64-bit quantities?
>
> No. But you could write a function that uses inline asm and the BSWAP instruction.
>
>


September 14, 2003
"Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk1214$fkj$1@digitaldaemon.com...
> Well I just cobbled together
>
>   private uint swap(in uint i)
>   {
>      uint v_swap = (i & 0xff) << 24
>                         | (i & 0xff00) << 8
>                         | (i & 0xff0000) >> 8
>                         | (i & 0xff000000) << 24;
>
>      return v_swap;
>   }
>
> Will the inline asm version be substantially faster than this?

Yes. It does it in one instruction!

    asm
    {    naked;
         bswap EAX ;
         ret ;
    }


September 14, 2003
Beautiful!

"Walter" <walter@digitalmars.com> wrote in message news:bk16f2$mpo$2@digitaldaemon.com...
>
> "Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk1214$fkj$1@digitaldaemon.com...
> > Well I just cobbled together
> >
> >   private uint swap(in uint i)
> >   {
> >      uint v_swap = (i & 0xff) << 24
> >                         | (i & 0xff00) << 8
> >                         | (i & 0xff0000) >> 8
> >                         | (i & 0xff000000) << 24;
> >
> >      return v_swap;
> >   }
> >
> > Will the inline asm version be substantially faster than this?
>
> Yes. It does it in one instruction!
>
>     asm
>     {    naked;
>          bswap EAX ;
>          ret ;
>     }
>
>


September 14, 2003
You have a bug there anyway.  The last bit should be " | (i & 0xff000000) >>
24;"

This is the kind of thing you can easily make a template for... this should go into the standard library.

template (class T)
{
    void bswap(inout T i)
    {
        byte* p = cast(byte*)&i;
        for (int b = 0; b < T.size/2; ++b)
            instance swap(byte).swap(p[b], p[T.size-1-b]);
    }
}

You can always specialize it for common types to use the bswap instruction, and on platforms where byte access is slow you can probably find a way to make it work only with words and shifts and masks.

Speaking of shifts and masks, I find that on many platforms it's better (and it's never worse) to write the below like this instead, because it has to load fewer mask constants:

uint v_swap = (i & 0xff) << 24
                    | (i & 0xff00) << 8
                    | (i >> 8) & 0xff00
                    | (i >> 24) & 0xff;

if your platform has ROL and ROR rotate shift instructions, since for ABCD, you want DCBA:

ABCD ror 8 = DABC,  has D and B in the right place
ABCD rol 8 = BCDA,  has A and C in the right place

So

uint v_swap = rol(i & 0x00ff00ff, 8) | ror(i, 8) & 0x00ff00ff;

Since on any platform with word size of x bits, rol(uintx a, int bits) can
be done equivalently by (a << bits | a >> (x-bits))
and ror(uintx a, int bits) can be done equivalently by (a >> bits | a <<
(x-bits))

And many platforms have an instruction for this, I am not sure why low-level
languages like C do not have a rotate operator or at least a standard
intrinsic.  __lrotl and __lrotr are not standard.
Sean

"Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk1214$fkj$1@digitaldaemon.com...
> Well I just cobbled together
>
>   private uint swap(in uint i)
>   {
>      uint v_swap = (i & 0xff) << 24
>                         | (i & 0xff00) << 8
>                         | (i & 0xff0000) >> 8
>                         | (i & 0xff000000) << 24;
>
>      return v_swap;
>   }
>
> Will the inline asm version be substantially faster than this?


September 15, 2003
> You have a bug there anyway.  The last bit should be " | (i & 0xff000000)
>>
> 24;"

Well spotted. I shall call you Eagle-eye Cherry. :)

> This is the kind of thing you can easily make a template for... this
should
> go into the standard library.
>
> template (class T)
> {
>     void bswap(inout T i)
>     {
>         byte* p = cast(byte*)&i;
>         for (int b = 0; b < T.size/2; ++b)
>             instance swap(byte).swap(p[b], p[T.size-1-b]);
>     }
> }
>
> You can always specialize it for common types to use the bswap
instruction,
> and on platforms where byte access is slow you can probably find a way to make it work only with words and shifts and masks.

Would love to, but I can't bring myself to get D-templating until we've persuaded Walter of the merits of implicit instantiation. :(

> Speaking of shifts and masks, I find that on many platforms it's better
(and
> it's never worse) to write the below like this instead, because it has to load fewer mask constants:
>
> uint v_swap = (i & 0xff) << 24
>                     | (i & 0xff00) << 8
>                     | (i >> 8) & 0xff00
>                     | (i >> 24) & 0xff;

Makes sense. I've used this form.

> if your platform has ROL and ROR rotate shift instructions, since for
ABCD,
> you want DCBA:
>
> ABCD ror 8 = DABC,  has D and B in the right place
> ABCD rol 8 = BCDA,  has A and C in the right place
>
> So
>
> uint v_swap = rol(i & 0x00ff00ff, 8) | ror(i, 8) & 0x00ff00ff;
>
> Since on any platform with word size of x bits, rol(uintx a, int bits) can
> be done equivalently by (a << bits | a >> (x-bits))
> and ror(uintx a, int bits) can be done equivalently by (a >> bits | a <<
> (x-bits))
>
> And many platforms have an instruction for this, I am not sure why
low-level
> languages like C do not have a rotate operator or at least a standard intrinsic.  __lrotl and __lrotr are not standard.

Agreed. Let's see all these nitty-bitty things built into D.



September 15, 2003
"Walter" <walter@digitalmars.com> a écrit dans le message de news:bk10dd$dj3$4@digitaldaemon.com...
>
> "Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bk0vkl$cgh$1@digitaldaemon.com...
> > Are there any built-in facilities for swapping the order of 16/32/64-bit quantities?
>
> No. But you could write a function that uses inline asm and the BSWAP instruction.
>
>

But we would like to have intrinsic function for that since it would be
portable and it would be more accessible to common programmers
that may not have the knowledge on how to do it optimized...

And it would also be nice to be able to discover if big or little endian is used on a given platform...

Also conversions between different double format that exist on different platform might be interesting... since this would allows us to read binary files that where written on such platform.


September 15, 2003
"Philippe Mori" <philippe_mori@hotmail.com> wrote in message news:bk4eq6$218m$1@digitaldaemon.com...
> But we would like to have intrinsic function for that since it would be
> portable and it would be more accessible to common programmers
> that may not have the knowledge on how to do it optimized...

This might be a good idea.

> And it would also be nice to be able to discover if big or little endian is used on a given platform...

Already there as a predefined version:

    version (BigEndian)
    { ... }
    version (LittleEndian)
    { ... }

> Also conversions between different double format that exist on different platform might be interesting... since this would allows us to read binary files that where written on such platform.

I'll worry about that when we've got a port to a cpu with different double formats!


September 16, 2003
> I'll worry about that when we've got a port to a cpu with different double formats!

Then you can relax - VAX is dead, and AlphaAXP has support for VAX double format only for backward compatibility. Or should it be "had"? ..since Alpha is almost dead.


« First   ‹ Prev
1 2