March 14, 2003
Thanks guys.. I know what you mean now..

Ben

"Jonathan Andrew" <Jonathan_member@pathlink.com> wrote in message news:b4r20u$1do2$1@digitaldaemon.com...
> The rotation operator (at least as I've used it) shifts bits around in a
> value, so if you had a value
>
> 11110000,
>
> and rotated it left two positions, you would have
>
> 11000011, the bits just get shifted over left two positions, with the ones on the left coming back around to the right. As far as I know, its mostly used for encryption, and multiplication and division within
processors
> use rotation (shifts, anyway) for faster operation.
>
> Even small microcontrollers like the HC11 have this capability built in to the machine code, so I don't see why there shouldn't be a special operator for it in D.
>
> -Jon
> In article <b4qoiq$16b5$1@digitaldaemon.com>, Ben Woodhead says...
> >
> >I hate to ask, but what is a rotation operator, or what is it for..
> >
> >Thanks Ben
> >"Ziemowit Zglinski" <Ziemowit_member@pathlink.com> wrote in message
> >news:b4pt22$gg1$1@digitaldaemon.com...
> >> While programming a lot of embedded application, I often miss a
rotation
> >> operator in almost all programming languages. This operation  is
> >implemented in
> >> all processor architectures and still difficult to achieve from a high
> >level
> >> language.
> >>
> >> Taking opportunity of not yet fully defined language, I propose to add
two
> >> rotation operators to the language definition:
> >> Rotate right - @>  or @>>
> >> Rotate left  - <@  or <<@
> >>
> >> The behaviour should be analogue to the logical shift operators, except
> >the bit
> >> shifted out would be shift in on the opposite side in place of 0.
Number
> >bits
> >> rotated should correspond to the length of the actual data type.
> >>
> >> Thank You,
> >>
> >> --  Ziemowit
> >>
> >>     ziemek@tera.com.pl
> >
> >
>
>


April 01, 2003
"Sean L. Palmer" <seanpalmer@directvinternet.com> wrote in message news:b4qk0p$12c6$1@digitaldaemon.com...
> BSF/BSR sound mighty useful for doing bit array work or figuring out which power of two to use for textures.

bsf and bsr are implemented as compiler intrisics, see phobos\intrinsic.d


April 01, 2003
Excellent!  Thank you, Walter.

Can I add a few instrinsics to the wish list?

rol
ror
fsincos

You obviously know about rol and ror.  C has non-ansi-standard __lrotl and __lrotr (Microsoft libraries) but hopefully D will have standard intrinsics for this basic op.  The only drag is that for it to be useful you need to know the size of the int type exactly.

fsincos is apparently a x87-only feature (haven't seen it on any other processor yet) but it seems to be the nature of math that if you need either sin or cos, you will likely need the other as well, for the same angle.  It makes sense for a coprocessor to compute them both at once much as it makes sense to do integer div and mod at the same time.  There is considerable work that doesn't need to be done twice.

Speaking of which, ldiv is a C library function which would be better off being an intrinsic.  Maybe you already do this for DMC.

A throwback to the old fixed point days:  muldiv  (do an integer multiply followed by an integer divide, without risk of overflow in the intermediate result, though the final result may well still overflow).

I'm presuming that min, max, and swap will be templatized library functions, not intrinsics, but I consider those to be basic requirements.  Currently swap seems tied up in the TypeInfo's, and min and max (thank heavens!) are there in Phobos, but only defined for a few types.  Very nice are the inclusion of min and max for arrays!!  Extremely nice is the inclusion of sum() for arrays! With that and the upcoming support for componentwise multiplication of arrays, I can express dot product very tidily, if not necessarily efficiently for fixed-size vectors:  sum(a[] * b[]) which would obviously be able to be far more efficient if the size of a and b are known at compile time.

I wonder if the compiler will be able to optimize this initialization to zero away in real sum(real n[])?

 real result = 0;
 for (uint i = 0; i < n.length; i++)
      result += n[i];
 return result;

could be transformed into:

real result;
if (!n.length)
    result = 0;
else
{
    result = n[0];
    for (uint i = 1; i < n.length; ++i)
        result += n[i];
}
return result;

Doesn't look faster, but if you know at compile time that your array is not zero-length, it does become faster.

Don't ya just love the basics?

Pssst... you keep putting off array component operations.  We could provide them ourselves with templates and free operator overloading, though we'd never be able to make it SIMD'able.  If you see SIMD support in D's future, language support for array ops is the way to go.  If not, free operators would get us part way there today.

I seem to have wandered off-topic.

Sean

"Walter" <walter@digitalmars.com> wrote in message news:b6b2an$1j2a$1@digitaldaemon.com...
>
> "Sean L. Palmer" <seanpalmer@directvinternet.com> wrote in message news:b4qk0p$12c6$1@digitaldaemon.com...
> > BSF/BSR sound mighty useful for doing bit array work or figuring out
which
> > power of two to use for textures.
>
> bsf and bsr are implemented as compiler intrisics, see phobos\intrinsic.d


April 01, 2003
Sean L. Palmer wrote:
> Pssst... you keep putting off array component operations.  We could provide
> them ourselves with templates and free operator overloading, though we'd
> never be able to make it SIMD'able.  If you see SIMD support in D's future,
> language support for array ops is the way to go.  If not, free operators
> would get us part way there today.

I sent Walter code for doing array arithmetic two months ago that supported MMX and 3DNow!.  I'm using an Athlon so I don't have SSE2. MMX in 8-bit was 106% faster on average.  MMX in 16-bit was 56% faster on average.  The 3DNow! code for 32-bit float was 54% faster on average.

April 01, 2003
Send me the interface, I'll code up SSE for it pronto!  Walter, Burton, what assembler do you want to use?  Or are you emitting binary opcodes directly?

Crap I don't have Linux installed, so it may be hard to test.  I'll port any straight asm or C you give me to SSE though.

I haven't done any SSE2 yet but it's pretty much the same thing.  There's some nicer convenience ops in SSE2, otherwise it's mostly just the same thing but with doubles.  I have a P4 so I could try that too.

MMX is kinda PITA because you have to kick the cpu into that mode, which shadows MMX registers atop the normal FPU registers;  thus you can't easily combine MMX and FPU ops.  Oh well.  Should be using MMX with SSE instead, most of the time.

I thought the Athlon had basic support for SSE emulation?

Sean

"Burton Radons" <loth@users.sourceforge.net> wrote in message news:b6c4t1$29od$1@digitaldaemon.com...
> Sean L. Palmer wrote:
> > Pssst... you keep putting off array component operations.  We could
provide
> > them ourselves with templates and free operator overloading, though we'd never be able to make it SIMD'able.  If you see SIMD support in D's
future,
> > language support for array ops is the way to go.  If not, free operators would get us part way there today.
>
> I sent Walter code for doing array arithmetic two months ago that supported MMX and 3DNow!.  I'm using an Athlon so I don't have SSE2. MMX in 8-bit was 106% faster on average.  MMX in 16-bit was 56% faster on average.  The 3DNow! code for 32-bit float was 54% faster on average.


April 02, 2003
On Tue, 1 Apr 2003 10:12:06 -0800, Sean L. Palmer <palmer.sean@verizon.net> wrote:

> I haven't done any SSE2 yet but it's pretty much the same thing.  There's
> some nicer convenience ops in SSE2, otherwise it's mostly just the same
> thing but with doubles.  I have a P4 so I could try that too.
>
just out of curiosity, what do you know about the 13 new instructions?  i read some of the documentation, but it seemed like so much uselss fluff.

> MMX is kinda PITA because you have to kick the cpu into that mode, which
> shadows MMX registers atop the normal FPU registers;  thus you can't easily
> combine MMX and FPU ops.  Oh well.  Should be using MMX with SSE instead,
> most of the time.
>
it's a shame Intel didn't spemd more time on it, eh?

> I thought the Athlon had basic support for SSE emulation?
>
the Athlon XP does.  i'm assuming Mr. Radons has an old Thunderbird core or similar.  anything before the Athlon XP doesn't have SSE.  the most recent XP and MP cores, OTOH, have full SSE2 support (not including the new 13 instructions, of course).

> Sean
>

-- 
Charles "grey wolf" Banas
April 03, 2003
Sean L. Palmer wrote:
> Send me the interface, I'll code up SSE for it pronto!  Walter, Burton, what
> assembler do you want to use?  Or are you emitting binary opcodes directly?

Err, I used templating and DMD's inline assembler.  I'll send it to you in a minute.

April 03, 2003
Sounds good.

DMD inline asm wouldn't by any chance support the P4 opcodes would it?  I can do it the hard way if I have to.

Hopefully I won't have to do this sort of coding much in the future;  this is definitely stuff that belongs inside the compiler.

Sean

"Burton Radons" <loth@users.sourceforge.net> wrote in message news:b6g2en$28jo$1@digitaldaemon.com...
> Sean L. Palmer wrote:
> > Send me the interface, I'll code up SSE for it pronto!  Walter, Burton,
what
> > assembler do you want to use?  Or are you emitting binary opcodes
directly?
>
> Err, I used templating and DMD's inline assembler.  I'll send it to you in a minute.


April 25, 2003
"Sean L. Palmer" <palmer.sean@verizon.net> wrote in message news:b6bhbj$1tf2$1@digitaldaemon.com...
> Excellent!  Thank you, Walter.
>
> Can I add a few instrinsics to the wish list?
>
> rol
> ror
> fsincos
>
> You obviously know about rol and ror.  C has non-ansi-standard __lrotl and __lrotr (Microsoft libraries) but hopefully D will have standard
intrinsics
> for this basic op.  The only drag is that for it to be useful you need to know the size of the int type exactly.

I haven't because I never know what to do with rcl and rcr <g>.


> fsincos is apparently a x87-only feature (haven't seen it on any other processor yet) but it seems to be the nature of math that if you need
either
> sin or cos, you will likely need the other as well, for the same angle.
It
> makes sense for a coprocessor to compute them both at once much as it
makes
> sense to do integer div and mod at the same time.  There is considerable work that doesn't need to be done twice.

I think a better approach is for the compiler to recognize that both sin()
and cos() are being done on the same argument, and internally convert it to
fsincos.


> Speaking of which, ldiv is a C library function which would be better off being an intrinsic.  Maybe you already do this for DMC.

No, I haven't.

> Don't ya just love the basics?

Doing the basics well pays endless dividends.

> Pssst... you keep putting off array component operations.  We could
provide
> them ourselves with templates and free operator overloading, though we'd never be able to make it SIMD'able.  If you see SIMD support in D's
future,
> language support for array ops is the way to go.  If not, free operators would get us part way there today.

Right now, getting the basic system working on linux is my priority. I'm currently studying the pthreads manual.


April 25, 2003
"Sean L. Palmer" <palmer.sean@verizon.net> wrote in message news:b6gfj1$2h9u$1@digitaldaemon.com...
> DMD inline asm wouldn't by any chance support the P4 opcodes would it?  I can do it the hard way if I have to.

The DMD inline assembler supports *all* the x86 opcodes, including the AMD 3d!now ones.