January 11, 2019
On Friday, 11 January 2019 at 09:36:09 UTC, Ethan wrote:
> On Thursday, 10 January 2019 at 21:01:09 UTC, luckoverthere wrote:
>> That's disappointing to learn. Ryzen has four 128-bit AVX units, 2 of them can only do addition and the other 2 can only do multiplication. Not sure how the memory is shared between units but if it isn't then it'd need to copy to be able to do an addition then a multiplication.
>
> The good news though is that Ryzen's 128-bit pipeline outperforms my Skylake i7 with this code. So you could say they've optimised for the majority usecase.
>
> It's reaaaaaally beneficial to do 256-bit logic for my particular use case here since I'm sampling and operating on 8 32-bit values at a time to produce a 32-bit output. But eh, I've gotta write for the build farm hardware.

Hi ethan, could you share a piece of code to do that ?

thanks you
January 11, 2019
On Friday, 11 January 2019 at 11:10:10 UTC, bioinfornatics wrote:
> Hi ethan, could you share a piece of code to do that ?
>
> thanks you

Not really.

1) It's very context specific
2) It's for my current employer and is subject to the usual code disclosure NDAs
January 11, 2019
On Friday, 11 January 2019 at 11:47:20 UTC, Ethan wrote:
> On Friday, 11 January 2019 at 11:10:10 UTC, bioinfornatics wrote:
>> Hi ethan, could you share a piece of code to do that ?
>>
>> thanks you
>
> Not really.
>
> 1) It's very context specific
> 2) It's for my current employer and is subject to the usual code disclosure NDAs

OK I understand, no problem 😉
So I could try to use this idea for training. As example take 8 value of 32 bit and return the sum or others...

But I though AMD had 2 units for sum and units for multiply. I need to get a better understanding on this topics 🤔
1 2 3 4 5
Next ›   Last »