Thread overview | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
May 31, 2023 x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
I have been working on simple wrappers around new(ish) x86 instructions that are not otherwise accessible. Also with replacement functions in straight D for machines where the instruction is not available. Currently only for GDC as LDC doesn’t support some of the features of GCC inline asm that I am relying on - named parameters in the asm with %[name] syntax. But hopefully that will get fixed by the LDC maintainers, so I will be able to work with either compiler. My routines need more testing and a vast amount of cleanup. So it’s early days. Is that something that you would be interested in for the D runtime library? (For GDC / LDC ?) I unfortunately haven’t attacked DMD yet because that uses a different inline asm syntax, and would mean a rewrite. But that isn’t a problem because thr DMD user gets the pure D replacement anyway due to conditional compilation. If you are interested, then let me know. I do need help testing though and some advice about unit tests. |
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Cecil Ward | On Wednesday, 31 May 2023 at 15:56:45 UTC, Cecil Ward wrote:
> I have been working on simple wrappers around new(ish) x86 instructions that are not otherwise accessible. Also with replacement functions in straight D for machines where the instruction is not available. Currently only for GDC as LDC doesn’t support some of the features of GCC inline asm that I am relying on - named parameters in the asm with %[name] syntax. But hopefully that will get fixed by the LDC maintainers, so I will be able to work with either compiler. My routines need more testing and a vast amount of cleanup. So it’s early days.
>
> Is that something that you would be interested in for the D runtime library? (For GDC / LDC ?) I unfortunately haven’t attacked DMD yet because that uses a different inline asm syntax, and would mean a rewrite. But that isn’t a problem because thr DMD user gets the pure D replacement anyway due to conditional compilation.
>
> If you are interested, then let me know. I do need help testing though and some advice about unit tests.
The instructions are those that were new with the Haswell micro architecture so that’s what ten years ago now, so now is the time that these instructions will become more usable for programmers worried about older machines, and there are the fallbacks too, as far as I have got with that.
|
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Cecil Ward | On Wednesday, 31 May 2023 at 15:58:53 UTC, Cecil Ward wrote:
> On Wednesday, 31 May 2023 at 15:56:45 UTC, Cecil Ward wrote:
>> I have been working on simple wrappers around new(ish) x86 instructions that are not otherwise accessible. Also with replacement functions in straight D for machines where the instruction is not available. Currently only for GDC as LDC doesn’t support some of the features of GCC inline asm that I am relying on - named parameters in the asm with %[name] syntax. But hopefully that will get fixed by the LDC maintainers, so I will be able to work with either compiler. My routines need more testing and a vast amount of cleanup. So it’s early days.
>>
>> Is that something that you would be interested in for the D runtime library? (For GDC / LDC ?) I unfortunately haven’t attacked DMD yet because that uses a different inline asm syntax, and would mean a rewrite. But that isn’t a problem because thr DMD user gets the pure D replacement anyway due to conditional compilation.
>>
>> If you are interested, then let me know. I do need help testing though and some advice about unit tests.
>
> The instructions are those that were new with the Haswell micro architecture so that’s what ten years ago now, so now is the time that these instructions will become more usable for programmers worried about older machines, and there are the fallbacks too, as far as I have got with that.
It’s been a project to help me learn D and explore the code quality of these compilers. I wrote various assembler languages for a living when I was working some years back, although when C compilers rose to sufficient quality of code generation then we switched to C for x86 at work and asm was much less of a thing, as for everyone.
I have also written a module that allows cached querying of results of calls to cpuid so that users can test for availability once only getting all the checks done before main so that there’s minimal overhead inside the real code in loops or wherever. The module calls cpuid many times in a loop with all the leaf subfunction queries that you might be interested in. That needs more work to be selective, maybe, and I haven’t yet enumerated all of the possibilities, because there are potentially a lot of them, and possibly many that users are not interested in in their use case. So I could perhaps do with a bit of advice there. Again if this is something that might be of interest then let me know. Needs a lot of cleanup once again to make the code look pretty.
|
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Cecil Ward | On Wednesday, 31 May 2023 at 15:58:53 UTC, Cecil Ward wrote: > On Wednesday, 31 May 2023 at 15:56:45 UTC, Cecil Ward wrote: >> I have been working on simple wrappers around new(ish) x86 instructions that are not otherwise accessible. Also with replacement functions in straight D for machines where the instruction is not available. Currently only for GDC as LDC doesn’t support some of the features of GCC inline asm that I am relying on - named parameters in the asm with %[name] syntax. But hopefully that will get fixed by the LDC maintainers, so I will be able to work with either compiler. My routines need more testing and a vast amount of cleanup. So it’s early days. >> >> Is that something that you would be interested in for the D runtime library? (For GDC / LDC ?) I unfortunately haven’t attacked DMD yet because that uses a different inline asm syntax, and would mean a rewrite. But that isn’t a problem because thr DMD user gets the pure D replacement anyway due to conditional compilation. >> >> If you are interested, then let me know. I do need help testing though and some advice about unit tests. > > The instructions are those that were new with the Haswell micro architecture so that’s what ten years ago now, so now is the time that these instructions will become more usable for programmers worried about older machines, and there are the fallbacks too, as far as I have got with that. Are you aware of intel-intrinsics? https://code.dlang.org/packages/intel-intrinsics It sounds like you are duplicating the effort; better to team up with that project. -Johan |
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan | On Wednesday, 31 May 2023 at 16:07:55 UTC, Johan wrote:
> On Wednesday, 31 May 2023 at 15:58:53 UTC, Cecil Ward wrote:
>> On Wednesday, 31 May 2023 at 15:56:45 UTC, Cecil Ward wrote:
>>> I have been working on simple wrappers around new(ish) x86 instructions that are not otherwise accessible. Also with replacement functions in straight D for machines where the instruction is not available. Currently only for GDC as LDC doesn’t support some of the features of GCC inline asm that I am relying on - named parameters in the asm with %[name] syntax. But hopefully that will get fixed by the LDC maintainers, so I will be able to work with either compiler. My routines need more testing and a vast amount of cleanup. So it’s early days.
>>>
>>> Is that something that you would be interested in for the D runtime library? (For GDC / LDC ?) I unfortunately haven’t attacked DMD yet because that uses a different inline asm syntax, and would mean a rewrite. But that isn’t a problem because thr DMD user gets the pure D replacement anyway due to conditional compilation.
>>>
>>> If you are interested, then let me know. I do need help testing though and some advice about unit tests.
>>
>> The instructions are those that were new with the Haswell micro architecture so that’s what ten years ago now, so now is the time that these instructions will become more usable for programmers worried about older machines, and there are the fallbacks too, as far as I have got with that.
>
> Are you aware of intel-intrinsics? https://code.dlang.org/packages/intel-intrinsics
> It sounds like you are duplicating the effort; better to team up with that project.
>
> -Johan
Yes, I am very aware, and was even thinking of using the same names. My goals are rather different though and I don’t use the same non-standard __xmm256 type names (or whatever). Those Intel routines don’t have a fallback equivalent though for machines where the instruction isn’t available so there’s some Intel sales promotion in there since you do need to have a sufficiently new CPU or nothing.
And I’m concentrating solely on D, not trying to write thing in C, put another wrapper round that for D and then hope it all still inlines with zero overhead parameter passing.
Lastly, those Intel intrinsics are I assume, unless I’m wrong, restricted to the Intel C/C++ compiler. And I’m GDC/LDC only.
So quite a gulf there and I’m not solely trying to do the same thing. And it’s D first, and with zero overhead being a requirement.
|
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Cecil Ward | On Wednesday, 31 May 2023 at 16:33:47 UTC, Cecil Ward wrote:
>
> So quite a gulf there and I’m not solely trying to do the same thing. And it’s D first, and with zero overhead being a requirement.
On a different topic. I’d like to develop similar things for AAarch64, but that’s an instruction set that’s new to me, so a new learning curve. Do any of our members have ARM64 asm experience and if so would they recommend tutorials for experienced asm programmers, beyond what I can google for myself, obviously, and the ARM official docs of course. And any tips on starting out as some stuff looks weird, such as the usage of the carry flag, and the bizarre x / w register width conventions. (Aren’t we all ex-DEC on this ? :-) with b w l q (o? dq? ) )
|
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Cecil Ward | On Wednesday, 31 May 2023 at 16:33:47 UTC, Cecil Ward wrote:
> On Wednesday, 31 May 2023 at 16:07:55 UTC, Johan wrote:
>> On Wednesday, 31 May 2023 at 15:58:53 UTC, Cecil Ward wrote:
>>> On Wednesday, 31 May 2023 at 15:56:45 UTC, Cecil Ward wrote:
>>>> [...]
>>>
>>> The instructions are those that were new with the Haswell micro architecture so that’s what ten years ago now, so now is the time that these instructions will become more usable for programmers worried about older machines, and there are the fallbacks too, as far as I have got with that.
>>
>> Are you aware of intel-intrinsics? https://code.dlang.org/packages/intel-intrinsics
>> It sounds like you are duplicating the effort; better to team up with that project.
>>
>> -Johan
>
> Yes, I am very aware, and was even thinking of using the same names. My goals are rather different though and I don’t use the same non-standard __xmm256 type names (or whatever). Those Intel routines don’t have a fallback equivalent though for machines where the instruction isn’t available so there’s some Intel sales promotion in there since you do need to have a sufficiently new CPU or nothing.
>
> And I’m concentrating solely on D, not trying to write thing in C, put another wrapper round that for D and then hope it all still inlines with zero overhead parameter passing.
>
> Lastly, those Intel intrinsics are I assume, unless I’m wrong, restricted to the Intel C/C++ compiler. And I’m GDC/LDC only.
>
> So quite a gulf there and I’m not solely trying to do the same thing. And it’s D first, and with zero overhead being a requirement.
You and Johan might be talking past each other here, "intel-intrinsics" in this case refers to p0nce's implementation of Intel's intrinsic (names and semantics) in D. There is no dependency on any Intel software. There are some traps that he has worked around that you will bump into at some point, so I recommend looking closely at what he has done. A subset also work on Arm.
|
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to max haughton | On Wednesday, 31 May 2023 at 16:45:35 UTC, max haughton wrote:
> On Wednesday, 31 May 2023 at 16:33:47 UTC, Cecil Ward wrote:
>> [...]
Ah, I was indeed misunderstanding. And no harm done as this was a D learning project until I started to think that I might be of some use to someone. Thanks for giving me that link !
|
May 31, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Cecil Ward | On Wednesday, 31 May 2023 at 16:51:42 UTC, Cecil Ward wrote:
> On Wednesday, 31 May 2023 at 16:45:35 UTC, max haughton wrote:
>> On Wednesday, 31 May 2023 at 16:33:47 UTC, Cecil Ward wrote:
>>> [...]
>
> Ah, I was indeed misunderstanding. And no harm done as this was a D learning project until I started to think that I might be of some use to someone. Thanks for giving me that link !
Ah, just followed that link. No that’s (solely?) SIMD, something I was aware of and so I’m not duplicating that as I haven’t gone near SIMD. The pext instruction would be one instruction that I attacked some time ago, and that would already be fine with ARM as there’s a pure D fallback, but maybe I can find some native ARM equivalent if I study AArch64.
So no, this would be something new. Non-SIMD insns for general use. The smallest instructions might be something like andn if I can keep to zero-overhead obviously, seeing as the benefit in the instruction is so tiny anyway. But mind you I could have done with it for graphics bit twiddling manipulation code.
Because I have zero familiarity with the tools, and am very unwell, I would just give the .d files with their inline asm and pure D code to someone experienced who is sufficiently motivated to help out. I wouldn’t be able to do anything on my own.
I would also like some help with some problems with unittest. To test that a native insn conforms to the spec, in respect of its mating up of register passing and the like, I would ideally want to use static asserts. Since I’m testing with x86 boxes on godbolt.org, If the compiler doesn’t mind doing ctfe with asm then all will be well. I I avoid a problem by using static if ( __ctfe ) (or whatever) then I’m not would not be doing a test against the native instruction but against the pure-D replacement. Thus defeating the whole point, as that’s a separate test, albeit one that very much needs doing anyway, but there I would compare the native instruction with the D replacement rather than comparing both against hand-calculated values. The problem with hand-calculated values is that you are just testing against your own understanding of the algorithm, testing your own self against your own ideas, although that has some value in anti-regression testing later on but that’s a different thing.
|
June 01, 2023 Re: x86 intrinsics for sale cheap | ||||
---|---|---|---|---|
| ||||
Posted in reply to Cecil Ward | A concern here is that inline assembly is unlikely (if at all) to inline. So you're going to have to be pretty careful that what you do is actually worth the function call, because if it isn't simd, it just might not be doing enough work to justify using inline assembly. If you are able to get a backend to generate the instruction you want using regular D code, then you're good to go. As that'll inline. My general recommendation here is to not worry about specific instructions unless you really _really_ need to (very tiny percentage of code fits this, almost to the point of not being worth considering). Instead focus on making your D code communicate to the backend what you intend. Even if it doesn't do the job today, in 2 years time it could generate significantly better assembly. |
Copyright © 1999-2021 by the D Language Foundation