Thread overview | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
December 26, 2015 CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Hi all, I will write std.blas and it will be heavily optimised for LDC. Can these features be added to LDC? 1. Basic compile time information about target CPU such as L1/L2/L3 cache sizes and available instructions set, e.g. SSE2, AVX, AVX2, AVX512. 2. Related cross-compile. For example: target is x86_64; AVX support can be checked at runtime using core.cpuid; so I want to force LDC to compile three versions of BLAS for SSE, AVX and AVX512, and choose better in runtime. Links: std.blas annonce: http://forum.dlang.org/thread/nilhvnqbsgqhxdshpqfl@forum.dlang.org |
December 27, 2015 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Yaroshenko | On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko wrote: > Hi all, > > I will write std.blas and it will be heavily optimised for LDC. jay! :-) > Can these features be added to LDC? > > 1. Basic compile time information about target CPU such as L1/L2/L3 cache sizes and available instructions set, e.g. SSE2, AVX, AVX2, AVX512. Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline? > 2. Related cross-compile. For example: target is x86_64; AVX support can be checked at runtime using core.cpuid; so I want to force LDC to compile three versions of BLAS for SSE, AVX and AVX512, and choose better in runtime. Something like this? https://gcc.gnu.org/wiki/FunctionMultiVersioning |
December 27, 2015 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | > On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko wrote: >> Hi all, >> >> 2. Related cross-compile. For example: target is x86_64; AVX support can be checked at runtime using core.cpuid; so I want to force LDC to compile three versions of BLAS for SSE, AVX and AVX512, and choose better in runtime. An LLVM presentation I found on the topic: http://llvm.org/devmtg/2014-10/Slides/Christopher-Function%20Multiversioning%20Talk.pdf (perhaps mostly a reminder to self ;) |
December 27, 2015 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On Sunday, 27 December 2015 at 17:34:26 UTC, Johan Engelen wrote: > On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko wrote: >> Hi all, >> >> I will write std.blas and it will be heavily optimised for LDC. > > jay! :-) > >> Can these features be added to LDC? >> >> 1. Basic compile time information about target CPU such as L1/L2/L3 cache sizes and available instructions set, e.g. SSE2, AVX, AVX2, AVX512. > > Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. > I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline? I have found that core.cpuid can provide runtime information about cache sizes, it is enough. However amount of SIMD registers and their sizes should be known at compile time. What do you mean with "set of function names / version IDs"? >> 2. Related cross-compile. For example: target is x86_64; AVX support can be checked at runtime using core.cpuid; so I want to force LDC to compile three versions of BLAS for SSE, AVX and AVX512, and choose better in runtime. > > Something like this? > https://gcc.gnu.org/wiki/FunctionMultiVersioning Yes! Or runtime check at least. Ilya |
December 30, 2015 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Yaroshenko | On Sunday, 27 December 2015 at 23:47:41 UTC, Ilya Yaroshenko wrote: > On Sunday, 27 December 2015 at 17:34:26 UTC, Johan Engelen wrote: >> On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko wrote: >> >>> Can these features be added to LDC? >>> >>> 1. Basic compile time information about target CPU such as L1/L2/L3 cache sizes and available instructions set, e.g. SSE2, AVX, AVX2, AVX512. >> >> Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. >> I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline? > > I have found that core.cpuid can provide runtime information about cache sizes, it is enough. However amount of SIMD registers and their sizes should be known at compile time. > > What do you mean with "set of function names / version IDs"? (I am pretty new to D, etc.) Can you give me a sample of code showing what "API" you expect for this stuff? >>> 2. Related cross-compile. For example: target is x86_64; AVX support can be checked at runtime using core.cpuid; so I want to force LDC to compile three versions of BLAS for SSE, AVX and AVX512, and choose better in runtime. >> >> Something like this? >> https://gcc.gnu.org/wiki/FunctionMultiVersioning > > Yes! Or runtime check at least. I had been thinking about implementing function multiversioning before. It's great that someone wants it :-) |
December 30, 2015 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On Wednesday, 30 December 2015 at 15:20:35 UTC, Johan Engelen wrote: > On Sunday, 27 December 2015 at 23:47:41 UTC, Ilya Yaroshenko wrote: >> On Sunday, 27 December 2015 at 17:34:26 UTC, Johan Engelen wrote: >>> On Saturday, 26 December 2015 at 20:47:39 UTC, Ilya Yaroshenko wrote: >>> >>>> Can these features be added to LDC? >>>> >>>> 1. Basic compile time information about target CPU such as L1/L2/L3 cache sizes and available instructions set, e.g. SSE2, AVX, AVX2, AVX512. >>> >>> Do you have a proposal for a set of function names / version IDs / ...? This sounds like a simple thing to add. >>> I'm not sure about cache sizes: is it currently possible to specify the target microarchitecture on the cmdline? >> >> I have found that core.cpuid can provide runtime information about cache sizes, it is enough. However amount of SIMD registers and their sizes should be known at compile time. >> >> What do you mean with "set of function names / version IDs"? > > (I am pretty new to D, etc.) > Can you give me a sample of code showing what "API" you expect for this stuff? Dispatching example: @target("default") //used for ctfe code int foo () { // The default version of foo. return 0; } @target("sse4.2") int foo() { // foo version for SSE4.2 if compiler is LDC return 1; } @target("arch=atom,+sse2") int foo() { // foo version for the Intel ATOM processor with SSE2 suport return 2; } Compile time features example: version(LDC) { enum bool a = __target(has, "avx2"); enum bool b = __target(compatible, "core-avx2"); enum bool c = __target("broadwell"); } else version(GNU) { ... } >>>> 2. Related cross-compile. For example: target is x86_64; AVX support can be checked at runtime using core.cpuid; so I want to force LDC to compile three versions of BLAS for SSE, AVX and AVX512, and choose better in runtime. >>> >>> Something like this? >>> https://gcc.gnu.org/wiki/FunctionMultiVersioning >> >> Yes! Or runtime check at least. > > I had been thinking about implementing function multiversioning before. It's great that someone wants it :-) |
January 02, 2016 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya | On Wednesday, 30 December 2015 at 20:07:02 UTC, Ilya wrote:
>
> @target("sse4.2")
> int foo() {
> // foo version for SSE4.2 if compiler is LDC
> return 1;
> }
I'm working on (a rudimentary version of) @target at the moment.
I assume you build LDC yourself and you are happy to help with some testing and give feedback? :)
cheers,
Johan
|
January 03, 2016 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to JohanEngelen | On Saturday, 2 January 2016 at 23:27:16 UTC, JohanEngelen wrote:
> On Wednesday, 30 December 2015 at 20:07:02 UTC, Ilya wrote:
>>
>> @target("sse4.2")
>> int foo() {
>> // foo version for SSE4.2 if compiler is LDC
>> return 1;
>> }
>
> I'm working on (a rudimentary version of) @target at the moment.
> I assume you build LDC yourself and you are happy to help with some testing and give feedback? :)
>
> cheers,
> Johan
Yes! You can count on me ;) --Ilya
|
January 03, 2016 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Yaroshenko | On Sunday, 3 January 2016 at 05:16:36 UTC, Ilya Yaroshenko wrote: > On Saturday, 2 January 2016 at 23:27:16 UTC, JohanEngelen wrote: >> >> I'm working on (a rudimentary version of) @target at the moment. >> I assume you build LDC yourself and you are happy to help with some testing and give feedback? :) >> >> cheers, >> Johan > > Yes! You can count on me ;) --Ilya Great, thanks :) The branch is ready: https://github.com/JohanEngelen/ldc/tree/attr_target (make sure git correctly fetches the druntime branch with @ldc.attributes.target in it) Usage examples can be found in the test file: tests/ir/attr_target_x86.d It'd be great if you can run the IR tests (and can help improve the tests): cd tests/ir python runlit.py -v . I myself often modify a test file locally and rerun the test to quickly see if things are working or not (inspect output .ll and .s). cheers, Johan |
January 03, 2016 Re: CT Information about target CPU and Related cross-compile | ||||
---|---|---|---|---|
| ||||
Posted in reply to JohanEngelen | On Sunday, 3 January 2016 at 13:11:55 UTC, JohanEngelen wrote: > > The branch is ready: See: https://github.com/ldc-developers/ldc/pull/1244 |
Copyright © 1999-2021 by the D Language Foundation