Thread overview
macOS Big Sur status report
Sep 13, 2020
Guillaume Piolat
Sep 15, 2020
Jacob Carlborg
Sep 15, 2020
Guillaume Piolat
Oct 25, 2020
Guillaume Piolat
September 13, 2020
Hello,

I'd like to post here to get you in touch with the future support of macOS Big Sur.



---------------------------

1. Context


Future Mac comes with "Apple Silicon" CPU that are natively arm64.

On these system (out end of 2020), you can run both x86_64 and arm64 executables.

As of today, a D program works unchanged under macOS Big Sur (11.0). It is emulated. The new processors are really fast and it's surprisingly painless.

So there is no real urgency for transitioning, apart from keeping native performance. For maximum compatibility one would distribute "fat binaries", a arm64 executable stitched to a x86_64 executable with the "lipo" tool.


-------------------

2. druntime support

The triple is: arm64-apple-macos and currently it needs tweaks to

A druntime built for macOS 11.0 doesn't exist yet but is in the works.
druntime needs to be a bit adapted for Big Sur. It's mostly C symbols that used to have OSX-specific names; it requires extra-care.

You can thankfully hack around the current lack of runtime with LDC by linking with the druntime built for iOS.

See progress here: https://github.com/ldc-developers/ldc/issues/3559

I'm just unaware which druntime needs to be touched: the LDC one or upstream?


--------------

3. DUB support

It appears nothing needs to be done since DUB supports triple in the -a flag.

$ dub -a arm64-apple-macos


---------------------------

4. intel-intrinsics support

The idea is to keep the x86 semantics of intel-intrinsics and get it to work on arm64, at a slight performance loss. The extent of the impedence mismatch is unknown yet.

This is a similar idea to the "simde" library on Github.

For now the goal has been to get to the same semantics through the generic intrinsic implementations, and to get back performance later.

For now the goal was to find a way to respect the rounding semantics.

ARM have no MXCSR, rounding mode and flush-to-zero mode are put into the ARM's FPCSR control word instead.

As there is no new ARM instruction to round using the current rounding mode (which is definately a good design), we dispatch after reading that control word.

What is unsupported for now is:
- FP exceptions flags and FP exceptions masks
- most intrinsics should be really slow. It will need detailed work to ensure the fastest implementation of each and every x86 intrinsic.


Thanks for reading and consider donating to LDC!

September 15, 2020
On Sunday, 13 September 2020 at 19:51:14 UTC, Guillaume Piolat wrote:
> Hello,
>
> I'd like to post here to get you in touch with the future support of macOS Big Sur.

Thanks for looking into this.

> 1. Context
>
>
> Future Mac comes with "Apple Silicon" CPU that are natively arm64.
>
> On these system (out end of 2020), you can run both x86_64 and arm64 executables.
>
> As of today, a D program works unchanged under macOS Big Sur (11.0). It is emulated. The new processors are really fast and it's surprisingly painless.

I would like to add, for clarity, that macOS Big Sur runs natively on existing Macs which runs x86-64. D works without any problems, out of the box on these systems. This post is specifically about the new Apple Silicon, ARM64 (AArch64).

> I'm just unaware which druntime needs to be touched: the LDC one or upstream?

As much as possible upstream. Hopefully it shouldn't be that much since it works on iOS. There might be some required changes to Phobos as well.

--
/Jacob Carlborg


September 15, 2020
On 9/15/20 6:08 AM, Jacob Carlborg wrote:
> On Sunday, 13 September 2020 at 19:51:14 UTC, Guillaume Piolat wrote:
>> I'm just unaware which druntime needs to be touched: the LDC one or upstream?
> 
> As much as possible upstream. Hopefully it shouldn't be that much since it works on iOS. There might be some required changes to Phobos as well.

I suspect DMD will not ever build binaries for the ARM architecture. But I second Jacob's view, we do not want diverging versions of druntime if we can help it. I know LDC already has slightly different code, but we should try to minimize that, and keep them in sync as much as possible using version statements.

-Steve
September 15, 2020
On Tuesday, 15 September 2020 at 10:08:05 UTC, Jacob Carlborg wrote:
>
> Thanks for looking into this.
>

My pleasure.


> As much as possible upstream. Hopefully it shouldn't be that much since it works on iOS. There might be some required changes to Phobos as well.

Indeed, the real work was done for iOS. I'm just piggy-backing.

Update:

`intel-intrinsics` now passes unittests on AArch64. There are about 20 intrinsics that will need further optimization, for example _mm_cvtps_epi32() or _mm_avg_epi16(). Thanks to LLVM a lot of intrinsics written for x86 are already optimal on ARM.
October 25, 2020
On Sunday, 13 September 2020 at 19:51:14 UTC, Guillaume Piolat wrote:
>
> I'd like to post here to get you in touch with the future support of macOS Big Sur.

Update on Big Sur support:

- LDC 1.024 has been released with experimental support for targeting macOS on 64-bit ARM (https://forum.dlang.org/post/gsmunhwpulpwjphnuoed@forum.dlang.org).
  On Big Sur, a lot of things work such as building programs with `-mtriple=arm64-apple-macos`, building Universal Binaries with lipo, Obj-C bindings...
   Future work include identifying the remaining ABI problems.

- `intel-intrinsics` when used with LDC will produce optimized code for arm64 (Big Sur), and working but slow code for ARM32 (Raspberry Pi). The Pi being a side-product from supporting ARM64. Updated table in: https://github.com/AuburnSounds/intel-intrinsics
   There IS some impedance mismatch between SSE and Neon at times. ARM is really quite a different beast.