ARM first & default LDC (page 11)

On Wednesday, 30 December 2020 at 13:30:58 UTC, Iain Buclaw wrote: > On Tuesday, 29 December 2020 at 21:25:37 UTC, Laeeth Isharc wrote: >> >> I wonder what amount of funding would be needed for a first version of an ARM back end for DMD. >> > > I guess you'd first need to define what is ARM for DMD? Is it ARM or Apple? Treating both as a distinction would be really important, as the former has more multilib and ABI combinations than there are moons in the solar system. The thread is focusing on Apple ARM though.

December 30, 2020

Re: ARM first & default LDC

Posted by Max Haughton
in reply to Laeeth Isharc

Permalink

Max Haughton

Posted in reply to Laeeth Isharc

Permalink

On Tuesday, 29 December 2020 at 21:25:37 UTC, Laeeth Isharc wrote:
> On Saturday, 19 December 2020 at 19:51:37 UTC, IGotD- wrote:
>> On Saturday, 19 December 2020 at 19:38:10 UTC, Ola Fosheim Grøstad wrote:
>>>
>>> You realise that creating a high quality backend that is 100% bug free takes a lot of effort and that you have to maintain it in perpetuity? It isn't sufficient that someone "just does it", it also has to be correct, efficient and updated.
>>>
>>
>> This post made me think.
>>
>> https://forum.dlang.org/post/umfqzznlrongfrvictus@forum.dlang.org
>>
>> Read point number 3.
>>
>>
>> Basically, there are problems with DMD, sometimes crashes and has codegen bugs. Codegen bugs is definitely not something you want in a big project because they are time consuming.
>>
>> Thank you for posting real commercial experience with D. If I was a hard headed boss of the D project I would probably demand that the DMD backend would be scrapped and efforts should focus on stability with the support of GCC and LLVM backends.
>
> Ilya has been working on a certain very important subset of work on behalf of Symmetry.  That mostly had to be @nogc because it needs to be usable as an Excel plugin as well as from our DSL.  And we decided to do something that conventionally speaking you should never do - porting a critical library written in a hurry in C++ and still evolving to an emerging programming language (D).  C++ mangling and ABI interoperability isn't perfect now but is a lot better than the situation when we started. We did finish it, by the way, and achieved our technical and commercial goals.
>
> I think we only build that codebase using LDC now but more generally we use LDC for release builds and dmd for development and it is not perfect, but it is overall fine. We build with dmd nightly as well to stay ahead of breakages coming down the line.
>
> I love the existence of gdc and the incredible range of targets it supports.  I am also in awe at what Iain has been able to accomplish.  We don't use gdc currently only because for pragmatic reasons we need to have a more current front end - although gdc is much more up to date these days.
>
> I wonder what amount of funding would be needed for a first version of an ARM back end for DMD.
>
> More generally I don't think D is growing too slowly.  If anything if you zoom out the time horizon, I think many of the causes of frustration amongst some users in fact reflects the fact that it can take time for organisation to catch up with growth. It takes time, energy and committed and talented people to build organisation and these things take the time they take, particularly with an open source endeavour.

Re: cost of DMD backend for ARM, the existing backend is loaded with implementation details from the pentium 5 and 6 (pro), and is generally not very nice to read or write - it would probably be easier to do a basic retargetable code generator from scratch but keep the existing backend for x86 in the meantime.

For a material estimate of size, the cranelift backend rust has is about 87k lines (inc. tests IIRC) so somewhere on that order (and we can generate huge amounts of code for free because D) - I think the our backend is a bit bigger than that.

Cranelift already has basic ARM support too; I can't comment on the quality of code generated.

This could also kill a few birds with one stone as it's an effective route to a modern JIT which hypothetically (I believe Stefan's work showed it's not as simple as that, but still) could help with CTFE.

On Wednesday, 30 December 2020 at 15:56:49 UTC, Ola Fosheim Grøstad wrote: > On Wednesday, 30 December 2020 at 15:00:15 UTC, Max Haughton wrote: >> Cranelift already has basic ARM support too; I can't comment on the quality of code generated. > > Are you thinking porting? No, just guessing how much work it would be. I would quite like to get a basic backend going but it's much easier said than done (i.e. most optimisations are fairly simple but generating proper code and debug info at the end takes ages to test let alone write)

On Wednesday, 30 December 2020 at 16:03:56 UTC, Max Haughton wrote: > No, just guessing how much work it would be. I would quite like to get a basic backend going but it's much easier said than done (i.e. most optimisations are fairly simple but generating proper code and debug info at the end takes ages to test let alone write) *nods* I actually like the idea of a fast non-optimizing backend. What could be fun is a space-optimizing compiler for WASM and embedded: save as much space as possible.

On Wednesday, 30 December 2020 at 16:13:14 UTC, Ola Fosheim Grøstad wrote: > On Wednesday, 30 December 2020 at 16:03:56 UTC, Max Haughton wrote: >> No, just guessing how much work it would be. I would quite like to get a basic backend going but it's much easier said than done (i.e. most optimisations are fairly simple but generating proper code and debug info at the end takes ages to test let alone write) > > *nods* > > I actually like the idea of a fast non-optimizing backend. > > What could be fun is a space-optimizing compiler for WASM and embedded: save as much space as possible. Space optimizing still requires quite a lot of time eliminating work, although LLVM is still pretty bad at it specifically - if you give a recursive factorial implementation to LLVM, it can't see the overflow's going to happen to so it will (at O3) give you about 100 SIMD instructions rather than a simple loop.

On Wednesday, 30 December 2020 at 16:18:34 UTC, Max Haughton wrote: > Space optimizing still requires quite a lot of time eliminating work, although LLVM is still pretty bad at it specifically - if you give a recursive factorial implementation to LLVM, it can't see the overflow's going to happen to so it will (at O3) give you about 100 SIMD instructions rather than a simple loop. :-D I wasn't aware of that. Yes, but I guess one could start with a non-optimizing SSA based backend with the intent of improving it later, but a focus on space optimization. At least it could be competitive in a niche rather than a lesser version of llvm...

On Wednesday, 30 December 2020 at 16:22:27 UTC, Ola Fosheim Grøstad wrote: > On Wednesday, 30 December 2020 at 16:18:34 UTC, Max Haughton wrote: >> Space optimizing still requires quite a lot of time eliminating work, although LLVM is still pretty bad at it specifically - if you give a recursive factorial implementation to LLVM, it can't see the overflow's going to happen to so it will (at O3) give you about 100 SIMD instructions rather than a simple loop. > > :-D I wasn't aware of that. > > Yes, but I guess one could start with a non-optimizing SSA based backend with the intent of improving it later, but a focus on space optimization. At least it could be competitive in a niche rather than a lesser version of llvm... Is it plausible to make a new backend that takes LLVM IR, so it uses the same glue layer as LDC? Or would that inhibit fast compilation?

On Wednesday, 30 December 2020 at 16:25:08 UTC, claptrap wrote: > Is it plausible to make a new backend that takes LLVM IR, so it uses the same glue layer as LDC? Or would that inhibit fast compilation? I think LDC uses some LLVM provided utility code for building the initial SSA, but you probably could do that. I don't think the glue layer is the most work though, so you would probably be better off creating a simpler SSA than LLVM? I haven't given this much thought, though.

On Wednesday, 30 December 2020 at 16:03:56 UTC, Max Haughton wrote: > On Wednesday, 30 December 2020 at 15:56:49 UTC, Ola Fosheim Grøstad wrote: >> On Wednesday, 30 December 2020 at 15:00:15 UTC, Max Haughton wrote: >>> Cranelift already has basic ARM support too; I can't comment on the quality of code generated. >> >> Are you thinking porting? > > No, just guessing how much work it would be. I would quite like to get a basic backend going but it's much easier said than done (i.e. most optimisations are fairly simple but generating proper code and debug info at the end takes ages to test let alone write) The dmd back-end is surprisingly more segregated than you think, there are a bunch of entrypoint methods you can override (class Obj if I remember correctly), and from there, modules can act as their own encapsulation for emitting code for different CPUs - just swap the x87 modules for ARM modules in a hypothetical build. Someone already did 70% of the work in untangling the back-end and made a toy ARMv4 backend several years ago. Though it may not be in a salvageable state now, especially if the ultimate aim is to generate code for the Apple M1.

Forums