On Thursday, 13 May 2021 at 01:59:15 UTC, Andrei Alexandrescu wrote:
>Integral division is the strongest arithmetic operation.
I have a friend who knows some M1 internals. He said it's really Star Trek stuff.
This will seriously challenge other CPU producers.
What perspectives do we have to run the compiler on M1 and produce M1 code?
It's already winning let alone challenging, although consider just how fucking enormous the transistor budget is on the M1 on a per-core basis (i.e. from what is known in public, the M1 doesn't really have that much magic to it but is rather an extremely wide - where it really matters - iteration of what already works elsewhere in the industry, combined with no X86 tax on desktop for the first time.). Intel's process engineers completely dropped the ball, so the M1 is on a process something like 4-5 x denser than Intel 14nm.
Someone mentioned on hackernews that Intel improved the ThisXeon + 1 integer division capabilities also, would be worth benchmarking - although expecting monster SPECint numbers from a 28 core Xeon is probably missing the point.
Someone on the discord has an M1, D already works fine apparently, I'm aiming to get a blog post out of it.
The GCC project has M1 hardware and should apparently be getting support soon-ish. Apple don't like upstreaming their backends from what I can tell, so it could be a while before they get tuned much.
Apple also haven't published anything along the lines of an optimization manual for M1 so I guess we'll find out via osmosis what it's really capable of as times goes on - I think it's more likely Apple get the Microsoft hidden-api treatment than actually go public on some of the extensions they have made to the ARM ISA - both in new instructions and in the form of an old trick SPARC had which basically turns TSO on underneath a program to aid X86 emulation.