How is it possible that countUntil() generates a jump-table when the hayStack is a compile time array? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » How is it possible that countUntil() generates a jump-table when the hayStack is a compile time array?

Thread overview

How is it possible that countUntil() generates a jump-table when the hayStack is a compile time array?
Oct 01, 2022 realhet
Oct 01, 2022 H. S. Teoh
Oct 01, 2022 realhet

October 01, 2022

How is it possible that countUntil() generates a jump-table when the hayStack is a compile time array?

Posted by realhet

realhet

Hello,

I just wanted to optimize a byte -> index lookup, by using a 256 element table instead of using [1, 2, 3].countUntil(x) and I was amazed what I've found.
My solution lookup[x] was not faster at all, because LDC2 amazingly optimized the linear search to a jump table.

Anyone please can tell me, how it is possible?

I looked inside the countUntil() template and I see no static ifs for compile time optimizations.

Is it true that: The compiler notices that there is a while loop on a static compile time array and it is clever enough to solve all the calculations in compile time and generate a completely unrolled loop? Then the optimizer transforms it to the jump table and do it with "jmp rax" ?

If I'm right, this is not just std library magic, it works even with my own template functions.
I'm also getting crazy long compile times, so this gotta be the case :D

Thank You in advance.

October 01, 2022

Re: How is it possible that countUntil() generates a jump-table when the hayStack is a compile time array?

Posted by H. S. Teoh
in reply to realhet

H. S. Teoh

Posted in reply to realhet

On Sat, Oct 01, 2022 at 01:20:08PM +0000, realhet via Digitalmars-d-learn wrote:
> Hello,
> 
> I just wanted to optimize a byte -> index lookup, by using a 256
> element table instead of using [1, 2, 3].countUntil(x) and I was
> amazed what I've found.
> My solution lookup[x] was not faster at all, because LDC2 amazingly
> optimized the linear search to a jump table.
> 
> Anyone please can tell me, how it is possible?

Because it's LDC. ;-) Or more specifically, the D front-end tells the LLVM back-end that the table size is statically-known, and the back-end pattern-matched linear searches through fixed-size tables into a jump table.

Generally, I wouldn't bother with optimizing minutiae like these unless the profiler indicates the hotspot is there. Just let the optimizer do its job.  Quite often the optimizer can do a better job than a human can.

> I looked inside the countUntil() template and I see no static ifs for
> compile time optimizations.

It has nothing to do with the Phobos code, it's a pattern recognized by LDC's optimizer.

> Is it true that: The compiler notices that there is a while loop on a static compile time array and it is clever enough to solve all the calculations in compile time and generate a completely unrolled loop? Then the optimizer transforms it to the jump table and do it with "jmp rax" ?

This is quite likely the case. I've personally seen LDC optimize an entire benchmark into a single instruction, because it figured out that it could run the function at compile-time and replace the entire function body with an instruction that just loads the final value.

> If I'm right, this is not just std library magic, it works even with
> my own template functions.
> I'm also getting crazy long compile times, so this gotta be the case
> :D
[...]

Yep.

T

-- 
If it tastes good, it's probably bad for you.

October 01, 2022

Re: How is it possible that countUntil() generates a jump-table when the hayStack is a compile time array?

Posted by realhet
in reply to H. S. Teoh

realhet

Posted in reply to H. S. Teoh

On Saturday, 1 October 2022 at 13:49:12 UTC, H. S. Teoh wrote:
> On Sat, Oct 01, 2022 at 01:20:08PM +0000, realhet via Digitalmars-d-learn wrote:

It is very good to know. Thank You for the confirmation. Indeed it is really clever.

I wrote a parser only to parse the structural elements of Dlang using template string[] parameters to feed tokens for the specialized parser functions recursively(!). And it unrolled everything into jump tables. Insane :D

The only successful optimization I made was using PCMPESTRI instruction for skipping 16 bytes of text when there is no match found in a compile time constant 16 element character set.
I expected like 10x speedup because of PCMPESTRI. But I only got 2x. I thought I was competing with poor linear searches only, but little I knew...

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation