Fastest Prime Sieve, in D

Jun 28, 2019

Jun 29, 2019

This is a continuation of a previous thread on the Segmented Sieve of Zakiya (SSoZ): https://forum.dlang.org/post/ozwapetxgcxkbzjjslqe@forum.dlang.org to present the D version of the current state-of-art of the work. The D version source code also performs the Fastest Primes Sieve for Twin Primes. https://gist.github.com/jzakiya/ae93bfa03dbc8b25ccc7f97ff8ad0f61 It's a direct translation of the (original) Nim (www.nim-lang.org) version. https://gist.github.com/jzakiya/6c7e1868bd749a6b1add62e3e3b2341e For those interested, here are two places to get my current paper on it. I need to update it (again) to incorporate this new algorithm, and results. https://www.academia.edu/37952623/The_Use_of_Prime_Generators_to_Implement_Fast_Twin_Primes_Sieve_of_Zakiya_SoZ_Applications_to_Number_Theory_and_Implications_for_the_Riemann_Hypotheses https://www.scribd.com/document/395415391/The-Use-of-Prime-Generators-to-Implement-Fast-Twin-Primes-Sieve-Of-Zakiya-SoZ-Applications-to-Number-Theory-and-Implications-for-the-Riemann-Hypoth I compiled the D source using the latest LDC 1.16.0. When comparing times to Nim it's comparable but becomes slower as inputs sizes increase. Here are some times for the Nim version compared to `primesieve` -- https://github.com/kimwalisch/primesieve ``` ➜ ~ inxi -SCM System: Host: localhost.localdomain Kernel: 5.1.11-pclos1 x86_64 bits: 64 Desktop: KDE Plasma 5.16.1 Distro: PCLinuxOS 2019 Machine: Type: Laptop System: System76 product: Gazelle v: gaze10 serial: <root required> Mobo: System76 model: Gazelle v: gaze10 serial: <root required> UEFI [Legacy]: American Megatrends v: 1.05.08 date: 03/31/2016 CPU: Topology: Quad Core model: Intel Core i7-6700HQ bits: 64 type: MT MCP L2 cache: 6144 KiB Speed: 1185 MHz min/max: 800/3500 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800 ``` ``` N | twinprimes_ssoz | primesieve | Ratio | -------------------|-----------------|---------------|-------| 100_000_000_000 | 4.628 | 5.837 | 1.26 | 500_000_000_000 | 23.308 | 32.765 | 1.41 | 1_000_000_000_000 | 47.966 | 69.705 | 1.45 | 5_000_000_000_000 | 265.746 | 388.233 | 1.46 | 10_000_000_000_000 | 572.803 | 821.755 | 1.43 | 50_000_000_000_000 | 3004.487 | 4620.525 | 1.54 | 100_000_000_000_000 | 6307.521 | 9724.568 | 1.54 | 200_000_000_000_000 | 13603.531 | 20279.547 | 1.49 | ``` I'm hoping people can help improve it and make it faster than it currently is. Thanks. Jabari

June 29, 2019

Re: Fastest Prime Sieve, in D

Posted by Jabari Zakiya
in reply to Jabari Zakiya

Permalink

Jabari Zakiya

Posted in reply to Jabari Zakiya

Permalink

On Friday, 28 June 2019 at 19:32:41 UTC, Jabari Zakiya wrote:
> This is a continuation of a previous thread on the Segmented Sieve of Zakiya (SSoZ):
>
> https://forum.dlang.org/post/ozwapetxgcxkbzjjslqe@forum.dlang.org
>
> to present the D version of the current state-of-art of the work.
>
> The D version source code also performs the Fastest Primes Sieve for Twin Primes.
>
> https://gist.github.com/jzakiya/ae93bfa03dbc8b25ccc7f97ff8ad0f61
>
> It's a direct translation of the (original) Nim (www.nim-lang.org) version.
>
> https://gist.github.com/jzakiya/6c7e1868bd749a6b1add62e3e3b2341e
>
> For those interested, here are two places to get my current paper on it.
> I need to update it (again) to incorporate this new algorithm, and results.
>
> https://www.academia.edu/37952623/The_Use_of_Prime_Generators_to_Implement_Fast_Twin_Primes_Sieve_of_Zakiya_SoZ_Applications_to_Number_Theory_and_Implications_for_the_Riemann_Hypotheses
>
> https://www.scribd.com/document/395415391/The-Use-of-Prime-Generators-to-Implement-Fast-Twin-Primes-Sieve-Of-Zakiya-SoZ-Applications-to-Number-Theory-and-Implications-for-the-Riemann-Hypoth
>
> I compiled the D source using the latest LDC 1.16.0. When comparing times to Nim it's comparable but becomes slower as inputs sizes increase. Here are some times for the Nim version compared to `primesieve` -- https://github.com/kimwalisch/primesieve
>
> ```
> ➜  ~ inxi -SCM
> System:    Host: localhost.localdomain Kernel: 5.1.11-pclos1 x86_64 bits: 64 Desktop: KDE Plasma 5.16.1 Distro: PCLinuxOS 2019
> Machine:   Type: Laptop System: System76 product: Gazelle v: gaze10 serial: <root required>
>            Mobo: System76 model: Gazelle v: gaze10 serial: <root required> UEFI [Legacy]: American Megatrends v: 1.05.08
>            date: 03/31/2016
> CPU:       Topology: Quad Core model: Intel Core i7-6700HQ bits: 64 type: MT MCP L2 cache: 6144 KiB
>            Speed: 1185 MHz min/max: 800/3500 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800
> ```
>
> ```
>             N          | twinprimes_ssoz |   primesieve  | Ratio |
>     -------------------|-----------------|---------------|-------|
>        100_000_000_000 |       4.628     |      5.837    |  1.26 |
>        500_000_000_000 |      23.308     |     32.765    |  1.41 |
>      1_000_000_000_000 |      47.966     |     69.705    |  1.45 |
>      5_000_000_000_000 |     265.746     |    388.233    |  1.46 |
>     10_000_000_000_000 |     572.803     |    821.755    |  1.43 |
>     50_000_000_000_000 |    3004.487     |   4620.525    |  1.54 |
>    100_000_000_000_000 |    6307.521     |   9724.568    |  1.54 |
>    200_000_000_000_000 |   13603.531     |  20279.547    |  1.49 |
> ```
> I'm hoping people can help improve it and make it faster than it currently is.
>
> Thanks.
>
> Jabari

Timing comparisons on my system. Nim code compiled with GCC 8.3.1 (system has LLVM 8.0.0).

```
            N          |       Nim       |       D       | Ratio |
    -------------------|-----------------|---------------|-------|
       100_000_000_000 |       4.628     |      4.619    |  1.00 |
       500_000_000_000 |      23.308     |     24.071    |  1.03 |
     1_000_000_000_000 |      47.966     |     49.703    |  1.04 |
     5_000_000_000_000 |     265.746     |    276.847    |  1.04 |
    10_000_000_000_000 |     572.803     |    590.881    |  1.03 |
    50_000_000_000_000 |    3009.485     |   3046.913    |  1.01 |
   100_000_000_000_000 |    6307.521     |   6594.283    |  1.04 |
   200_000_000_000_000 |   13603.531     |  14706.743    |  1.08 |
```

On Saturday, 29 June 2019 at 12:28:35 UTC, Jabari Zakiya wrote: > Timing comparisons on my system. Nim code compiled with GCC 8.3.1 (system has LLVM 8.0.0). > > ``` > N | Nim | D | Ratio | > -------------------|-----------------|---------------|-------| > 100_000_000_000 | 4.628 | 4.619 | 1.00 | > 500_000_000_000 | 23.308 | 24.071 | 1.03 | > 1_000_000_000_000 | 47.966 | 49.703 | 1.04 | > 5_000_000_000_000 | 265.746 | 276.847 | 1.04 | > 10_000_000_000_000 | 572.803 | 590.881 | 1.03 | > 50_000_000_000_000 | 3009.485 | 3046.913 | 1.01 | > 100_000_000_000_000 | 6307.521 | 6594.283 | 1.04 | > 200_000_000_000_000 | 13603.531 | 14706.743 | 1.08 | > ``` Could be the difference between NIM's GC and D's, seeing as you are continually allocating new arrays. Could just be that the GCC backend for NIM generates better assembly than LLVM. Could profile the two and look at the assembly to see where the additional time is being spent.

Forums