August 02, 2004
In article <celkst$4e7$1@digitaldaemon.com>, parabolis says...

>I poked around in primes.d and glanced over the Int docs. I have a couple notes about the primes.d implementation. I was quite impressed with the fact your lookup table is half the size of a full binary Eratosthenes sieve. That is probably the smallest table which contains all primes from 0...2^16 and can still be searched without any multiplicatoin or division (excluding shifts). I also liked this loop:
>
>     for (n=(n|1)-2; n!=1; n-=2)
>
>However I could not help playing with the loop in isPrime() over all non-short values from 0x10000 to 0xFFFFFFFF.

I agree that could be speeded up.

>I eliminated the function calls,

I assumed the compiler would do that by inlining where it can.

>unrolled the loop to take better advantage of the pipeline

Again, I assumed the compiler would do that. I kinda worked under the assumption that DMD would do a better job of optimizing than I. We're told often enough on this forum that "premature optimization is bad". However, the actual mechanics of compiler optimization is a black art to me. I have no idea whether this sort of thing ends up being faster or slower. When I really need to make things zip, I go for inline assembler. But of course, you are more than welcome to make things go faster, if your changes do actually achieve that.

> and check almost half the numbers you were
>checking (now 8 in 30 instead of 15 in 30 - for fun figure out why I can do this).

That's neat. I assume you eliminate powers of 3, 5, 7, ...?

>If you (meaning anybody reading this) want to view or use the modifications then send me an email and I will send the new version to you.

I'm interested. Why not post it on the dsource Demios forum? I'll be getting back to Int in a week or so, so I can implement your changes then.


>The reason I asked about the primality was because I have a passing fancy in primality implementations and I was kind of assuming (or hoping) you would be implementing the new "Primes is in P" algorithm

It's not on my "to do" list, no. You see, my motivation is cryptography, and for cryptography one needs to generater primes /fast/. The new algorithm may be polynomial time, but it's still just not fast enough for practical crypto applications. The probabilistic test is much, much faster. It is true that there is a small probability of isProbablyPrime() getting it wrong, but in practice, one can make the probability of that's happening as small as one likes.

That said, I would of course have absolutely no objection were you or anyone else to implement it, purely for the coolness of it. !(g)



>along with a linear time perfect power detector

That's planned for the future. But again, I have no objection if anyone else would care to implement it before I get around to it.


>and Knuth's new Fast-Fourier Multiplication.

Again, that's planned for the future. I have an outstanding bug report relating to the Karatsuba multiplication algorithm, so I should probably fix that first. FFM is kinda like an extension of Karatsuba to more than two fragments, so it's probably not that hard, but you only really see any speed improvement for really /big/ numbers. (Bigger than one is likely to need in cryptography, so, again, it wasn't a priority for me).


>I am considering playing around with an implementation of this for isPrime(ulong).

How would FFM help with isPrime(ulong)?

Hell, don't bother to answer that question (unless it's an easy question to answer) - just do the implementation and I'll read the source code!


>On a somewhat related note... Another reason I was interested in what you are doing with the Int class is that my first impressions of Phobos were being underwhelmed and disdain at the (possibly misleading) appreance that things are just being tossed in without attempting to guess where they will likely clash with future additions. At the time I mapped out this bit for the math library:
>
>  std.math.big
>  std.math.complex
>  std.math.imaginary
>  std.math.integer
>  std.math.fixed
>  std.math.real
>  std.math.symbolic
>
>Ever since then I have been curious where the functions to fill those libraries will come from.

I have no influence on what goes into std. Deimos currently has "etc" as its only root. Deimos is also a little untidy in that regard, right now - which I why I plan to move etc.prime into etc.math.prime (imported from etc.math.math). I'd like to keep the root as "clean" as possible.

Arcane Jill



August 02, 2004
Arcane Jill wrote:

> Again, I assumed the compiler would do that. I kinda worked under the assumption
> that DMD would do a better job of optimizing than I. We're told often enough on
> this forum that "premature optimization is bad". However, the actual mechanics
> of compiler optimization is a black art to me. I have no idea whether this sort
> of thing ends up being faster or slower. When I really need to make things zip,
> I go for inline assembler. But of course, you are more than welcome to make
> things go faster, if your changes do actually achieve that.

Yes that is understandable. However the unrolling was largerly
justified by the almost doubling the algorithms speed for
intervals of 30. The eimination of function calls was also to
ensure no compiler could defeat the optimization. Compilers seem to be rather iffy about inlining recursive function calls. DMD may do it but D hopes for other compiler implementations and there is no guarantee they will get it right...

> 
>>and check almost half the numbers you were checking (now 8 in 30 instead of 15 in 30 - for fun figure out why I can do this).
> 
> 
> That's neat. I assume you eliminate powers of 3, 5, 7, ...?

Almost correct. Think a bit harder about what make 30 special
and the rest should be fairly obvious. Also think about whether or not there is a larger/smaller number I could have choosen and if so why I did not. :)

> 
>>If you (meaning anybody reading this) want to view or use the modifications then send me an email and I will send the new version to you.
> 
> 
> I'm interested. Why not post it on the dsource Demios forum? I'll be getting
> back to Int in a week or so, so I can implement your changes then.

I just emailed it to you. You have an account there I am guessing? :P

> 
>>The reason I asked about the primality was because I have a passing fancy in primality implementations and I was kind of assuming (or hoping) you would be implementing the new "Primes is in P" algorithm
> 
> 
> It's not on my "to do" list, no. You see, my motivation is cryptography, and for
> cryptography one needs to generater primes /fast/. The new algorithm may be

Usually. Generating primes for RSA can take a fortnight and still not be too much of a bother in some cases.

> 
> That said, I would of course have absolutely no objection were you or anyone
> else to implement it, purely for the coolness of it. !(g)
> 
<snip>
> 
> That's planned for the future. But again, I have no objection if anyone else
> would care to implement it before I get around to it.
> 
<snip>
> 
> Again, that's planned for the future. I have an outstanding bug report relating
> to the Karatsuba multiplication algorithm, so I should probably fix that first.
> FFM is kinda like an extension of Karatsuba to more than two fragments, so it's
> probably not that hard, but you only really see any speed improvement for really
> /big/ numbers. (Bigger than one is likely to need in cryptography, so, again, it
> wasn't a priority for me).

I already probably spend too much time playing with these things. Anything more complex than a ulong domain and the time involved is more than I think I could ever hope to find. I am looking forward to your implementations however.

> 
>>I am considering playing around with an implementation of this for isPrime(ulong).
> 
> 
> How would FFM help with isPrime(ulong)?
> 
> Hell, don't bother to answer that question (unless it's an easy question to
> answer) - just do the implementation and I'll read the source code!

Actually I would not use FFM but rather normal hardware multiplication. The cost is lg^12n instead of lg^6n which I can live with given the simplicity of implementing these things over a ulong domain.

> 
>>On a somewhat related note... Another reason I was interested in what you are doing with the Int class is that my first impressions of Phobos were being underwhelmed and disdain at the (possibly misleading) appreance that things are just being tossed in without attempting to guess where they will likely clash with future additions. At the time I mapped out this bit for the math library:
>>
>> std.math.big
>> std.math.complex
>> std.math.imaginary
>> std.math.integer
>> std.math.fixed
>> std.math.real
>> std.math.symbolic
>>
>>Ever since then I have been curious where the functions to fill those libraries will come from.
> 
> 
> I have no influence on what goes into std. Deimos currently has "etc" as its
> only root. Deimos is also a little untidy in that regard, right now - which I
> why I plan to move etc.prime into etc.math.prime (imported from etc.math.math).
> I'd like to keep the root as "clean" as possible.
> 

I only mentioned it because I was somewhat hoping either you or someone else would see the comment and tell me why I am wrong to think such things and I would learn something :)
August 02, 2004
In article <celtam$8ju$1@digitaldaemon.com>, parabolis says...
>
>Arcane Jill wrote:
>

>Almost correct. Think a bit harder about what make 30 special and the rest should be fairly obvious. Also think about whether or not there is a larger/smaller number I could have choosen and if so why I did not. :)

Hmmm. Well, 30 = 2 * 3 * 5, so I'm guessing you could equally well have used 6
(= 2 * 3) or 210 (2 * 3 * 5 * 7). Eliminating multiples of 2 (0, 2, 4, 6, 8, 10,
12, 14, 16, 18, 20, 22, 24, 26, 28), multpiles of 3 (3, 9, 15, 21, 27) and
multiples of 5 (5, 15) leaves you with only remainders of (1, 7, 11, 13, 17, 19,
23, 29) to test.

I agree that's a snazzy speed-up!


>I just emailed it to you. You have an account there I am guessing? :P

My account name on dsource is "Arcane Jill". I'll check out my dsource messages when restart work on Int (but I have some Unicode work to finish off first).


>Usually. Generating primes for RSA can take a fortnight and still not be too much of a bother in some cases.

Doesn't change anything. If you can do the same thing in minutes, who's going to use the slow algorithm? The "primes in N" algorithm is of academic interest, sure, and I certainly wouldn't complain if someone (other than I) were to implement it, but it's hard to think of any real, practical uses.


>Actually I would not use FFM but rather normal hardware multiplication. The cost is lg^12n instead of lg^6n which I can live with given the simplicity of implementing these things over a ulong domain.

Cool. I didn't understand "lg^12n" or "lg^6n", but it's still cool.


>I only mentioned it because I was somewhat hoping either you or someone else would see the comment and tell me why I am wrong to think such things and I would learn something :)

Walter has said in the past that tidying up the root of std was a good idea, but he's /rather/ busy at the mo, and I don't see it as being high on the priority list.

Arcane Jill


August 02, 2004
Arcane Jill wrote:

> In article <celtam$8ju$1@digitaldaemon.com>, parabolis says...
> 
>>Almost correct. Think a bit harder about what make 30 special
>>and the rest should be fairly obvious. Also think about whether or not there is a larger/smaller number I could have choosen and if so why I did not. :)
> 
> Hmmm. Well, 30 = 2 * 3 * 5, so I'm guessing you could equally well have used 6
> (= 2 * 3) or 210 (2 * 3 * 5 * 7). Eliminating multiples of 2 (0, 2, 4, 6, 8, 10,
> 12, 14, 16, 18, 20, 22, 24, 26, 28), multpiles of 3 (3, 9, 15, 21, 27) and
> multiples of 5 (5, 15) leaves you with only remainders of (1, 7, 11, 13, 17, 19,
> 23, 29) to test.
> 
> I agree that's a snazzy speed-up!

That is exactly correct (assuming the 15 in 5 is a typo and was actually 25). Each prime you include eliminates:

    1/2
    2/3
    4/5
    6/7
    ...
(n-1)/n

So for a 210 loop you only get a speedup of 6/7. Only using a 6 loop means you miss out on speedup of 4/5 (close to 1/4).

August 04, 2004
Arcane Jill wrote:

> Hmmm. Well, 30 = 2 * 3 * 5, so I'm guessing you could equally well have used 6
> (= 2 * 3) or 210 (2 * 3 * 5 * 7). Eliminating multiples of 2 (0, 2, 4, 6, 8, 10,
> 12, 14, 16, 18, 20, 22, 24, 26, 28), multpiles of 3 (3, 9, 15, 21, 27) and
> multiples of 5 (5, 15) leaves you with only remainders of (1, 7, 11, 13, 17, 19,
> 23, 29) to test.
> 
> I agree that's a snazzy speed-up!

I was just looking at mango's primes.d and figured out that the longest stretch of non-primes in a ushort is 52. I also gave some thought to a way to speed up the primeGreaterEqual and related functions and realized that a clever switch statement would allow the same speed up for these functions.

(The problem being that starting from 0 it is simple to maintain a window of 30, however it is trickier to fall into a window of 30 in the general case)

So given no two ushort primes have a distance greater than 60 you can guarantee that they will be found within 16 attempts. In mango's implementation there is a table of 1000 shorts and the next prime is found by using a binary search which has a worst case around 10 but the distribution of cases is the opposite of the distribution of prime distance - the majority of cases in bsearch are worst cases.



August 04, 2004
In article <cepcdp$1pmt$1@digitaldaemon.com>, parabolis says...

>(The problem being that starting from 0 it is simple to maintain a window of 30, however it is trickier to fall into a window of 30 in the general case)

I thing the % operator should do the job. In fact, with a bit of assembler, you could do / and % at the same time, in a single instruction.

Jill


August 04, 2004
Arcane Jill wrote:

> In article <cepcdp$1pmt$1@digitaldaemon.com>, parabolis says...
> 
> 
>>(The problem being that starting from 0 it is simple to maintain a window of 30, however it is trickier to fall into a window of 30 in the general case)
> 
> 
> I thing the % operator should do the job. In fact, with a bit of assembler, you
> could do / and % at the same time, in a single instruction.
> 

Yes I was thinking of using % but it costs /.
1 2
Next ›   Last »