Thread overview
Looking for something similar to Python's bisect_right
Oct 30, 2013
Zeke
Oct 30, 2013
Ali Çehreli
Oct 30, 2013
Zeke
Oct 30, 2013
qznc
Oct 30, 2013
Zeke
Oct 30, 2013
qznc
October 30, 2013
Hello, I'm on day 2 of learning D. I've learned C, C++, Java, Python, Ruby in University, but I wanted to broaden my palette by picking up D.

This project is a basic implementation of Project Euler problem 10. I build an array of primes, and for each new test number I check if the test is divisible by any prime from 2 to sqrt(test) (which cuts down on number of iterations). Trouble is, I can't find a method that's in the libraries to calculate the index of where sqrt(test) would exist or would be inserted into a sorted array.

I tried casting the primes array to a SortedRange using assumeSorted() but it didn't work for me. countUntil() returns -1 if the value doesn't exist. and I didn't find anything in the std.algorithm, std.array, or std.container.

Here's my current working implementation of is_prime and the functions it uses:

pure nothrow @safe bool is_prime(in ulong[] primes, ulong num) {
    // Need to find a more elegant way to find the index of where a number's
    // square root would be in a sorted array.
    auto upper = insind(primes, cast(ulong)sqrt(cast(double)num)) + 1;
    // Check each prime in the array up to the square root of our current test
    // number to see if it is prime.
    foreach(ulong prime; primes[0..upper]) {
        if (num % prime == 0) {
            return false;
        }
    }
    return true;
}

/**
 * Friendly overload so I don't have to give the bounds of the array for the
 * recursive step.
 */
pure nothrow @safe ulong
insind(in ulong[] arr, ulong key)
{
    return insind(arr, key, 0, arr.length);
}

/**
 * Using tail-recursion (so we don't build up a huge call stack), binary search
 * the array for the index where the key would be found if it existed or were
 * to be inserted into the array.
 */
pure nothrow @safe ulong
insind(in ulong[] arr, ulong key, ulong lower, ulong upper)
{
    if (lower >= upper) {
        return lower;
    } else {
        ulong mid = (lower + upper) / 2;
        if (arr[mid] > key) {
            return insind(arr, key, lower, mid - 1);
        } else if (arr[mid] < key) {
            return insind(arr, key, mid + 1, upper);
        } else {
            return mid;
        }
    }
}

I would also appreciate tips to help my "D-ness" coding style develop.
October 30, 2013
On 10/29/2013 11:04 PM, Zeke wrote:

lowerBound and friends are related:

  http://dlang.org/phobos/std_range.html#.lowerBound

Ali

October 30, 2013
On Wednesday, 30 October 2013 at 06:04:36 UTC, Zeke wrote:
> Hello, I'm on day 2 of learning D. I've learned C, C++, Java, Python, Ruby in University, but I wanted to broaden my palette by picking up D.
>
> This project is a basic implementation of Project Euler problem 10. I build an array of primes, and for each new test number I check if the test is divisible by any prime from 2 to sqrt(test) (which cuts down on number of iterations). Trouble is, I can't find a method that's in the libraries to calculate the index of where sqrt(test) would exist or would be inserted into a sorted array.
>
> I tried casting the primes array to a SortedRange using assumeSorted() but it didn't work for me. countUntil() returns -1 if the value doesn't exist. and I didn't find anything in the std.algorithm, std.array, or std.container.
>
> Here's my current working implementation of is_prime and the functions it uses:
>
> pure nothrow @safe bool is_prime(in ulong[] primes, ulong num) {
>     // Need to find a more elegant way to find the index of where a number's
>     // square root would be in a sorted array.
>     auto upper = insind(primes, cast(ulong)sqrt(cast(double)num)) + 1;
>     // Check each prime in the array up to the square root of our current test
>     // number to see if it is prime.
>     foreach(ulong prime; primes[0..upper]) {
>         if (num % prime == 0) {
>             return false;
>         }
>     }
>     return true;
> }

Why do you want to find the exact prime first? Just check against sqrt(num) in the loop.

auto upper = cast(ulong)sqrt(cast(double)num)) + 1;
foreach(ulong prime; primes) {
  if (prime > upper) return true;
  if (num % prime == 0) return false;
}
assert (false); // should be unreachable?
October 30, 2013
On Wednesday, 30 October 2013 at 14:17:22 UTC, qznc wrote:
> Why do you want to find the exact prime first? Just check against sqrt(num) in the loop.
>
> auto upper = cast(ulong)sqrt(cast(double)num)) + 1;
> foreach(ulong prime; primes) {
>   if (prime > upper) return true;
>   if (num % prime == 0) return false;
> }
> assert (false); // should be unreachable?

Because having a branch inside the foreach causes a branch prediction error when you've found a prime number. Simply iterating up to the sqrt and then terminating the loop has no branch prediction error. Try it for yourself.
October 30, 2013
On Wednesday, 30 October 2013 at 06:10:59 UTC, Ali Çehreli wrote:
> On 10/29/2013 11:04 PM, Zeke wrote:
>
> lowerBound and friends are related:
>
>   http://dlang.org/phobos/std_range.html#.lowerBound
>
> Ali

lowerBound returns a range with the last value being the sqrt, so I can't directly iterate over it because the squares of primes get into the primes array. My code looks like this now:

...
auto sqrt = cast(ulong)sqrt(cast(double)num);
auto sorted = assumeSorted(primes);
ulong upper = sorted.lowerBound(sqrt).length + 1;
foreach(ulong prime; sorted[0..upper]) {
...

Which has no speed improvement, but is using a standard library, so that accomplishes that.

I'd like to refactor so that the primes array is always a SortedRange, so I don't have to cast it for every call of is_prime(). What type would I need to change is_prime(ulong[] primes, ulong num) to so that I could give it the SortedRange?

Also how would I insert a found prime onto the back of a SortedRange?

Zeke
October 30, 2013
On Wednesday, 30 October 2013 at 18:56:42 UTC, Zeke wrote:
> On Wednesday, 30 October 2013 at 14:17:22 UTC, qznc wrote:
>> Why do you want to find the exact prime first? Just check against sqrt(num) in the loop.
>>
>> auto upper = cast(ulong)sqrt(cast(double)num)) + 1;
>> foreach(ulong prime; primes) {
>>  if (prime > upper) return true;
>>  if (num % prime == 0) return false;
>> }
>> assert (false); // should be unreachable?
>
> Because having a branch inside the foreach causes a branch prediction error when you've found a prime number. Simply iterating up to the sqrt and then terminating the loop has no branch prediction error. Try it for yourself.

Interesting thought.

In your code, there are two conditional branches in each iteration: The check for the end of foreach range and the prime check. Why is one more condition for the upper check so bad?

I admit, my code includes the superfluous foreach range check. Hence this improved version:

ulong prime;
int i = 0;
assert (primes[$] > upper);
do {
  prime = primes[i];
  if (num % prime == 0) return false;
  i+=1;
} while (prime < upper);
return true;

In this version there is still an implicit bounds check for the array access. For dmd "-noboundscheck" disables that branching.

Optimizing for branch prediction sounds premature, since you are using a slow algorithm anyways. ;)