Thread overview
Implementing Sparse Vectors With Associative Arrays/Compiler Bug?
Mar 07, 2013
Ed
Mar 07, 2013
jerro
Mar 07, 2013
Rene Zwanenburg
Mar 10, 2013
Nick B
Mar 10, 2013
Danny Arends
March 07, 2013
I'm new to D and am trying to implement simple sparse vectors using associative arrays, but I'm getting fairly large floating point errors. Example code for sparse dot product:

import std.stdio;
import std.math;
import std.random;
static import std.datetime;

int main(string[] args) {
  double[int] v1;
  double[int] v2;
  Random gen;
  gen.seed(cast(uint)std.datetime.Clock.currTime().stdTime());

  double accum = 0;
  double val;
  foreach(i;1 .. 1000) {
    val = uniform(-1000.0,1000.0,gen);
    accum += val * val;
    v1[i] = val;
    v2[i] = val;
  }

  double accum2 = 0;
  double v2Val;
  foreach(k;v1.byKey()) {
    v2Val= v2.get(k,0);
    if(v2Val != 0) {
      accum2 += v1.get(k,0) * v2Val;
    }
  }

  writefln("accum - accum2 = %e", accum - accum2);

  return 0;
}

This outputs values such as:
accum - accum2 = -4.172325e-07
accum - accum2 = 2.384186e-07
accum - accum2 = 4.172325e-07

Are errors of this magnitude to be expected using doubles, or is this a compiler bug?
March 07, 2013
> Are errors of this magnitude to be expected using doubles, or is this a compiler bug?

Errors of this magnitude are to be expected. the value of accum in your example is somewhere around 3e+08, so the relative error is around 1e-15, and double.epsilon is 2.22045e-16.

By the way, you can use unpredictableSeed to get an unpredictable seed.
March 07, 2013
On Thursday, 7 March 2013 at 09:43:21 UTC, jerro wrote:
>> Are errors of this magnitude to be expected using doubles, or is this a compiler bug?
>
> Errors of this magnitude are to be expected. the value of accum in your example is somewhere around 3e+08, so the relative error is around 1e-15, and double.epsilon is 2.22045e-16.

This.

You can use reals to store the intermediary results. A real has the largest hardware supported size, which is 80 bits for x87. It's not a silver bullet but can be useful in cases like this.
March 10, 2013
On Thursday, 7 March 2013 at 07:03:04 UTC, Ed wrote:
> I'm new to D and am trying to implement simple sparse vectors using associative arrays, but I'm getting fairly large floating point errors. Example code for sparse dot product:
>
> import std.stdio;
> import std.math;
> import std.random;
> static import std.datetime;
>
> int main(string[] args) {
>   double[int] v1;
>   double[int] v2;
>   Random gen;
>   gen.seed(cast(uint)std.datetime.Clock.currTime().stdTime());
>
>   double accum = 0;
>   double val;
>   foreach(i;1 .. 1000) {
>     val = uniform(-1000.0,1000.0,gen);
>     accum += val * val;
>     v1[i] = val;
>     v2[i] = val;
>   }
>
>   double accum2 = 0;
>   double v2Val;
>   foreach(k;v1.byKey()) {
>     v2Val= v2.get(k,0);
>     if(v2Val != 0) {
>       accum2 += v1.get(k,0) * v2Val;
>     }
>   }
>
>   writefln("accum - accum2 = %e", accum - accum2);
>
>   return 0;
> }
>
> This outputs values such as:
> accum - accum2 = -4.172325e-07
> accum - accum2 = 2.384186e-07
> accum - accum2 = 4.172325e-07
>
> Are errors of this magnitude to be expected using doubles, or is this a compiler bug?

Hi Ed

I also interested in simple sparse vectors.  Any chance this code could be published or put in a library ?

Nick
March 10, 2013
You could also try to use a Kahan Accumulator to 'fix' this problem.

See wikipedia: http://en.wikipedia.org/wiki/Kahan_summation_algorithm

Its pretty straight forward to implement.

Gr,
Danny Arends
http://www.dannyarends.nl