pfft 0.1

Jul 20, 2012

jerro

Jul 20, 2012

Jul 21, 2012

Jul 21, 2012

Jul 21, 2012

Jul 22, 2012

Jul 24, 2012

Jul 24, 2012

Jul 24, 2012

Jul 22, 2012

Jul 24, 2012

Jul 24, 2012

I'm announcing the release of pfft, a fast FFT written in D. Code: https://github.com/jerro/pfft/ Downloads: https://github.com/jerro/pfft/downloads Documentation: http://jerro.github.com/pfft/doc/pfft.pfft.html Benchmarks: http://jerro.github.com/pfft/benchmarks/

jerro: > I'm announcing the release of pfft, a fast FFT written in D. Everything looks nice. Are you using "in", pure/nothrow/immutable/const enough? Is it worth changing the Phobos API to something similar to this API? Bye, bearophile

On Friday, 20 July 2012 at 23:36:37 UTC, bearophile wrote: > jerro: > >> I'm announcing the release of pfft, a fast FFT written in D. > > Everything looks nice. Are you using "in", pure/nothrow/immutable/const enough? Not yet, but I plan to add those. > Is it worth changing the Phobos API to something similar to this API? Pfft already includes an API that is compatible with the Phobos one (pfft.stdapi). But I don't think it makes any sense to change the Phobos API to something like pfft.pfft. The main difference between the two is that std.numeric.Fft uses interleaved format for sequences of complex numbers and pfft.pfft uses a split format. Interleaved format is more convenient in most cases and I think it's more intuitive too. The FFT in Phobos also works on ranges, while pfft.pfft only works on aligned arrays. The only reason the pfft.pfft works the way it does is that it is supposed to be an interface to the underlying implementation with no loss of performance and that is how the implementation works. Even if the implementation in Phobos did use split format, I think it would still be better to use interleaved format for the API. It's just about 30% slower and I think that a nice API is more important for the standard library than a 30% difference in speed. There is one more important difference in the API between std.numeric.Fft and pfft.pfft. With Phobos you can use one instance of Fft for all sizes and types. With pfft.pfft, the Fft class is parametrized with a floating point type and you need a different instance for each size. I think what Phobos does in this case is wrong. The fact that one class works on all floating point types results in very poor precision when the data consists of doubles or reals. I guess that precomputed tables are stored as floats. You could only fix that by saving them as reals, but that would probably hurt performance quite a lot. The fact that one instance can be used for multiple sizes would be a problem if we wanted to change an implementation since not all FFT implementation can use one precomputed table for different data sizes. > Bye, > bearophile

jerro: Are all your benchmarks done on a 64 bit system? > I think what Phobos does in > this case is wrong. The fact that one class works on all > floating point types results in very poor precision when > the data consists of doubles or reals. I guess that precomputed > tables are stored as floats. You could only fix that by saving > them as reals, but that would probably hurt performance quite > a lot. The fact that one instance can be used for multiple sizes > would be a problem if we wanted to change an implementation since > not all FFT implementation can use one precomputed table for > different data sizes. If you are right and this is a problem, are Andrei and others accepting to change this little part of Phobos? If the answer is positive, are you interested in creating a GIT patch that changes that? Bye, bearophile

> Are all your benchmarks done on a 64 bit system? They are. Here's one comparison with 32 bit (for complex single precision transform using sse)if you are interested in that: http://imgur.com/CTuCD . The 32 bit executable is slower, probably because there are less general purpose and SSE registers on x86 than on x86_64. > If you are right and this is a problem, are Andrei and others accepting to change this little part of Phobos? If the answer is positive, are you interested in creating a GIT patch that changes that? I think that poor precision is a problem. I have checked now and std.numeric.Fft does indeed use floats for the lookup table. The precision problem could be solved by either changing that to real or by changing Fft to a template and using whatever the type parameter is. Just changing float to real doesn't require changing the API, but would probably result in worse performance. I haven't talked to Andrei or others about changing it, but I am willing to write a patch that changes the API, if it would be decided that would be the best thing to do. If it would be decided that it's best to just change the type of the lookup table that's trivial anyway, since it's just one typedef.

On 21/07/2012 01:26, jerro wrote: > I'm announcing the release of pfft, a fast FFT written in D. > > Code: https://github.com/jerro/pfft/ > > Downloads: https://github.com/jerro/pfft/downloads > > Documentation: http://jerro.github.com/pfft/doc/pfft.pfft.html > > Benchmarks: http://jerro.github.com/pfft/benchmarks/ A signal processing module in phobos sound like a lot of fun.

> I think that poor precision is a problem. I have checked now and > std.numeric.Fft does indeed use floats for the lookup table. > The precision problem could be solved by either changing that to > real or by changing Fft to a template and using whatever the type > parameter is. Just changing float to real doesn't require changing > the API, but would probably result in worse performance. I forgot to mention one more solution (the one that pfft.stdapi currently uses). The class could also use lazy initialization. When the fft is called, check if the lookup table for that type is already created, create it if it isn't, store it for later and then compute the fft. This would require casting away const in the fft() method, though.

jerro: > I haven't talked to Andrei or others about changing it, but I am > willing to write a patch that changes the API, if it would be > decided that would be the best thing to do. Given your willingness to work, and the days since you wrote that, maybe we have to write this little proposal again in the main D newsgroup. Bye, bearophile

On Friday, 20 July 2012 at 23:26:46 UTC, jerro wrote: > I'm announcing the release of pfft, a fast FFT written in D. > > Code: https://github.com/jerro/pfft/ > > Downloads: https://github.com/jerro/pfft/downloads > > Documentation: http://jerro.github.com/pfft/doc/pfft.pfft.html > > Benchmarks: http://jerro.github.com/pfft/benchmarks/ Nice work. Just did the usual advertising in HN and Reddit, http://news.ycombinator.com/item?id=4286257 http://www.reddit.com/r/programming/comments/x2tt8/pfft_a_fast_fourier_transform_written_in_d/ -- Paulo

July 24, 2012

Re: pfft 0.1

Posted by Andrei Alexandrescu
in reply to jerro

Permalink

Andrei Alexandrescu

Posted in reply to jerro

Permalink

On 7/20/12 8:02 PM, jerro wrote:
> On Friday, 20 July 2012 at 23:36:37 UTC, bearophile wrote:
>> Is it worth changing the Phobos API to something similar to this API?
>
> Pfft already includes an API that is compatible with the
> Phobos one (pfft.stdapi). But I don't think it makes any sense
> to change the Phobos API to something like pfft.pfft. The main
> difference between the two is that std.numeric.Fft uses
> interleaved format for sequences of complex numbers and
> pfft.pfft uses a split format. Interleaved format is more
> convenient in most cases and I think it's more intuitive too.
> The FFT in Phobos also works on ranges, while pfft.pfft only
> works on aligned arrays.
>
> The only reason the pfft.pfft works the way it does is that it
> is supposed to be an interface to the underlying implementation
> with no loss of performance and that is how the implementation
> works. Even if the implementation in Phobos did use split
> format, I think it would still be better to use interleaved
> format for the API. It's just about 30% slower and I think that
> a nice API is more important for the standard library than a 30%
> difference in speed.

This all is very interesting. Even considering all potential issues and incompatibilities, it might be great to look into integrating Fft into Phobos. Do you think it would work to add it as a "high-speed alternative with different format and alignment requirements"? Offhand I'd think someone using the FFT is generally interested in speed, and a 30% margin is sensible since the algorithm might be most of some applications' run time.

> There is one more important difference in the API between
> std.numeric.Fft and pfft.pfft. With Phobos you can use one
> instance of Fft for all sizes and types. With pfft.pfft, the Fft
> class is parametrized with a floating point type and you need a
> different instance for each size. I think what Phobos does in
> this case is wrong. The fact that one class works on all
> floating point types results in very poor precision when
> the data consists of doubles or reals. I guess that precomputed
> tables are stored as floats. You could only fix that by saving
> them as reals, but that would probably hurt performance quite
> a lot. The fact that one instance can be used for multiple sizes
> would be a problem if we wanted to change an implementation since
> not all FFT implementation can use one precomputed table for
> different data sizes.

Do you think Phobos can be fixed in a backwards-compatible way (e.g. provide a template and alias it to the old name)?


Andrei

Forums