Jump to page: 1 2 3
Thread overview
[phobos] FFT
Aug 01, 2010
David Simcha
Aug 01, 2010
Walter Bright
Aug 01, 2010
David Simcha
Aug 01, 2010
David Simcha
Aug 02, 2010
Don Clugston
Aug 02, 2010
David Simcha
Aug 02, 2010
Don Clugston
Aug 02, 2010
David Simcha
Aug 02, 2010
David Simcha
Aug 02, 2010
David Simcha
Aug 02, 2010
David Simcha
Aug 02, 2010
Don Clugston
Aug 02, 2010
David Simcha
Aug 03, 2010
Walter Bright
Aug 03, 2010
Walter Bright
Aug 03, 2010
David Simcha
Aug 03, 2010
Walter Bright
Aug 03, 2010
David Simcha
Aug 03, 2010
Walter Bright
Aug 03, 2010
David Simcha
Aug 04, 2010
David Simcha
Aug 04, 2010
Sean Kelly
Aug 02, 2010
Leandro Lucarella
August 01, 2010
So I've changed my mind about the design of my kernel density estimation lib that needed an FFT.  I now no longer have an immediate use for an FFT in D.  I also gave up on trying to translate kiss_fft because the code is so difficult to follow that if it doesn't work I'll be at a loss to debug it.

However, in the process of prototyping I wrote a very basic FFT lib.  It only handles power of two sizes.  It handles generic random-access ranges of complex inputs using std.complex, or pure real inputs of any builtin numeric type.  It does not include any cache optimizations and takes about 1.7 seconds to do a double[] -> Complex!double FFT of size 2^20 on my machine Athlon 64 X2 5200+.  (I haven't benchmarked other libs so I have no idea how fast or slow this really is).  Support for inverse FFTs can be trivially added, and definitely will be if this is contributed to Phobos.

As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list.  Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful?  If it sounds worth including I'll clean up/document the code and send it to the mailing list for review.  If it sounds too primitive, I'll just scrap it.
August 01, 2010
David Simcha wrote:
>
> As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list. Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful?  If it sounds worth including I'll clean up/document the code and send it to the mailing list for review.  If it sounds too primitive, I'll just scrap it.
>

I don't know enough about FFT's to make any sort of informed comment.
August 01, 2010
Speed-wise, I've just been goofing around for the past hour or so and I've sped it up 2x.  It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT in about 880 milliseconds.

On 8/1/2010 3:36 PM, Walter Bright wrote:
> David Simcha wrote:
>>
>> As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list. Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful?  If it sounds worth including I'll clean up/document the code and send it to the mailing list for review.  If it sounds too primitive, I'll just scrap it.
>>
>
> I don't know enough about FFT's to make any sort of informed comment.
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>

August 01, 2010
Sounds interesting. BTW I don't need FFT, but I think such a package would be a good addition to Phobos. Would be great if we could generalize the code a bit to work with general ranges (random-access I suppose, worth exploring to see if less powerful ranges would be enough).

Andrei

On 08/01/2010 03:32 PM, David Simcha wrote:
> Speed-wise, I've just been goofing around for the past hour or so and I've sped it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT in about 880 milliseconds.
>
> On 8/1/2010 3:36 PM, Walter Bright wrote:
>> David Simcha wrote:
>>>
>>> As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list. Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful? If it sounds worth including I'll clean up/document the code and send it to the mailing list for review. If it sounds too primitive, I'll just scrap it.
>>>
>>
>> I don't know enough about FFT's to make any sort of informed comment.
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
August 01, 2010
Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] -> Complex!double[] transform.  In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow.

On 8/1/2010 5:21 PM, Jason Spencer wrote:
> I don't have much of an opinion on adding to Phobos, but I'd definitely be interested in seeing the code.  I'm starting to do some scientific computing using D, and that would be a great learning reference for me.
>
> Jason
>
>
> ----- Original Message ----
> 
>> From: David Simcha<dsimcha at gmail.com>
>> To: Discuss the phobos library for D<phobos at puremagic.com>
>> Sent: Sun, August 1, 2010 1:32:38 PM
>> Subject: Re: [phobos] FFT
>>
>> Speed-wise, I've just been goofing around for the past hour or so and I've sped it up 2x.  It now does 1 size 2 ^^ 20 double[] ->  Complex!double FFT in  about 880 milliseconds.
>>
>> On 8/1/2010 3:36 PM, Walter Bright wrote:
>> 
>>>   David Simcha wrote:
>>> 
>>>> As I am no longer going to use FFTs  in my kernel density lib, improving
>>>> 
>> this FFT code will be bumped down my hacking  to-do list.  Does what I have now
>> sound better than nothing by a large  enough margin to warrant inclusion in
>> std.numeric, or does it sound too  primitive to be widely useful?  If it sounds
>> worth including I'll clean  up/document the code and send it to the mailing list
>> for review.  If it  sounds too primitive, I'll just scrap it.
>> 
>>>> 
>>> I don't  know enough about FFT's to make any sort of informed comment.
>>>   _______________________________________________
>>> phobos mailing  list
>>> phobos at puremagic.com
>>>   http://lists.puremagic.com/mailman/listinfo/phobos
>>>
>>> 
>> _______________________________________________
>> phobos mailing  list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>> 
> 

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fft3.d
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100801/28545d5d/attachment.ksh>
August 01, 2010
It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it".

Andrei

On 08/01/2010 06:31 PM, David Simcha wrote:
> Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] -> Complex!double[] transform. In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow.
>
> On 8/1/2010 5:21 PM, Jason Spencer wrote:
>> I don't have much of an opinion on adding to Phobos, but I'd
>> definitely be
>> interested in seeing the code. I'm starting to do some scientific
>> computing
>> using D, and that would be a great learning reference for me.
>>
>> Jason
>>
>>
>> ----- Original Message ----
>>> From: David Simcha<dsimcha at gmail.com>
>>> To: Discuss the phobos library for D<phobos at puremagic.com>
>>> Sent: Sun, August 1, 2010 1:32:38 PM
>>> Subject: Re: [phobos] FFT
>>>
>>> Speed-wise, I've just been goofing around for the past hour or so and
>>> I've sped
>>> it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT
>>> in about
>>> 880 milliseconds.
>>>
>>> On 8/1/2010 3:36 PM, Walter Bright wrote:
>>>> David Simcha wrote:
>>>>> As I am no longer going to use FFTs in my kernel density lib, improving
>>> this FFT code will be bumped down my hacking to-do list. Does what I
>>> have now
>>> sound better than nothing by a large enough margin to warrant
>>> inclusion in
>>> std.numeric, or does it sound too primitive to be widely useful? If
>>> it sounds
>>> worth including I'll clean up/document the code and send it to the
>>> mailing list
>>> for review. If it sounds too primitive, I'll just scrap it.
>>>> I don't know enough about FFT's to make any sort of informed comment.
>>>> _______________________________________________
>>>> phobos mailing list
>>>> phobos at puremagic.com
>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
August 02, 2010
On 2 August 2010 06:10, Andrei Alexandrescu <andrei at erdani.com> wrote:
> It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it".
>
> Andrei

I agree with that, although I think there are FFT use cases which rate
convenience over speed, especially if the speed is only a factor of
three slower than worlds' best practice.
We can assume that any usage of FFTs which is really speed critical
will be using FFTW (or a similar library). So a Phobos FFT should not
be aimed at power users. It might even be used for testing (do I need
an FFT here?) which is basically what you've ended up doing.
(My philosophy for std.bigint is the same; power users will use GMP,
but for casual usage, GMP is overkill and a nuisance to set up).
OTOH I think we need to be careful, it still needs to be much better
than something which users could write themselves in a few hours...

>
> On 08/01/2010 06:31 PM, David Simcha wrote:
>>
>> Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] -> Complex!double[] transform. In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow.
>>
>> On 8/1/2010 5:21 PM, Jason Spencer wrote:
>>>
>>> I don't have much of an opinion on adding to Phobos, but I'd
>>> definitely be
>>> interested in seeing the code. I'm starting to do some scientific
>>> computing
>>> using D, and that would be a great learning reference for me.
>>>
>>> Jason
>>>
>>>
>>> ----- Original Message ----
>>>>
>>>> From: David Simcha<dsimcha at gmail.com>
>>>> To: Discuss the phobos library for D<phobos at puremagic.com>
>>>> Sent: Sun, August 1, 2010 1:32:38 PM
>>>> Subject: Re: [phobos] FFT
>>>>
>>>> Speed-wise, I've just been goofing around for the past hour or so and
>>>> I've sped
>>>> it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT
>>>> in about
>>>> 880 milliseconds.
>>>>
>>>> On 8/1/2010 3:36 PM, Walter Bright wrote:
>>>>>
>>>>> David Simcha wrote:
>>>>>>
>>>>>> As I am no longer going to use FFTs in my kernel density lib, improving
>>>>
>>>> this FFT code will be bumped down my hacking to-do list. Does what I
>>>> have now
>>>> sound better than nothing by a large enough margin to warrant
>>>> inclusion in
>>>> std.numeric, or does it sound too primitive to be widely useful? If
>>>> it sounds
>>>> worth including I'll clean up/document the code and send it to the
>>>> mailing list
>>>> for review. If it sounds too primitive, I'll just scrap it.
>>>>>
>>>>> I don't know enough about FFT's to make any sort of informed comment.
>>>>> _______________________________________________
>>>>> phobos mailing list
>>>>> phobos at puremagic.com
>>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>>
>>>> _______________________________________________
>>>> phobos mailing list
>>>> phobos at puremagic.com
>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>
>>
>>
>>
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
August 02, 2010
Unfortunately I just downloaded the benchmark program for FFTW and my FFT is a ton slower, depending on how you look at it.  Using size 2^20 as my benchmark, FFTW takes about 131 seconds to create its plan, even using -oestimate, the fastest planner.  However, the plan can be reused if performing multiple FFTs of the same size, and once the plan is created, it can do an FFT of size 2^20 in about 53 milliseconds (which I find almost unbelievable because even sorting an array of size 2^20 using a well-optimized quick sort takes almost that long, and FFT seems like it should should have a much larger constant than quick sort), compared to my latest improvements to around 730 on the hardware I'm benchmarking on.

On the other hand, for one-off use cases, the lack of needing to create a plan is a big win, both from a speed and a simplicity of API point of view.  Benchmarking against R, which doesn't appear to use plans, making the comparison somewhat more relevant, things look better for my FFT:  R takes about 610 milliseconds for a size 2^20 pure real FFT.

On 8/2/2010 5:30 AM, Don Clugston wrote:
> On 2 August 2010 06:10, Andrei Alexandrescu<andrei at erdani.com>  wrote:
> 
>> It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it".
>>
>> Andrei
>> 
> I agree with that, although I think there are FFT use cases which rate
> convenience over speed, especially if the speed is only a factor of
> three slower than worlds' best practice.
> We can assume that any usage of FFTs which is really speed critical
> will be using FFTW (or a similar library). So a Phobos FFT should not
> be aimed at power users. It might even be used for testing (do I need
> an FFT here?) which is basically what you've ended up doing.
> (My philosophy for std.bigint is the same; power users will use GMP,
> but for casual usage, GMP is overkill and a nuisance to set up).
> OTOH I think we need to be careful, it still needs to be much better
> than something which users could write themselves in a few hours...
>
> 
>> On 08/01/2010 06:31 PM, David Simcha wrote:
>> 
>>> Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] ->  Complex!double[] transform. In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow.
>>>
>>> On 8/1/2010 5:21 PM, Jason Spencer wrote:
>>> 
>>>> I don't have much of an opinion on adding to Phobos, but I'd
>>>> definitely be
>>>> interested in seeing the code. I'm starting to do some scientific
>>>> computing
>>>> using D, and that would be a great learning reference for me.
>>>>
>>>> Jason
>>>>
>>>>
>>>> ----- Original Message ----
>>>> 
>>>>> From: David Simcha<dsimcha at gmail.com>
>>>>> To: Discuss the phobos library for D<phobos at puremagic.com>
>>>>> Sent: Sun, August 1, 2010 1:32:38 PM
>>>>> Subject: Re: [phobos] FFT
>>>>>
>>>>> Speed-wise, I've just been goofing around for the past hour or so and
>>>>> I've sped
>>>>> it up 2x. It now does 1 size 2 ^^ 20 double[] ->  Complex!double FFT
>>>>> in about
>>>>> 880 milliseconds.
>>>>>
>>>>> On 8/1/2010 3:36 PM, Walter Bright wrote:
>>>>> 
>>>>>> David Simcha wrote:
>>>>>> 
>>>>>>> As I am no longer going to use FFTs in my kernel density lib,
>>>>>>> improving
>>>>>>> 
>>>>> this FFT code will be bumped down my hacking to-do list. Does what I
>>>>> have now
>>>>> sound better than nothing by a large enough margin to warrant
>>>>> inclusion in
>>>>> std.numeric, or does it sound too primitive to be widely useful? If
>>>>> it sounds
>>>>> worth including I'll clean up/document the code and send it to the
>>>>> mailing list
>>>>> for review. If it sounds too primitive, I'll just scrap it.
>>>>> 
>>>>>> I don't know enough about FFT's to make any sort of informed comment.
>>>>>> _______________________________________________
>>>>>> phobos mailing list
>>>>>> phobos at puremagic.com
>>>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>>>
>>>>>> 
>>>>> _______________________________________________
>>>>> phobos mailing list
>>>>> phobos at puremagic.com
>>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>>
>>>>> 
>>>
>>>
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>> 
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>> 
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
> 

August 02, 2010
Andrei Alexandrescu, el  1 de agosto a las 23:10 me escribiste:
> It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it".

Something like this? http://permalink.gmane.org/gmane.comp.lang.d.phobos/1090

:)

(it's just a joke, that post came automatically into my mind when I read this, don't bother replying :)

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
<o_O> parakenotengobarraespaciadora
<o_O> aver
<o_O> estoyarreglandolabarraporkeserompiounapatita
August 02, 2010
On 2 August 2010 15:41, David Simcha <dsimcha at gmail.com> wrote:
> Unfortunately I just downloaded the benchmark program for FFTW and my FFT is a ton slower, depending on how you look at it. ?Using size 2^20 as my benchmark, FFTW takes about 131 seconds to create its plan, even using -oestimate, the fastest planner. ?However, the plan can be reused if performing multiple FFTs of the same size, and once the plan is created, it can do an FFT of size 2^20 in about 53 milliseconds (which I find almost unbelievable because even sorting an array of size 2^20 using a well-optimized quick sort takes almost that long, and FFT seems like it should should have a much larger constant than quick sort), compared to my latest improvements to around 730 on the hardware I'm benchmarking on.
>
> On the other hand, for one-off use cases, the lack of needing to create a plan is a big win, both from a speed and a simplicity of API point of view. ?Benchmarking against R, which doesn't appear to use plans, making the comparison somewhat more relevant, things look better for my FFT: ?R takes about 610 milliseconds for a size 2^20 pure real FFT.

All you're seeing is the L2 cache. Did you see the link I posted to
the NG about the internals of FFTW? The graph at the top shows FFTW is
40 times faster than the 'numerical recipes' code for 2^^20. So your
factor of 13 isn't so bad. Based on that graph, if you reduce it to
say 2^^15, the factor should drop significantly. Adding a little bit
of cache awareness (using core.cpuid) should be able to avoid the
performance cliff.
Also, DMD's floating point optimiser is so primitive, you lose up to a
factor of two immediately.
« First   ‹ Prev
1 2 3