Thread overview | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
August 01, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
So I've changed my mind about the design of my kernel density estimation lib that needed an FFT. I now no longer have an immediate use for an FFT in D. I also gave up on trying to translate kiss_fft because the code is so difficult to follow that if it doesn't work I'll be at a loss to debug it. However, in the process of prototyping I wrote a very basic FFT lib. It only handles power of two sizes. It handles generic random-access ranges of complex inputs using std.complex, or pure real inputs of any builtin numeric type. It does not include any cache optimizations and takes about 1.7 seconds to do a double[] -> Complex!double FFT of size 2^20 on my machine Athlon 64 X2 5200+. (I haven't benchmarked other libs so I have no idea how fast or slow this really is). Support for inverse FFTs can be trivially added, and definitely will be if this is contributed to Phobos. As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list. Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful? If it sounds worth including I'll clean up/document the code and send it to the mailing list for review. If it sounds too primitive, I'll just scrap it. |
August 01, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | David Simcha wrote:
>
> As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list. Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful? If it sounds worth including I'll clean up/document the code and send it to the mailing list for review. If it sounds too primitive, I'll just scrap it.
>
I don't know enough about FFT's to make any sort of informed comment.
|
August 01, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Speed-wise, I've just been goofing around for the past hour or so and I've sped it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT in about 880 milliseconds.
On 8/1/2010 3:36 PM, Walter Bright wrote:
> David Simcha wrote:
>>
>> As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list. Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful? If it sounds worth including I'll clean up/document the code and send it to the mailing list for review. If it sounds too primitive, I'll just scrap it.
>>
>
> I don't know enough about FFT's to make any sort of informed comment.
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
|
August 01, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | Sounds interesting. BTW I don't need FFT, but I think such a package would be a good addition to Phobos. Would be great if we could generalize the code a bit to work with general ranges (random-access I suppose, worth exploring to see if less powerful ranges would be enough).
Andrei
On 08/01/2010 03:32 PM, David Simcha wrote:
> Speed-wise, I've just been goofing around for the past hour or so and I've sped it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT in about 880 milliseconds.
>
> On 8/1/2010 3:36 PM, Walter Bright wrote:
>> David Simcha wrote:
>>>
>>> As I am no longer going to use FFTs in my kernel density lib, improving this FFT code will be bumped down my hacking to-do list. Does what I have now sound better than nothing by a large enough margin to warrant inclusion in std.numeric, or does it sound too primitive to be widely useful? If it sounds worth including I'll clean up/document the code and send it to the mailing list for review. If it sounds too primitive, I'll just scrap it.
>>>
>>
>> I don't know enough about FFT's to make any sort of informed comment.
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
August 01, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] -> Complex!double[] transform. In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow. On 8/1/2010 5:21 PM, Jason Spencer wrote: > I don't have much of an opinion on adding to Phobos, but I'd definitely be interested in seeing the code. I'm starting to do some scientific computing using D, and that would be a great learning reference for me. > > Jason > > > ----- Original Message ---- > >> From: David Simcha<dsimcha at gmail.com> >> To: Discuss the phobos library for D<phobos at puremagic.com> >> Sent: Sun, August 1, 2010 1:32:38 PM >> Subject: Re: [phobos] FFT >> >> Speed-wise, I've just been goofing around for the past hour or so and I've sped it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT in about 880 milliseconds. >> >> On 8/1/2010 3:36 PM, Walter Bright wrote: >> >>> David Simcha wrote: >>> >>>> As I am no longer going to use FFTs in my kernel density lib, improving >>>> >> this FFT code will be bumped down my hacking to-do list. Does what I have now >> sound better than nothing by a large enough margin to warrant inclusion in >> std.numeric, or does it sound too primitive to be widely useful? If it sounds >> worth including I'll clean up/document the code and send it to the mailing list >> for review. If it sounds too primitive, I'll just scrap it. >> >>>> >>> I don't know enough about FFT's to make any sort of informed comment. >>> _______________________________________________ >>> phobos mailing list >>> phobos at puremagic.com >>> http://lists.puremagic.com/mailman/listinfo/phobos >>> >>> >> _______________________________________________ >> phobos mailing list >> phobos at puremagic.com >> http://lists.puremagic.com/mailman/listinfo/phobos >> >> > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fft3.d URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100801/28545d5d/attachment.ksh> |
August 01, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it".
Andrei
On 08/01/2010 06:31 PM, David Simcha wrote:
> Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] -> Complex!double[] transform. In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow.
>
> On 8/1/2010 5:21 PM, Jason Spencer wrote:
>> I don't have much of an opinion on adding to Phobos, but I'd
>> definitely be
>> interested in seeing the code. I'm starting to do some scientific
>> computing
>> using D, and that would be a great learning reference for me.
>>
>> Jason
>>
>>
>> ----- Original Message ----
>>> From: David Simcha<dsimcha at gmail.com>
>>> To: Discuss the phobos library for D<phobos at puremagic.com>
>>> Sent: Sun, August 1, 2010 1:32:38 PM
>>> Subject: Re: [phobos] FFT
>>>
>>> Speed-wise, I've just been goofing around for the past hour or so and
>>> I've sped
>>> it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT
>>> in about
>>> 880 milliseconds.
>>>
>>> On 8/1/2010 3:36 PM, Walter Bright wrote:
>>>> David Simcha wrote:
>>>>> As I am no longer going to use FFTs in my kernel density lib, improving
>>> this FFT code will be bumped down my hacking to-do list. Does what I
>>> have now
>>> sound better than nothing by a large enough margin to warrant
>>> inclusion in
>>> std.numeric, or does it sound too primitive to be widely useful? If
>>> it sounds
>>> worth including I'll clean up/document the code and send it to the
>>> mailing list
>>> for review. If it sounds too primitive, I'll just scrap it.
>>>> I don't know enough about FFT's to make any sort of informed comment.
>>>> _______________________________________________
>>>> phobos mailing list
>>>> phobos at puremagic.com
>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
August 02, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 2 August 2010 06:10, Andrei Alexandrescu <andrei at erdani.com> wrote: > It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it". > > Andrei I agree with that, although I think there are FFT use cases which rate convenience over speed, especially if the speed is only a factor of three slower than worlds' best practice. We can assume that any usage of FFTs which is really speed critical will be using FFTW (or a similar library). So a Phobos FFT should not be aimed at power users. It might even be used for testing (do I need an FFT here?) which is basically what you've ended up doing. (My philosophy for std.bigint is the same; power users will use GMP, but for casual usage, GMP is overkill and a nuisance to set up). OTOH I think we need to be careful, it still needs to be much better than something which users could write themselves in a few hours... > > On 08/01/2010 06:31 PM, David Simcha wrote: >> >> Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] -> Complex!double[] transform. In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow. >> >> On 8/1/2010 5:21 PM, Jason Spencer wrote: >>> >>> I don't have much of an opinion on adding to Phobos, but I'd >>> definitely be >>> interested in seeing the code. I'm starting to do some scientific >>> computing >>> using D, and that would be a great learning reference for me. >>> >>> Jason >>> >>> >>> ----- Original Message ---- >>>> >>>> From: David Simcha<dsimcha at gmail.com> >>>> To: Discuss the phobos library for D<phobos at puremagic.com> >>>> Sent: Sun, August 1, 2010 1:32:38 PM >>>> Subject: Re: [phobos] FFT >>>> >>>> Speed-wise, I've just been goofing around for the past hour or so and >>>> I've sped >>>> it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT >>>> in about >>>> 880 milliseconds. >>>> >>>> On 8/1/2010 3:36 PM, Walter Bright wrote: >>>>> >>>>> David Simcha wrote: >>>>>> >>>>>> As I am no longer going to use FFTs in my kernel density lib, improving >>>> >>>> this FFT code will be bumped down my hacking to-do list. Does what I >>>> have now >>>> sound better than nothing by a large enough margin to warrant >>>> inclusion in >>>> std.numeric, or does it sound too primitive to be widely useful? If >>>> it sounds >>>> worth including I'll clean up/document the code and send it to the >>>> mailing list >>>> for review. If it sounds too primitive, I'll just scrap it. >>>>> >>>>> I don't know enough about FFT's to make any sort of informed comment. >>>>> _______________________________________________ >>>>> phobos mailing list >>>>> phobos at puremagic.com >>>>> http://lists.puremagic.com/mailman/listinfo/phobos >>>>> >>>> _______________________________________________ >>>> phobos mailing list >>>> phobos at puremagic.com >>>> http://lists.puremagic.com/mailman/listinfo/phobos >>>> >> >> >> >> _______________________________________________ >> phobos mailing list >> phobos at puremagic.com >> http://lists.puremagic.com/mailman/listinfo/phobos > > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos > |
August 02, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don Clugston | Unfortunately I just downloaded the benchmark program for FFTW and my FFT is a ton slower, depending on how you look at it. Using size 2^20 as my benchmark, FFTW takes about 131 seconds to create its plan, even using -oestimate, the fastest planner. However, the plan can be reused if performing multiple FFTs of the same size, and once the plan is created, it can do an FFT of size 2^20 in about 53 milliseconds (which I find almost unbelievable because even sorting an array of size 2^20 using a well-optimized quick sort takes almost that long, and FFT seems like it should should have a much larger constant than quick sort), compared to my latest improvements to around 730 on the hardware I'm benchmarking on.
On the other hand, for one-off use cases, the lack of needing to create a plan is a big win, both from a speed and a simplicity of API point of view. Benchmarking against R, which doesn't appear to use plans, making the comparison somewhat more relevant, things look better for my FFT: R takes about 610 milliseconds for a size 2^20 pure real FFT.
On 8/2/2010 5:30 AM, Don Clugston wrote:
> On 2 August 2010 06:10, Andrei Alexandrescu<andrei at erdani.com> wrote:
>
>> It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it".
>>
>> Andrei
>>
> I agree with that, although I think there are FFT use cases which rate
> convenience over speed, especially if the speed is only a factor of
> three slower than worlds' best practice.
> We can assume that any usage of FFTs which is really speed critical
> will be using FFTW (or a similar library). So a Phobos FFT should not
> be aimed at power users. It might even be used for testing (do I need
> an FFT here?) which is basically what you've ended up doing.
> (My philosophy for std.bigint is the same; power users will use GMP,
> but for casual usage, GMP is overkill and a nuisance to set up).
> OTOH I think we need to be careful, it still needs to be much better
> than something which users could write themselves in a few hours...
>
>
>> On 08/01/2010 06:31 PM, David Simcha wrote:
>>
>>> Ok, here's the code, I've got it down to about 800 milliseconds for a 2^20 double[] -> Complex!double[] transform. In comparison, the R statistics package, which is probably very heavily optimized and uses (I think) GCC as its compiler can do a similar transform in ~610 milliseconds, so my code isn't blindingly fast, but it's not unreasonably slow.
>>>
>>> On 8/1/2010 5:21 PM, Jason Spencer wrote:
>>>
>>>> I don't have much of an opinion on adding to Phobos, but I'd
>>>> definitely be
>>>> interested in seeing the code. I'm starting to do some scientific
>>>> computing
>>>> using D, and that would be a great learning reference for me.
>>>>
>>>> Jason
>>>>
>>>>
>>>> ----- Original Message ----
>>>>
>>>>> From: David Simcha<dsimcha at gmail.com>
>>>>> To: Discuss the phobos library for D<phobos at puremagic.com>
>>>>> Sent: Sun, August 1, 2010 1:32:38 PM
>>>>> Subject: Re: [phobos] FFT
>>>>>
>>>>> Speed-wise, I've just been goofing around for the past hour or so and
>>>>> I've sped
>>>>> it up 2x. It now does 1 size 2 ^^ 20 double[] -> Complex!double FFT
>>>>> in about
>>>>> 880 milliseconds.
>>>>>
>>>>> On 8/1/2010 3:36 PM, Walter Bright wrote:
>>>>>
>>>>>> David Simcha wrote:
>>>>>>
>>>>>>> As I am no longer going to use FFTs in my kernel density lib,
>>>>>>> improving
>>>>>>>
>>>>> this FFT code will be bumped down my hacking to-do list. Does what I
>>>>> have now
>>>>> sound better than nothing by a large enough margin to warrant
>>>>> inclusion in
>>>>> std.numeric, or does it sound too primitive to be widely useful? If
>>>>> it sounds
>>>>> worth including I'll clean up/document the code and send it to the
>>>>> mailing list
>>>>> for review. If it sounds too primitive, I'll just scrap it.
>>>>>
>>>>>> I don't know enough about FFT's to make any sort of informed comment.
>>>>>> _______________________________________________
>>>>>> phobos mailing list
>>>>>> phobos at puremagic.com
>>>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> phobos mailing list
>>>>> phobos at puremagic.com
>>>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>>>
>>>>>
>>>
>>>
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
>
|
August 02, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu, el 1 de agosto a las 23:10 me escribiste: > It would be useful to compare this against the other FFTs that have been talked about. I don't mean to discourage! I think the typical user of FFT would put no price on readability and would be interested exclusively in speed. If your implementation is within the ballpark, then great. If it's considerably slower, then I think people won't say "well it's slower but I like the elegance inside it". Something like this? http://permalink.gmane.org/gmane.comp.lang.d.phobos/1090 :) (it's just a joke, that post came automatically into my mind when I read this, don't bother replying :) -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- <o_O> parakenotengobarraespaciadora <o_O> aver <o_O> estoyarreglandolabarraporkeserompiounapatita |
August 02, 2010 [phobos] FFT | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | On 2 August 2010 15:41, David Simcha <dsimcha at gmail.com> wrote:
> Unfortunately I just downloaded the benchmark program for FFTW and my FFT is a ton slower, depending on how you look at it. ?Using size 2^20 as my benchmark, FFTW takes about 131 seconds to create its plan, even using -oestimate, the fastest planner. ?However, the plan can be reused if performing multiple FFTs of the same size, and once the plan is created, it can do an FFT of size 2^20 in about 53 milliseconds (which I find almost unbelievable because even sorting an array of size 2^20 using a well-optimized quick sort takes almost that long, and FFT seems like it should should have a much larger constant than quick sort), compared to my latest improvements to around 730 on the hardware I'm benchmarking on.
>
> On the other hand, for one-off use cases, the lack of needing to create a plan is a big win, both from a speed and a simplicity of API point of view. ?Benchmarking against R, which doesn't appear to use plans, making the comparison somewhat more relevant, things look better for my FFT: ?R takes about 610 milliseconds for a size 2^20 pure real FFT.
All you're seeing is the L2 cache. Did you see the link I posted to
the NG about the internals of FFTW? The graph at the top shows FFTW is
40 times faster than the 'numerical recipes' code for 2^^20. So your
factor of 13 isn't so bad. Based on that graph, if you reduce it to
say 2^^15, the factor should drop significantly. Adding a little bit
of cache awareness (using core.cpuid) should be able to avoid the
performance cliff.
Also, DMD's floating point optimiser is so primitive, you lose up to a
factor of two immediately.
|
Copyright © 1999-2021 by the D Language Foundation