July 02, 2009
BCS wrote:
> Hello Don,
> 
>> Size. Since modern CPUs are memory-bandwidth limited, it's always
>> going to be MUCH faster to use float[] instead of real[] once the
>> array size gets too big to fit in the cache. Maybe around 2000
>> elements or so.
> 
> I was under the impression that the memory buss could feed the CPU at least as fast as the CPU could process data but just with huge latency. Based on that, it's not how much data is loaded (bandwidth) but how many places it's loaded from. Is my initial assumption wrong or am I just nit picking?
> 
Intel Core2 can only perform one load per cycle, but can do one floating point add per cycle.
So in something like a[] += b[], you're limited by memory bandwidth even when everything is in the L1 cache.
But in practice, performance is usually dominated by cache misses.
1 2
Next ›   Last »