View mode: basic / threaded / horizontal-split · Log in · Help
April 07, 2006
Re: Ceci n'est pas une char
Thomas Kuehne wrote:
> Jari-Matti wrote:
>>> That's very true. A "normal" hard drive reads 60 MB/s. So,
>>> reading a 4 MB file takes at least 66 ms and a 1 MB UTF-8-file (only
>>> ASCII-characters) is read in 17 ms (well, I'm a bit optimistic here :).
>>> A modern processor executes 3 000 000 000 operations in a
>>> second. Going through the UTF-8 stream takes 1 000 000 * 10 (perhaps?)
>>> operations and thus costs 3 ms. So it's actually faster to read UTF-8.
> 
> 1) your sample: English (consider Chinese)
> 2) magic word: seek

Yes, I know. This was just an optimistic tongue-in-the-cheek analysis :)
A real world example would naturally have a lot of non-ASCII characters
too, but the point is that reading huge loads of uncompressed UTF-32
data will be usually slower than reading UTF-8 if we are also checking
against text corruptions. I wonder if it's any faster to read
UTF-32-files from a transparently compressed reiser4 drive?

-- 
Jari-Matti
Next ›   Last »
1 2 3
Top | Discussion index | About this forum | D home