April 07, 2006 Re: Ceci n'est pas une char | ||||
---|---|---|---|---|
| ||||
Posted in reply to Thomas Kuehne | Thomas Kuehne wrote: > Jari-Matti wrote: >>> That's very true. A "normal" hard drive reads 60 MB/s. So, >>> reading a 4 MB file takes at least 66 ms and a 1 MB UTF-8-file (only >>> ASCII-characters) is read in 17 ms (well, I'm a bit optimistic here :). >>> A modern processor executes 3 000 000 000 operations in a >>> second. Going through the UTF-8 stream takes 1 000 000 * 10 (perhaps?) >>> operations and thus costs 3 ms. So it's actually faster to read UTF-8. > > 1) your sample: English (consider Chinese) > 2) magic word: seek Yes, I know. This was just an optimistic tongue-in-the-cheek analysis :) A real world example would naturally have a lot of non-ASCII characters too, but the point is that reading huge loads of uncompressed UTF-32 data will be usually slower than reading UTF-8 if we are also checking against text corruptions. I wonder if it's any faster to read UTF-32-files from a transparently compressed reiser4 drive? -- Jari-Matti |
Copyright © 1999-2021 by the D Language Foundation