May 27, 2013 Re: New UTF-8 stride function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On 05/26/2013 10:49 PM, Dmitry Olshansky wrote: > If there is anything that come out of UTF-8 discussion is that I decided > to dust off my experimental implementation of UTF-8 stride function. > Just for fun. > > The key difference vs std is in handling non-ASCII case. > I'm replacing bsr intrinsic with a what I call an "in-register lookup > table" (neat stuff that is a piece of cake, thx to CTFE). > > See unittest/benchmark here: > https://gist.github.com/blackwhale/5653927 > Looks promising. > Test files I used: > https://github.com/blackwhale/gsoc-bench-2012/blob/master/arwiki-latest-all-titles-in-ns0 > > https://github.com/blackwhale/gsoc-bench-2012/blob/master/dewiki-latest-all-titles-in-ns0 > > https://github.com/blackwhale/gsoc-bench-2012/blob/master/dewiki-latest-all-titles-in-ns0 > > https://github.com/blackwhale/gsoc-bench-2012/blob/master/ruwiki-latest-all-titles-in-ns0 > These are huge and most likely the performance is limited by the memory bandwith. |
May 27, 2013 Re: New UTF-8 stride function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | 27-May-2013 23:21, Martin Nowak пишет: > > On 05/26/2013 10:49 PM, Dmitry Olshansky wrote: > > If there is anything that come out of UTF-8 discussion is that I decided > > to dust off my experimental implementation of UTF-8 stride function. > > Just for fun. > > > > The key difference vs std is in handling non-ASCII case. > > I'm replacing bsr intrinsic with a what I call an "in-register lookup > > table" (neat stuff that is a piece of cake, thx to CTFE). > > > > See unittest/benchmark here: > > https://gist.github.com/blackwhale/5653927 > > > Looks promising. Cool, I'm not alone in this :) The only definitive results so far is that it takes less cycles on 32 bit. For me AMD CodeAnalyst confirms this is literally in cycles of up to 33% less with smaller samples in a loop. ASCII-only case seems to stay more or less the same (at least cycle-wise but not in time...) saving my sanity. > > These are huge and most likely the performance is limited by the memory > bandwith. > That could be it. I'll be making measurement on smaller samples of said files and spin on them. More tests to come tomorrow. -- Dmitry Olshansky |
May 27, 2013 Re: New UTF-8 stride function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | On 05/27/2013 09:21 PM, Martin Nowak wrote:
> > See unittest/benchmark here:
> > https://gist.github.com/blackwhale/5653927
> >
> Looks promising.
This will not detect 0xFF as invalid UTF-8 sequence.
For sequences with 5 or 6 bytes, that aren't used for unicode, it will return a stride of 4.
|
May 28, 2013 Re: New UTF-8 stride function | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | 28-May-2013 00:42, Martin Nowak пишет: > On 05/27/2013 09:21 PM, Martin Nowak wrote: >> > See unittest/benchmark here: >> > https://gist.github.com/blackwhale/5653927 >> > >> Looks promising. > > This will not detect 0xFF as invalid UTF-8 sequence. > For sequences with 5 or 6 bytes, that aren't used for unicode, it will > return a stride of 4. > First of all there is a minor bug in std.utf in a sense that it accepts sequences of 5 and 6 bytes. They are simply explicitly not defined per Unicode standard and should throw invalid UTF as well. OK I just need to consider the next bit making the whole mask 4bits wide. Thus I need 16 slots in a register. 64bit version will fit just fine in a register 4*16 = 64. 32bit version will have to go with packing 2bits per slot and doing +1 afterwards. Here is an updated version that I'm testing again: https://github.com/blackwhale/gsoc-bench-2012/blob/master/fast_stride.d -- Dmitry Olshansky |
Copyright © 1999-2021 by the D Language Foundation