Of possible interest: fast UTF8 validation

On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote: > https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/ I re-implemented some common string functionality at Remedy using SSE 4.2 instructions. Pretty handy. Except we had to turn that code off for released products since nowhere near enough people are running SSE 4.2 capable hardware. The code linked doesn't seem to use any instructions newer than SSE2, so it's perfectly safe to run on any x64 processor. Could probably be sped up with newer SSE instructions if you're only ever running internally on hardware you control.

On 05/16/2018 08:47 AM, Ethan Watson wrote: > On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote: >> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/ >> > > I re-implemented some common string functionality at Remedy using SSE 4.2 instructions. Pretty handy. Except we had to turn that code off for released products since nowhere near enough people are running SSE 4.2 capable hardware. Is it workable to have a runtime-initialized flag that controls using SSE vs. conservative? > The code linked doesn't seem to use any instructions newer than SSE2, so it's perfectly safe to run on any x64 processor. Could probably be sped up with newer SSE instructions if you're only ever running internally on hardware you control. Even better! Contributions would be very welcome. Andrei

On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote: > https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/ D doesn't seem to have C definitions for the x86 SIMD intrinsics, which is a bummer https://issues.dlang.org/show_bug.cgi?id=18865 It's too bad that nothing came of std.simd.

On Wednesday, 16 May 2018 at 13:54:05 UTC, Andrei Alexandrescu wrote: > Is it workable to have a runtime-initialized flag that controls using SSE vs. conservative? Sure, it's workable with these kind of speed gains. Although the conservative code path ends up being slightly worse off - an extra fetch, compare and branch get introduced. My preferred method though is to just build multiple sets of binaries as DLLs/SOs/DYNLIBs, then load in the correct libraries dependant on the CPUID test at program initialisation. Current Xbox/Playstation hardware is pretty terrible when it comes to branching, so compiling with minimal branching and deploying the exact binaries for the hardware capabilities is the way I generally approach things. We never got around to setting something like that up for the PC release of Quantum Break, although we definitely talked about it.

On Wednesday, 16 May 2018 at 14:25:07 UTC, Jack Stouffer wrote: > D doesn't seem to have C definitions for the x86 SIMD intrinsics, which is a bummer Replying to highlight this. There's core.simd which doesn't look anything like SSE/AVX intrinsics at all, and looks a lot more like a wrapper for writing assembly instructions directly. And even better - LDC doesn't support core.simd and has its own intrinsics that don't match the SSE/AVX intrinsics API published by Intel. And since I'm a multi-platform developer, the "What about NEON intrinsics?" question always sits in the back of my mind. I ended up implementing my own SIMD primitives in Binderoo, but they're all versioned out for LDC at the moment until I look in to it and complete the implementation.

On Wednesday, 16 May 2018 at 15:48:09 UTC, Joakim wrote: > On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote: >> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/ > > Sigh, this reminds me of the old quote about people spending a bunch of time making more efficient what shouldn't be done at all. Validating UTF-8 is super common, most text protocols and files these days would use it, other would have an option to do so. I’d like our validateUtf to be fast, since right now we do validation every time we decode string. And THAT is slow. Trying to not validate on decode means most things should be validated on input...

May 16, 2018

Re: Of possible interest: fast UTF8 validation

Posted by Walter Bright
in reply to Ethan Watson

Permalink

Walter Bright

Posted in reply to Ethan Watson

Permalink

On 5/16/2018 7:38 AM, Ethan Watson wrote:
> My preferred method though is to just build multiple sets of binaries as DLLs/SOs/DYNLIBs, then load in the correct libraries dependant on the CPUID test at program initialisation.
I used to do things like that a simpler way. 3 functions would be created:

  void FeatureInHardware();
  void EmulateFeature();
  void Select();
  void function() doIt = &Select;

I.e. the first time doIt is called, it calls the Select function which then resets doIt to either FeatureInHardware() or EmulateFeature().

It costs an indirect call, but if you move it up the call hierarchy a bit so it isn't in the hot loops, the indirect function call cost is negligible.

The advantage is there was only one binary.

----

The PDP-11 had an optional chipset to do floating point. The compiler generated function calls that emulated the floating point:

    call FPADD
    call FPSUB
    ...

Those functions would check to see if the FPU existed. If it did, it would in-place patch the binary to replace the calls with FPU instructions! Of course, that won't work these days because of protected code pages.

----

In the bad old DOS days, emulator calls were written out by the compiler. Special relocation fixup records were emitted for them. The emulator or the FPU library was then linked in, and included special relocation fixup values which tricked the linker fixup mechanism into patching those instructions with either emulator calls or FPU instructions. It was just brilliant!

On Wednesday, 16 May 2018 at 16:48:28 UTC, Dmitry Olshansky wrote: > On Wednesday, 16 May 2018 at 15:48:09 UTC, Joakim wrote: >> On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote: >>> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/ >> >> Sigh, this reminds me of the old quote about people spending a bunch of time making more efficient what shouldn't be done at all. > > Validating UTF-8 is super common, most text protocols and files these days would use it, other would have an option to do so. > > I’d like our validateUtf to be fast, since right now we do validation every time we decode string. And THAT is slow. Trying to not validate on decode means most things should be validated on input... I think you know what I'm referring to, which is that UTF-8 is a badly designed format, not that input validation shouldn't be done.

Forums