Jump to page: 1 24  
Page
Thread overview
Of possible interest: fast UTF8 validation
May 16, 2018
Ethan Watson
May 16, 2018
Ethan Watson
May 16, 2018
Walter Bright
May 16, 2018
Ethan
May 16, 2018
Walter Bright
May 16, 2018
rikki cattermole
May 17, 2018
Ethan
May 16, 2018
xenon325
May 16, 2018
Walter Bright
May 16, 2018
Jack Stouffer
May 16, 2018
Ethan Watson
May 19, 2018
David Nadlinger
May 16, 2018
Joakim
May 16, 2018
Dmitry Olshansky
May 16, 2018
Joakim
May 16, 2018
Jack Stouffer
May 16, 2018
Walter Bright
May 16, 2018
Jonathan M Davis
May 17, 2018
Joakim
May 17, 2018
Patrick Schluter
May 17, 2018
Joakim
May 17, 2018
Patrick Schluter
May 17, 2018
H. S. Teoh
May 18, 2018
Patrick Schluter
May 18, 2018
Neia Neutuladh
May 17, 2018
Patrick Schluter
May 17, 2018
Walter Bright
May 17, 2018
Dmitry Olshansky
May 17, 2018
Ethan
May 18, 2018
Joakim
May 18, 2018
Nemanja Boric
May 17, 2018
Joakim
May 17, 2018
H. S. Teoh
May 16, 2018
https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/
May 16, 2018
On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote:
> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/

I re-implemented some common string functionality at Remedy using SSE 4.2 instructions. Pretty handy. Except we had to turn that code off for released products since nowhere near enough people are running SSE 4.2 capable hardware.

The code linked doesn't seem to use any instructions newer than SSE2, so it's perfectly safe to run on any x64 processor. Could probably be sped up with newer SSE instructions if you're only ever running internally on hardware you control.
May 16, 2018
On 05/16/2018 08:47 AM, Ethan Watson wrote:
> On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote:
>> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/ 
>>
> 
> I re-implemented some common string functionality at Remedy using SSE 4.2 instructions. Pretty handy. Except we had to turn that code off for released products since nowhere near enough people are running SSE 4.2 capable hardware.

Is it workable to have a runtime-initialized flag that controls using SSE vs. conservative?

> The code linked doesn't seem to use any instructions newer than SSE2, so it's perfectly safe to run on any x64 processor. Could probably be sped up with newer SSE instructions if you're only ever running internally on hardware you control.

Even better!

Contributions would be very welcome.


Andrei
May 16, 2018
On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote:
> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/

D doesn't seem to have C definitions for the x86 SIMD intrinsics, which is a bummer

https://issues.dlang.org/show_bug.cgi?id=18865

It's too bad that nothing came of std.simd.
May 16, 2018
On Wednesday, 16 May 2018 at 13:54:05 UTC, Andrei Alexandrescu wrote:
> Is it workable to have a runtime-initialized flag that controls using SSE vs. conservative?

Sure, it's workable with these kind of speed gains. Although the conservative code path ends up being slightly worse off - an extra fetch, compare and branch get introduced.

My preferred method though is to just build multiple sets of binaries as DLLs/SOs/DYNLIBs, then load in the correct libraries dependant on the CPUID test at program initialisation. Current Xbox/Playstation hardware is pretty terrible when it comes to branching, so compiling with minimal branching and deploying the exact binaries for the hardware capabilities is the way I generally approach things.

We never got around to setting something like that up for the PC release of Quantum Break, although we definitely talked about it.
May 16, 2018
On Wednesday, 16 May 2018 at 14:25:07 UTC, Jack Stouffer wrote:
> D doesn't seem to have C definitions for the x86 SIMD intrinsics, which is a bummer

Replying to highlight this.

There's core.simd which doesn't look anything like SSE/AVX intrinsics at all, and looks a lot more like a wrapper for writing assembly instructions directly.

And even better - LDC doesn't support core.simd and has its own intrinsics that don't match the SSE/AVX intrinsics API published by Intel.

And since I'm a multi-platform developer, the "What about NEON intrinsics?" question always sits in the back of my mind.

I ended up implementing my own SIMD primitives in Binderoo, but they're all versioned out for LDC at the moment until I look in to it and complete the implementation.
May 16, 2018
On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote:
> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/

Sigh, this reminds me of the old quote about people spending a bunch of time making more efficient what shouldn't be done at all.
May 16, 2018
On Wednesday, 16 May 2018 at 15:48:09 UTC, Joakim wrote:
> On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote:
>> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/
>
> Sigh, this reminds me of the old quote about people spending a bunch of time making more efficient what shouldn't be done at all.

Validating UTF-8 is super common, most text protocols and files these days would use it, other would have an option to do so.

I’d like our validateUtf to be fast, since right now we do validation every time we decode string. And THAT is slow. Trying to not validate on decode means most things should be validated on input...



May 16, 2018
On 5/16/2018 7:38 AM, Ethan Watson wrote:
> My preferred method though is to just build multiple sets of binaries as DLLs/SOs/DYNLIBs, then load in the correct libraries dependant on the CPUID test at program initialisation.
I used to do things like that a simpler way. 3 functions would be created:

  void FeatureInHardware();
  void EmulateFeature();
  void Select();
  void function() doIt = &Select;

I.e. the first time doIt is called, it calls the Select function which then resets doIt to either FeatureInHardware() or EmulateFeature().

It costs an indirect call, but if you move it up the call hierarchy a bit so it isn't in the hot loops, the indirect function call cost is negligible.

The advantage is there was only one binary.

----

The PDP-11 had an optional chipset to do floating point. The compiler generated function calls that emulated the floating point:

    call FPADD
    call FPSUB
    ...

Those functions would check to see if the FPU existed. If it did, it would in-place patch the binary to replace the calls with FPU instructions! Of course, that won't work these days because of protected code pages.

----

In the bad old DOS days, emulator calls were written out by the compiler. Special relocation fixup records were emitted for them. The emulator or the FPU library was then linked in, and included special relocation fixup values which tricked the linker fixup mechanism into patching those instructions with either emulator calls or FPU instructions. It was just brilliant!


May 16, 2018
On Wednesday, 16 May 2018 at 16:48:28 UTC, Dmitry Olshansky wrote:
> On Wednesday, 16 May 2018 at 15:48:09 UTC, Joakim wrote:
>> On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote:
>>> https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/
>>
>> Sigh, this reminds me of the old quote about people spending a bunch of time making more efficient what shouldn't be done at all.
>
> Validating UTF-8 is super common, most text protocols and files these days would use it, other would have an option to do so.
>
> I’d like our validateUtf to be fast, since right now we do validation every time we decode string. And THAT is slow. Trying to not validate on decode means most things should be validated on input...

I think you know what I'm referring to, which is that UTF-8 is a badly designed format, not that input validation shouldn't be done.
« First   ‹ Prev
1 2 3 4