April 07, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | On Monday, 7 April 2014 at 11:35:37 UTC, Martin Nowak wrote:
> On 04/01/2014 08:35 PM, Walter Bright wrote:
>> Try this benchmark comparing various classification schemes:
>
> Counter example:
> https://github.com/D-Programming-Language/phobos/commit/8599d6b4acde0eff886876a56f54ecd8e6b528e9
It's all a matter of how complicated the "non-table" implementation is I guess. table-lookup has a "steep entry cost", but is constant, regardless of complexity.
Table lookups were "recently" removed from std.ascii too. The implementation walter is benching "against" is particularly complicated here.
|
April 07, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ola Fosheim Grøstad | On 4/7/2014 2:41 AM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote: >> on. Unfortunately for your small program, gcc is too smart and >> optimises everything away. > > Haha, that's funny. But why don't you put the data in a separate compilation unit? True, it's pretty easy to defeat that optimization. |
April 07, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to monarch_dodra | On 4/7/2014 7:11 AM, monarch_dodra wrote:
> It's all a matter of how complicated the "non-table" implementation is I guess.
Yup. Memory lookups can be expensive. Once it's in the cache, it's cheap.
|
April 17, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | I added a lookup scheme of my own, its not as fast as Walters (in fact its the slowest without -inline - release -O) but it uses 1 bit per entry in the table instead of a whole byte so you can have lots and lots of different tables. I'm even reasonably sure that it works correctly! =============================== import core.stdc.stdlib; import core.stdc.string; import std.algorithm; import std.array; import std.ascii; import std.datetime; import std.range; import std.stdio; import std.traits; bool isIdentifierChar0(ubyte c) { return isAlphaNum(c) || c == '_' || c == '$'; } bool isIdentifierChar1(ubyte c) { return ((c >= '0' || c == '$') && (c <= '9' || c >= 'A') && (c <= 'Z' || c >= 'a' || c == '_') && (c <= 'z')); } immutable bool[256] tab2; static this() { for (size_t u = 0; u < 0x100; ++u) { tab2[u] = isIdentifierChar0(cast(ubyte)u); } } bool isIdentifierChar2(ubyte c) { return tab2[c]; } immutable ulong[4] tab3; static this() { for (size_t u = 0; u < 0x100; ++u) { if (isIdentifierChar0(cast(ubyte)u)) { auto sub = u >>> 6; auto b = u & 0x3f; auto mask = 0x01L << b; tab3[sub] |= mask; } } } bool isIdentifierChar3(ubyte c) { auto sub = c >>> 6; c &= 0x3f; auto mask = 0x01L << c; return (tab3[sub] & mask) > 0; } int f0() { int x; for (uint u = 0; u < 0x100; ++u) { x += isIdentifierChar0(cast(ubyte)u); } return x; } int f1() { int x; for (uint u = 0; u < 0x100; ++u) { x += isIdentifierChar1(cast(ubyte)u); } return x; } int f2() { int x; for (uint u = 0; u < 0x100; ++u) { x += isIdentifierChar2(cast(ubyte)u); } return x; } int f3() { int x; for (uint u = 0; u < 0x100; ++u) { x += isIdentifierChar3(cast(ubyte)u); } return x; } void main() { auto r = benchmark!(f0, f1, f2, f3)(10_000); writefln("Milliseconds %s %s %s %s", r[0].msecs, r[1].msecs, r[2].msecs, r[3].msecs); } |
April 17, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to Alix Pexton | On Thursday, 17 April 2014 at 16:27:26 UTC, Alix Pexton wrote:
> I added a lookup scheme of my own, its not as fast as Walters (in fact its the slowest without -inline - release -O) but it uses 1 bit per entry in the table instead of a whole byte so you can have lots and lots of different tables. I'm even reasonably sure that it works correctly!
>
>
> ===============================
> import core.stdc.stdlib;
> import core.stdc.string;
>
> import std.algorithm;
> import std.array;
> import std.ascii;
> import std.datetime;
> import std.range;
> import std.stdio;
> import std.traits;
>
> bool isIdentifierChar0(ubyte c)
> {
> return isAlphaNum(c) || c == '_' || c == '$';
> }
>
> bool isIdentifierChar1(ubyte c)
> {
> return ((c >= '0' || c == '$') &&
> (c <= '9' || c >= 'A') &&
> (c <= 'Z' || c >= 'a' || c == '_') &&
> (c <= 'z'));
> }
>
> immutable bool[256] tab2;
> static this()
> {
> for (size_t u = 0; u < 0x100; ++u)
> {
> tab2[u] = isIdentifierChar0(cast(ubyte)u);
> }
> }
>
> bool isIdentifierChar2(ubyte c)
> {
> return tab2[c];
> }
>
> immutable ulong[4] tab3;
> static this()
> {
> for (size_t u = 0; u < 0x100; ++u)
> {
> if (isIdentifierChar0(cast(ubyte)u))
> {
> auto sub = u >>> 6;
> auto b = u & 0x3f;
> auto mask = 0x01L << b;
> tab3[sub] |= mask;
> }
> }
> }
>
> bool isIdentifierChar3(ubyte c)
> {
> auto sub = c >>> 6;
> c &= 0x3f;
> auto mask = 0x01L << c;
> return (tab3[sub] & mask) > 0;
> }
>
> int f0()
> {
> int x;
> for (uint u = 0; u < 0x100; ++u)
> {
> x += isIdentifierChar0(cast(ubyte)u);
> }
> return x;
> }
>
> int f1()
> {
> int x;
> for (uint u = 0; u < 0x100; ++u)
> {
> x += isIdentifierChar1(cast(ubyte)u);
> }
> return x;
> }
>
> int f2()
> {
> int x;
> for (uint u = 0; u < 0x100; ++u)
> {
> x += isIdentifierChar2(cast(ubyte)u);
> }
> return x;
> }
>
> int f3()
> {
> int x;
> for (uint u = 0; u < 0x100; ++u)
> {
> x += isIdentifierChar3(cast(ubyte)u);
> }
> return x;
> }
>
> void main()
> {
> auto r = benchmark!(f0, f1, f2, f3)(10_000);
> writefln("Milliseconds %s %s %s %s", r[0].msecs, r[1].msecs, r[2].msecs, r[3].msecs);
> }
I feel like there must be a way of making a fast bit look up but my version is only moderate in speed. You can get all the bits you need on two 64 bit registers or one SSE register. I haven't tried bt, does that work with a 64 bit register?
|
April 17, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | On Thursday, 17 April 2014 at 18:07:24 UTC, ixid wrote: > I feel like there must be a way of making a fast bit look up but my version is only moderate in speed. You can get all the bits you need on two 64 bit registers or one SSE register. I haven't tried bt, does that work with a 64 bit register? http://dlang.org/phobos/core_bitop.html#.bt ? Note it can be applied to the table in general, rather than the byte themselves. EG: ubyte[256] buf; auto b = bt(buf.ptr, 428); |
April 18, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to monarch_dodra | On 17/04/2014 8:41 PM, monarch_dodra wrote:
> On Thursday, 17 April 2014 at 18:07:24 UTC, ixid wrote:
>> I feel like there must be a way of making a fast bit look up but my
>> version is only moderate in speed. You can get all the bits you need
>> on two 64 bit registers or one SSE register. I haven't tried bt, does
>> that work with a 64 bit register?
>
> http://dlang.org/phobos/core_bitop.html#.bt
>
> ?
>
> Note it can be applied to the table in general, rather than the byte
> themselves. EG:
>
> ubyte[256] buf;
> auto b = bt(buf.ptr, 428);
I didn't think to look in core.bitop for a faster way to check bits, I'll check if it closes the gap to Walter's version.
A...
|
April 18, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to Alix Pexton | I tested this against the others... (with "-inline -release -O" of course) === uint[8] tab4; // bitop function only work on uints static this() { for (size_t u = 0; u < 0x100; ++u) { if (isIdentifierChar0(cast(ubyte)u)) { bts(tab4.ptr, u); } } } bool isIdentifierChar4(ubyte c) { return bt(tab4.ptr, c) != 0; } === it takes about 3 times as long (about the same as the original isIdentifierChar1, that used lots of &&s and ||s). So 2 shifts, 2 ands and a compare to zero, beats core.bitop. for clarity, the output of my benchmark for all 5 version was Milliseconds 21 15 2 5 15 the 2 (which on some runs was a 3) is Walter's table of bools, the 5 is my table of ulongs and the 15 on the end is core.bitop. Short of delving into the asm, which I'm trying to avoid, any other suggestions for speed-ups? A... |
April 18, 2014 Re: Table lookups - this is pretty definitive | ||||
---|---|---|---|---|
| ||||
Posted in reply to Alix Pexton | PS I discovered that core.bitop.bts() won't accept a pointer to immutable even when inside a static this. Not sure if that counts as a bug or not >< |
Copyright © 1999-2021 by the D Language Foundation