Table lookups - this is pretty definitive - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Table lookups - this is pretty definitive

Thread overview

Table lookups - this is pretty definitive
Apr 01, 2014 Walter Bright
Apr 01, 2014 w0rp
Apr 01, 2014 w0rp
Apr 01, 2014 Denis Shelomovskij
Apr 01, 2014 Denis Shelomovskij
Apr 01, 2014 Walter Bright
Apr 01, 2014 Walter Bright
Apr 01, 2014 John Colvin
Apr 01, 2014 Walter Bright
Apr 01, 2014 Peter Alexander
Apr 01, 2014 Walter Bright
Apr 01, 2014 JR
Apr 01, 2014 Artur Skawina
Apr 01, 2014 bearophile
Apr 01, 2014 Andrej Mitrovic
Apr 01, 2014 Walter Bright
Apr 01, 2014 bearophile
Apr 02, 2014 Brad Anderson
Apr 02, 2014 bearophile
Apr 03, 2014 Brad Anderson
Apr 01, 2014 Walter Bright
Apr 01, 2014 bearophile
Apr 01, 2014 Tove
Apr 01, 2014 Dmitry Olshansky
Apr 01, 2014 Walter Bright
Apr 02, 2014 Dmitry Olshansky
Apr 01, 2014 Ivan Kazmenko
Apr 01, 2014 Ivan Kazmenko
Apr 01, 2014 Vladimir Panteleev
Apr 01, 2014 Walter Bright
Apr 02, 2014 Marco Leise
Apr 02, 2014 Daniel N
Apr 02, 2014 Dmitry Olshansky
Apr 02, 2014 bearophile
Apr 02, 2014 Dmitry Olshansky
Apr 02, 2014 Walter Bright
Apr 02, 2014 Dmitry Olshansky
Apr 02, 2014 Walter Bright
Apr 02, 2014 bearophile
Apr 03, 2014 Walter Bright
Apr 03, 2014 bearophile
Apr 03, 2014 Walter Bright
Apr 03, 2014 bearophile
Apr 03, 2014 Artur Skawina
Apr 03, 2014 Dmitry Olshansky
Apr 06, 2014 Marco Leise
Apr 03, 2014 Dmitry Olshansky
Apr 02, 2014 monarch_dodra
Apr 02, 2014 Daniel N
Apr 02, 2014 Dmitry Olshansky
Apr 02, 2014 Walter Bright
Apr 02, 2014 Dmitry Olshansky
Apr 03, 2014 monarch_dodra
Apr 03, 2014 Andrei Alexandrescu
Apr 03, 2014 monarch_dodra
Apr 03, 2014 Dmitry Olshansky
Apr 02, 2014 Walter Bright
Apr 07, 2014 Iain Buclaw
Apr 07, 2014 Ola Fosheim Grøstad
Apr 07, 2014 Walter Bright
Apr 07, 2014 Martin Nowak
Apr 07, 2014 monarch_dodra
Apr 07, 2014 Walter Bright
Apr 17, 2014 Alix Pexton
Apr 17, 2014 ixid
Apr 17, 2014 monarch_dodra
Apr 18, 2014 Alix Pexton
Apr 18, 2014 Alix Pexton
Apr 18, 2014 Alix Pexton

April 01, 2014

Table lookups - this is pretty definitive

Posted by Walter Bright

Walter Bright

Try this benchmark comparing various classification schemes:
---------------------------------
import core.stdc.stdlib;
import core.stdc.string;

import std.algorithm;
import std.array;
import std.ascii;
import std.datetime;
import std.range;
import std.stdio;
import std.traits;

bool isIdentifierChar0(ubyte c)
{
    return isAlphaNum(c) || c == '_' || c == '$';
}

bool isIdentifierChar1(ubyte c)
{
    return ((c >= '0' || c == '$') &&
            (c <= '9' || c >= 'A')  &&
            (c <= 'Z' || c >= 'a' || c == '_') &&
            (c <= 'z'));
}

immutable bool[256] tab2;
static this()
{
    for (size_t u = 0; u < 0x100; ++u)
    {
        tab2[u] = isIdentifierChar0(cast(ubyte)u);
    }
}

bool isIdentifierChar2(ubyte c)
{
    return tab2[c];
}

/*********************************************/

int f0()
{
    int x;
    for (uint u = 0; u < 0x100; ++u)
    {
        x += isIdentifierChar0(cast(ubyte)u);
    }
    return x;
}

int f1()
{
    int x;
    for (uint u = 0; u < 0x100; ++u)
    {
        x += isIdentifierChar1(cast(ubyte)u);
    }
    return x;
}

int f2()
{
    int x;
    for (uint u = 0; u < 0x100; ++u)
    {
        x += isIdentifierChar2(cast(ubyte)u);
    }
    return x;
}

void main()
{
    auto r = benchmark!(f0, f1, f2)(10_000);
    writefln("Milliseconds %s %s %s", r[0].msecs, r[1].msecs, r[2].msecs);
}

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by w0rp
in reply to Walter Bright

w0rp

Posted in reply to Walter Bright

>Milliseconds 22 14 10

Yeah, table lookup is the way to go for ASCII, it seems.

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by w0rp
in reply to w0rp

w0rp

Posted in reply to w0rp

On Tuesday, 1 April 2014 at 18:41:03 UTC, w0rp wrote:
>>Milliseconds 22 14 10
>
> Yeah, table lookup is the way to go for ASCII, it seems.

> Milliseconds 13 8 2

After I remember to use -inline -release -O

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by Denis Shelomovskij
in reply to Walter Bright

Denis Shelomovskij

Posted in reply to Walter Bright

01.04.2014 22:35, Walter Bright пишет:
> Try this benchmark comparing various classification schemes:
> ---------------------------------
> import core.stdc.stdlib;
> import core.stdc.string;
>
> import std.algorithm;
> import std.array;
> import std.ascii;
> import std.datetime;
> import std.range;
> import std.stdio;
> import std.traits;
>
> bool isIdentifierChar0(ubyte c)
> {
>      return isAlphaNum(c) || c == '_' || c == '$';
> }
>
> bool isIdentifierChar1(ubyte c)
> {
>      return ((c >= '0' || c == '$') &&
>              (c <= '9' || c >= 'A')  &&
>              (c <= 'Z' || c >= 'a' || c == '_') &&
>              (c <= 'z'));
> }
>
> immutable bool[256] tab2;
> static this()
> {
>      for (size_t u = 0; u < 0x100; ++u)
>      {
>          tab2[u] = isIdentifierChar0(cast(ubyte)u);
>      }
> }
>
> bool isIdentifierChar2(ubyte c)
> {
>      return tab2[c];
> }
>
> /*********************************************/
>
> int f0()
> {
>      int x;
>      for (uint u = 0; u < 0x100; ++u)
>      {
>          x += isIdentifierChar0(cast(ubyte)u);
>      }
>      return x;
> }
>
> int f1()
> {
>      int x;
>      for (uint u = 0; u < 0x100; ++u)
>      {
>          x += isIdentifierChar1(cast(ubyte)u);
>      }
>      return x;
> }
>
> int f2()
> {
>      int x;
>      for (uint u = 0; u < 0x100; ++u)
>      {
>          x += isIdentifierChar2(cast(ubyte)u);
>      }
>      return x;
> }
>
> void main()
> {
>      auto r = benchmark!(f0, f1, f2)(10_000);
>      writefln("Milliseconds %s %s %s", r[0].msecs, r[1].msecs, r[2].msecs);
> }

Some regular benchmark notes:

1. The first one passed to `benchmark` is always slower (e.g. pass `f2` and see).

2. Unexpected program behaviour changes:

Let's use `benchmark!(f1, f1, f1)(1_000_000)`:
Milliseconds 928 889 888

Then copy `isAlphaNum` in file (benchmark still shows same results) and change `dchar c` to `ubyte c`, result changes:
Milliseconds 849 815 827
The difference is sufficient but `isAlphaNum` not called in benchmark.

Also `f0` is faster than `f1`, benchmark!(f0, f0, f0)(1_000_000):
Milliseconds 731 694 702

And `f2` wins, it's the only obvious thing, benchmark!(f2, f2, f2)(1_000_000):
Milliseconds 227 184 184


Compiler used: dmd -O -inline -release

-- 
Денис В. Шеломовский
Denis V. Shelomovskij

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by John Colvin
in reply to Walter Bright

John Colvin

Posted in reply to Walter Bright

On Tuesday, 1 April 2014 at 18:35:50 UTC, Walter Bright wrote:
> Try this benchmark comparing various classification schemes:
> ---------------------------------
> import core.stdc.stdlib;
> import core.stdc.string;
>
> import std.algorithm;
> import std.array;
> import std.ascii;
> import std.datetime;
> import std.range;
> import std.stdio;
> import std.traits;
>
> bool isIdentifierChar0(ubyte c)
> {
>     return isAlphaNum(c) || c == '_' || c == '$';
> }
>
> bool isIdentifierChar1(ubyte c)
> {
>     return ((c >= '0' || c == '$') &&
>             (c <= '9' || c >= 'A')  &&
>             (c <= 'Z' || c >= 'a' || c == '_') &&
>             (c <= 'z'));
> }
>
> immutable bool[256] tab2;
> static this()
> {
>     for (size_t u = 0; u < 0x100; ++u)
>     {
>         tab2[u] = isIdentifierChar0(cast(ubyte)u);
>     }
> }
>
> bool isIdentifierChar2(ubyte c)
> {
>     return tab2[c];
> }
>
> /*********************************************/
>
> int f0()
> {
>     int x;
>     for (uint u = 0; u < 0x100; ++u)
>     {
>         x += isIdentifierChar0(cast(ubyte)u);
>     }
>     return x;
> }
>
> int f1()
> {
>     int x;
>     for (uint u = 0; u < 0x100; ++u)
>     {
>         x += isIdentifierChar1(cast(ubyte)u);
>     }
>     return x;
> }
>
> int f2()
> {
>     int x;
>     for (uint u = 0; u < 0x100; ++u)
>     {
>         x += isIdentifierChar2(cast(ubyte)u);
>     }
>     return x;
> }
>
> void main()
> {
>     auto r = benchmark!(f0, f1, f2)(10_000);
>     writefln("Milliseconds %s %s %s", r[0].msecs, r[1].msecs, r[2].msecs);
> }

It's not really possible to tell anything from that benchmark, especially with fancy modern optimisers and branch predictors.

1) ldc and gdc both optimise away some/all of the tests respectively.
2) once isIdentifierCharX is inlined, the compiler can determine what the results will be before-hand.
3) Even without 1) and 2), the results are unrealistically (very!) correlated, leading to a branch-prediction bonanza. I'm sure you know how significant that can be on modern CPUs. It is also very friendly to pre-fetching of the lookup table*

Perhaps have the same benchmark, but working on realistic data from a file.

*In my experience, the cost of using lookup tables often only appears in real code, where cache pressure and predictability becomes worse. This is especially true of lazy, range-based code. If several layers of a range use different lookup tables you can quickly end up ruining cache performance compared to an eager approach.

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by JR
in reply to Walter Bright

JR

Posted in reply to Walter Bright

On Tuesday, 1 April 2014 at 18:35:50 UTC, Walter Bright wrote:
> immutable bool[256] tab2;
> static this()
> {
>     for (size_t u = 0; u < 0x100; ++u)
>     {
>         tab2[u] = isIdentifierChar0(cast(ubyte)u);
>     }
> }

Table lookups are awesome.

While tab2 there lends itself to being populated in a static this() it would really lower the bar if we could define static immutable AA literals at compile-time. Or at least CTFE-fill them. Having to rely on static this() for such is unfortunate and hacky.

Relevant is http://forum.dlang.org/thread/oxnwtojtdymopgvanbwz@forum.dlang.org in which I get a speed increase of >3.5x by doing string-to-enum translation/mapping using someEnum[string] AA lookups, instead of using someString.to!someEnum.

(Given CtAAs, std.conv.to could then generate such for various translations where cheap -- assuming we still like AAs?)

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by Denis Shelomovskij
in reply to Denis Shelomovskij

Denis Shelomovskij

Posted in reply to Denis Shelomovskij

01.04.2014 23:22, Denis Shelomovskij пишет:
> 01.04.2014 22:35, Walter Bright пишет:
>> Try this benchmark comparing various classification schemes:
>> ---------------------------------
>> import core.stdc.stdlib;
>> import core.stdc.string;
>>
>> import std.algorithm;
>> import std.array;
>> import std.ascii;
>> import std.datetime;
>> import std.range;
>> import std.stdio;
>> import std.traits;
>>
>> bool isIdentifierChar0(ubyte c)
>> {
>>      return isAlphaNum(c) || c == '_' || c == '$';
>> }
>>
>> bool isIdentifierChar1(ubyte c)
>> {
>>      return ((c >= '0' || c == '$') &&
>>              (c <= '9' || c >= 'A')  &&
>>              (c <= 'Z' || c >= 'a' || c == '_') &&
>>              (c <= 'z'));
>> }
>>
>> immutable bool[256] tab2;
>> static this()
>> {
>>      for (size_t u = 0; u < 0x100; ++u)
>>      {
>>          tab2[u] = isIdentifierChar0(cast(ubyte)u);
>>      }
>> }
>>
>> bool isIdentifierChar2(ubyte c)
>> {
>>      return tab2[c];
>> }
>>
>> /*********************************************/
>>
>> int f0()
>> {
>>      int x;
>>      for (uint u = 0; u < 0x100; ++u)
>>      {
>>          x += isIdentifierChar0(cast(ubyte)u);
>>      }
>>      return x;
>> }
>>
>> int f1()
>> {
>>      int x;
>>      for (uint u = 0; u < 0x100; ++u)
>>      {
>>          x += isIdentifierChar1(cast(ubyte)u);
>>      }
>>      return x;
>> }
>>
>> int f2()
>> {
>>      int x;
>>      for (uint u = 0; u < 0x100; ++u)
>>      {
>>          x += isIdentifierChar2(cast(ubyte)u);
>>      }
>>      return x;
>> }
>>
>> void main()
>> {
>>      auto r = benchmark!(f0, f1, f2)(10_000);
>>      writefln("Milliseconds %s %s %s", r[0].msecs, r[1].msecs,
>> r[2].msecs);
>> }
>
> Some regular benchmark notes:
>
> 1. The first one passed to `benchmark` is always slower (e.g. pass `f2`
> and see).
>
> 2. Unexpected program behaviour changes:
>
> Let's use `benchmark!(f1, f1, f1)(1_000_000)`:
> Milliseconds 928 889 888
>
> Then copy `isAlphaNum` in file (benchmark still shows same results) and
> change `dchar c` to `ubyte c`, result changes:
> Milliseconds 849 815 827
> The difference is sufficient but `isAlphaNum` not called in benchmark.
>
> Also `f0` is faster than `f1`, benchmark!(f0, f0, f0)(1_000_000):
> Milliseconds 731 694 702
>
> And `f2` wins, it's the only obvious thing, benchmark!(f2, f2,
> f2)(1_000_000):
> Milliseconds 227 184 184
>
>
> Compiler used: dmd -O -inline -release
>

Few more words about `dmd`:

"Hey, silly, you still use `== x` to compare with `x`? Use table lookups! And never cast `size_t` to `ubyte`! See:
---
import std.datetime;
import std.stdio;


immutable bool[256] tab2;

static this()
{
    foreach(size_t u; 0 .. 0x100)
        tab2[u] = u == '_';
}

/*********************************************/

int f0()
{
    int x;
    foreach(uint u; 0 .. 0x100)
        x += u == '_';
    return x;
}

int f2()
{
    int x;
    foreach(uint u; 0 .. 0x100)
        x += tab2[cast(ubyte)u];
    return x;
}

int f2plus()
{
    int x;
    foreach(size_t u; 0 .. 0x100)
        x += tab2[u];
    return x;
}

void main()
{
    auto r = benchmark!(f0, f0, f0)(1_000_000);
    writefln("Milliseconds %s %s %s (f0)", r[0].msecs, r[1].msecs, r[2].msecs);

    r = benchmark!(f2, f2, f2)(1_000_000);
    writefln("Milliseconds %s %s %s (f2)", r[0].msecs, r[1].msecs, r[2].msecs);

    r = benchmark!(f2plus, f2plus, f2plus)(1_000_000);
    writefln("Milliseconds %s %s %s (f2plus)", r[0].msecs, r[1].msecs, r[2].msecs);
}
---
Milliseconds 323 274 281 (f0)
Milliseconds 185 185 185 (f2)
Milliseconds 105 97 105 (f2plus)
---


P.S.
I don't want to say anything with these results. I just have no idea what happens here and I don't want to investigate dmd's way of optimizing things now.


-- 
Денис В. Шеломовский
Denis V. Shelomovskij

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by Walter Bright
in reply to Denis Shelomovskij

Walter Bright

Posted in reply to Denis Shelomovskij

On 4/1/2014 12:22 PM, Denis Shelomovskij wrote:
> Compiler used: dmd -O -inline -release

For me, rewriting main() as:

void main()
{
  {
    auto r = benchmark!(f0, f1, f2)(100_000);
    writefln("Milliseconds %s %s %s", r[0].msecs, r[1].msecs, r[2].msecs);
  }
  {
    auto r = benchmark!(f0, f1, f2)(100_000);
    writefln("Milliseconds %s %s %s", r[0].msecs, r[1].msecs, r[2].msecs);
  }
}

Gives:

Milliseconds 139 99 15
Milliseconds 122 93 13

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by Tove
in reply to Walter Bright

Tove

Posted in reply to Walter Bright

On Tuesday, 1 April 2014 at 18:35:50 UTC, Walter Bright wrote:
> Try this benchmark comparing various classification schemes:
> ---------------------------------

Original program... 3 4 1
10x as many iterations... 36 47 19

I think this benchmark is flawed...

1) Apparently there are rounding issues... assuming linear scaling, 1.9 ms yields ~1ms? Probably better to use usecs...

2) Table lookups will nearly always win in isolated 1 function tests... but not necessarily in real apps.

April 01, 2014

Re: Table lookups - this is pretty definitive

Posted by Dmitry Olshansky
in reply to Walter Bright

Dmitry Olshansky

Posted in reply to Walter Bright

01-Apr-2014 22:35, Walter Bright пишет:
> Try this benchmark comparing various classification schemes:

> int f0()
> {
>      int x;
>      for (uint u = 0; u < 0x100; ++u)
>      {
>          x += isIdentifierChar0(cast(ubyte)u);
>      }
>      return x;
> }
>
> int f1()
> {
>      int x;
>      for (uint u = 0; u < 0x100; ++u)
>      {
>          x += isIdentifierChar1(cast(ubyte)u);
>      }
>      return x;
> }
>
> int f2()
> {
>      int x;
>      for (uint u = 0; u < 0x100; ++u)
>      {
>          x += isIdentifierChar2(cast(ubyte)u);
>      }
>      return x;
> }

Would strongly suggest to use 2 kinds of data - randomized and some text. And, of course, it must be loaded at run-time.

-- 
Dmitry Olshansky

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation