Jump to page: 1 26  
Page
Thread overview
Can you shrink it further?
Oct 09, 2016
Stefan Koch
Oct 10, 2016
safety0ff
Oct 10, 2016
Stefan Koch
Oct 10, 2016
Stefan Koch
Oct 10, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Lurker
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Matthias Bentrup
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Temtaime
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Temtaime
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Ethan Watson
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Matthias Bentrup
Oct 11, 2016
Stefan Koch
Oct 12, 2016
Matthias Bentrup
Oct 12, 2016
Stefan Koch
Oct 12, 2016
Matthias Bentrup
Oct 12, 2016
Stefan Koch
Oct 12, 2016
Stefan Koch
Oct 12, 2016
Stefan Koch
Oct 12, 2016
Stefan Koch
Oct 12, 2016
Stefan Koch
Oct 12, 2016
Stefan Koch
Oct 11, 2016
David Nadlinger
Oct 11, 2016
Stefan Koch
Oct 11, 2016
Stefan Koch
Oct 12, 2016
safety0ff
Oct 12, 2016
safety0ff
Oct 14, 2016
Stefan Koch
Oct 14, 2016
Stefan Koch
October 09, 2016
Had a bit of a good time with popFront for UTF, got to this: https://github.com/dlang/phobos/pull/4848. I suspect there are cleverer things that can be done. Can anyone make it tighter? Take a look at the disassembly referred to in the pull request. Thanks! -- Andrei
October 09, 2016
On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote:
> Had a bit of a good time with popFront for UTF, got to this: https://github.com/dlang/phobos/pull/4848. I suspect there are cleverer things that can be done. Can anyone make it tighter? Take a look at the disassembly referred to in the pull request. Thanks! -- Andrei

With -Oz both functions look almost the same on ldc and gdc
October 09, 2016
On 10/09/2016 07:05 PM, Stefan Koch wrote:
> On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote:
>> Had a bit of a good time with popFront for UTF, got to this:
>> https://github.com/dlang/phobos/pull/4848. I suspect there are
>> cleverer things that can be done. Can anyone make it tighter? Take a
>> look at the disassembly referred to in the pull request. Thanks! --
>> Andrei
>
> With -Oz both functions look almost the same on ldc and gdc

The big ldc link I posted:

http://ldc.acomirei.ru/#compilers:!((compiler:ldc,options:'-release+-O3+-boundscheck%3Doff',sourcez:G4ewlgJgBADiMDEBOIB2AXAjACiQUwDMoBjACwEMkBtAXSgGcBKKAAXSQFd709oYP8UVCHSkUAdygBvAFBQoYALaKO6cgCMANnhJQAvAyoAGGgG45CotmJQAPFCMAPABxHms%2BfPr6GAOhjsVJhQvr5%2B2qgA5qJmFgC%2BUHia9DoenkpwSOgkIPi%2B6mDo8OaeUBxgGAo%2BAOwcUAC0UOr0SNgAfjYAPlCYHIwl6YqZ2dwQvuSakbmFpIoD8mBWYFAAfFAAbH1VBpjzDD70/oGKFdhgADTheFGizKFXN6Sx8nEyrzKgkLDwyGjoAEy4QgkCjUOjcJDMNicbi8WACHTCUQSaQWJQqNRaHQ2AwQ4zPSxQax2BwuNwWNLyAD0VICSAU3i4cKKUHIn2gHFQXLwxDw9HolAAnk0QJyIN4yDyANYVSK%2BCxedgHdhHajBe4Q3wRaJPAaveRJFKE4l6AxOAgERgUhVQGlNFplVAQQgVOEEXIOG0Q5VIVVBEJhTXamJ6iyGvDW0oZXLZYi5PD5QrwKAALntSD25FUICginozRqDXT7WI/RtiyJ2DzBfs/2Y3Sr%2Be8a3WjCtpUpnhpAElUMAJl8AKoAFQQ9WcNvk1e8Oz2%2BsGwwY6DGEymSBmcy9StxKrpVBOqEbzUuQeuOrugZVwd18TeMiAA)),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3

uses -O3, not -Oz. Changing the flag to -Oz seems to cause no change in the generated code. I see different code generated for popFront1 (proposed) and popFront2 (candidate), e.g. popFront1 has 27 asm lines whereas popFront2 has 34. (BTW: is there an easy way to see the size of the generated functions?)

Under what conditions did you notice almost identical code?


Thanks,

Andrei
October 10, 2016
On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote:
> I suspect there are cleverer things that can be done.

Less clever seemed to work for me:

void popFront1(ref char[] s) @trusted pure nothrow {
  immutable c = s[0];
  if (c < 0x80 || c >= 0xFE) {
    s = s.ptr[1 .. s.length];
  } else {
    import core.bitop;
    size_t i = 7u - bsr(~c); // N.B. changed uint to size_t
    import std.algorithm;
    s = s.ptr[min(i, s.length) .. s.length];
  }
}
October 09, 2016
On 10/09/2016 10:55 PM, safety0ff wrote:
> On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote:
>> I suspect there are cleverer things that can be done.
>
> Less clever seemed to work for me:
>
> void popFront1(ref char[] s) @trusted pure nothrow {
>   immutable c = s[0];
>   if (c < 0x80 || c >= 0xFE) {
>     s = s.ptr[1 .. s.length];
>   } else {
>     import core.bitop;
>     size_t i = 7u - bsr(~c); // N.B. changed uint to size_t
>     import std.algorithm;
>     s = s.ptr[min(i, s.length) .. s.length];
>   }
> }

Oh, forgot to mention: the initial/short path should only check for ASCII, i.e. c < 0x80. -- Andrei
October 10, 2016
On Monday, 10 October 2016 at 03:55:17 UTC, Andrei Alexandrescu wrote:
>
> Oh, forgot to mention: the initial/short path should only check for ASCII, i.e. c < 0x80. -- Andrei

void popFront1(ref char[] s) @trusted pure nothrow {
  immutable c = s[0];
  size_t char_length = 1;
  if (!(c & 0x80)) {
    goto Lend;
  } else {
    import core.bitop;
    uint i = 7u - bsr(~c | 1u);
    import std.algorithm;
    if (i > 6u) goto Lend;
    char_length = min(i, s.length);
  }
  Lend :
  s = s.ptr[char_length .. s.length];
}


This one removes one unnecessary ret.
It will also probably be better for the branch-predictor.
October 10, 2016
On Monday, 10 October 2016 at 15:17:05 UTC, Stefan Koch wrote:
> On Monday, 10 October 2016 at 03:55:17 UTC, Andrei Alexandrescu wrote:
>>
>> Oh, forgot to mention: the initial/short path should only check for ASCII, i.e. c < 0x80. -- Andrei

Since in this case stability of min is concern, you can shave of another 2 instructions by writing the comparison by hand

void popFront1(ref char[] s) @trusted pure nothrow {
  immutable c = s[0];
  size_t char_length = 1;
  if (!(c & 0x80)) {
    goto Lend;
  } else {
    import core.bitop;
    uint i = 7u - bsr(~c | 1u);
    import std.algorithm;
    if (i > 6u) goto Lend;
    char_length = i < s.length ? i : s.length;
  }
  Lend :
  s = s.ptr[char_length .. s.length];
}
October 10, 2016
On Monday, 10 October 2016 at 15:37:09 UTC, Stefan Koch wrote:
> Since in this case stability of min is concern, you can shave of another 2 instructions by writing the comparison by hand

In this case the stability of min is +NO+ concern.
October 11, 2016
This version has 24 instructions but these have a smaller encoding then and are generally inexpensive
With inline asm and conditional moves it would be possible to reduce this even further
to ~20 instructions.

void popFront1(ref char[] s) @trusted pure nothrow {
  immutable c = s[0];
  size_t char_length = 1;
  if (c < 127)
  {
    goto Lend;
  } else {
    if ((c & 0b1100_0000) == 0b1000_0000)
    {
      // This is invalid as a first char
      goto Lerror;
    }
    if (c < 192)
    {
      char_length = 2;
      goto Lend;
    }
    if (c < 240) {
      char_length = 3;
      goto Lend;
    }
    if (c < 248) {
       char_length = 4;
      goto Lend;
    }

    //These characters are also no longer valid
    Lerror : assert(0);
  }
  Lend :
  s = s.ptr[char_length .. s.length];
}

October 10, 2016
That looks good. I'm just worried about the jump forward - ideally the case c < 127 would simply entail a quick return. I tried a fix, but it didn't do what I wanted in ldc. We shouldn't assert(0) if wrong - just skip one byte. Also, are we right to not worry about 5- and 6-byte sequences? The docs keep on threatening with it, and then immediately mention those are not valid.

void popFront3(ref char[] s) @trusted pure nothrow {
  immutable c = s[0];
  uint char_length = 1;
  if (c < 127)
  {
  Lend :
    s = s.ptr[char_length .. s.length];
  } else {
    if (c < 192)
    {
      char_length = 2;
      goto Lend;
    }
    if (c < 240) {
      char_length = 3;
      goto Lend;
    }
    if (c < 248) {
       char_length = 4;
    }
    goto Lend;
  }
}


Andrei

On 10/10/16 9:39 PM, Stefan Koch wrote:
> This version has 24 instructions but these have a smaller encoding then
> and are generally inexpensive
> With inline asm and conditional moves it would be possible to reduce
> this even further
> to ~20 instructions.
>
> void popFront1(ref char[] s) @trusted pure nothrow {
>   immutable c = s[0];
>   size_t char_length = 1;
>   if (c < 127)
>   {
>     goto Lend;
>   } else {
>     if ((c & 0b1100_0000) == 0b1000_0000)
>     {
>       // This is invalid as a first char
>       goto Lerror;
>     }
>     if (c < 192)
>     {
>       char_length = 2;
>       goto Lend;
>     }
>     if (c < 240) {
>       char_length = 3;
>       goto Lend;
>     }
>     if (c < 248) {
>        char_length = 4;
>       goto Lend;
>     }
>
>     //These characters are also no longer valid
>     Lerror : assert(0);
>   }
>   Lend :
>   s = s.ptr[char_length .. s.length];
> }
>

« First   ‹ Prev
1 2 3 4 5 6