October 09, 2016 Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Had a bit of a good time with popFront for UTF, got to this: https://github.com/dlang/phobos/pull/4848. I suspect there are cleverer things that can be done. Can anyone make it tighter? Take a look at the disassembly referred to in the pull request. Thanks! -- Andrei |
October 09, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote:
> Had a bit of a good time with popFront for UTF, got to this: https://github.com/dlang/phobos/pull/4848. I suspect there are cleverer things that can be done. Can anyone make it tighter? Take a look at the disassembly referred to in the pull request. Thanks! -- Andrei
With -Oz both functions look almost the same on ldc and gdc
|
October 09, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On 10/09/2016 07:05 PM, Stefan Koch wrote: > On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote: >> Had a bit of a good time with popFront for UTF, got to this: >> https://github.com/dlang/phobos/pull/4848. I suspect there are >> cleverer things that can be done. Can anyone make it tighter? Take a >> look at the disassembly referred to in the pull request. Thanks! -- >> Andrei > > With -Oz both functions look almost the same on ldc and gdc The big ldc link I posted: http://ldc.acomirei.ru/#compilers:!((compiler:ldc,options:'-release+-O3+-boundscheck%3Doff',sourcez:G4ewlgJgBADiMDEBOIB2AXAjACiQUwDMoBjACwEMkBtAXSgGcBKKAAXSQFd709oYP8UVCHSkUAdygBvAFBQoYALaKO6cgCMANnhJQAvAyoAGGgG45CotmJQAPFCMAPABxHms%2BfPr6GAOhjsVJhQvr5%2B2qgA5qJmFgC%2BUHia9DoenkpwSOgkIPi%2B6mDo8OaeUBxgGAo%2BAOwcUAC0UOr0SNgAfjYAPlCYHIwl6YqZ2dwQvuSakbmFpIoD8mBWYFAAfFAAbH1VBpjzDD70/oGKFdhgADTheFGizKFXN6Sx8nEyrzKgkLDwyGjoAEy4QgkCjUOjcJDMNicbi8WACHTCUQSaQWJQqNRaHQ2AwQ4zPSxQax2BwuNwWNLyAD0VICSAU3i4cKKUHIn2gHFQXLwxDw9HolAAnk0QJyIN4yDyANYVSK%2BCxedgHdhHajBe4Q3wRaJPAaveRJFKE4l6AxOAgERgUhVQGlNFplVAQQgVOEEXIOG0Q5VIVVBEJhTXamJ6iyGvDW0oZXLZYi5PD5QrwKAALntSD25FUICginozRqDXT7WI/RtiyJ2DzBfs/2Y3Sr%2Be8a3WjCtpUpnhpAElUMAJl8AKoAFQQ9WcNvk1e8Oz2%2BsGwwY6DGEymSBmcy9StxKrpVBOqEbzUuQeuOrugZVwd18TeMiAA)),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3 uses -O3, not -Oz. Changing the flag to -Oz seems to cause no change in the generated code. I see different code generated for popFront1 (proposed) and popFront2 (candidate), e.g. popFront1 has 27 asm lines whereas popFront2 has 34. (BTW: is there an easy way to see the size of the generated functions?) Under what conditions did you notice almost identical code? Thanks, Andrei |
October 10, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote:
> I suspect there are cleverer things that can be done.
Less clever seemed to work for me:
void popFront1(ref char[] s) @trusted pure nothrow {
immutable c = s[0];
if (c < 0x80 || c >= 0xFE) {
s = s.ptr[1 .. s.length];
} else {
import core.bitop;
size_t i = 7u - bsr(~c); // N.B. changed uint to size_t
import std.algorithm;
s = s.ptr[min(i, s.length) .. s.length];
}
}
|
October 09, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to safety0ff | On 10/09/2016 10:55 PM, safety0ff wrote:
> On Sunday, 9 October 2016 at 22:11:50 UTC, Andrei Alexandrescu wrote:
>> I suspect there are cleverer things that can be done.
>
> Less clever seemed to work for me:
>
> void popFront1(ref char[] s) @trusted pure nothrow {
> immutable c = s[0];
> if (c < 0x80 || c >= 0xFE) {
> s = s.ptr[1 .. s.length];
> } else {
> import core.bitop;
> size_t i = 7u - bsr(~c); // N.B. changed uint to size_t
> import std.algorithm;
> s = s.ptr[min(i, s.length) .. s.length];
> }
> }
Oh, forgot to mention: the initial/short path should only check for ASCII, i.e. c < 0x80. -- Andrei
|
October 10, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Monday, 10 October 2016 at 03:55:17 UTC, Andrei Alexandrescu wrote:
>
> Oh, forgot to mention: the initial/short path should only check for ASCII, i.e. c < 0x80. -- Andrei
void popFront1(ref char[] s) @trusted pure nothrow {
immutable c = s[0];
size_t char_length = 1;
if (!(c & 0x80)) {
goto Lend;
} else {
import core.bitop;
uint i = 7u - bsr(~c | 1u);
import std.algorithm;
if (i > 6u) goto Lend;
char_length = min(i, s.length);
}
Lend :
s = s.ptr[char_length .. s.length];
}
This one removes one unnecessary ret.
It will also probably be better for the branch-predictor.
|
October 10, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Monday, 10 October 2016 at 15:17:05 UTC, Stefan Koch wrote:
> On Monday, 10 October 2016 at 03:55:17 UTC, Andrei Alexandrescu wrote:
>>
>> Oh, forgot to mention: the initial/short path should only check for ASCII, i.e. c < 0x80. -- Andrei
Since in this case stability of min is concern, you can shave of another 2 instructions by writing the comparison by hand
void popFront1(ref char[] s) @trusted pure nothrow {
immutable c = s[0];
size_t char_length = 1;
if (!(c & 0x80)) {
goto Lend;
} else {
import core.bitop;
uint i = 7u - bsr(~c | 1u);
import std.algorithm;
if (i > 6u) goto Lend;
char_length = i < s.length ? i : s.length;
}
Lend :
s = s.ptr[char_length .. s.length];
}
|
October 10, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Monday, 10 October 2016 at 15:37:09 UTC, Stefan Koch wrote:
> Since in this case stability of min is concern, you can shave of another 2 instructions by writing the comparison by hand
In this case the stability of min is +NO+ concern.
|
October 11, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | This version has 24 instructions but these have a smaller encoding then and are generally inexpensive With inline asm and conditional moves it would be possible to reduce this even further to ~20 instructions. void popFront1(ref char[] s) @trusted pure nothrow { immutable c = s[0]; size_t char_length = 1; if (c < 127) { goto Lend; } else { if ((c & 0b1100_0000) == 0b1000_0000) { // This is invalid as a first char goto Lerror; } if (c < 192) { char_length = 2; goto Lend; } if (c < 240) { char_length = 3; goto Lend; } if (c < 248) { char_length = 4; goto Lend; } //These characters are also no longer valid Lerror : assert(0); } Lend : s = s.ptr[char_length .. s.length]; } |
October 10, 2016 Re: Can you shrink it further? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | That looks good. I'm just worried about the jump forward - ideally the case c < 127 would simply entail a quick return. I tried a fix, but it didn't do what I wanted in ldc. We shouldn't assert(0) if wrong - just skip one byte. Also, are we right to not worry about 5- and 6-byte sequences? The docs keep on threatening with it, and then immediately mention those are not valid.
void popFront3(ref char[] s) @trusted pure nothrow {
immutable c = s[0];
uint char_length = 1;
if (c < 127)
{
Lend :
s = s.ptr[char_length .. s.length];
} else {
if (c < 192)
{
char_length = 2;
goto Lend;
}
if (c < 240) {
char_length = 3;
goto Lend;
}
if (c < 248) {
char_length = 4;
}
goto Lend;
}
}
Andrei
On 10/10/16 9:39 PM, Stefan Koch wrote:
> This version has 24 instructions but these have a smaller encoding then
> and are generally inexpensive
> With inline asm and conditional moves it would be possible to reduce
> this even further
> to ~20 instructions.
>
> void popFront1(ref char[] s) @trusted pure nothrow {
> immutable c = s[0];
> size_t char_length = 1;
> if (c < 127)
> {
> goto Lend;
> } else {
> if ((c & 0b1100_0000) == 0b1000_0000)
> {
> // This is invalid as a first char
> goto Lerror;
> }
> if (c < 192)
> {
> char_length = 2;
> goto Lend;
> }
> if (c < 240) {
> char_length = 3;
> goto Lend;
> }
> if (c < 248) {
> char_length = 4;
> goto Lend;
> }
>
> //These characters are also no longer valid
> Lerror : assert(0);
> }
> Lend :
> s = s.ptr[char_length .. s.length];
> }
>
|
Copyright © 1999-2021 by the D Language Foundation