October 13, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 2016-10-13 03:26, Andrei Alexandrescu wrote: > Yah, shouldn't go in object.d as it's fairly niche. On the other hand > defining a new module for two functions seems excessive unless we have a > good theme. On the third hand we may find an existing module that's > topically close. Thoughts? -- Andrei I think it should be a new module. I think core.intrinsics, as Stefan suggested, sounds like a good idea. I don't think having a module with only two functions is a problem, assuming we expect more of these functions. We already have that case with core.attribute [1], which only have _one_ attribute defined. [1] https://github.com/dlang/druntime/blob/master/src/core/attribute.d#L54 -- /Jacob Carlborg |
October 13, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to safety0ff | On Wednesday, 12 October 2016 at 20:24:54 UTC, safety0ff wrote:
> Code: http://pastebin.com/CFCpUftW
Line 25 doesn't look trusted: reads past the end of an empty string.
|
October 13, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kagamin | On Thursday, 13 October 2016 at 14:51:50 UTC, Kagamin wrote:
> On Wednesday, 12 October 2016 at 20:24:54 UTC, safety0ff wrote:
>> Code: http://pastebin.com/CFCpUftW
>
> Line 25 doesn't look trusted: reads past the end of an empty string.
Length is checked in the loop that calls this function.
In phobos length is only checked with an assertion,
|
October 13, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Thursday, 13 October 2016 at 01:36:44 UTC, Andrei Alexandrescu wrote:
>
> Oh ok, so it's that checksum in particular that got optimized. Bad benchmark! Bad! -- Andrei
Also, I suspect a benchmark with a larger loop body might not benefit as significantly from branch hints as this one.
|
October 14, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to safety0ff | On Thursday, 13 October 2016 at 21:49:22 UTC, safety0ff wrote:
>> Bad benchmark! Bad! -- Andrei
>
> Also, I suspect a benchmark with a larger loop body might not benefit as significantly from branch hints as this one.
I disagree in longer loops code compactness is as important as in small ones.
This is about the smallest inline version of decode I could come up with :
__gshared static immutable ubyte[] charWidthTab = [
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 1, 1
];
dchar myFront(ref char[] str) pure nothrow
{
dchar c = cast(dchar) str[0];
if ((c & 128))
{
if (c & 64)
final switch(charWidthTab[c - 192])
{
case 2 :
c |= ((str[1] & 0x80) >> 5);
break;
case 3 :
c |= ((str[1] & 0x80) >> 4);
c |= ((str[2] & 0x80) >> 10);
break;
case 4 :
c |= ((str[1] & 0x80) >> 3);
c |= ((str[2] & 0x80) >> 9);
c |= ((str[3] & 0x80) >> 15);
break;
case 5,6,1 :
goto Linvalid;
}
else
Linvalid :
c = dchar.init;
}
return c;
}
|
October 15, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Friday, 14 October 2016 at 20:47:39 UTC, Stefan Koch wrote:
> On Thursday, 13 October 2016 at 21:49:22 UTC, safety0ff wrote:
>>> Bad benchmark! Bad! -- Andrei
>>
>> Also, I suspect a benchmark with a larger loop body might not benefit as significantly from branch hints as this one.
>
> I disagree in longer loops code compactness is as important as in small ones.
>
> This is about the smallest inline version of decode I could come up with :
>
> __gshared static immutable ubyte[] charWidthTab = [
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 1, 1
> ];
>
> dchar myFront(ref char[] str) pure nothrow
> {
> dchar c = cast(dchar) str[0];
> if ((c & 128))
> {
> if (c & 64)
> final switch(charWidthTab[c - 192])
> {
> case 2 :
> c |= ((str[1] & 0x80) >> 5);
> break;
> case 3 :
> c |= ((str[1] & 0x80) >> 4);
> c |= ((str[2] & 0x80) >> 10);
> break;
> case 4 :
> c |= ((str[1] & 0x80) >> 3);
> c |= ((str[2] & 0x80) >> 9);
> c |= ((str[3] & 0x80) >> 15);
> break;
> case 5,6,1 :
> goto Linvalid;
> }
> else
> Linvalid :
> c = dchar.init;
>
> }
> return c;
> }
Disregard all that code.
It is horribly wrong!
This is more correct : (Tough for some reason it does not pass the unittests)
dchar myFront(ref char[] str) pure
{
dchar c = cast(dchar) str.ptr[0];
if (c & 128)
{
if (c & 64)
{
auto l = charWidthTab.ptr[c - 192];
if (str.length < l)
goto Linvalid;
final switch (l)
{
case 2:
c = ((c & ~(64 | 128)) << 6);
c |= (str.ptr[1] & ~0x80);
break;
case 3:
c = ((c & ~(32 | 64 | 128)) << 12);
c |= ((str.ptr[1] & ~0x80) << 6);
c |= ((str.ptr[2] & ~0x80));
break;
case 4:
c = ((c & ~(16 | 32 | 64 | 128)) << 18);
c |= ((str.ptr[1] & ~0x80) << 12);
c |= ((str.ptr[2] & ~0x80) << 6);
c |= ((str.ptr[3] & ~0x80));
break;
case 5, 6, 1:
goto Linvalid;
}
}
else
Linvalid : throw new Exception("yadayada");
}
return c;
}
|
October 15, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Friday, 14 October 2016 at 20:47:39 UTC, Stefan Koch wrote: > On Thursday, 13 October 2016 at 21:49:22 UTC, safety0ff wrote: >>> Bad benchmark! Bad! -- Andrei >> >> Also, I suspect a benchmark with a larger loop body might not benefit as significantly from branch hints as this one. > > I disagree in longer loops code compactness is as important as in small ones. You must have misunderstood: My thought was simply that with a larger loop body, LLVM might not make such dramatic rearrangement of the basic blocks. Take your straw man elsewhere :-/ > > This is more correct : (Tough for some reason it does not pass the unittests) You're only validating the first byte, current code validates all of them. |
October 15, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Saturday, 15 October 2016 at 00:50:08 UTC, Stefan Koch wrote:
> On Friday, 14 October 2016 at 20:47:39 UTC, Stefan Koch wrote:
>> On Thursday, 13 October 2016 at 21:49:22 UTC, safety0ff wrote:
>>>> Bad benchmark! Bad! -- Andrei
>>>
>>> Also, I suspect a benchmark with a larger loop body might not benefit as significantly from branch hints as this one.
>>
>> I disagree in longer loops code compactness is as important as in small ones.
>>
>> This is about the smallest inline version of decode I could come up with :
>>
>> __gshared static immutable ubyte[] charWidthTab = [
>> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
>> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
>> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
>> 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 1, 1
>> ];
>>
>> dchar myFront(ref char[] str) pure nothrow
>> {
>> dchar c = cast(dchar) str[0];
>> if ((c & 128))
>> {
>> if (c & 64)
>> final switch(charWidthTab[c - 192])
>> {
>> case 2 :
>> c |= ((str[1] & 0x80) >> 5);
>> break;
>> case 3 :
>> c |= ((str[1] & 0x80) >> 4);
>> c |= ((str[2] & 0x80) >> 10);
>> break;
>> case 4 :
>> c |= ((str[1] & 0x80) >> 3);
>> c |= ((str[2] & 0x80) >> 9);
>> c |= ((str[3] & 0x80) >> 15);
>> break;
>> case 5,6,1 :
>> goto Linvalid;
>> }
>> else
>> Linvalid :
>> c = dchar.init;
>>
>> }
>> return c;
>> }
>
> Disregard all that code.
> It is horribly wrong!
>
> This is more correct : (Tough for some reason it does not pass the unittests)
>
> dchar myFront(ref char[] str) pure
> {
> dchar c = cast(dchar) str.ptr[0];
> if (c & 128)
> {
> if (c & 64)
> {
> auto l = charWidthTab.ptr[c - 192];
> if (str.length < l)
> goto Linvalid;
>
> final switch (l)
> {
> case 2:
> c = ((c & ~(64 | 128)) << 6);
> c |= (str.ptr[1] & ~0x80);
> break;
> case 3:
> c = ((c & ~(32 | 64 | 128)) << 12);
> c |= ((str.ptr[1] & ~0x80) << 6);
> c |= ((str.ptr[2] & ~0x80));
> break;
> case 4:
> c = ((c & ~(16 | 32 | 64 | 128)) << 18);
> c |= ((str.ptr[1] & ~0x80) << 12);
> c |= ((str.ptr[2] & ~0x80) << 6);
> c |= ((str.ptr[3] & ~0x80));
> break;
> case 5, 6, 1:
> goto Linvalid;
> }
> }
> else
> Linvalid : throw new Exception("yadayada");
>
> }
> return c;
> }
Looks very verbose to me. I had found in the BSD codebase a very clever utf-8 conversion function in C, maybe it can be used here. Sorry if I do not participate on the testing as I don't have a proper compilation environment here at home. Here the routine I use at work (it's in C), put that here for inspiration.
DEFINE_INLINE uint_t xctomb(char *r, wchar_t wc)
{
uint_t u8l = utf8len(wc);
switch(u8l) {
/* Note: code falls through cases! */
case 4: r[3] = 0x80 | (wc & 0x3f); wc >>= 6; wc |= 0x10000;
case 3: r[2] = 0x80 | (wc & 0x3f); wc >>= 6; wc |= 0x800;
case 2: r[1] = 0x80 | (wc & 0x3f); wc >>= 6; wc |= 0xc0;
case 1: r[0] = wc;
}
return u8l;
}
utf8len being
DEFINE_INLINE uint_t utf8len(wchar_t wc)
{
if(wc < 0x80)
return 1;
else if(wc < 0x800)
return 2;
else
if(wc < 0x10000)
return 3;
else
return 4;
}
The code generated on SPARC with gcc 3.4.6 was really good. On x86_64 with gcc 5.1 was also not bad. I have not tried a lot of alternatives as UTF-8 coding is not a bottle neck on our project. There's also no check for length 5 and 6 as they are not possible on our system, but for here it has to be added. (the DEFINE_INLINE macro is either extern inline or inline depending on some macro magic that is not of importance here).
|
October 15, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Patrick Schluter | Oooops, I should not post after drinking 2 glasses of Châteauneuf-du-pape. That function does exactly the contrary of what popFront does. This one is conversion from dchar to multibyte not multibyte to dchar as you did. Sorry for the inconvenience. |
October 15, 2016 Re: Reducing the cost of autodecoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Patrick Schluter | On 10/15/2016 12:42 PM, Patrick Schluter wrote: > Sorry if I do not participate on the testing as I don't have a proper > compilation environment here at home. https://ldc.acomirei.ru Andrei |
Copyright © 1999-2021 by the D Language Foundation