Challenge: write a really really small front() for UTF8 (page 2)

On 2014-03-23 21:22:58 +0000, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > Here's a baseline: http://goo.gl/91vIGc. Destroy! Optimizing for smallest assembly size: dchar front(char[] s) { size_t bytesize; dchar result; switch (s[0]) { case 0b00000000: .. case 0b01111111: return s[0]; case 0b11000000: .. case 0b11011111: return ((s[0] & 0b00011111) << 6) | (s[1] & 0b00011111); case 0b11100000: .. case 0b11101111: result = s[0] & 0b00001111; bytesize = 3; break; case 0b11110000: .. case 0b11110111: result = s[0] & 0b00000111; bytesize = 4; default: return dchar.init; } foreach (i; 1..bytesize) result = (result << 6) | (s[i] & 0b00111111); return result; } -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca

On 2014-03-23 22:58:32 +0000, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > Array bounds checking takes care of that. If you want the smallest assembly size with array bound checking, make the function @safe and see the disaster it causes to the assembly size. That's what you have to optimize. If you're going to optimize while looking at the assembly, better check for bounds manually: dchar front(char[] s) { if (s.length < 1) return dchar.init; size_t bytesize; dchar result; switch (s[0]) { case 0b00000000: .. case 0b01111111: return s[0]; case 0b11000000: .. case 0b11011111: result = s[0] & 0b00011111; bytesize = 2; break; case 0b11100000: .. case 0b11101111: result = s[0] & 0b00001111; bytesize = 3; break; case 0b11110000: .. case 0b11110111: result = s[0] & 0b00000111; bytesize = 4; default: return dchar.init; } if (s.length < bytesize) return dchar.init; foreach (i; 1..bytesize) result = (result << 6) | (s[i] & 0b00111111); return result; } -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca

On 3/23/14, 6:53 PM, Michel Fortin wrote: > On 2014-03-23 21:22:58 +0000, Andrei Alexandrescu > <SeeWebsiteForEmail@erdani.org> said: > >> Here's a baseline: http://goo.gl/91vIGc. Destroy! > > Optimizing for smallest assembly size: > > dchar front(char[] s) > { > size_t bytesize; > dchar result; > switch (s[0]) > { > case 0b00000000: .. case 0b01111111: > return s[0]; > case 0b11000000: .. case 0b11011111: > return ((s[0] & 0b00011111) << 6) | (s[1] & 0b00011111); > case 0b11100000: .. case 0b11101111: > result = s[0] & 0b00001111; > bytesize = 3; > break; > case 0b11110000: .. case 0b11110111: > result = s[0] & 0b00000111; > bytesize = 4; > default: > return dchar.init; > } > foreach (i; 1..bytesize) > result = (result << 6) | (s[i] & 0b00111111); > return result; > } > Nice, thanks! I'd hope for a short path for the ASCII subset, could you achieve that? Andrei

On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote: > Here's a baseline: http://goo.gl/91vIGc. Destroy! > > Andrei Here's my attempt: http://goo.gl/alJ9JS I interpreted the challenge slightly differently, I went for fewer lines of code and fewer branches. I verified the output for all valid inputs.

On 3/23/14, 8:36 PM, safety0ff wrote: > On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote: >> Here's a baseline: http://goo.gl/91vIGc. Destroy! >> >> Andrei > > Here's my attempt: http://goo.gl/alJ9JS > I interpreted the challenge slightly differently, I went for fewer lines > of code and fewer branches. > > I verified the output for all valid inputs. Very interesting! -- Andrei

On 2014-03-24 02:25:17 +0000, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > On 3/23/14, 6:53 PM, Michel Fortin wrote: >> On 2014-03-23 21:22:58 +0000, Andrei Alexandrescu >> <SeeWebsiteForEmail@erdani.org> said: >> >>> Here's a baseline: http://goo.gl/91vIGc. Destroy! >> >> Optimizing for smallest assembly size: >> >> dchar front(char[] s) >> { >> size_t bytesize; >> dchar result; >> switch (s[0]) >> { >> case 0b00000000: .. case 0b01111111: >> return s[0]; >> case 0b11000000: .. case 0b11011111: >> return ((s[0] & 0b00011111) << 6) | (s[1] & 0b00011111); >> case 0b11100000: .. case 0b11101111: >> result = s[0] & 0b00001111; >> bytesize = 3; >> break; >> case 0b11110000: .. case 0b11110111: >> result = s[0] & 0b00000111; >> bytesize = 4; >> default: >> return dchar.init; >> } >> foreach (i; 1..bytesize) >> result = (result << 6) | (s[i] & 0b00111111); >> return result; >> } > > Nice, thanks! I'd hope for a short path for the ASCII subset, could you achieve that? Unfortunately, there's a bug in the above. A missing "break" results in a fallthrough to default case. That's why the optimizer is so good, it just omits the four-byte branch entirely. I noticed something was wrong by looking at the assembly. I really wish D had no implicit fallthrough. But try this instead, the result is even shorter: dchar front(char[] s) { if (s[0] < 0b1000000) return s[0]; // ASCII // pattern indicator tailLength // 0b100xxxxx 0b00 (0) 1 // 0b101xxxxx 0b01 (1) 1 == indicator // 0b110xxxxx 0b10 (2) 2 == indicator // 0b111xxxxx 0b11 (3) 3 == indicator // note: undefined result for illegal 0b1111xxxx case auto indicator = (s[0] >> 5) & 0b11; auto tailLength = indicator ? indicator : 1; dchar result = s[0] & (0b00111111 >> tailLength); foreach (i; 0..tailLength) result = (result << 6) | (s[1+i] & 0b00111111); return result; } (Disclaimer: not tested, but I did check that all the expected code paths are present in the assembly this time.) -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca

On 2014-03-24 04:39:08 +0000, "bearophile" <bearophileHUGS@lycos.com> said: > Michel Fortin: > >> I really wish D had no implicit fallthrough. > > Isn't your wish already granted? Maybe. I don't have a D compiler installed at the moment to check. I'm just playing with d.godbolt.org and it accepts implicit fallthrough. -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca

March 24, 2014

Re: Challenge: write a really really small front() for UTF8

Posted by Andrei Alexandrescu
in reply to Michel Fortin

Permalink

Andrei Alexandrescu

Posted in reply to Michel Fortin

Permalink

On 3/23/14, 9:37 PM, Michel Fortin wrote:
> On 2014-03-24 02:25:17 +0000, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> said:
>
>> On 3/23/14, 6:53 PM, Michel Fortin wrote:
>>> On 2014-03-23 21:22:58 +0000, Andrei Alexandrescu
>>> <SeeWebsiteForEmail@erdani.org> said:
>>>
>>>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>>
>>> Optimizing for smallest assembly size:
>>>
>>> dchar front(char[] s)
>>> {
>>>   size_t bytesize;
>>>   dchar result;
>>>   switch (s[0])
>>>   {
>>>     case 0b00000000: .. case 0b01111111:
>>>         return s[0];
>>>     case 0b11000000: .. case 0b11011111:
>>>         return ((s[0] & 0b00011111) << 6) | (s[1] & 0b00011111);
>>>     case 0b11100000: .. case 0b11101111:
>>>         result = s[0] & 0b00001111;
>>>         bytesize = 3;
>>>         break;
>>>     case 0b11110000: .. case 0b11110111:
>>>         result = s[0] & 0b00000111;
>>>         bytesize = 4;
>>>     default:
>>>        return dchar.init;
>>>   }
>>>   foreach (i; 1..bytesize)
>>>       result = (result << 6) | (s[i] & 0b00111111);
>>>   return result;
>>> }
>>
>> Nice, thanks! I'd hope for a short path for the ASCII subset, could
>> you achieve that?
>
> Unfortunately, there's a bug in the above. A missing "break" results in
> a fallthrough to default case. That's why the optimizer is so good, it
> just omits the four-byte branch entirely. I noticed something was wrong
> by looking at the assembly. I really wish D had no implicit fallthrough.

-w

Probably time to move that from a warning to an error. In the short time we've used D at Facebook (forgot to add "-w" to the implicit flags) we've already have not one, but two bugs related to switch's implicit fallthrough.

> But try this instead, the result is even shorter:
>
> dchar front(char[] s)
> {
>   if (s[0] < 0b1000000)
>     return s[0]; // ASCII
>
>   // pattern     indicator  tailLength
>   // 0b100xxxxx  0b00 (0)   1
>   // 0b101xxxxx  0b01 (1)   1 == indicator
>   // 0b110xxxxx  0b10 (2)   2 == indicator
>   // 0b111xxxxx  0b11 (3)   3 == indicator
>   // note: undefined result for illegal 0b1111xxxx case
>
>   auto indicator = (s[0] >> 5) & 0b11;
>   auto tailLength = indicator ? indicator : 1;
>
>   dchar result = s[0] & (0b00111111 >> tailLength);
>   foreach (i; 0..tailLength)
>       result = (result << 6) | (s[1+i] & 0b00111111);
>   return result;
> }
>
> (Disclaimer: not tested, but I did check that all the expected code
> paths are present in the assembly this time.)

Nicely done. I collected what we have at http://goo.gl/W9V6E7.


Andrei

On 2014-03-24 04:37:22 +0000, Michel Fortin <michel.fortin@michelf.ca> said: > But try this instead, the result is even shorter: Oops, messed up my patterns. Here's a hopefully fixed front(): dchar front(char[] s) { if (s[0] < 0b1000000) return s[0]; // ASCII // pattern indicator tailLength // 0b1100xxxx 0b00 (0) 1 // 0b1101xxxx 0b01 (1) 1 == indicator // 0b1110xxxx 0b10 (2) 2 == indicator // 0b1111xxxx 0b11 (3) 3 == indicator // note: undefined result for illegal 0b11111xxx case auto indicator = (s[0] >> 4) & 0b11; auto tailLength = indicator ? indicator : 1; dchar result = s[0] & (0b00111111 >> tailLength); foreach (i; 0..tailLength) result = (result << 6) | (s[1+i] & 0b00111111); return result; } -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca

Forums