March 24, 2014
On 2014-03-24 06:08:30 +0000, "safety0ff" <safety0ff.dev@gmail.com> said:

> Everything seems to work in your corrected versions:
> http://dpaste.dzfl.pl/3b32535559ba
> 
> Andrei's version doesn't compile on recent compiler versions due to goto skipping initialization.

Ok, so let's check what happens in actual usage. In other words, what happens when front is inlined and used in a loop. I'm counting the number of assembly lines of this main function for looping through each possible input using the various implementations:

http://goo.gl/QjnA2k

implementation   line count of main in assembly
andralex         285 -  9 labels = 276 instructions
MFortin1         135 - 13 labels = 122 instructions
MFortin2         150 - 14 labels = 136 instructions
safety0ff         -- not inlined (too big?) --
dnspies          161 - 15 labels = 146 instructions
CWilliams        233 - 25 labels = 208 instructions

For compactness, my first implementation seems to be the winner, both with and without inlining.

That said, I think front should only assume a non-empty char[] and thus should check the length before accessing anything after the first byte to protect against incomplete multi-byte sequences such as [0x1000_0000]. So I added this line to my two implementations:

   if (1+tailLength >= s.length) return dchar.init;

Now lets see what happens with those changes:

http://goo.gl/XPCGYE

implementation   line count of main in assembly
MFortin1Check    103 - 11 labels =  92 instructions
MFortin2Check    135 - 13 labels = 122 instructions

Now I'm baffled. Adding a test makes the code shorter? It actually make the standalone functions longer by 8 instructions (as expected), but the inlined version is shorter. My guess is that the optimizer creates something entirely different and it turns out that this different version optimises better after inlining.

That said, my test "main" function isn't very representative of the general case because the length is statically known by the optimizer. Let's see what it does when the length is not statically known:

http://goo.gl/E2Q0Yu

implementation   line count of main in assembly
andralex         384 -  9 labels = 375 instructions
MFortin1         173 - 19 labels = 154 instructions
MFortin2         182 - 18 labels = 164 instructions
safety0ff         -- not inlined --
dnspies           -- not inlined --
CWilliams        229 - 23 labels = 206 instructions
MFortin1Check    211 - 24 labels = 187 instructions
MFortin2Check    218 - 21 labels = 197 instructions

So again MFortin1 is the winner for compactness. Still, I maintain that we ought to at least check the length of the array for multi-byte sequences to protect against incomplete sequences that could lay at the end of the string, so I'd favor MFortin1Check.

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

March 24, 2014
On 2014-03-24 13:39:45 +0000, Michel Fortin <michel.fortin@michelf.ca> said:

> That said, my test "main" function isn't very representative of the general case because the length is statically known by the optimizer. Let's see what it does when the length is not statically known:

Oops, wrong link, was from an intermediary test.
This one will give you the results in the table:
http://goo.gl/lwU4hv

> implementation   line count of main in assembly
> andralex         384 -  9 labels = 375 instructions
> MFortin1         173 - 19 labels = 154 instructions
> MFortin2         182 - 18 labels = 164 instructions
> safety0ff         -- not inlined --
> dnspies           -- not inlined --
> CWilliams        229 - 23 labels = 206 instructions
> MFortin1Check    211 - 24 labels = 187 instructions
> MFortin2Check    218 - 21 labels = 197 instructions


-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

March 24, 2014
On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>
> Andrei

I've probably missed something, but here's 15 instructions: http://goo.gl/d3Mgj0


Also, I contacted the owner of d.godbolt.org and hopefully we'll have a more up-to-date gdc on there soon. Perhaps ldc as well :)
March 24, 2014
On Monday, 24 March 2014 at 15:32:09 UTC, John Colvin wrote:
> On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>
>> Andrei
>
> I've probably missed something, but here's 15 instructions: http://goo.gl/d3Mgj0

woops yeah, that's completely broken.
March 24, 2014
On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu
wrote:
> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>
> Andrei

http://forum.dlang.org/post/fsgdviklnbugdzjsgfsk@forum.dlang.org
March 24, 2014
On 3/24/14, 12:53 AM, Chris Williams wrote:
> On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>
>> Andrei
>
> http://goo.gl/TaZTNB

Nice! Why assert(ret != 0)?

Andrei
March 24, 2014
On 3/24/14, 1:06 AM, dnspies wrote:
> On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>
>> Andrei
>
> Here's mine (the loop gets unrolled):
>
> dchar front(const char[] s) {
>    assert(s.length > 0);
>    byte c = s[0];
>    dchar res = cast(ubyte)c;
>    if(c >= 0) {
>      return res;
>    }
>    c <<= 1;
>    assert(c < 0);
>    for(int i = 1; i < 4; i++) {
>      assert(i < s.length);
>      ubyte d = s[i];
>      assert((d >> 6) == 0b10);
>      res = (res << 8) | d;
>      c <<= 1;
>      if(c >= 0)
>        return res;
>    }
>    assert(false);
> }

http://goo.gl/HBSv5T - looks nice! For minimum size replace the assert(false) with one inside the loop: http://goo.gl/eWs0ft

Andrei

March 24, 2014
On 3/24/14, 1:56 AM, dnspies wrote:
> On Monday, 24 March 2014 at 08:06:53 UTC, dnspies wrote:
>> On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
>>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>>
>>> Andrei
>>
>> Here's mine (the loop gets unrolled):
>>
>> dchar front(const char[] s) {
>>   assert(s.length > 0);
>>   byte c = s[0];
>>   dchar res = cast(ubyte)c;
>>   if(c >= 0) {
>>     return res;
>>   }
>>   c <<= 1;
>>   assert(c < 0);
>>   for(int i = 1; i < 4; i++) {
>>     assert(i < s.length);
>>     ubyte d = s[i];
>>     assert((d >> 6) == 0b10);
>>     res = (res << 8) | d;
>>     c <<= 1;
>>     if(c >= 0)
>>       return res;
>>   }
>>   assert(false);
>> }
>
> Sorry, I misunderstood.  We only want the x's in the output. Here's my
> revised solution
>
> http://goo.gl/PL729J

That turned out quite large. -- Andrei

March 24, 2014
On 3/24/14, 2:02 AM, monarch_dodra wrote:
> On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>
>> Andrei
>
> Before we roll this out, could we discuss a strategy/guideline in
> regards to detecting and handling invalid UTF sequences?

I think std.array.front should return the invalid dchar on error, and popFront should attempt to resync on error.

Andrei

March 24, 2014
On 3/24/14, 3:30 AM, dnspies wrote:
> On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>
>> Andrei
>
> Ok, I managed to make it smaller.  I think this is the smallest so far
> with only 23 instructions (no loop unrolling in this one):
>
> http://goo.gl/RKF5Vm

This is nice because even the checks are relatively cheap. It's quite difficult to understand tho :o). -- Andrei