Thread overview
[Issue 12923] UTF exception in stride even though passes validate.
Jun 14, 2014
Timothee Cour
Jun 14, 2014
Timothee Cour
Jul 27, 2014
Dmitry Olshansky
June 14, 2014
https://issues.dlang.org/show_bug.cgi?id=12923

Timothee Cour <timothee.cour2@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timothee.cour2@gmail.com

--- Comment #1 from Timothee Cour <timothee.cour2@gmail.com> ---
(In reply to Timothee Cour from comment #0)
> import std.utf;
> void main(){
>   char[3]a=[167, 133, 175];
>   validate(a);
>   //passes
> 
>   auto k=stride(a,0);
>   /+
>   std.utf.UTFException@std/utf.d(199): Invalid UTF-8 sequence (at index 0)
>   pure @safe uint std.utf.stride!(char[3]).stride(ref char[3], ulong) + 141
>   +/
> }
> 
> This happens even after applying the fix https://github.com/D-Programming-Language/phobos/pull/2038

Additionally, another error is thrown on any of those:
foreach (i, dchar c; a){} //src/rt/util/utf.d:290 Invalid UTF-8 sequence
foreach_reverse (i, dchar c; a){} //src/rt/aApplyR.d:511 Invalid UTF-8 sequence

so perhaps std.utf.validate accepts some invalid UTF sequences

--
June 14, 2014
https://issues.dlang.org/show_bug.cgi?id=12923

--- Comment #2 from Timothee Cour <timothee.cour2@gmail.com> ---
(In reply to Timothee Cour from comment #1)
> (In reply to Timothee Cour from comment #0)
> > import std.utf;
> > void main(){
> >   char[3]a=[167, 133, 175];
> >   validate(a);
> >   //passes
> > 
> >   auto k=stride(a,0);
> >   /+
> >   std.utf.UTFException@std/utf.d(199): Invalid UTF-8 sequence (at index 0)
> >   pure @safe uint std.utf.stride!(char[3]).stride(ref char[3], ulong) + 141
> >   +/
> > }
> > 
> > This happens even after applying the fix https://github.com/D-Programming-Language/phobos/pull/2038
> 
> Additionally, another error is thrown on any of those:
> foreach (i, dchar c; a){} //src/rt/util/utf.d:290 Invalid UTF-8 sequence
> foreach_reverse (i, dchar c; a){} //src/rt/aApplyR.d:511 Invalid UTF-8
> sequence
> 
> so perhaps std.utf.validate accepts some invalid UTF sequences


Here's one possible fix:

in decodeImpl:
----
UTFException invalidUTF(){...}

//insert this
import core.bitop;
immutable msbs = 7 - bsr(~fst);
if (msbs < 2 || msbs > 6) throw invalidUTF();

UTFException outOfBounds() {...}
----

To have same behavior as inside strideImpl.
But is that correct, or was the behavior in strideImpl wrong itself?

--
July 27, 2014
https://issues.dlang.org/show_bug.cgi?id=12923

Dmitry Olshansky <dmitry.olsh@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                URL|                            |https://github.com/D-Progra
                   |                            |mming-Language/phobos/pull/
                   |                            |2376
                 CC|                            |dmitry.olsh@gmail.com
           Assignee|nobody@puremagic.com        |dmitry.olsh@gmail.com

--- Comment #3 from Dmitry Olshansky <dmitry.olsh@gmail.com> ---
(In reply to Timothee Cour from comment #2)
> (In reply to Timothee Cour from comment #1)
> > (In reply to Timothee Cour from comment #0)
> > > import std.utf;
> > > void main(){
> > >   char[3]a=[167, 133, 175];
> > >   validate(a);
> > >   //passes
> > > 
> > >   auto k=stride(a,0);
> > >   /+
> > >   std.utf.UTFException@std/utf.d(199): Invalid UTF-8 sequence (at index 0)
> > >   pure @safe uint std.utf.stride!(char[3]).stride(ref char[3], ulong) + 141
> > >   +/
> > > }
> > > 
> > > This happens even after applying the fix https://github.com/D-Programming-Language/phobos/pull/2038
> > 
> > Additionally, another error is thrown on any of those:
> > foreach (i, dchar c; a){} //src/rt/util/utf.d:290 Invalid UTF-8 sequence
> > foreach_reverse (i, dchar c; a){} //src/rt/aApplyR.d:511 Invalid UTF-8
> > sequence
> > 
> > so perhaps std.utf.validate accepts some invalid UTF sequences
> 
> 
> Here's one possible fix:
> 
> in decodeImpl:
> ----
> UTFException invalidUTF(){...}
> 
> //insert this
> import core.bitop;
> immutable msbs = 7 - bsr(~fst);
> if (msbs < 2 || msbs > 6) throw invalidUTF();
> 
> UTFException outOfBounds() {...}
> ----
> 
> To have same behavior as inside strideImpl.
> But is that correct, or was the behavior in strideImpl wrong itself?

--
July 27, 2014
https://issues.dlang.org/show_bug.cgi?id=12923

github-bugzilla@puremagic.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--
July 27, 2014
https://issues.dlang.org/show_bug.cgi?id=12923

--- Comment #4 from github-bugzilla@puremagic.com ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/9afbbaf056641edf139e2f94a7d0c4a8f86bb5b3 Fix issue 12923

UTF exception in stride even though passes validate.
The root cause is that decode has very lax checking of the first
code unit.

https://github.com/D-Programming-Language/phobos/commit/cdd26e309d9b8ade1082330c8b06868523ec1a90 Merge pull request #2376 from DmitryOlshansky/issue-12923

Fix issue 12923

--
July 31, 2014
https://issues.dlang.org/show_bug.cgi?id=12923

--- Comment #5 from github-bugzilla@puremagic.com ---
Commit pushed to 2.066 at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/888897030c1587c36adbd19542a1e431f965e480 Merge pull request #2376 from DmitryOlshansky/issue-12923

Fix issue 12923

--
August 21, 2014
https://issues.dlang.org/show_bug.cgi?id=12923

--- Comment #6 from github-bugzilla@puremagic.com ---
Commit pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/888897030c1587c36adbd19542a1e431f965e480 Merge pull request #2376 from DmitryOlshansky/issue-12923

--