Created attachment 70 [details]
bugfix

as we are rapidly fixing bugs, there is another one. i reported it to mainline
DMD and they still not fixed it. but it can be fixed in gdc though.

EncodingSchemeUtf16Native.decode() and EncodingSchemeUtf32Native.decode()
should take character type size into account when chopping out decoded bytes.
patch attached. here is test case:

import std.encoding;


void testUTF16 () {
  version(LittleEndian) {
    auto efrom = EncodingScheme.create("utf-16le");
    ubyte[6] sample = [154,1, 155,1, 156,1];
  }
  version(BigEndian) {
    auto efrom = EncodingScheme.create("utf-16be");
    ubyte[6] sample = [1,154, 1,155, 1,156];
  }
  const(ubyte)[] ub = cast(const(ubyte)[])sample;
  dchar dc = efrom.safeDecode(ub);
  assert(dc == 410);
  assert(ub.length == 4);
}


void testUTF32 () {
  version(LittleEndian) {
    auto efrom = EncodingScheme.create("utf-32le");
    ubyte[12] sample = [154,1,0,0, 155,1,0,0, 156,1,0,0];
  }
  version(BigEndian) {
    auto efrom = EncodingScheme.create("utf-32be");
    ubyte[12] sample = [0,0,1,154, 0,0,1,155, 0,0,1,156];
  }
  const(ubyte)[] ub = cast(const(ubyte)[])sample;
  dchar dc = efrom.safeDecode(ub);
  assert(dc == 410);
  assert(ub.length == 8);
}


void main () {
  testUTF16();
  testUTF32();
}

Bug ID	138
Summary	std.enconding: EncodingSchemeUtf16Native and EncodingSchemeUtf32Native invalid splicing
Product	GDC
Version	development
Hardware	All
OS	All
Status	NEW
Severity	normal
Priority	Normal
Component	libgphobos
Assignee	ibuclaw@gdcproject.org
Reporter	ketmar@ketmar.no-ip.org