Thread overview
regex on binary data
Dec 31, 2014
Darrell
Dec 31, 2014
Tobias Pankrath
Dec 31, 2014
ketmar
December 31, 2014
So far attempts to run regex on binary data causes
"Invalid UTF-8 sequence".

Attempts to pass ubyte also didn't work out.

December 31, 2014
On Wednesday, 31 December 2014 at 15:36:19 UTC, Darrell wrote:
> So far attempts to run regex on binary data causes
> "Invalid UTF-8 sequence".
>
> Attempts to pass ubyte also didn't work out.


I doubt using anything except (d,w)string is supported or possible.
December 31, 2014
On Wed, 31 Dec 2014 15:36:16 +0000
Darrell via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
wrote:

> So far attempts to run regex on binary data causes
> "Invalid UTF-8 sequence".
> 
> Attempts to pass ubyte also didn't work out.

current regex engine assumes that you are using UTF-8 encoded text. i really want regex engine to support user-supplied input ranges instead, so decoding can be done by range (and regex engine can work on anything, not only on strings), but i'm not ready for that challenge yet. maybe i'll try to do something with it in 2015. ;-)