January 29, 2005
Currently the lexer allows invalid UTF-8 in comments.
Here is a patch to make it check all comments as well.

It's just a gross copy-and-paste, since that seemed
to be the rule in the current lexer.c source code ? :-)

To all three places, where it skips over characters:
//
/*
/+

Seems to be working OK with GDC, as far as I can tell. (haven't run the regression suite just yet, but anyway)

--anders

PS. Walter, here's some neat tools:
     http://unxutils.sourceforge.net/




February 01, 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Anders F Björklund schrieb am Sat, 29 Jan 2005 23:08:39 +0100:
> Currently the lexer allows invalid UTF-8 in comments.

I've added a bunch of invalid UTF test to DStress:

http://dstress.kuehne.cn/nocompile/invalid_utf_01.d
...
http://dstress.kuehne.cn/nocompile/invalid_utf_43.d

Note: The invalid UTF tests aren't complete yet.

Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFB/7HA3w+/yD4P9tIRAi9HAKCV7mEJT3rmThzOebdvTR0B1VrQtgCgxwhT
FIncPr5yoRVcSAoO40MD6GY=
=x6Hd
-----END PGP SIGNATURE-----