February 22, 2017
On Wednesday, 22 February 2017 at 20:01:57 UTC, Adam D. Ruppe wrote:
> On Wednesday, 22 February 2017 at 19:26:15 UTC, berni wrote:
>> herefore I'd like to make sure that the string the program read is only made up of ascii characters.
>
> Easiest:
>
> foreach(char ch; postscript)
>   if(ch > 127) throw new Exception("non-ascii detected");

:)
February 23, 2017
On Wednesday, 22 February 2017 at 21:23:45 UTC, H. S. Teoh wrote:
> 	enforce(!s.any!"a > 127");

Puh, it's lot's of possibilities to choose of, now... I thought of something like the foreach-loop but wasn't sure if that is correct for all utf encodings. All in all, I think I take the any-approach, because it feels a little bit more like looking at the string at a whole and I like to use enforce.

Thanks for all your answers!

February 23, 2017
On Thursday, 23 February 2017 at 08:34:53 UTC, berni wrote:
> On Wednesday, 22 February 2017 at 21:23:45 UTC, H. S. Teoh wrote:
>> 	enforce(!s.any!"a > 127");
>
> Puh, it's lot's of possibilities to choose of, now... I thought of something like the foreach-loop but wasn't sure if that is correct for all utf encodings. All in all, I think I take the any-approach, because it feels a little bit more like looking at the string at a whole and I like to use enforce.
>
> Thanks for all your answers!

All the examples given here are very nice.
But alas this will not work with postscript files as found in the wild.

> In my program, I read a postscript file. Normal postscript files should only be composed of ascii characters, but one never knows what users give us. Therefore I'd like to make sure that the string the program read is only made up of ascii characters.

Generally postscript files may contain binary data.
Think of included images or font data.
So in postscript files there should normally be no utf-8 encoded text, but binary data are quite usual.
Think of postscript files as a sequence of ubytes.
February 23, 2017
On Thursday, 23 February 2017 at 17:44:05 UTC, HeiHon wrote:
> Generally postscript files may contain binary data.
> Think of included images or font data.
> So in postscript files there should normally be no utf-8 encoded text, but binary data are quite usual.
> Think of postscript files as a sequence of ubytes.

As far as I know, images and font data have to be in clean7bit too (they are not human readable though). But postscript files can contain preview images, which can be binary. I know about this. I just tried to keep my question simple -- and actually I'm only testing part of the postscript file, where I know, that binary data must not occur.
1 2
Next ›   Last »