October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #11 from Sum Proxy <sum.proxy@gmail.com> --- I tried the new version of the compiler with the issue you referred to, but alas - no luck. Please see https://issues.dlang.org/show_bug.cgi?id=1448#c12 SetConsoleCP(65001) and SetConsoleOutputCP(65001) didn't help either. Thanks. -- |
October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 Vladimir Panteleev <thecybershadow@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Hardware|x86_64 |All --- Comment #12 from Vladimir Panteleev <thecybershadow@gmail.com> --- Indeed. Happens with both DMC and MSVC runtime. -- |
October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #13 from Vladimir Panteleev <thecybershadow@gmail.com> --- "scanf" misbehaves in the same way. Not a D bug, I think. -- |
October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #14 from Sum Proxy <sum.proxy@gmail.com> --- Do you find it necessary to report the issue elsewhere, or the guys in charge of https://issues.dlang.org/show_bug.cgi?id=1448 will do it? -- |
October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #15 from Vladimir Panteleev <thecybershadow@gmail.com> --- Report it where? To Microsoft? Figuring out why scanf is failing would probably be the next step to resolving this. -- |
October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #16 from Sum Proxy <sum.proxy@gmail.com> --- Are you referring to C's scanf? Is it consistently reproducible in a small chunk of C code? -- |
October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #17 from Vladimir Panteleev <thecybershadow@gmail.com> --- Yep: /////////// test.c /////////// void main() { char buf[1024]; SetConsoleCP(65001); SetConsoleOutputCP(65001); scanf("%s", buf); printf("%d", strlen(buf)); } ////////////////////////////// -- |
October 25, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #18 from Sum Proxy <sum.proxy@gmail.com> --- >From what I know this program will work incorrectly for any non-ascii unicode input, which I have confirmed through simple tests. scanf and strlen rely on '\0' to indicate string termination, but I don't think this goes well with unicode strings. I believe the right way to do something similar (without buffer length) is this: #include <stdio.h> #include <fcntl.h> #include <io.h> int main( void ) { wchar_t buf[1024]; _setmode( _fileno( stdin ), _O_U16TEXT ); _setmode( _fileno( stdout ), _O_U16TEXT ); wscanf( L"%ls", buf ); wprintf( L"%s", buf ); } For further info please refer to http://www.siao2.com/2008/03/18/8306597.aspx and http://msdn.microsoft.com/en-us/library/tw4k6df8%28v=vs.120%29.aspx HTH, Thanks. -- |
October 26, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #19 from Vladimir Panteleev <thecybershadow@gmail.com> --- (In reply to Sum Proxy from comment #18) > scanf and strlen rely on '\0' to indicate string termination, but I don't think this goes well with unicode strings. Not true. At least, not true with UTF-8, which is what we set the CP to. > I believe the right way to do something similar (without buffer length) is > this: I would not say that's the "right" way. That's the way to read wchar_t text, but we need UTF-8 text. -- |
October 28, 2014 [Issue 12990] utf8 string not read/written to windows console | ||||
---|---|---|---|---|
| ||||
https://issues.dlang.org/show_bug.cgi?id=12990 --- Comment #20 from Sum Proxy <sum.proxy@gmail.com> --- I believe the problem is that default internal representation of Unicode in Windows is UTF-16, which implies that some sort of conversion would be necessary here. I haven't found a way to do it right yet. -- |
Copyright © 1999-2021 by the D Language Foundation