Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
March 16, 2004 Could not believe it! | ||||
---|---|---|---|---|
| ||||
/* Strange Bug: This program produces erroneous outputs, if the number of characters joined together in the printf() function matches those in the examples shown. System: dmd V0.81 on WinXP Output: Look at - this! Look at <<<<< joined array contents duplicated Test2: Look + at <<<<< correct - unless 1 char removed this! Now look at that#! <<<<< control character inserted */ import std.string; char[][] a; char[][] b; int main (char[][] args) { // This produces an incorrect output: a~="Look"; a~="at"; printf("\n"); printf(join(a," ") ~ " - this!\n"); // This is fine, but when for example // 'Test2' is changed to 'Test' the // output will be incorrect printf("\n\n"); printf("Test2: " ~ join(a," + ") ~ "\n"); printf("this!\n"); // This puts a control character between // 'that' and the exclamation mark printf("\n\n"); b~="Now"; b~="look"; b~="at"; b~="that"; printf(join(b," ")); printf("!\n"); return(0); } |
March 17, 2004 Re: Could not believe it! | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bob W | Nothing strange. printf is a C function. In C, strings are 0-terminated. In D, only literals for compatibility are, generic strings as resulting from array operations are not. See http://www.prowiki.org/wiki4d/wiki.cgi?HowTo/printf You can use something like toStringz to make a null-terminated string from D string. In general, printf would either get "banned" (from public examples etc, not from library) or we put a D-specific version together. But we haven't decided yet. Stream IO from Phobos is strongly recommended over C IO unless you know exactly what you are doing. So welcome here and good luck. -eye Bob W schrieb: > /* > > Strange Bug: > > This program produces erroneous outputs, > if the number of characters joined together > in the printf() function matches those in the > examples shown. > > > System: > dmd V0.81 on WinXP > > > Output: > > Look at - this! > Look at <<<<< joined array contents duplicated > > Test2: Look + at <<<<< correct - unless 1 char removed > this! > > > Now look at that#! <<<<< control character inserted > > */ > > import std.string; > > char[][] a; > char[][] b; > > int main (char[][] args) { > > // This produces an incorrect output: > > a~="Look"; a~="at"; > printf("\n"); > printf(join(a," ") ~ " - this!\n"); > > > // This is fine, but when for example > // 'Test2' is changed to 'Test' the > // output will be incorrect > > printf("\n\n"); > printf("Test2: " ~ join(a," + ") ~ "\n"); > printf("this!\n"); > > > // This puts a control character between > // 'that' and the exclamation mark > > printf("\n\n"); > b~="Now"; b~="look"; b~="at"; b~="that"; > printf(join(b," ")); > printf("!\n"); > > return(0); > } > > > |
March 17, 2004 Re: Could not believe it! | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Minkov | Thanks for your reply. If I understand you correctly the following would happen: // This works because a literal is used as parameter1 printf("Simple string 1\n"); // This will still work because 's' points to a literal char[] s="Simple string 2\n"; printf(s); // This is ok, because both literals are merged during compile-time printf("Simple " ~ "string 3\n"); // Just this one is a problem because variable and literal seem // to be merged at runtime, so a 'genuine' D string is created. // Furthermore the end of the new string is sitting just before a // 16-bytes boundary, which prevents eventual zero padding to // come as a rescue. char[] s="Simple "; printf(s ~ "string 4\n"); // quick fix: add '\0' after '\n' ? ----------------------------------------------------- "Ilya Minkov" <minkov@cs.tum.edu> wrote in message news:c3880m$2jc1$1@digitaldaemon.com... > Nothing strange. printf is a C function. In C, strings are 0-terminated. In D, only literals for compatibility are, generic strings as resulting from array operations are not. > > See http://www.prowiki.org/wiki4d/wiki.cgi?HowTo/printf > > You can use something like toStringz to make a null-terminated string from D string. > > In general, printf would either get "banned" (from public examples etc, not from library) or we put a D-specific version together. But we haven't decided yet. Stream IO from Phobos is strongly recommended over C IO unless you know exactly what you are doing. > > So welcome here and good luck. > > -eye > > > > Bob W schrieb: > > /* > > > > Strange Bug: > > > > This program produces erroneous outputs, > > if the number of characters joined together > > in the printf() function matches those in the > > examples shown. > > > > > > System: > > dmd V0.81 on WinXP > > > > > > Output: > > > > Look at - this! > > Look at <<<<< joined array contents duplicated > > > > Test2: Look + at <<<<< correct - unless 1 char removed this! > > > > > > Now look at that#! <<<<< control character inserted > > > > */ > > > > import std.string; > > > > char[][] a; > > char[][] b; > > > > int main (char[][] args) { > > > > // This produces an incorrect output: > > > > a~="Look"; a~="at"; > > printf("\n"); > > printf(join(a," ") ~ " - this!\n"); > > > > > > // This is fine, but when for example > > // 'Test2' is changed to 'Test' the > > // output will be incorrect > > > > printf("\n\n"); > > printf("Test2: " ~ join(a," + ") ~ "\n"); > > printf("this!\n"); > > > > > > // This puts a control character between > > // 'that' and the exclamation mark > > > > printf("\n\n"); > > b~="Now"; b~="look"; b~="at"; b~="that"; > > printf(join(b," ")); > > printf("!\n"); > > > > return(0); > > } > > > > > > |
March 18, 2004 Re: Could not believe it! | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bob W | Bob W schrieb: > Thanks for your reply. You're welcome! > If I understand you correctly the following would happen: Everything is right except for: > // This is ok, because both literals are merged during compile-time > > printf("Simple " ~ "string 3\n"); With DMD it's true, but i'm not sure it is defined whether this should be so or not. That is, current compiler does it so, but if we get popular and get rivals, there this might mean a non null terminated array. It is even not defined, whether this concatenation happens at compile time or execution time. > // Just this one is a problem because variable and literal seem > // to be merged at runtime, so a 'genuine' D string is created. > // Furthermore the end of the new string is sitting just before a > // 16-bytes boundary, which prevents eventual zero padding to > // come as a rescue. > > char[] s="Simple "; > printf(s ~ "string 4\n"); // quick fix: add '\0' after '\n' ? True. But you cannot rely on zero padding to work either. Adding \0 works, but such a string only makes sense for C functions. In D functions, this might cause problems, because when i.e. you conatinate something else to the end, you get a string with embedded 0! The thing is, when constant strings are emitted, they are padded with zeroes at the end. At runtime, a slice (a slice is a value, consisting of data pointer and length) into the constant area is assigned to the array variable. So there is a 0 right behind the array bound. As soon as any operations increasing the length are done, the array data is requiered to be copied. A new memory area is being allocated. Thus the zeroes are lost. In fact, it is a convention to copy on any change except for slicing. There are also other funny things which may happen, including: * You slice into a string literal. Simplest thing is you have a string, and decrease its length, You printf it and have the string go not to its real end, but further to 0 teminator, ie original length. * You printf using format string, which contains %s, and something afterwards. This something gets replaced by noise... %s is the wrong format, you should use ... can't remember, see the link. * Functions can write in the arrays they get as input, but if you change the length the change is not propagated back to the caller. In other words, semantics is semi-constant, where you have to make sure you either copy an array (array.dup) before changing it - if the change needs not be propagated - or use the inout modifyer. I think this should be in some newbee FAQ, please someone add if it's not. It's too late here. -eye |
March 18, 2004 Re: Could not believe it! | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Minkov | Got the info that my last reply to this thread was misposted. I guess this one should work: ----- Original Message ----- From: "Ilya Minkov" <minkov@cs.tum.edu> Newsgroups: D Sent: Thursday, 18 March, 2004 01:02 Subject: Re: Could not believe it! > Bob W schrieb: > > Thanks for your reply. > > You're welcome! > > > If I understand you correctly the following would happen: > > Everything is right except for: > > > // This is ok, because both literals are merged during compile-time > > > > printf("Simple " ~ "string 3\n"); > > With DMD it's true, but i'm not sure it is defined whether this should be so or not. That is, current compiler does it so, but if we get popular and get rivals, there this might mean a non null terminated array. It is even not defined, whether this concatenation happens at compile time or execution time. I am aware of this, but I just wanted to understand correctly the current phenomena I've experienced. > > > // Just this one is a problem because variable and literal seem > > // to be merged at runtime, so a 'genuine' D string is created. > > // Furthermore the end of the new string is sitting just before a > > // 16-bytes boundary, which prevents eventual zero padding to > > // come as a rescue. > > > > char[] s="Simple "; > > printf(s ~ "string 4\n"); // quick fix: add '\0' after '\n' ? > > True. But you cannot rely on zero padding to work either. Adding \0 works, but such a string only makes sense for C functions. In D functions, this might cause problems, because when i.e. you conatinate something else to the end, you get a string with embedded 0! As long as the string is used just to be displayed, I personally would not worry about a '\0' being added. Otherwise it is a potential pitfall, I agree to that. > The thing is, when constant strings are emitted, they are padded with zeroes at the end. At runtime, a slice (a slice is a value, consisting of data pointer and length) into the constant area is assigned to the array variable. So there is a 0 right behind the array bound. As soon as any operations increasing the length are done, the array data is requiered to be copied. A new memory area is being allocated. Thus the zeroes are lost. In fact, it is a convention to copy on any change except for slicing. > > There are also other funny things which may happen, including: > > * You slice into a string literal. Simplest thing is you have a string, > and decrease its length, You printf it and have the string go not to its > real end, but further to 0 teminator, ie original length. > * You printf using format string, which contains %s, and something > afterwards. This something gets replaced by noise... %s is the wrong > format, you should use ... can't remember, see the link. If you are referring to the '%.*s' crutch, I was quite astonished that there were no other measures found to get printf() to work with D-strings. I know that D is still in the alpha stage, but offering something like '%t' instead of '%.*s' to handle D-type of strings would help, because printf is something almost everyone will be using during an initial evaluation of D and beyond. Obscuring one of the most popular conversion specifiers for printf() probably does not really help in getting D promoted. Besides, a recent survey has shown that for some reason C++ is loosing market share to good old C, so printf() is here to stay for the next 100 years or so anyway ..... : ) > * Functions can write in the arrays they get as input, but if you change the length the change is not propagated back to the caller. In other words, semantics is semi-constant, where you have to make sure you either copy an array (array.dup) before changing it - if the change needs not be propagated - or use the inout modifyer. That is good to know, thanks. > I think this should be in some newbee FAQ, please someone add if it's not. It's too late here. > > -eye |
Copyright © 1999-2021 by the D Language Foundation