Thread overview |
---|
October 14, 2008 [Issue 2418] New: Same-value string (char[]) literals get overwritten (unlike array literals) | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=2418 Summary: Same-value string (char[]) literals get overwritten (unlike array literals) Product: D Version: 1.035 Platform: PC OS/Version: Windows Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: bugzilla@digitalmars.com ReportedBy: business3@twistedpairgaming.com When a string is initialized with a string literal, and part of the string is then changed, all other string literals with the same value reflect the same change. For the following code: ------------------ module test; import tango.io.Stdout; class FooChar { char[] str; this(char[] _str) { this.str = _str; } } class FooInt { int[] ints; this(int[] _ints) { this.ints = _ints; } } void main() { FooChar a; FooChar b; FooChar c; FooChar d; a = new FooChar("AAA"); Stdout.formatln("a.str: {}", a.str); a.str[0] = '!'; Stdout.formatln("a.str: {}", a.str); a = new FooChar("AAA"); Stdout.formatln("a.str: {}", a.str); Stdout.formatln(""); b = new FooChar("AAA"); Stdout.formatln("b.str: {}", b.str); b.str[0] = '!'; Stdout.formatln("b.str: {}", b.str); b = new FooChar("AAA"); Stdout.formatln("b.str: {}", b.str); Stdout.formatln(""); c = new FooChar("AA1"); Stdout.formatln("c.str: {}", c.str); c.str[0] = '!'; Stdout.formatln("c.str: {}", c.str); c = new FooChar("AA1"); Stdout.formatln("c.str: {}", c.str); Stdout.formatln(""); const char[] dInit = "AA2"; d = new FooChar(dInit); Stdout.formatln("d.str: {}", d.str); d.str[0] = '!'; Stdout.formatln("d.str: {}", d.str); d = new FooChar(dInit); Stdout.formatln("d.str: {}", d.str); Stdout.formatln(""); FooInt i; i = new FooInt([1, 2]); Stdout.formatln("i.ints: {}", i.ints); i.ints[0] = 77; Stdout.formatln("i.ints: {}", i.ints); i = new FooInt([1, 2]); Stdout.formatln("i.ints: {}", i.ints); } ------------------ Expected output: a.str: AAA a.str: !AA a.str: AAA b.str: AAA b.str: !AA b.str: AAA c.str: AA1 c.str: !A1 c.str: AA1 d.str: AA2 d.str: !A2 d.str: AA2 i.ints: [1, 2] i.ints: [77, 2] i.ints: [1, 2] Actual output: a.str: AAA a.str: !AA a.str: !AA b.str: !AA b.str: !AA b.str: !AA c.str: AA1 c.str: !A1 c.str: !A1 d.str: AA2 d.str: !A2 d.str: !A2 i.ints: [1, 2] i.ints: [77, 2] i.ints: [1, 2] This might be a side-effect of the fix for #817: http://d.puremagic.com/issues/show_bug.cgi?id=817 -- |
October 14, 2008 [Issue 2418] Same-value string (char[]) literals get overwritten (unlike array literals) | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2418 ------- Comment #1 from business3@twistedpairgaming.com 2008-10-14 17:59 ------- Sorry, I guess those classes are not needed: --------------- module test; import tango.io.Stdout; void main() { char[] a; char[] b; char[] c; char[] d; a = "AAA"; Stdout.formatln("a: {}", a); a[0] = '!'; Stdout.formatln("a: {}", a); a = "AAA"; Stdout.formatln("a: {}", a); Stdout.formatln(""); b = "AAA"; Stdout.formatln("b: {}", b); b[0] = '!'; Stdout.formatln("b: {}", b); b = "AAA"; Stdout.formatln("b: {}", b); Stdout.formatln(""); c = "AA1"; Stdout.formatln("c: {}", c); c[0] = '!'; Stdout.formatln("c: {}", c); c = "AA1"; Stdout.formatln("c: {}", c); Stdout.formatln(""); const char[] dInit = "AA2"; d = dInit; Stdout.formatln("d: {}", d); d[0] = '!'; Stdout.formatln("d: {}", d); d = dInit; Stdout.formatln("d: {}", d); Stdout.formatln(""); int[] i; i = [1, 2]; Stdout.formatln("i: {}", i); i[0] = 77; Stdout.formatln("i: {}", i); i = [1, 2]; Stdout.formatln("i: {}", i); } --------------- Expected output: a: AAA a: !AA a: AAA b: AAA b: !AA b: AAA c: AA1 c: !A1 c: AA1 d: AA2 d: !A2 d: AA2 i: [1, 2] i: [77, 2] i: [1, 2] Actual output: a: AAA a: !AA a: !AA b: !AA b: !AA b: !AA c: AA1 c: !A1 c: !A1 d: AA2 d: !A2 d: !A2 i: [1, 2] i: [77, 2] i: [1, 2] -- |
October 14, 2008 [Issue 2418] Same-value string (char[]) literals get overwritten (unlike array literals) | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2418 2korden@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID ------- Comment #2 from 2korden@gmail.com 2008-10-14 18:01 ------- The bug is at your site. You don't make string copies and work on a single string. char[] a = "AAA"; char[] b = a; // both point to the same location a[0] = '!'; assert(b[0] == '!'); // should be true since they both point to the same location Same example in C (this may help understand better): char* str = "AAA"; char* ptr = str; str[0] = '!'; assert(ptr[0] == '!'); The actual bug is that the following line should not compile: char[] a = "AAA"; Problem is, D1 doesn't have invariant type modifier and the spec is frozen, so this will never be fixed. D2, however, doesn't allow you to do this. Solution would be to make copies explicitly: a = new FooChar("AAA".dup); -- |
October 15, 2008 [Issue 2418] Same-value string (char[]) literals get overwritten (unlike array literals) | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2418 business3@twistedpairgaming.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|INVALID | ------- Comment #3 from business3@twistedpairgaming.com 2008-10-14 20:34 ------- (In reply to comment #2) > The bug is at your site. You don't make string copies and work on a single string. > > char[] a = "AAA"; > char[] b = a; // both point to the same location > a[0] = '!'; > assert(b[0] == '!'); // should be true since they both point to the same > location > > Same example in C (this may help understand better): > char* str = "AAA"; > char* ptr = str; > str[0] = '!'; > assert(ptr[0] == '!'); > I don't think that's applicable (or at least, it shouldn't be). Here's a shortened example: char[] a = "AAA"; a[0] = '!'; char[] b = "AAA"; It's absurd that 'b' should become "!AA". It's clear that the compiler is "optimizing" both "AAA" string literals into the same reference: char[] _tmp1 = "AAA"; char[] a = _tmp1; a[0] = '!'; char[] b = _tmp1; But I'm saying that transformation should not be occurring. The two "AAA" literals should not share the same reference. In D2, array literals are immutable, so yes, in D2 it works out fine. But in D1, array literals are *not* immutable, so the compiler shouldn't assume that array literals with a equal *value* can safely share the same *reference*. > The actual bug is that the following line should not compile: char[] a = "AAA"; Again, this is for D1, so the idea of strings being immutable doesn't apply. Besides, at the very least there's still an inconsistency: char[] a = "AAA"; a[0] = '!'; char[] b = "AAA"; // b is "!AA" char[] c = ['A', 'A', 'A']; c[0] = '!'; char[] d = ['A', 'A', 'A']; // d is "AAA" The above shows that two occurrences of "AAA" share the same reference, but two occurrences of ['A', 'A', 'A'] have separate references. The former leads to odd side-effects, the latter does not. -- |
October 15, 2008 [Issue 2418] Same-value string (char[]) literals get overwritten (unlike array literals) | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2418 2korden@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |INVALID ------- Comment #4 from 2korden@gmail.com 2008-10-15 00:02 ------- > ------- Comment #3 from business3@twistedpairgaming.com 2008-10-14 > > I don't think that's applicable (or at least, it shouldn't be). Here's a > shortened example: > > char[] a = "AAA"; > a[0] = '!'; > char[] b = "AAA"; > > It's absurd that 'b' should become "!AA". Well, that's what I expect, at least :) > It's clear that the compiler is > "optimizing" both "AAA" string literals into the same reference: > > char[] _tmp1 = "AAA"; > char[] a = _tmp1; > a[0] = '!'; > char[] b = _tmp1; Of course it does! This string literal is put into executable. "AAA" is just a pointer to it. Please, understand that char[] is nothing more than a pointer to a string and an associated length. When you change the string via pointer, everyone who has that pointer gets the change. Compare to object: class A { int i = 15; } A aaa = new A(); // aaa is an analog of "AAA" A a = aaa; writefln(a); // prints 15 a.i = 42; A b = aaa; writefln(b); // why the hell it prints 42???? This is because you don't make a copy of aaa. Here is another example: while (true) { char[] a = "AAA"; a[0] = '!'; // in the next iteration, should a be "AAA" again? } If answer is yes, then there should be a memory allocation in line 2 under the hood. You can't get new string every time without a memory allocation. This, however, definitely should not occur. If you want to make an allocation - do it yourself explicitely. Code with the behaviour you expect should be as follows: while (true) { char[] a = "AAA".dup; writefln(a); a[0] = '!'; writefln(a); } This is by design, it works as it should. It can't be 'fixed' because it is not broken. > Besides, at the very least there's still an inconsistency: > > char[] a = "AAA"; > a[0] = '!'; > char[] b = "AAA"; // b is "!AA" > > char[] c = ['A', 'A', 'A']; > c[0] = '!'; > char[] d = ['A', 'A', 'A']; // d is "AAA" I agree, this might be a bug, d really ought to be "!AAA"! Do you want to create a new bugreport so that the latter case would be 'fixed'? :) -- |
October 15, 2008 [Issue 2418] Same-value string (char[]) literals get overwritten (unlike array literals) | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2418 ------- Comment #5 from kamm-removethis@incasoftware.de 2008-10-15 02:36 ------- The D spec says "String literals are immutable (read only)." even in D1. So code like this char[] a = "ABC"; a[0] = '!'; is illegal (and will segfault on linux!). Array literals are allocated on the heap (the D spec explicitly says so) and will thus behave fundamentally different from string literals. -- |
October 15, 2008 [Issue 2418] Same-value string (char[]) literals get overwritten (unlike array literals) | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2418 ------- Comment #6 from business3@twistedpairgaming.com 2008-10-15 05:13 ------- (In reply to comment #5) > The D spec says "String literals are immutable (read only)." even in D1. So > code like this > > char[] a = "ABC"; > a[0] = '!'; > > is illegal (and will segfault on linux!). > > Array literals are allocated on the heap (the D spec explicitly says so) and will thus behave fundamentally different from string literals. > Alright, I see now. It does seem odd though that array literals would carry a type that lacks any sort of const/read-only/immutable/etc qualifier, but is still immutable anyway. Just another of D2's improvements I guess. Out of curiosity, any idea if the lack of a run-time error on windows is due to windows itself or just something DMD/OPTLINK does differently between the platforms? -- |
Copyright © 1999-2021 by the D Language Foundation