Thread overview | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
May 10, 2008 [Issue 2093] New: string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=2093 Summary: string concatenation modifies original Product: D Version: 2.014 Platform: PC OS/Version: Windows Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: bugzilla@digitalmars.com ReportedBy: bartosz@relisoft.com I will attach source code for this example. It's an XML parser. It should produce the following output: c:\D\Work>xml root child color=red Text=foo bar baz Instead it produces this: c:\D\Work>xml root rootd rootd=red Text=rootdar baz The problem is that strings are modified after being copied, when the original is concatenated upon. The problem goes away if I idup strings: _name = name.idup; _value = value.idup; or when I replace a ~= b; with a = a ~ b; -- |
May 10, 2008 [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2093 ------- Comment #1 from bartosz@relisoft.com 2008-05-10 14:31 ------- Created an attachment (id=256) --> (http://d.puremagic.com/issues/attachment.cgi?id=256&action=view) Test case -- |
May 10, 2008 Re: [Issue 2093] New: string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | <d-bugmail@puremagic.com> wrote in message news:bug-2093-3@http.d.puremagic.com/issues/... > or when I replace > a ~= b; > with > a = a ~ b; ~ always creates a copy, but ~= will attempt to expand the array in-place. Now, if this is D2, and ~= is expanding an invariant(char)[] in-place, then _that_ is definitely an issue. |
November 22, 2008 [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2093 smjg@iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |smjg@iname.com ------- Comment #2 from smjg@iname.com 2008-11-21 19:09 ------- Welcome to the world of bug reporting. The way to report a bug isn't to attach a 695-line program that contains some functionality somewhere that exhibits the problem. The correct manner is to post a small example that illustrates the problem, typically either by writing a test program from scratch or by simplifying little by little the program in which you found it. If done well, the result will be small enough to post straight into the bug report rather than attaching it. DMD's code coverage analysis is a useful tool for identifying unused parts of a program in order to cut them out, among other things. -- |
November 22, 2008 [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2093 smjg@iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |wrong-code ------- Comment #3 from smjg@iname.com 2008-11-21 20:40 ------- I think I've finally managed to figure out what was going on. ---------- import std.stdio; void main() { string s1, s2; s1 ~= "hello"; s2 = s1; writefln(s1); writefln(s2); s1.length = 0; s1 ~= "Hi"; writefln(s1); writefln(s2); } ---------- hello hello Hi Hillo ---------- This is the kind of testcase we like here. Walter is more likely to fix a bug if you make life easier for him by supplying something on which the cause can easily be seen. -- |
November 22, 2008 [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2093 ------- Comment #4 from 2korden@gmail.com 2008-11-22 00:39 ------- This is a known bug and is a major array design flow. Arrays has no determined owner (the only one who can grow without a reallocation if capacity permits): import std.stdio; void main() { char[] s1, s2; s1.length = 100; // reserve the capacity s1.length = 0; s2 = s1; // both are pointing to an empty string with the capacity of 100 s1 ~= "Hello"; // array is not reallocated, it is grown in-place writefln(s1); writefln(s2); // prints empty string. s2 still points to the same string (which is now "Hello") and carries length of 0 s2 ~= "Hi"; // overwrites s1 writefln(s2); // "Hi" writefln(s1); // "Hillo" } s1 is the array owner and s2 is a slice (even though it really points to the entire array), i.e. it should reallocate and take the ownership of the reallocated array on append, but it doesn't happen. Currently an 'owner' is anyone who has a pointer to array's beginning: char[] s = "hello".dup; char[] s1 = s[0..4]; s1 ~= "!"; assert(s != s1); // fails, both are "hell!", s is overwritten s = "_hello".dup; char[] s2 = s[1..5]; s2 ~= "!"; assert(s != s1); // succeeds, s1 is not changed -- |
November 22, 2008 [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2093 ------- Comment #5 from ddparnell@bigpond.com 2008-11-22 03:58 ------- I thought 'string' types were immutable and thus ... s1.length = 0; should fail as it updates the string (trucates it to zero characters). -- |
November 22, 2008 Re: [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | 22.11.08 в 12:58 в своём письме писал(а):
> http://d.puremagic.com/issues/show_bug.cgi?id=2093
>
>
>
>
>
> ------- Comment #5 from ddparnell@bigpond.com 2008-11-22 03:58 -------
>
> I thought 'string' types were immutable and thus ...
>
> s1.length = 0;
>
> should fail as it updates the string (trucates it to zero characters).
>
>
No, string is a mutable array of immutable chars:
string == const(char)[]
|
November 22, 2008 [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2093 ------- Comment #6 from 2korden@gmail.com 2008-11-22 06:45 ------- No, string is aliased to invariant(char)[], i.e. an array of invariant characters. You can change its length (usually, decreasing) but not contents. -- |
November 22, 2008 [Issue 2093] string concatenation modifies original | ||||
---|---|---|---|---|
| ||||
Posted in reply to d-bugmail | http://d.puremagic.com/issues/show_bug.cgi?id=2093 ------- Comment #7 from smjg@iname.com 2008-11-22 08:43 ------- (In reply to comment #4) > Currently an 'owner' is anyone who has a pointer to array's beginning: > > char[] s = "hello".dup; > char[] s1 = s[0..4]; > s1 ~= "!"; > assert(s != s1); // fails, both are "hell!", s is overwritten A simple char[] is fully mutable, so that doesn't violate any established rule, but whether it's desirable is another matter. With const(char)[] or invariant(char)[], obviously this isn't going to work, so ~= should always reallocate (unless the optimiser can be sure that no other reference to the data can possibly exist). Alternatively, the GC could maintain a note of the actual length of every heap-allocated array. Ownership would be determined by matching in both start pointer and length. When the length is increased, whether by .length or ~=, either update this actual length (if it's the owner that we're extending, IWC all other references to the same data lose ownership) or reallocate the array. -- |
Copyright © 1999-2021 by the D Language Foundation