Jump to page: 1 2
Thread overview
Problems using strings in D
Jan 24, 2006
Chris Miller
Jan 25, 2006
Walter Bright
Jan 25, 2006
James Dunne
Jan 25, 2006
Sean Kelly
Jan 25, 2006
Sean Kelly
Jan 26, 2006
Kris
Jan 26, 2006
Sean Kelly
Jan 26, 2006
Sean Kelly
Jan 26, 2006
Sean Kelly
Jan 26, 2006
Russ Lewis
Jan 26, 2006
Ameer Armaly
Jan 26, 2006
Walter Bright
Jan 26, 2006
Sean Kelly
Jan 29, 2006
Derek Parnell
January 24, 2006
I am trying to run this little program:

    import std.stdio;
    import std.path;

    int main()
    {
        char[] test_string = null;
        char[] original = "/home/.resource";
        test_string = getBaseName(original);
        test_string[2] = 'a';
        writefln("is %s like %s?", original, test_string);
        return 0;
    }

But I get a core dump. gdb points at the line where getBaseName is being called.

 (gdb) bt
 #0  0x0804c3cc in _Dmain () at hello.d:8
 #1  0x0804c4c2 in main () at internal/dmain2.d:72
 (gdb) f 0
 #0  0x0804c3cc in _Dmain () at hello.d:8
 8           test_string = getBaseName(original);

Why does this happen and how do I prevent this?

January 24, 2006
On Tue, 24 Jan 2006 17:18:04 -0500, Grzegorz Adam Hankiewicz <fake@dont.use> wrote:

> I am trying to run this little program:
>
>     import std.stdio;
>     import std.path;
>    int main()
>     {
>         char[] test_string = null;
>         char[] original = "/home/.resource";
>         test_string = getBaseName(original);
>         test_string[2] = 'a';
>         writefln("is %s like %s?", original, test_string);
>         return 0;
>     }
>
> But I get a core dump. gdb points at the line where getBaseName is
> being called.
>
>  (gdb) bt
>  #0  0x0804c3cc in _Dmain () at hello.d:8
>  #1  0x0804c4c2 in main () at internal/dmain2.d:72
>  (gdb) f 0
>  #0  0x0804c3cc in _Dmain () at hello.d:8
>  8           test_string = getBaseName(original);
>
> Why does this happen and how do I prevent this?
>

Use copy-on-write unless you know you are the sole owner of a string. getBaseName() returns a slice of original.

test_string = getBaseName(original);
test_string = test_string.dup; // Get my own copy.
test_string[2] = 'a';
test_string[3] = 'b'; // I'm still the sole owner.
January 24, 2006
"Grzegorz Adam Hankiewicz" <fake@dont.use> wrote in message news:pan.2006.01.24.22.18.02.498385@dont.use...
>I am trying to run this little program:
>
>    import std.stdio;
>    import std.path;
>
>    int main()
>    {
>        char[] test_string = null;
>        char[] original = "/home/.resource";
>        test_string = getBaseName(original);
>        test_string[2] = 'a';
>        writefln("is %s like %s?", original, test_string);
>        return 0;
>    }

The equivalent Windows code (changing / to \ in the path name) doesn't segfault.  Try putting a .dup on the end of that string literal; I know there's a problem (?) in Linux where string literals are stored in a read-only segment, so trying to modify them (which is what your code will do) will cause a .. problem.  Maybe gdb or DMD got the line off by one, as I would expect the segfault to happen on line 9.


January 25, 2006
"Grzegorz Adam Hankiewicz" <fake@dont.use> wrote in message news:pan.2006.01.24.22.18.02.498385@dont.use...
>I am trying to run this little program:
>
>    import std.stdio;
>    import std.path;
>
>    int main()
>    {
>        char[] test_string = null;
>        char[] original = "/home/.resource";
>        test_string = getBaseName(original);
>        test_string[2] = 'a';
>        writefln("is %s like %s?", original, test_string);
>        return 0;
>    }
>
> But I get a core dump. gdb points at the line where getBaseName is being called.

String literals are read-only. std.path.getBaseName() is returing a slice of its argument, which will be into read-only data. The seg fault comes from attempting to write into that read-only data.

The COW (copy-on-write) fix to your code would be:

    test_string = getBaseName(original).dup;


January 25, 2006
Walter Bright wrote:
> "Grzegorz Adam Hankiewicz" <fake@dont.use> wrote in message news:pan.2006.01.24.22.18.02.498385@dont.use...
> 
>>I am trying to run this little program:
>>
>>   import std.stdio;
>>   import std.path;
>>
>>   int main()
>>   {
>>       char[] test_string = null;
>>       char[] original = "/home/.resource";
>>       test_string = getBaseName(original);
>>       test_string[2] = 'a';
>>       writefln("is %s like %s?", original, test_string);
>>       return 0;
>>   }
>>
>>But I get a core dump. gdb points at the line where getBaseName is
>>being called.
> 
> 
> String literals are read-only. std.path.getBaseName() is returing a slice of its argument, which will be into read-only data. The seg fault comes from attempting to write into that read-only data.
> 
> The COW (copy-on-write) fix to your code would be:
> 
>     test_string = getBaseName(original).dup; 
> 
> 

const might've told us this. =D
January 25, 2006
James Dunne wrote:
> Walter Bright wrote:
>> "Grzegorz Adam Hankiewicz" <fake@dont.use> wrote in message news:pan.2006.01.24.22.18.02.498385@dont.use...
>>
>>> I am trying to run this little program:
>>>
>>>   import std.stdio;
>>>   import std.path;
>>>
>>>   int main()
>>>   {
>>>       char[] test_string = null;
>>>       char[] original = "/home/.resource";
>>>       test_string = getBaseName(original);
>>>       test_string[2] = 'a';
>>>       writefln("is %s like %s?", original, test_string);
>>>       return 0;
>>>   }
>>>
>>> But I get a core dump. gdb points at the line where getBaseName is
>>> being called.
>>
>> String literals are read-only. std.path.getBaseName() is returing a slice of its argument, which will be into read-only data. The seg fault comes from attempting to write into that read-only data.
>>
>> The COW (copy-on-write) fix to your code would be:
>>
>>     test_string = getBaseName(original).dup;
> 
> const might've told us this. =D

The irritating thing is that the string literal is merely used for initialization in the above case.  This almost has me wishing such cases would always cause an allocation/memcpy instead of referencing the original string.  Perhaps this could be a rule when non-const arrays are initialized with const data?  What happens if a static initializer is used for an int[] array and then someone attempts an in-place modification?


Sean
January 25, 2006
Sean Kelly wrote:
> James Dunne wrote:
>> Walter Bright wrote:
>>> "Grzegorz Adam Hankiewicz" <fake@dont.use> wrote in message news:pan.2006.01.24.22.18.02.498385@dont.use...
>>>
>>>> I am trying to run this little program:
>>>>
>>>>   import std.stdio;
>>>>   import std.path;
>>>>
>>>>   int main()
>>>>   {
>>>>       char[] test_string = null;
>>>>       char[] original = "/home/.resource";
>>>>       test_string = getBaseName(original);
>>>>       test_string[2] = 'a';
>>>>       writefln("is %s like %s?", original, test_string);
>>>>       return 0;
>>>>   }
>>>>
>>>> But I get a core dump. gdb points at the line where getBaseName is
>>>> being called.
>>>
>>> String literals are read-only. std.path.getBaseName() is returing a slice of its argument, which will be into read-only data. The seg fault comes from attempting to write into that read-only data.
>>>
>>> The COW (copy-on-write) fix to your code would be:
>>>
>>>     test_string = getBaseName(original).dup;
>>
>> const might've told us this. =D
> 
> The irritating thing is that the string literal is merely used for initialization in the above case.  This almost has me wishing such cases would always cause an allocation/memcpy instead of referencing the original string.  Perhaps this could be a rule when non-const arrays are initialized with const data?  What happens if a static initializer is used for an int[] array and then someone attempts an in-place modification?

Alternately, perhaps it should be a popular D idiom to do the following:

char[] original = "/home/.resource".dup;

This would allow for efficiency when it is desired (and eliminate the need for a language change), but should dramatically reduce the chance of such errors.


Sean
January 26, 2006
"Sean Kelly" <sean@f4.ca> wrote ...
>> The irritating thing is that the string literal is merely used for initialization in the above case.  This almost has me wishing such cases would always cause an allocation/memcpy instead of referencing the original string.  Perhaps this could be a rule when non-const arrays are initialized with const data?  What happens if a static initializer is used for an int[] array and then someone attempts an in-place modification?
>
> Alternately, perhaps it should be a popular D idiom to do the following:


Alternatively, the compiler should support the notion that /some/ data is actually read-only; and report it as such. That would solve many problems.

CoW may very well look OK on paper ~ yet in my experience, when applying it to anything but trivialities, it's actually full of hollow promise. Reality rarely follows academic theory.

The true problem here is not convention per se. Instead it is the lack of compiler enforcement with respect to one convention or another. It's easy to say "Oh, one should follow the gentleman's agreement of copy upon write" ~ that's just cheap talk. It would be quite another thing if the compiler would enforce this. I rather suspect such enforcement would be more difficult than providing a limited, language-supported, read-only attribute.


January 26, 2006
Kris wrote:
> "Sean Kelly" <sean@f4.ca> wrote ...
>>> The irritating thing is that the string literal is merely used for initialization in the above case.  This almost has me wishing such cases would always cause an allocation/memcpy instead of referencing the original string.  Perhaps this could be a rule when non-const arrays are initialized with const data?  What happens if a static initializer is used for an int[] array and then someone attempts an in-place modification?
>> Alternately, perhaps it should be a popular D idiom to do the following:
> 
> 
> Alternatively, the compiler should support the notion that /some/ data is actually read-only; and report it as such. That would solve many problems.

Agreed :-)  And now that I think about it, the compiler should be able to detect such problems, as it does not seem terribly difficult to determine whether a write is being performed on something in the const data area vs. somewhere else.

> The true problem here is not convention per se. Instead it is the lack of compiler enforcement with respect to one convention or another. It's easy to say "Oh, one should follow the gentleman's agreement of copy upon write" ~ that's just cheap talk. It would be quite another thing if the compiler would enforce this. I rather suspect such enforcement would be more difficult than providing a limited, language-supported, read-only attribute. 

See above.  I think such a flag may not actually be necessary in this case, simply because code generation for const data tends to be somewhat distinct.  Perhaps some late stage analysis could be performed to detect this problem?  I'm kind of guessing here, but in the small amount of compiler work I've done in the past I think this would have been fairly simple to implement.


Sean
January 26, 2006
Sean Kelly wrote:
> 
> See above.  I think such a flag may not actually be necessary in this case, simply because code generation for const data tends to be somewhat distinct.  Perhaps some late stage analysis could be performed to detect this problem?  I'm kind of guessing here, but in the small amount of compiler work I've done in the past I think this would have been fairly simple to implement.

I take it back :-P.  Passing through an opaque function call as in the original example tosses the possibility of code analysis out the window.  But some detection might be better than none in this case.  Also, it would be nice if the system reported a meaningful error message if this occurs--perhaps something indicating that the segfault occurred from an attempted write to const data?  But once you're stuck with runtime detection, I don't really care if the problem is first noticed by a software flag or a hardware fault.  In fact, loading a core dump makes reproducing the problem fairly simple in most cases.


Sean
« First   ‹ Prev
1 2