Jump to page: 1 2
Thread overview
[Bug 91] String literals not always properly zero-terminated
Jan 12, 2014
Iain Buclaw
Jan 12, 2014
Iain Buclaw
Jan 12, 2014
Johannes Pfau
Jan 12, 2014
Iain Buclaw
Jan 12, 2014
Johannes Pfau
Oct 26, 2014
Peter Remmers
Oct 26, 2014
Peter Remmers
Oct 26, 2014
Ketmar Dark
Oct 26, 2014
Marc Schütz
Oct 27, 2014
Peter Remmers
Apr 05, 2015
Jens Bauer
Jul 03, 2015
Iain Buclaw
January 12, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

--- Comment #1 from Iain Buclaw <ibuclaw@gdcproject.org> 2014-01-12 15:54:59 GMT ---
Hmm... dynamic D arrays aren't guaranteed to be zero-terminated.  So I'm not sure what the immediate problem is. :)

See StringExp::toElem for where to tweak the behaviour.

-- 
Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.
January 12, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

--- Comment #2 from Iain Buclaw <ibuclaw@gdcproject.org> 2014-01-12 16:00:11 GMT ---
http://dlang.org/arrays.html

"""
Since strings, however, are not 0 terminated in D, when transferring a pointer
to a string to C, add a terminating 0:

str ~= "\0";

or use the function std.string.toStringz.
"""

-- 
Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.
January 12, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

--- Comment #3 from Johannes Pfau <johannespfau@gmail.com> 2014-01-12 16:51:00 GMT ---
Yes, but string _literals_ must be zero-terminated IIRC. I can't find the relevant docs on dlang.org but see for example this site: http://dlang.org/interfaceToC.html "However, string literals in D are 0 terminated."

So doesn't this belong to the string-literal rule? "0x1.FFFFFFFFFFFFFFFEp-16382" is a string literal and should be zero terminated. If we then assign this literal to s there's no copy, so I'd expect s.ptr to point to the literals memory and that should be zero-terminated.

Is there an error in this reasoning?

-- 
Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.
January 12, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

--- Comment #4 from Iain Buclaw <ibuclaw@gdcproject.org> 2014-01-12 17:03:57 GMT ---
(In reply to comment #3)
> Yes, but string _literals_ must be zero-terminated IIRC. I can't find the relevant docs on dlang.org but see for example this site: http://dlang.org/interfaceToC.html "However, string literals in D are 0 terminated."
> 

Yes, string literals are 0 terminated when interfacing to C char* types. :)

eg:

int printf(in char *, ...);
printf("Foo");


In StringExp::toElem, "Foo" is a Tpointer type, which is constructed as a zero-terminated string.

-- 
Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.
January 12, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

Johannes Pfau <johannespfau@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID

--- Comment #5 from Johannes Pfau <johannespfau@gmail.com> 2014-01-12 18:23:06 GMT ---
OK, now I get it. Sorry for the noise ;-)

-- 
Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.
October 26, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

Peter Remmers <p.remmers@arcor.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |p.remmers@arcor.de

--- Comment #6 from Peter Remmers <p.remmers@arcor.de> ---
I must add that I also just stumbled upon this. I was about to file another bug but then found this.

I think string literals should always be zero terminated, not just when used as a parameter to a function that takes a char*.

Here is a quote from std/string.d (the toStringz() function):

     * Note that the compiler will put a 0 past the end of static
     * strings, and the storage allocator will put a 0 past the end
     * of newly allocated char[]'s.


This little test program works on DMD and LDC2, but fails on GDC:

int main(string[] argv)
{
    string s = "Hello"; // same with static string s = "Hello";
    assert(*(s.ptr + s.length - 1) == 'o'); // OK
    assert(*(s.ptr + s.length) == '\0');    // fails

    return 0;
}

I think it's a bug.

-- 
You are receiving this mail because:
You are watching all bug changes.


October 26, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

--- Comment #7 from Peter Remmers <p.remmers@arcor.de> ---
I might add, always adding a zero termination costs nothing apart from a few bytes in the data segment.

At the very least this is a performance issue, as GDC's toStringz(string) would always copy, and DMD's and LDC's would not.

-- 
You are receiving this mail because:
You are watching all bug changes.


October 26, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

Ketmar Dark <ketmar@ketmar.no-ip.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ketmar@ketmar.no-ip.org

--- Comment #8 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
i agree, 0-terminated string literals are handy sometimes. and i remember for sure that D specs promises that string literal is *always* 0-terminated (but i can't find that in specs right now).

-- 
You are receiving this mail because:
You are watching all bug changes.


October 26, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

Marc Schütz <schuetzm@gmx.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |NEW
                 CC|                            |schuetzm@gmx.net
         Resolution|INVALID                     |---

--- Comment #9 from Marc Schütz <schuetzm@gmx.net> ---
Yes, see here:
http://dlang.org/expression#StringLiterals

"String literals have a 0 appended to them, which makes them easy to pass to C or C++ functions expecting a const char* string."

Specifically, it does _not_ say that the zero is only there when the literal appears in a `char*` context.

-- 
You are receiving this mail because:
You are watching all bug changes.


October 27, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=91

--- Comment #10 from Peter Remmers <p.remmers@arcor.de> ---
The more I read about this topic, the more I notice that D seems to have a long history of this popping up, dating back to as early as 2005.

For example: http://comments.gmane.org/gmane.comp.lang.d.general/97793

The idea of using char* usage as an indicator for C-style strings does not seem so bad a solution, given the limited possibilities and the consequences of other solutions that have been explored.

The problem is, this needs to be well-documented. Every (scarce) piece of current documentation on this says "string literals are 0-terminated". No further constraints. So I would expect initializing a string variable with a literal would just copy the pointer and thus make the string also zero terminated.

Zero termination popping in and out of existence depending upon the usage context is totally un-obvious, un-intuitive, and right now un-documented.

And it is also a surprising behavior in only one of the three major D compilers.

The again, just stating that "literals are 0-terminated", and making sure they always are, is what D currently has settled on. And I wouldn't have noticed any problems if I hadn't tried GDC :)

-- 
You are receiving this mail because:
You are watching all bug changes.


« First   ‹ Prev
1 2