Thread overview
Phobos returned strings
Apr 06, 2004
Vathix
Apr 06, 2004
Ben Hinkle
Apr 06, 2004
C
April 06, 2004
Most of the functions in phobos that return strings automatically new the string returned. While it makes most things easier, it makes it harder to use phobos to build a string in your own buffer.
For example, say I want to read a file but I also want to put parentheses around its data. So I use "(" ~ std.file.read("foo.txt") ~ ")". How many cycles did I just waste? If I could specify a destination buffer, I could get the file size, add 2, and avoid copying the file data twice. Here's the code:

uint size = getSize("foo.txt");
char[] str = new char[size + 2];
str[0] = '(';
str[str.length - 1] = ');
std.file.read("foo.txt", str[1 .. str.length - 1]);

Exactly what I am proposing:
Functions that return string data have an overloaded extra parameter for the destination buffer. (I know this isn't possible in all cases, but it is in most.) It would be up to the caller to supply a large enough buffer, and the filled portion slice would be returned. The convention has been to put the destination parameter first, as in memcpy(dest, src) but I don't think it should be in this case, because it's just an optional extra parameter. A good working example of this is in std.base64.

If votes are for this, I'm willing to go through phobos and implement it wherever I can, and give the origional authors full credit.


A bit off topic, what does everyone think of adding
alias char[] string;
to std.string? There are no name conflicts and I think it will appeal to more users (I think it looks nicer). The lowercase s indicates that it's a primitive value type (value that holds a reference to the string data) as opposed to a class object reference. This isn't a big deal, just thought I'd mention it. I've already noticed people doing it, hope it doesn't turn into the C bool thing ;)


-- 
Christopher E. Miller
April 06, 2004
> Exactly what I am proposing:
> Functions that return string data have an overloaded extra parameter for
> the destination buffer. (I know this isn't possible in all cases, but it
> is in most.) It would be up to the caller to supply a large enough
> buffer, and the filled portion slice would be returned. The convention
> has been to put the destination parameter first, as in memcpy(dest, src)
> but I don't think it should be in this case, because it's just an
> optional extra parameter. A good working example of this is in std.base64.
>
> If votes are for this, I'm willing to go through phobos and implement it wherever I can, and give the origional authors full credit.

I agree this is generally needed. My updates to std.stream include the
overloaded function
  void readLine(inout char[] result)
as well as the existing
  char[] readLine()
The speedup from doing this was somewhere around 10% I think.

There are probably other functions in there that should also get an overloaded input buffer.

Note when implementing something that takes a buffer don't set the length to 0 unless you absolutely have to becuase setting the legnth to 0 throws away the data pointer (see my earlier posts about reserving array space and length 0)

-Ben


April 06, 2004
> A bit off topic, what does everyone think of adding
> alias char[] string;
> to std.string?

I usually do this myself, but only because at some point I might want to change char [] , to wchar [] or dchar [] , so I'd say leave it up to the user.  Or maybe do a version for unicode , change string to d/w char instead.

> How many cycles did I just waste? If I could specify a destination buffer, I could get the file size, add 2, and avoid copying the file data twice.

How much faster would this be ? I know MSVC has an optimiazation where

MyClass x = Function();

doesn't actually copy the MyClass, just uses the local address of the variable for its creation.  If we get a choice I would vote to optimize the return by values before offering a pass by refernce addition to all functions in std.

Just my 2 pents,
C

On Tue, 06 Apr 2004 09:42:20 -0400, Vathix <vathix@dprogramming.com> wrote:

> Most of the functions in phobos that return strings automatically new the string returned. While it makes most things easier, it makes it harder to use phobos to build a string in your own buffer.
> For example, say I want to read a file but I also want to put parentheses around its data. So I use "(" ~ std.file.read("foo.txt") ~ ")". How many cycles did I just waste? If I could specify a destination buffer, I could get the file size, add 2, and avoid copying the file data twice. Here's the code:
>
> uint size = getSize("foo.txt");
> char[] str = new char[size + 2];
> str[0] = '(';
> str[str.length - 1] = ');
> std.file.read("foo.txt", str[1 .. str.length - 1]);
>
> Exactly what I am proposing:
> Functions that return string data have an overloaded extra parameter for the destination buffer. (I know this isn't possible in all cases, but it is in most.) It would be up to the caller to supply a large enough buffer, and the filled portion slice would be returned. The convention has been to put the destination parameter first, as in memcpy(dest, src) but I don't think it should be in this case, because it's just an optional extra parameter. A good working example of this is in std.base64.
>
> If votes are for this, I'm willing to go through phobos and implement it wherever I can, and give the origional authors full credit.
>
>
> A bit off topic, what does everyone think of adding
> alias char[] string;
> to std.string? There are no name conflicts and I think it will appeal to more users (I think it looks nicer). The lowercase s indicates that it's a primitive value type (value that holds a reference to the string data) as opposed to a class object reference. This isn't a big deal, just thought I'd mention it. I've already noticed people doing it, hope it doesn't turn into the C bool thing ;)
>
>



-- 
D Newsgroup.