Thread overview
string manipulation performance
Jun 12, 2011
Lloyd Dupont
Jun 12, 2011
Jonathan M Davis
Jun 13, 2011
Lloyd Dupont
Jun 13, 2011
Lloyd Dupont
Jun 13, 2011
Jonathan M Davis
Jun 13, 2011
Lloyd Dupont
June 12, 2011
I have a method like that:
===
public string repeat(string s, int num)
{
   string result = s;
   for (int  i=1; i<num; i++)
       result ~= s;
   return result;
}
===
basically it will create num string, each a little longer...
is there a more efficient way to go about that?
thanks! :)

June 12, 2011
On Sun, 12 Jun 2011 12:49:25 -0400, Lloyd Dupont <ld-REMOVE@galador.net> wrote:

> I have a method like that:
> ===
> public string repeat(string s, int num)
> {
>     string result = s;
>     for (int  i=1; i<num; i++)
>         result ~= s;
>     return result;
> }
> ===
> basically it will create num string, each a little longer...
> is there a more efficient way to go about that?
> thanks! :)
>

The runtime tries its best to avoid allocating a new string on each append.  Please read the manual on appending, and you also might want to check out an article I wrote about slices that deals with appending.  The runtime also provides functions to pre-allocate an array for appending.  For example:


 public string repeat(string s, int num)
 {
     string result = s;
     result.reserve(s.length * num); // ensure result can append all the repeated data without reallocating
     for (int  i=1; i<num; i++)
         result ~= s;
     return result;
 }


http://www.digitalmars.com/d/2.0/arrays.html#resize

http://www.digitalmars.com/d/2.0/phobos/object.html#reserve

http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

-Steve
June 12, 2011
On 2011-06-12 11:08, Steven Schveighoffer wrote:
> On Sun, 12 Jun 2011 12:49:25 -0400, Lloyd Dupont <ld-REMOVE@galador.net>
> 
> wrote:
> > I have a method like that:
> > ===
> > public string repeat(string s, int num)
> > {
> > 
> >     string result = s;
> >     for (int  i=1; i<num; i++)
> > 
> >         result ~= s;
> > 
> >     return result;
> > 
> > }
> > ===
> > basically it will create num string, each a little longer...
> > is there a more efficient way to go about that?
> > thanks! :)
> 
> The runtime tries its best to avoid allocating a new string on each append.  Please read the manual on appending, and you also might want to check out an article I wrote about slices that deals with appending.  The runtime also provides functions to pre-allocate an array for appending. For example:
> 
> 
>   public string repeat(string s, int num)
>   {
>       string result = s;
>       result.reserve(s.length * num); // ensure result can append all the
> repeated data without reallocating
>       for (int  i=1; i<num; i++)
>           result ~= s;
>       return result;
>   }
> 
> 
> http://www.digitalmars.com/d/2.0/arrays.html#resize
> 
> http://www.digitalmars.com/d/2.0/phobos/object.html#reserve
> 
> http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

Also, std.string.repeat has been scheduled for deprecation. You should use std.array.replicate instead. It does the same thing but for all arrays instead of just strings.

- Jonathan M Davis
June 13, 2011
But... string being immutable I don't see the point of allocating some space for one..
Am I missing something?

"Steven Schveighoffer"  wrote in message news:op.vwy503w4eav7ka@localhost.localdomain...

On Sun, 12 Jun 2011 12:49:25 -0400, Lloyd Dupont <ld-REMOVE@galador.net>
wrote:

> I have a method like that:
> ===
> public string repeat(string s, int num)
> {
>     string result = s;
>     for (int  i=1; i<num; i++)
>         result ~= s;
>     return result;
> }
> ===
> basically it will create num string, each a little longer...
> is there a more efficient way to go about that?
> thanks! :)
>

The runtime tries its best to avoid allocating a new string on each
append.  Please read the manual on appending, and you also might want to
check out an article I wrote about slices that deals with appending.  The
runtime also provides functions to pre-allocate an array for appending.
For example:


 public string repeat(string s, int num)
 {
     string result = s;
     result.reserve(s.length * num); // ensure result can append all the
repeated data without reallocating
     for (int  i=1; i<num; i++)
         result ~= s;
     return result;
 }


http://www.digitalmars.com/d/2.0/arrays.html#resize

http://www.digitalmars.com/d/2.0/phobos/object.html#reserve

http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

-Steve 

June 13, 2011
Thanks!

"Jonathan M Davis"  wrote in message news:mailman.851.1307909610.14074.digitalmars-d-
Also, std.string.repeat has been scheduled for deprecation. You should use
std.array.replicate instead. It does the same thing but for all arrays instead
of just strings.

- Jonathan M Davis 

June 13, 2011
On 2011-06-12 18:02, Lloyd Dupont wrote:
> But... string being immutable I don't see the point of allocating some
> space for one..
> Am I missing something?

Just because it's immutable doesn't mean that it doesn't need to exist at runtime. All immutable means is that you can't change it. It could have been created entirely at runtime with values decided entirely at runtime. An immutable variable takes up just as much space as a mutable one. The differences are that you can never change it and that it's implicitly shared.

Now, manifest constants don't exist per se. e.g.

enum str = "hello world";

str doesn't exist at runtime, and you can't take its address. But that's because every place that you use str, it gets replaced with its value. So, every place that it's used takes up just as much space as if you'd put the value there directly.

immutable variables, however, very much exist just as much as mutable ones.

- Jonathan M davis
June 13, 2011
On Sun, 12 Jun 2011 21:02:05 -0400, Lloyd Dupont <ld-REMOVE@galador.net> wrote:

> But... string being immutable I don't see the point of allocating some space for one..
> Am I missing something?

Reserving space for appending does not make that space immutable, yet.

As far as the runtime is concerned, that space is unallocated.  Although it can only ever be allocated to have immutable string data, it's not allocated yet, so it can be modified in the future.

Observe:

string s;
assert(s.ptr is null); // string is unallocated
s.reserve(100);

assert(s.length == 0); // reserve doesn't alter the actual string, it just sets up space for it to grow into
assert(s.ptr !is null); // but now it points to a memory block!
auto sptr = s.ptr; // save pointer for later proof...

for(i; 0..20) s ~= "hello"; // make a bunch of hellos
assert(s.length == 100); // yep, we added some data, but
assert(s.ptr is sptr); // it didn't move, so essentially, it "grew" into the existing memory block, that was previously unused.

The reason it works is because the unused space in the block has no references, therefore, even though it is potentially immutable, it doesn't matter that we change it because nobody else knows about it yet.

Note that without the reserve call, s.ptr would not be equal to sptr at the end of the operation, because the runtime would have chosen smaller memory blocks to begin with to store the string.

-Steve
June 13, 2011
Thanks Steven, that was very informative!

"Steven Schveighoffer"  wrote in message news:op.vwzrwdmteav7ka@localhost.localdomain...

On Sun, 12 Jun 2011 21:02:05 -0400, Lloyd Dupont <ld-REMOVE@galador.net>
wrote:

> But... string being immutable I don't see the point of allocating some space for one..
> Am I missing something?

Reserving space for appending does not make that space immutable, yet.

As far as the runtime is concerned, that space is unallocated.  Although
it can only ever be allocated to have immutable string data, it's not
allocated yet, so it can be modified in the future.

Observe:

string s;
assert(s.ptr is null); // string is unallocated
s.reserve(100);

assert(s.length == 0); // reserve doesn't alter the actual string, it just
sets up space for it to grow into
assert(s.ptr !is null); // but now it points to a memory block!
auto sptr = s.ptr; // save pointer for later proof...

for(i; 0..20) s ~= "hello"; // make a bunch of hellos
assert(s.length == 100); // yep, we added some data, but
assert(s.ptr is sptr); // it didn't move, so essentially, it "grew" into
the existing memory block, that was previously unused.

The reason it works is because the unused space in the block has no
references, therefore, even though it is potentially immutable, it doesn't
matter that we change it because nobody else knows about it yet.

Note that without the reserve call, s.ptr would not be equal to sptr at
the end of the operation, because the runtime would have chosen smaller
memory blocks to begin with to store the string.

-Steve