June 15, 2004
On Mon, 14 Jun 2004 17:02:25 -0700, Daniel Horn <hellcatv@hotmail.com> wrote:
> I don't know the length of my numbers...
> so slice assignment is painful... basically I have to have some sort of ftoa and itoa function (that need to avoid assigning memory, returning statically sized structs and lengths) then assign it to the slice, after keeping track of the last length and finding the next length.
>
> then I need a separate counter to see how much I've allocated...it's just a mess...and this is exactly what D was supposed to avoid.  This is the old fashioned "C buffer safe" way...and I'm not happy with it or I'd still be using C.

I think the bottom line is, efficient is not always easy:
-to type
-to understand
-to implement
etc..
or even possible in all situations. If it was, everything would be efficient.

Perhaps the solution in your case is to write a string class. One that assigns a length to a char[] then uses memcpy to slice data into it.

I thought one of the design goals for D was to avoid this being necessary.

I do remember a discussion on arrays in generate and how it would be nice to be able to set a property called 'reserve' which allocated the space indicated, this property could be independant of length and not effect the append operation such that...

char[] line;

line.reserve = 1000;  // allocates space for 1000 chars
line ~= "boo";        // appends to string at length (which == 0)
                      // length is now 3, but string has 1000 chars allocated to it.

Not sure if this is a big change or not.

> And the bottom line is that I don't want to print to a file, I want to keep it in a string.

That was my mistake, I thought you were printing to a file.

> the code for concat is so clear--but why is it so slow?

I'm not sure, given Vathix's statement:

"Actual allocations are the smallest power of 2 that holds the requested
size. I don't think they reallocate when shrinking because you could have
sliced that memory to use somewhere else."

If you could set a reasonable initial length AND append to the end of the data in the string rather than the end of the string length (as it currently does), then it would not need to reallocate so would be much faster.

The comment above suggests a string has a stored length, and also a stored size in allocate memory, in which case my proposed change above is very easy.

> PS: make sure to use toStringz when using printf with %s.  toString is not guaranteed to zero terminate.

Or %.*s (tho this relies on the implementation of char[])

Regan

> Regan Heath wrote:
>> On Mon, 14 Jun 2004 16:29:06 -0700, Daniel Horn <hellcatv@hotmail.com> wrote:
>>
>>> That uses a string
>>
>>
>> Yep. So did your example? What am I missing? pls explain..
>>
>>> and is not typesafe
>>
>>
>> Why not? change x from float[] to a double[] and it still works, change it to an int[] and it still works.. sorry.. correction.. the fprintf should have had %.*s in it. eg.
>>
>> fprintf(f,"f %.*s %.*s %.*s\n",toString(x[i]),toString(y[i]),toString(z[i]));
>>
>>> and what if I want to send it over the net.
>>
>>
>> The f on the start of the line says it's a float, you chop the string on spaces and parse accordingly. Isn't that what the f is there for?
>>
>>> Basically I want to do stringops internally :-) and I want to do it the "D" way.
>>
>>
>> Im not sure I understand what you mean..
>>
>> You can set the length (if you know what you need) and you can assign to a slice i.e.



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 15, 2004
"Sean Kelly" <sean@f4.ca> wrote in message news:calekm$2r2h$1@digitaldaemon.com...
> In article <calcc2$2nq5$1@digitaldaemon.com>, Daniel Horn says...
> >
> >i.e. does a string always hold exactly .length (rounded up to some constant mallocable size) or does it really double the length when you overrun to aggregate decent performance out of things
>
> I've been meaning to ask this exact question :)  How much memory do dynamic arrays allocate when they grow and do they ever reallocate when they shrink?

AFAIK, which might be woefully out-of-date, they allocate exactly what they need, up to a rounding factor.



June 15, 2004
Try this faststring class I just wrote quickly, it's very bare bones as yet.

You use it just like the string in your first example, except the constructor takes an initial string size which it sets the string length to.

Does it increase performance?

On Mon, 14 Jun 2004 17:02:25 -0700, Daniel Horn <hellcatv@hotmail.com> wrote:
> I don't know the length of my numbers...
> so slice assignment is painful... basically I have to have some sort of
> ftoa and itoa function (that need to avoid assigning memory, returning
> statically sized structs and lengths) then assign it to the slice, after
> keeping track of the last length and finding the next length.
>
> then I need a separate counter to see how much I've allocated...it's just a mess...and this is exactly what D was supposed to avoid.  This is the old fashioned "C buffer safe" way...and I'm not happy with it or I'd still be using C.
>
> And the bottom line is that I don't want to print to a file, I want to keep it in a string.  the code for concat is so clear--but why is it so slow?
>
> PS: make sure to use toStringz when using printf with %s.  toString is
> not guaranteed to zero terminate.
> Regan Heath wrote:
>> On Mon, 14 Jun 2004 16:29:06 -0700, Daniel Horn <hellcatv@hotmail.com> wrote:
>>
>>> That uses a string
>>
>>
>> Yep. So did your example? What am I missing? pls explain..
>>
>>> and is not typesafe
>>
>>
>> Why not? change x from float[] to a double[] and it still works, change it to an int[] and it still works.. sorry.. correction.. the fprintf should have had %.*s in it. eg.
>>
>> fprintf(f,"f %.*s %.*s %.*s\n",toString(x[i]),toString(y[i]),toString(z[i]));
>>
>>> and what if I want to send it over the net.
>>
>>
>> The f on the start of the line says it's a float, you chop the string on spaces and parse accordingly. Isn't that what the f is there for?
>>
>>> Basically I want to do stringops internally :-) and I want to do it the "D" way.
>>
>>
>> Im not sure I understand what you mean..
>>
>> You can set the length (if you know what you need) and you can assign to a slice i.e.



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

June 15, 2004
Try building the string with std.outbuffer. That way, you'll be avoiding the reallocations and copying.

"Daniel Horn" <hellcatv@hotmail.com> wrote in message news:cakk42$1fnr$1@digitaldaemon.com...
> I'm writing a program which spits out an .obj file.
> I'm doing
>
> char[] out;
> for (int i=0;i<mvert;i++)
>     out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>
> for (int i=0;i<mface;i++)
>     out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>
> the performance is simply abysmal.  To write a 50 meg file takes upwards of 2 minutes.
>
> is there any way to write this to be fast without sacrificing readability;
> part of the problem seems to be the realloc per face (instead of
> intelligently doubling the ram allocated)
> but I also suspect allocating so many small strings with ftoa and itoa
> isn't helping.


June 15, 2004
On Tue, 15 Jun 2004 16:05:50 +1200, Regan Heath wrote:

> module regan.faststring;
> 
> class FastString {
> 	char[] buffer;
> 	int blen;
> 
> 	this(int _length = 0) {
> 		buffer.length = _length;
> 		blen = 0;
> 	}
> 	this(FastString str) {
> 		buffer = str.buffer.dup;
> 		blen = str.blen;
> 	}
> 
> 	FastString opCat(char[] str) {
> 		FastString f = new FastString(this);
> 		f ~= str;
> 		return f;
> 	}
> 
> 	FastString opCatAssign(char[] str) {
> 		if (blen + str.length > buffer.length)
> 			buffer.length = buffer.length + (buffer.length/2);
> 		buffer[blen..blen+str.length] = str;
> 		blen += str.length;
> 		return this;
> 	}
> 
> 	char[] toString() {
> 		return buffer;
> 	}
> 
> 	uint size() {
> 		return buffer.length;
> 	}
> 
> 	uint length() {
> 		return blen;
> 	}
> }

LOL. I just created a class (struct actually) that did almost the same as this FastString. I noticed one small issue with your opCatAssign() function. When expanding the size of the buffer, you need to ensure that the expansion amount is at least able to hold the new data. So I just added the length of the new data as well as the extra 50% growth factor.

  if (blen + str.length > buffer.length)
    buffer.length = buffer.length + (buffer.length/2)
                     + str.length ;


You might also consider this routine to, to clear the buffer.

 void clear()
 {
    blen = 0;
    buffer.length = 0;
 }

-- 
Derek
Melbourne, Australia
15/Jun/04 2:44:13 PM
June 15, 2004
"Walter" <newshound@digitalmars.com> wrote in message news:calt3r$evc$2@digitaldaemon.com...
> Try building the string with std.outbuffer. That way, you'll be avoiding
the
> reallocations and copying.

Regarding OutBuffer: InBuffer is mentioned, exists?


> "Daniel Horn" <hellcatv@hotmail.com> wrote in message news:cakk42$1fnr$1@digitaldaemon.com...
> > I'm writing a program which spits out an .obj file.
> > I'm doing
> >
> > char[] out;
> > for (int i=0;i<mvert;i++)
> >     out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
> >
> > for (int i=0;i<mface;i++)
> >     out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
> >
> > the performance is simply abysmal.  To write a 50 meg file takes upwards of 2 minutes.
> >
> > is there any way to write this to be fast without sacrificing
readability;
> > part of the problem seems to be the realloc per face (instead of
> > intelligently doubling the ram allocated)
> > but I also suspect allocating so many small strings with ftoa and itoa
> > isn't helping.
>
>


June 15, 2004
On Tue, 15 Jun 2004 14:53:01 +1000, Derek Parnell <derek@psych.ward> wrote:
> On Tue, 15 Jun 2004 16:05:50 +1200, Regan Heath wrote:
>
>> module regan.faststring;
>>
>> class FastString {
>> 	char[] buffer;
>> 	int blen;
>>
>> 	this(int _length = 0) {
>> 		buffer.length = _length;
>> 		blen = 0;
>> 	}
>> 	this(FastString str) {
>> 		buffer = str.buffer.dup;
>> 		blen = str.blen;
>> 	}
>>
>> 	FastString opCat(char[] str) {
>> 		FastString f = new FastString(this);
>> 		f ~= str;
>> 		return f;
>> 	}
>>
>> 	FastString opCatAssign(char[] str) {
>> 		if (blen + str.length > buffer.length)
>> 			buffer.length = buffer.length + (buffer.length/2);
>> 		buffer[blen..blen+str.length] = str;
>> 		blen += str.length;
>> 		return this;
>> 	}
>>
>> 	char[] toString() {
>> 		return buffer;
>> 	}
>>
>> 	uint size() {
>> 		return buffer.length;
>> 	}
>>
>> 	uint length() {
>> 		return blen;
>> 	}
>> }
>
> LOL. I just created a class (struct actually) that did almost the same as
> this FastString. I noticed one small issue with your opCatAssign()
> function. When expanding the size of the buffer, you need to ensure that
> the expansion amount is at least able to hold the new data. So I just added
> the length of the new data as well as the extra 50% growth factor.

Thanks. In fact without this my code is really broken if buffer.length == 0.

>   if (blen + str.length > buffer.length)
>     buffer.length = buffer.length + (buffer.length/2)
>                      + str.length ;
>
>
> You might also consider this routine to, to clear the buffer.
>
>  void clear()
>  {
>     blen = 0;
>     buffer.length = 0;
>  }

Good idea. I wouldn't set buffer.length to 0 tho, that will free the memory associated with it (I believe) and we may as well keep it till we are deleted.

Once Daniel gets back to me about whether it is actually faster or not, then I'll polish the class(maybe make it a struct) up.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 15, 2004
I'm glad you're all looking at this...
I'll have to investigate both your struct and walters outbuffer idea.
unfortunately I may not have time until thursday ...
I'll get back to you when I have numbers...but this has been an enlightening discussion.
I'm not convinced that realloc idea actually does anything performance-wise... the allocation process itself (realloc) usually has to do some gc overhead and that could well cause problems, even if the mem were there--of course I'll throw that into a benchmarking suite and let you know the numbers.

I do insist that the "C" way of doing things is difficult and error-prone--after writing many buffer-safe C classes to do just that I can tell you that I moved to D to avoid just this. And don't get me started about snprintf.  If I use snprintf and my buffer runs out of space--it's still a RUNTIME error and I don't get the correct information into my buffer--sure it's not an EXPLOIT...but hardly better to have my program crash or do incorrect and mysterious things because of an snprintf--we've all been there.
and the concat operator has been perfect aside from the performance hit.
--Daniel

Regan Heath wrote:
> On Tue, 15 Jun 2004 14:53:01 +1000, Derek Parnell <derek@psych.ward> wrote:
> 
>> On Tue, 15 Jun 2004 16:05:50 +1200, Regan Heath wrote:
>>
>>> module regan.faststring;
>>>
>>> class FastString {
>>>     char[] buffer;
>>>     int blen;
>>>
>>>     this(int _length = 0) {
>>>         buffer.length = _length;
>>>         blen = 0;
>>>     }
>>>     this(FastString str) {
>>>         buffer = str.buffer.dup;
>>>         blen = str.blen;
>>>     }
>>>
>>>     FastString opCat(char[] str) {
>>>         FastString f = new FastString(this);
>>>         f ~= str;
>>>         return f;
>>>     }
>>>
>>>     FastString opCatAssign(char[] str) {
>>>         if (blen + str.length > buffer.length)
>>>             buffer.length = buffer.length + (buffer.length/2);
>>>         buffer[blen..blen+str.length] = str;
>>>         blen += str.length;
>>>         return this;
>>>     }
>>>
>>>     char[] toString() {
>>>         return buffer;
>>>     }
>>>
>>>     uint size() {
>>>         return buffer.length;
>>>     }
>>>
>>>     uint length() {
>>>         return blen;
>>>     }
>>> }
>>
>>
>> LOL. I just created a class (struct actually) that did almost the same as
>> this FastString. I noticed one small issue with your opCatAssign()
>> function. When expanding the size of the buffer, you need to ensure that
>> the expansion amount is at least able to hold the new data. So I just added
>> the length of the new data as well as the extra 50% growth factor.
> 
> 
> Thanks. In fact without this my code is really broken if buffer.length == 0.
> 
>>   if (blen + str.length > buffer.length)
>>     buffer.length = buffer.length + (buffer.length/2)
>>                      + str.length ;
>>
>>
>> You might also consider this routine to, to clear the buffer.
>>
>>  void clear()
>>  {
>>     blen = 0;
>>     buffer.length = 0;
>>  }
> 
> 
> Good idea. I wouldn't set buffer.length to 0 tho, that will free the memory associated with it (I believe) and we may as well keep it till we are deleted.
> 
> Once Daniel gets back to me about whether it is actually faster or not, then I'll polish the class(maybe make it a struct) up.
> 
> Regan.
> 
June 15, 2004
if people are so concerned about the cost of saving the reserved ammt...
if we always know that the reserved ammt is going to be the next power of two higher than our current value we could literally compute it each time instead of saving it (in current hardware it's better to recompute usually) :-)

that way we get the amortized cost of an append to be constant rather than (in this case) n :-/
i.e. if I concat n strings of size 1 I'm gonna have n^2 time just in the reallocation.

>>
>>
>> Good idea. I wouldn't set buffer.length to 0 tho, that will free the memory associated with it (I believe) and we may as well keep it till we are deleted.
>>
>> Once Daniel gets back to me about whether it is actually faster or not, then I'll polish the class(maybe make it a struct) up.
>>
>> Regan.
>>
June 15, 2004
Why is everyone re-inventing the wheel?

Isn't this exactly what std.outbuffer.Outbuffer is for?

Jill


1 2
Next ›   Last »