View mode: basic / threaded / horizontal-split · Log in · Help
June 14, 2004
string performance issues
I'm writing a program which spits out an .obj file.
I'm doing

char[] out;
for (int i=0;i<mvert;i++)
   out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";

for (int i=0;i<mface;i++)
   out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";

the performance is simply abysmal.  To write a 50 meg file takes upwards 
of 2 minutes.

is there any way to write this to be fast without sacrificing readability;
part of the problem seems to be the realloc per face (instead of 
intelligently doubling the ram allocated)
but I also suspect allocating so many small strings with ftoa and itoa 
isn't helping.
June 14, 2004
Re: string performance issues
Instead of building one huge string in memory how about
processing line by line:

char[128] out; // 128 is max str len
for (int i=0;i<mface;i++) {
   sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
   ... do something with out ...
}

If the end result is going to a file the temporary buffer
might not even be needed - the printf can go right to the file.

-Ben

Daniel Horn wrote:

> I'm writing a program which spits out an .obj file.
> I'm doing
> 
> char[] out;
> for (int i=0;i<mvert;i++)
>     out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
> 
> for (int i=0;i<mface;i++)
>     out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
> 
> the performance is simply abysmal.  To write a 50 meg file takes upwards
> of 2 minutes.
> 
> is there any way to write this to be fast without sacrificing readability;
> part of the problem seems to be the realloc per face (instead of
> intelligently doubling the ram allocated)
> but I also suspect allocating so many small strings with ftoa and itoa
> isn't helping.
June 14, 2004
Re: string performance issues
That's the code I was expecting to see and exactly the code I was 
wishing to avoid:
a) what if the type changes (double to real and suddenly string takes 
too much memory and buffer overruns)
b) not type safe (what if I say %d but pass in a float)
c) you still have to realloc every face
d) sprintf isn't part of D--it's a nasty hanging chad from C...
I'd like to see a clean solution in D entirely

Ben Hinkle wrote:
> Instead of building one huge string in memory how about
> processing line by line:
> 
>  char[128] out; // 128 is max str len
>  for (int i=0;i<mface;i++) {
>     sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
>     ... do something with out ...
>  }
> 
> If the end result is going to a file the temporary buffer
> might not even be needed - the printf can go right to the file.
> 
> -Ben
> 
> Daniel Horn wrote:
> 
> 
>>I'm writing a program which spits out an .obj file.
>>I'm doing
>>
>>char[] out;
>>for (int i=0;i<mvert;i++)
>>    out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>>
>>for (int i=0;i<mface;i++)
>>    out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>>
>>the performance is simply abysmal.  To write a 50 meg file takes upwards
>>of 2 minutes.
>>
>>is there any way to write this to be fast without sacrificing readability;
>>part of the problem seems to be the realloc per face (instead of
>>intelligently doubling the ram allocated)
>>but I also suspect allocating so many small strings with ftoa and itoa
>>isn't helping.
> 
>
June 14, 2004
Re: string performance issues
What about...

f = fopen("file.txt","w");
if (!f) ..barf..

for (int i = 0; i < mvert; i++) {
	fprintf(f,"f %s %s %s\n",toString(x[i]),toString(y[i]),toString(z[i]));
}

fclose(f);

On Mon, 14 Jun 2004 15:22:47 -0700, Daniel Horn <hellcatv@hotmail.com> 
wrote:
>
> That's the code I was expecting to see and exactly the code I was 
> wishing to avoid:
> a) what if the type changes (double to real and suddenly string takes 
> too much memory and buffer overruns)
> b) not type safe (what if I say %d but pass in a float)
> c) you still have to realloc every face
> d) sprintf isn't part of D--it's a nasty hanging chad from C...
> I'd like to see a clean solution in D entirely
>
> Ben Hinkle wrote:
>> Instead of building one huge string in memory how about
>> processing line by line:
>>
>>  char[128] out; // 128 is max str len
>>  for (int i=0;i<mface;i++) {
>>     sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
>>     ... do something with out ...
>>  }
>>
>> If the end result is going to a file the temporary buffer
>> might not even be needed - the printf can go right to the file.
>>
>> -Ben
>>
>> Daniel Horn wrote:
>>
>>
>>> I'm writing a program which spits out an .obj file.
>>> I'm doing
>>>
>>> char[] out;
>>> for (int i=0;i<mvert;i++)
>>>    out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>>>
>>> for (int i=0;i<mface;i++)
>>>    out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>>>
>>> the performance is simply abysmal.  To write a 50 meg file takes 
>>> upwards
>>> of 2 minutes.
>>>
>>> is there any way to write this to be fast without sacrificing 
>>> readability;
>>> part of the problem seems to be the realloc per face (instead of
>>> intelligently doubling the ram allocated)
>>> but I also suspect allocating so many small strings with ftoa and itoa
>>> isn't helping.
>>
>>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 14, 2004
Re: string performance issues
That uses a string and is not typesafe
and what if I want to send it over the net.

Basically I want to do stringops internally :-) and I want to do it the 
"D" way.

I'm debating whether there should be a struct String that did the 
resizing appropriately so appends work fast... or a class String (I'm 
leaning towards struct since it would be a wrapper around char[] with an 
xtra length field)
that way I could dynamically size it appropriately (each overrun 
multiplying allocated length by some constant >= 1.5)  walter would this 
be a good idea? or is there some magic you can pull so that when you 
assign .length or append it won't call realloc or some other slow function.

i.e. does a string always hold exactly .length (rounded up to some 
constant mallocable size) or does it really double the length when you 
overrun to aggregate decent performance out of things


Regan Heath wrote:
> What about...
> 
> f = fopen("file.txt","w");
> if (!f) ..barf..
> 
> for (int i = 0; i < mvert; i++) {
>     fprintf(f,"f %s %s %s\n",toString(x[i]),toString(y[i]),toString(z[i]));
> }
> 
> fclose(f);
> 
> On Mon, 14 Jun 2004 15:22:47 -0700, Daniel Horn <hellcatv@hotmail.com> 
> wrote:
> 
>>
>> That's the code I was expecting to see and exactly the code I was 
>> wishing to avoid:
>> a) what if the type changes (double to real and suddenly string takes 
>> too much memory and buffer overruns)
>> b) not type safe (what if I say %d but pass in a float)
>> c) you still have to realloc every face
>> d) sprintf isn't part of D--it's a nasty hanging chad from C...
>> I'd like to see a clean solution in D entirely
>>
>> Ben Hinkle wrote:
>>
>>> Instead of building one huge string in memory how about
>>> processing line by line:
>>>
>>>  char[128] out; // 128 is max str len
>>>  for (int i=0;i<mface;i++) {
>>>     sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
>>>     ... do something with out ...
>>>  }
>>>
>>> If the end result is going to a file the temporary buffer
>>> might not even be needed - the printf can go right to the file.
>>>
>>> -Ben
>>>
>>> Daniel Horn wrote:
>>>
>>>
>>>> I'm writing a program which spits out an .obj file.
>>>> I'm doing
>>>>
>>>> char[] out;
>>>> for (int i=0;i<mvert;i++)
>>>>    out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>>>>
>>>> for (int i=0;i<mface;i++)
>>>>    out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>>>>
>>>> the performance is simply abysmal.  To write a 50 meg file takes 
>>>> upwards
>>>> of 2 minutes.
>>>>
>>>> is there any way to write this to be fast without sacrificing 
>>>> readability;
>>>> part of the problem seems to be the realloc per face (instead of
>>>> intelligently doubling the ram allocated)
>>>> but I also suspect allocating so many small strings with ftoa and itoa
>>>> isn't helping.
>>>
>>>
>>>
> 
> 
>
June 14, 2004
Re: string performance issues
On Mon, 14 Jun 2004 16:29:06 -0700, Daniel Horn <hellcatv@hotmail.com> 
wrote:
> That uses a string

Yep. So did your example? What am I missing? pls explain..

> and is not typesafe

Why not? change x from float[] to a double[] and it still works, change it 
to an int[] and it still works.. sorry.. correction.. the fprintf should 
have had %.*s in it. eg.

fprintf(f,"f %.*s %.*s 
%.*s\n",toString(x[i]),toString(y[i]),toString(z[i]));

> and what if I want to send it over the net.

The f on the start of the line says it's a float, you chop the string on 
spaces and parse accordingly. Isn't that what the f is there for?

> Basically I want to do stringops internally :-) and I want to do it the 
> "D" way.

Im not sure I understand what you mean..

You can set the length (if you know what you need) and you can assign to a 
slice i.e.

char[] test = "a guy named jones walked down the street"
char[] foo = "regan";

test[12..17] = foo[];

So assuming your values always have a set length you can set the length of 
the string, then assign to the appropriate slices the data.

> I'm debating whether there should be a struct String that did the 
> resizing appropriately so appends work fast... or a class String (I'm 
> leaning towards struct since it would be a wrapper around char[] with an 
> xtra length field)
> that way I could dynamically size it appropriately (each overrun 
> multiplying allocated length by some constant >= 1.5)  walter would this 
> be a good idea? or is there some magic you can pull so that when you 
> assign .length or append it won't call realloc or some other slow 
> function.

If you set the length then append, it actually appends to the end of the 
new allocated length eg.

char[] test = "regan";

test.length = 10;
test ~= "fred";
printf("%d:= ",test.length);
foreach(char c; test)
	printf("%02x ",c);

outputs

14:= 72 65 67 61 6e 00 00 00 00 00 66 72 65 64

> i.e. does a string always hold exactly .length (rounded up to some 
> constant mallocable size) or does it really double the length when you 
> overrun to aggregate decent performance out of things
>
> Regan Heath wrote:
>> What about...
>>
>> f = fopen("file.txt","w");
>> if (!f) ..barf..
>>
>> for (int i = 0; i < mvert; i++) {
>>     fprintf(f,"f %s %s 
>> %s\n",toString(x[i]),toString(y[i]),toString(z[i]));
>> }
>>
>> fclose(f);
>>
>> On Mon, 14 Jun 2004 15:22:47 -0700, Daniel Horn <hellcatv@hotmail.com> 
>> wrote:
>>
>>>
>>> That's the code I was expecting to see and exactly the code I was 
>>> wishing to avoid:
>>> a) what if the type changes (double to real and suddenly string takes 
>>> too much memory and buffer overruns)
>>> b) not type safe (what if I say %d but pass in a float)
>>> c) you still have to realloc every face
>>> d) sprintf isn't part of D--it's a nasty hanging chad from C...
>>> I'd like to see a clean solution in D entirely
>>>
>>> Ben Hinkle wrote:
>>>
>>>> Instead of building one huge string in memory how about
>>>> processing line by line:
>>>>
>>>>  char[128] out; // 128 is max str len
>>>>  for (int i=0;i<mface;i++) {
>>>>     sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
>>>>     ... do something with out ...
>>>>  }
>>>>
>>>> If the end result is going to a file the temporary buffer
>>>> might not even be needed - the printf can go right to the file.
>>>>
>>>> -Ben
>>>>
>>>> Daniel Horn wrote:
>>>>
>>>>
>>>>> I'm writing a program which spits out an .obj file.
>>>>> I'm doing
>>>>>
>>>>> char[] out;
>>>>> for (int i=0;i<mvert;i++)
>>>>>    out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>>>>>
>>>>> for (int i=0;i<mface;i++)
>>>>>    out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>>>>>
>>>>> the performance is simply abysmal.  To write a 50 meg file takes 
>>>>> upwards
>>>>> of 2 minutes.
>>>>>
>>>>> is there any way to write this to be fast without sacrificing 
>>>>> readability;
>>>>> part of the problem seems to be the realloc per face (instead of
>>>>> intelligently doubling the ram allocated)
>>>>> but I also suspect allocating so many small strings with ftoa and 
>>>>> itoa
>>>>> isn't helping.
>>>>
>>>>
>>>>
>>
>>
>>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 15, 2004
Re: string performance issues
I don't know the length of my numbers...
so slice assignment is painful... basically I have to have some sort of 
ftoa and itoa function (that need to avoid assigning memory, returning 
statically sized structs and lengths) then assign it to the slice, after 
keeping track of the last length and finding the next length.

then I need a separate counter to see how much I've allocated...it's 
just a mess...and this is exactly what D was supposed to avoid.  This is 
the old fashioned "C buffer safe" way...and I'm not happy with it or I'd 
still be using C.

And the bottom line is that I don't want to print to a file, I want to 
keep it in a string.  the code for concat is so clear--but why is it so 
slow?

PS: make sure to use toStringz when using printf with %s.  toString is 
not guaranteed to zero terminate.
Regan Heath wrote:
> On Mon, 14 Jun 2004 16:29:06 -0700, Daniel Horn <hellcatv@hotmail.com> 
> wrote:
> 
>> That uses a string
> 
> 
> Yep. So did your example? What am I missing? pls explain..
> 
>> and is not typesafe
> 
> 
> Why not? change x from float[] to a double[] and it still works, change 
> it to an int[] and it still works.. sorry.. correction.. the fprintf 
> should have had %.*s in it. eg.
> 
> fprintf(f,"f %.*s %.*s 
> %.*s\n",toString(x[i]),toString(y[i]),toString(z[i]));
> 
>> and what if I want to send it over the net.
> 
> 
> The f on the start of the line says it's a float, you chop the string on 
> spaces and parse accordingly. Isn't that what the f is there for?
> 
>> Basically I want to do stringops internally :-) and I want to do it 
>> the "D" way.
> 
> 
> Im not sure I understand what you mean..
> 
> You can set the length (if you know what you need) and you can assign to 
> a slice i.e.
June 15, 2004
Re: string performance issues
In article <calcc2$2nq5$1@digitaldaemon.com>, Daniel Horn says...
>
>i.e. does a string always hold exactly .length (rounded up to some 
>constant mallocable size) or does it really double the length when you 
>overrun to aggregate decent performance out of things

I've been meaning to ask this exact question :)  How much memory do dynamic
arrays allocate when they grow and do they ever reallocate when they shrink?


Sean
June 15, 2004
Re: string performance issues
"Sean Kelly" <sean@f4.ca> wrote in message
news:calekm$2r2h$1@digitaldaemon.com...
> In article <calcc2$2nq5$1@digitaldaemon.com>, Daniel Horn says...
> >
> >i.e. does a string always hold exactly .length (rounded up to some
> >constant mallocable size) or does it really double the length when you
> >overrun to aggregate decent performance out of things
>
> I've been meaning to ask this exact question :)  How much memory do
dynamic
> arrays allocate when they grow and do they ever reallocate when they
shrink?
>

Actual allocations are the smallest power of 2 that holds the requested
size. I don't think they reallocate when shrinking because you could have
sliced that memory to use somewhere else.
June 15, 2004
Re: string performance issues
Daniel Horn wrote:

> 
> That's the code I was expecting to see and exactly the code I was
> wishing to avoid:

oh well. you could have warned me :-)

> a) what if the type changes (double to real and suddenly string takes
> too much memory and buffer overruns)

another (possibly more common) case is switching to a template where the
type isn't known. Casting is a way out:
sprintf(buf,"f %g\n",cast(double)a[i]);
If casting is too ugly then sprintf probably isn't the way to go.
If overflow is a concern then snprintf is an option. Now that I think about
it how about a D wrapper around the printf family that takes a dynamic
array as the candidate output buffer and if the string fits in the array
then it fills it and returns the slice holding the result and otherwise it
allocates a dynamic array and fills that. It would probably be a few lines
of snprintf and array allocation. The declaration is
char[] sprintf(char[], char*, ...)

> b) not type safe (what if I say %d but pass in a float)

yup. true. as above casting is an option if that is a concern.

> c) you still have to realloc every face

I'm not exactly sure what you mean here but I'm now guessing you really do
want to catenate all the strings up into one huge 50meg string in memory.
Preallocation could help here.

> d) sprintf isn't part of D--it's a nasty hanging chad from C...
> I'd like to see a clean solution in D entirely

It is a matter of personal preference. I use C functions whenever it makes
sense since I know them well and users reading my code will know them well. 

> Ben Hinkle wrote:
>> Instead of building one huge string in memory how about
>> processing line by line:
>> 
>>  char[128] out; // 128 is max str len
>>  for (int i=0;i<mface;i++) {
>>     sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
>>     ... do something with out ...
>>  }
>> 
>> If the end result is going to a file the temporary buffer
>> might not even be needed - the printf can go right to the file.
>> 
>> -Ben
>> 
>> Daniel Horn wrote:
>> 
>> 
>>>I'm writing a program which spits out an .obj file.
>>>I'm doing
>>>
>>>char[] out;
>>>for (int i=0;i<mvert;i++)
>>>    out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>>>
>>>for (int i=0;i<mface;i++)
>>>    out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>>>
>>>the performance is simply abysmal.  To write a 50 meg file takes upwards
>>>of 2 minutes.
>>>
>>>is there any way to write this to be fast without sacrificing
>>>readability; part of the problem seems to be the realloc per face
>>>(instead of intelligently doubling the ram allocated)
>>>but I also suspect allocating so many small strings with ftoa and itoa
>>>isn't helping.
>> 
>>
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home