Thread overview | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 14, 2004 string performance issues | ||||
---|---|---|---|---|
| ||||
I'm writing a program which spits out an .obj file. I'm doing char[] out; for (int i=0;i<mvert;i++) out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n"; for (int i=0;i<mface;i++) out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n"; the performance is simply abysmal. To write a 50 meg file takes upwards of 2 minutes. is there any way to write this to be fast without sacrificing readability; part of the problem seems to be the realloc per face (instead of intelligently doubling the ram allocated) but I also suspect allocating so many small strings with ftoa and itoa isn't helping. |
June 14, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Horn | Instead of building one huge string in memory how about processing line by line:
char[128] out; // 128 is max str len
for (int i=0;i<mface;i++) {
sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
... do something with out ...
}
If the end result is going to a file the temporary buffer might not even be needed - the printf can go right to the file.
-Ben
Daniel Horn wrote:
> I'm writing a program which spits out an .obj file.
> I'm doing
>
> char[] out;
> for (int i=0;i<mvert;i++)
> out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>
> for (int i=0;i<mface;i++)
> out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>
> the performance is simply abysmal. To write a 50 meg file takes upwards of 2 minutes.
>
> is there any way to write this to be fast without sacrificing readability;
> part of the problem seems to be the realloc per face (instead of
> intelligently doubling the ram allocated)
> but I also suspect allocating so many small strings with ftoa and itoa
> isn't helping.
|
June 14, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle |
That's the code I was expecting to see and exactly the code I was wishing to avoid:
a) what if the type changes (double to real and suddenly string takes too much memory and buffer overruns)
b) not type safe (what if I say %d but pass in a float)
c) you still have to realloc every face
d) sprintf isn't part of D--it's a nasty hanging chad from C...
I'd like to see a clean solution in D entirely
Ben Hinkle wrote:
> Instead of building one huge string in memory how about
> processing line by line:
>
> char[128] out; // 128 is max str len
> for (int i=0;i<mface;i++) {
> sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
> ... do something with out ...
> }
>
> If the end result is going to a file the temporary buffer
> might not even be needed - the printf can go right to the file.
>
> -Ben
>
> Daniel Horn wrote:
>
>
>>I'm writing a program which spits out an .obj file.
>>I'm doing
>>
>>char[] out;
>>for (int i=0;i<mvert;i++)
>> out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>>
>>for (int i=0;i<mface;i++)
>> out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>>
>>the performance is simply abysmal. To write a 50 meg file takes upwards
>>of 2 minutes.
>>
>>is there any way to write this to be fast without sacrificing readability;
>>part of the problem seems to be the realloc per face (instead of
>>intelligently doubling the ram allocated)
>>but I also suspect allocating so many small strings with ftoa and itoa
>>isn't helping.
>
>
|
June 14, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Horn | What about... f = fopen("file.txt","w"); if (!f) ..barf.. for (int i = 0; i < mvert; i++) { fprintf(f,"f %s %s %s\n",toString(x[i]),toString(y[i]),toString(z[i])); } fclose(f); On Mon, 14 Jun 2004 15:22:47 -0700, Daniel Horn <hellcatv@hotmail.com> wrote: > > That's the code I was expecting to see and exactly the code I was wishing to avoid: > a) what if the type changes (double to real and suddenly string takes too much memory and buffer overruns) > b) not type safe (what if I say %d but pass in a float) > c) you still have to realloc every face > d) sprintf isn't part of D--it's a nasty hanging chad from C... > I'd like to see a clean solution in D entirely > > Ben Hinkle wrote: >> Instead of building one huge string in memory how about >> processing line by line: >> >> char[128] out; // 128 is max str len >> for (int i=0;i<mface;i++) { >> sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]); >> ... do something with out ... >> } >> >> If the end result is going to a file the temporary buffer >> might not even be needed - the printf can go right to the file. >> >> -Ben >> >> Daniel Horn wrote: >> >> >>> I'm writing a program which spits out an .obj file. >>> I'm doing >>> >>> char[] out; >>> for (int i=0;i<mvert;i++) >>> out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n"; >>> >>> for (int i=0;i<mface;i++) >>> out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n"; >>> >>> the performance is simply abysmal. To write a 50 meg file takes upwards >>> of 2 minutes. >>> >>> is there any way to write this to be fast without sacrificing readability; >>> part of the problem seems to be the realloc per face (instead of >>> intelligently doubling the ram allocated) >>> but I also suspect allocating so many small strings with ftoa and itoa >>> isn't helping. >> >> -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ |
June 14, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | That uses a string and is not typesafe
and what if I want to send it over the net.
Basically I want to do stringops internally :-) and I want to do it the "D" way.
I'm debating whether there should be a struct String that did the resizing appropriately so appends work fast... or a class String (I'm leaning towards struct since it would be a wrapper around char[] with an xtra length field)
that way I could dynamically size it appropriately (each overrun multiplying allocated length by some constant >= 1.5) walter would this be a good idea? or is there some magic you can pull so that when you assign .length or append it won't call realloc or some other slow function.
i.e. does a string always hold exactly .length (rounded up to some constant mallocable size) or does it really double the length when you overrun to aggregate decent performance out of things
Regan Heath wrote:
> What about...
>
> f = fopen("file.txt","w");
> if (!f) ..barf..
>
> for (int i = 0; i < mvert; i++) {
> fprintf(f,"f %s %s %s\n",toString(x[i]),toString(y[i]),toString(z[i]));
> }
>
> fclose(f);
>
> On Mon, 14 Jun 2004 15:22:47 -0700, Daniel Horn <hellcatv@hotmail.com> wrote:
>
>>
>> That's the code I was expecting to see and exactly the code I was wishing to avoid:
>> a) what if the type changes (double to real and suddenly string takes too much memory and buffer overruns)
>> b) not type safe (what if I say %d but pass in a float)
>> c) you still have to realloc every face
>> d) sprintf isn't part of D--it's a nasty hanging chad from C...
>> I'd like to see a clean solution in D entirely
>>
>> Ben Hinkle wrote:
>>
>>> Instead of building one huge string in memory how about
>>> processing line by line:
>>>
>>> char[128] out; // 128 is max str len
>>> for (int i=0;i<mface;i++) {
>>> sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]);
>>> ... do something with out ...
>>> }
>>>
>>> If the end result is going to a file the temporary buffer
>>> might not even be needed - the printf can go right to the file.
>>>
>>> -Ben
>>>
>>> Daniel Horn wrote:
>>>
>>>
>>>> I'm writing a program which spits out an .obj file.
>>>> I'm doing
>>>>
>>>> char[] out;
>>>> for (int i=0;i<mvert;i++)
>>>> out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n";
>>>>
>>>> for (int i=0;i<mface;i++)
>>>> out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n";
>>>>
>>>> the performance is simply abysmal. To write a 50 meg file takes upwards
>>>> of 2 minutes.
>>>>
>>>> is there any way to write this to be fast without sacrificing readability;
>>>> part of the problem seems to be the realloc per face (instead of
>>>> intelligently doubling the ram allocated)
>>>> but I also suspect allocating so many small strings with ftoa and itoa
>>>> isn't helping.
>>>
>>>
>>>
>
>
>
|
June 14, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Horn | On Mon, 14 Jun 2004 16:29:06 -0700, Daniel Horn <hellcatv@hotmail.com> wrote: > That uses a string Yep. So did your example? What am I missing? pls explain.. > and is not typesafe Why not? change x from float[] to a double[] and it still works, change it to an int[] and it still works.. sorry.. correction.. the fprintf should have had %.*s in it. eg. fprintf(f,"f %.*s %.*s %.*s\n",toString(x[i]),toString(y[i]),toString(z[i])); > and what if I want to send it over the net. The f on the start of the line says it's a float, you chop the string on spaces and parse accordingly. Isn't that what the f is there for? > Basically I want to do stringops internally :-) and I want to do it the "D" way. Im not sure I understand what you mean.. You can set the length (if you know what you need) and you can assign to a slice i.e. char[] test = "a guy named jones walked down the street" char[] foo = "regan"; test[12..17] = foo[]; So assuming your values always have a set length you can set the length of the string, then assign to the appropriate slices the data. > I'm debating whether there should be a struct String that did the resizing appropriately so appends work fast... or a class String (I'm leaning towards struct since it would be a wrapper around char[] with an xtra length field) > that way I could dynamically size it appropriately (each overrun multiplying allocated length by some constant >= 1.5) walter would this be a good idea? or is there some magic you can pull so that when you assign .length or append it won't call realloc or some other slow function. If you set the length then append, it actually appends to the end of the new allocated length eg. char[] test = "regan"; test.length = 10; test ~= "fred"; printf("%d:= ",test.length); foreach(char c; test) printf("%02x ",c); outputs 14:= 72 65 67 61 6e 00 00 00 00 00 66 72 65 64 > i.e. does a string always hold exactly .length (rounded up to some constant mallocable size) or does it really double the length when you overrun to aggregate decent performance out of things > > Regan Heath wrote: >> What about... >> >> f = fopen("file.txt","w"); >> if (!f) ..barf.. >> >> for (int i = 0; i < mvert; i++) { >> fprintf(f,"f %s %s %s\n",toString(x[i]),toString(y[i]),toString(z[i])); >> } >> >> fclose(f); >> >> On Mon, 14 Jun 2004 15:22:47 -0700, Daniel Horn <hellcatv@hotmail.com> wrote: >> >>> >>> That's the code I was expecting to see and exactly the code I was wishing to avoid: >>> a) what if the type changes (double to real and suddenly string takes too much memory and buffer overruns) >>> b) not type safe (what if I say %d but pass in a float) >>> c) you still have to realloc every face >>> d) sprintf isn't part of D--it's a nasty hanging chad from C... >>> I'd like to see a clean solution in D entirely >>> >>> Ben Hinkle wrote: >>> >>>> Instead of building one huge string in memory how about >>>> processing line by line: >>>> >>>> char[128] out; // 128 is max str len >>>> for (int i=0;i<mface;i++) { >>>> sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]); >>>> ... do something with out ... >>>> } >>>> >>>> If the end result is going to a file the temporary buffer >>>> might not even be needed - the printf can go right to the file. >>>> >>>> -Ben >>>> >>>> Daniel Horn wrote: >>>> >>>> >>>>> I'm writing a program which spits out an .obj file. >>>>> I'm doing >>>>> >>>>> char[] out; >>>>> for (int i=0;i<mvert;i++) >>>>> out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n"; >>>>> >>>>> for (int i=0;i<mface;i++) >>>>> out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n"; >>>>> >>>>> the performance is simply abysmal. To write a 50 meg file takes upwards >>>>> of 2 minutes. >>>>> >>>>> is there any way to write this to be fast without sacrificing readability; >>>>> part of the problem seems to be the realloc per face (instead of >>>>> intelligently doubling the ram allocated) >>>>> but I also suspect allocating so many small strings with ftoa and itoa >>>>> isn't helping. >>>> >>>> >>>> >> >> >> -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ |
June 15, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | I don't know the length of my numbers...
so slice assignment is painful... basically I have to have some sort of ftoa and itoa function (that need to avoid assigning memory, returning statically sized structs and lengths) then assign it to the slice, after keeping track of the last length and finding the next length.
then I need a separate counter to see how much I've allocated...it's just a mess...and this is exactly what D was supposed to avoid. This is the old fashioned "C buffer safe" way...and I'm not happy with it or I'd still be using C.
And the bottom line is that I don't want to print to a file, I want to keep it in a string. the code for concat is so clear--but why is it so slow?
PS: make sure to use toStringz when using printf with %s. toString is not guaranteed to zero terminate.
Regan Heath wrote:
> On Mon, 14 Jun 2004 16:29:06 -0700, Daniel Horn <hellcatv@hotmail.com> wrote:
>
>> That uses a string
>
>
> Yep. So did your example? What am I missing? pls explain..
>
>> and is not typesafe
>
>
> Why not? change x from float[] to a double[] and it still works, change it to an int[] and it still works.. sorry.. correction.. the fprintf should have had %.*s in it. eg.
>
> fprintf(f,"f %.*s %.*s %.*s\n",toString(x[i]),toString(y[i]),toString(z[i]));
>
>> and what if I want to send it over the net.
>
>
> The f on the start of the line says it's a float, you chop the string on spaces and parse accordingly. Isn't that what the f is there for?
>
>> Basically I want to do stringops internally :-) and I want to do it the "D" way.
>
>
> Im not sure I understand what you mean..
>
> You can set the length (if you know what you need) and you can assign to a slice i.e.
|
June 15, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Horn | In article <calcc2$2nq5$1@digitaldaemon.com>, Daniel Horn says... > >i.e. does a string always hold exactly .length (rounded up to some constant mallocable size) or does it really double the length when you overrun to aggregate decent performance out of things I've been meaning to ask this exact question :) How much memory do dynamic arrays allocate when they grow and do they ever reallocate when they shrink? Sean |
June 15, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | "Sean Kelly" <sean@f4.ca> wrote in message news:calekm$2r2h$1@digitaldaemon.com... > In article <calcc2$2nq5$1@digitaldaemon.com>, Daniel Horn says... > > > >i.e. does a string always hold exactly .length (rounded up to some constant mallocable size) or does it really double the length when you overrun to aggregate decent performance out of things > > I've been meaning to ask this exact question :) How much memory do dynamic > arrays allocate when they grow and do they ever reallocate when they shrink? > Actual allocations are the smallest power of 2 that holds the requested size. I don't think they reallocate when shrinking because you could have sliced that memory to use somewhere else. |
June 15, 2004 Re: string performance issues | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Horn | Daniel Horn wrote: > > That's the code I was expecting to see and exactly the code I was wishing to avoid: oh well. you could have warned me :-) > a) what if the type changes (double to real and suddenly string takes > too much memory and buffer overruns) another (possibly more common) case is switching to a template where the type isn't known. Casting is a way out: sprintf(buf,"f %g\n",cast(double)a[i]); If casting is too ugly then sprintf probably isn't the way to go. If overflow is a concern then snprintf is an option. Now that I think about it how about a D wrapper around the printf family that takes a dynamic array as the candidate output buffer and if the string fits in the array then it fills it and returns the slice holding the result and otherwise it allocates a dynamic array and fills that. It would probably be a few lines of snprintf and array allocation. The declaration is char[] sprintf(char[], char*, ...) > b) not type safe (what if I say %d but pass in a float) yup. true. as above casting is an option if that is a concern. > c) you still have to realloc every face I'm not exactly sure what you mean here but I'm now guessing you really do want to catenate all the strings up into one huge 50meg string in memory. Preallocation could help here. > d) sprintf isn't part of D--it's a nasty hanging chad from C... I'd like to see a clean solution in D entirely It is a matter of personal preference. I use C functions whenever it makes sense since I know them well and users reading my code will know them well. > Ben Hinkle wrote: >> Instead of building one huge string in memory how about processing line by line: >> >> char[128] out; // 128 is max str len >> for (int i=0;i<mface;i++) { >> sprintf(out,"f %d %d %d\n",a[i],b[i],c[i]); >> ... do something with out ... >> } >> >> If the end result is going to a file the temporary buffer might not even be needed - the printf can go right to the file. >> >> -Ben >> >> Daniel Horn wrote: >> >> >>>I'm writing a program which spits out an .obj file. >>>I'm doing >>> >>>char[] out; >>>for (int i=0;i<mvert;i++) >>> out~="v"~" "~ftoa(x[i])~" "~ftoa(y[i])~" "~ftoa(z[i])~"\n"; >>> >>>for (int i=0;i<mface;i++) >>> out~="f"~" "~itoa(a[i])~" "~itoa(b[i])~" "~itoa(c[i])~"\n"; >>> >>>the performance is simply abysmal. To write a 50 meg file takes upwards of 2 minutes. >>> >>>is there any way to write this to be fast without sacrificing >>>readability; part of the problem seems to be the realloc per face >>>(instead of intelligently doubling the ram allocated) >>>but I also suspect allocating so many small strings with ftoa and itoa >>>isn't helping. >> >> |
Copyright © 1999-2021 by the D Language Foundation