How to speedup file processing?

May 01, 2007

Tyro

May 01, 2007

Brad Anderson

May 02, 2007

Tyro

May 02, 2007

Brad Anderson

May 02, 2007

Tyro

Good day all, running the below code snippet against a 800Kb file takes an ungodly amount of time (anywhere from 25 to 35 minutes). Is there anything I can do to speed the process up? // Load ID Standards File if no db exists char[] idfile = cast(char[])read("idfile.ids"); while(idfile.length) { //Avoid utf-8 Error message. Ideal situation would be to //display the contents as it appears in the file but I'm // unsure how to accomplish that. if(!isascii(idfile[11])) { idfile = idfile[80..$]; continue; } db.exec("INSERT INTO idstd values( null, '"~idfile[0..6]~"', '"~ idfile[6..11] ~"', '"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"', '"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"', '"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"' )"); idfile = idfile[80..$]; }

Tyro wrote: > Good day all, > > running the below code snippet against a 800Kb file takes an ungodly amount of time (anywhere from 25 to 35 minutes). Is there anything I can do to speed the process up? > > // Load ID Standards File if no db exists > char[] idfile = cast(char[])read("idfile.ids"); > > while(idfile.length) > { > //Avoid utf-8 Error message. Ideal situation would be to > //display the contents as it appears in the file but I'm > // unsure how to accomplish that. > if(!isascii(idfile[11])) { > idfile = idfile[80..$]; > continue; > } > > db.exec("INSERT INTO idstd > values( null, > '"~idfile[0..6]~"', '"~ idfile[6..11] ~"', > '"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"', > '"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"', > '"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"' > )"); > idfile = idfile[80..$]; > } You might be better off making a file tailored specifically for your db's bulk copy program and using that. I suspect your issue is all the inserts, not the processing of your code. Bulk copy usually turns off any logging during the one huge insert, so that helps as well. BA

Brad Anderson Wrote: > Tyro wrote: > > Good day all, > > > > running the below code snippet against a 800Kb file takes an ungodly amount of time (anywhere from 25 to 35 minutes). Is there anything I can do to speed the process up? > > > > // Load ID Standards File if no db exists > > char[] idfile = cast(char[])read("idfile.ids"); > > > > while(idfile.length) > > { > > //Avoid utf-8 Error message. Ideal situation would be to > > //display the contents as it appears in the file but I'm > > // unsure how to accomplish that. > > if(!isascii(idfile[11])) { > > idfile = idfile[80..$]; > > continue; > > } > > > > db.exec("INSERT INTO idstd > > values( null, > > '"~idfile[0..6]~"', '"~ idfile[6..11] ~"', > > '"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"', > > '"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"', > > '"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"' > > )"); > > idfile = idfile[80..$]; > > } > > You might be better off making a file tailored specifically for your db's bulk copy program and using that. I suspect your issue is all the inserts, not the processing of your code. Bulk copy usually turns off any logging during the one huge insert, so that helps as well. > > BA Thanks Brad, Following that explanation I did some googleing and came up with the solution below: db.exec("BEGIN"); // Insert while loop here db.exec("COMMIT"); This led me to an arrayBounds Error on line 199 of sqlite_oo which happens to be the fetch() originally implemented as such: public char[][] fetch() { return data[rowCursor++]; } After reimplemening it as: public char[][] fetch() { if(rowCursor < data.length) return data[rowCursor++]; else return null; } The program compiles and runs correctly. Again, Thanks for your assistance. Tyro

Tyro wrote: > Brad Anderson Wrote: > >> Tyro wrote: >>> Good day all, >>> >>> running the below code snippet against a 800Kb file takes an ungodly amount of time (anywhere from 25 to 35 minutes). Is there anything I can do to speed the process up? >>> >>> // Load ID Standards File if no db exists >>> char[] idfile = cast(char[])read("idfile.ids"); >>> >>> while(idfile.length) >>> { >>> //Avoid utf-8 Error message. Ideal situation would be to >>> //display the contents as it appears in the file but I'm >>> // unsure how to accomplish that. >>> if(!isascii(idfile[11])) { >>> idfile = idfile[80..$]; >>> continue; >>> } >>> >>> db.exec("INSERT INTO idstd >>> values( null, >>> '"~idfile[0..6]~"', '"~ idfile[6..11] ~"', >>> '"~ idfile[11..24] ~"', '"~ idfile[24..38] ~"', >>> '"~ idfile[38..40] ~"', '"~ idfile[40..42] ~"', >>> '"~ idfile[42..43] ~"', '"~ idfile[43..44] ~"' >>> )"); >>> idfile = idfile[80..$]; >>> } >> You might be better off making a file tailored specifically for your db's bulk copy program and using that. I suspect your issue is all the inserts, not the processing of your code. Bulk copy usually turns off any logging during the one huge insert, so that helps as well. >> >> BA > > Thanks Brad, > > Following that explanation I did some googleing and came up with the solution below: > > db.exec("BEGIN"); > // Insert while loop here > db.exec("COMMIT"); ok, setting up a transaction. > > This led me to an arrayBounds Error on line 199 of sqlite_oo which happens to be the fetch() originally implemented as such: Hmm, sqlite... not sure of a bulk-load facility there. I have more experience w/ MSSQL, PGSQL, and MySQL > > public char[][] fetch() > { > return data[rowCursor++]; > } > > After reimplemening it as: > > public char[][] fetch() > { > if(rowCursor < data.length) > return data[rowCursor++]; > else > return null; > } > > The program compiles and runs correctly. Cool. How fast? BA