Thread overview
filename.writeln() across network
Jun 21, 2012
Paul
Jun 21, 2012
Regan Heath
Jun 21, 2012
Jonathan M Davis
Jun 21, 2012
Danny Arends
Jun 26, 2012
Paul
June 21, 2012
I wrote a program that parses a text file and writes results as it is processing the file (i.e. many writeln()'s).  On my local harddrive it works fine.  When I later used it on a file located on a file server, it went from 500ms to 1 minute processing time.

It there a more efficient way to write out the results that would say maybe only access the harddrive as it closes the connection...or somesuch?

Thanks for your assistance.
June 21, 2012
On Thu, 21 Jun 2012 14:56:37 +0100, Paul <phshaffer@gmail.com> wrote:

> I wrote a program that parses a text file and writes results as it is processing the file (i.e. many writeln()'s).  On my local harddrive it works fine.  When I later used it on a file located on a file server, it went from 500ms to 1 minute processing time.
>
> It there a more efficient way to write out the results that would say maybe only access the harddrive as it closes the connection...or somesuch?
>
> Thanks for your assistance.

I imagine writeln is synchronous/non-overlapped IO.  Meaning, the call to writeln doesn't return until the write has "completed".  So, on every call you're basically waiting for the network IO to complete before you process something else locally.

What you want is asynchronous or overlapped IO where the write starts, and the function returns, and then you later get notified that the write has completed.  This lets you continue processing locally while the write happens in the background.

That's the theory, in practice I'm not sure what options you have in phobos for overlapped IO.  If you're on windows you can pull in the win32 functions CreateFile, WriteFile etc and define the data structures required for overlapped IO.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
June 21, 2012
On Thursday, June 21, 2012 18:14:26 Regan Heath wrote:
> On Thu, 21 Jun 2012 14:56:37 +0100, Paul <phshaffer@gmail.com> wrote:
> > I wrote a program that parses a text file and writes results as it is processing the file (i.e. many writeln()'s). On my local harddrive it works fine. When I later used it on a file located on a file server, it went from 500ms to 1 minute processing time.
> > 
> > It there a more efficient way to write out the results that would say maybe only access the harddrive as it closes the connection...or somesuch?
> > 
> > Thanks for your assistance.
> 
> I imagine writeln is synchronous/non-overlapped IO. Meaning, the call to writeln doesn't return until the write has "completed". So, on every call you're basically waiting for the network IO to complete before you process something else locally.
> 
> What you want is asynchronous or overlapped IO where the write starts, and the function returns, and then you later get notified that the write has completed. This lets you continue processing locally while the write happens in the background.
> 
> That's the theory, in practice I'm not sure what options you have in phobos for overlapped IO. If you're on windows you can pull in the win32 functions CreateFile, WriteFile etc and define the data structures required for overlapped IO.

If he's pulling in data from the network and then writing it to disk, he could use std.concurrency to have the network stuff on one thread and the writing on another. It would probably mean copying the data to be able to pass it across threads (unless he's using different buffers every time he's reading the data from the network, in which case casting to immutable could do the trick), but it would disconnect the reading from the writing. IIRC, there's an example in TDPL's concurrency chapter on how to use std.concurrency to read and write files concurrently, which could be used as a starting point.

http://www.informit.com/articles/article.aspx?p=1609144

- Jonathan M Davis
June 21, 2012
On Thursday, 21 June 2012 at 17:14:34 UTC, Regan Heath wrote:
> On Thu, 21 Jun 2012 14:56:37 +0100, Paul <phshaffer@gmail.com> wrote:
>
>> I wrote a program that parses a text file and writes results as it is processing the file (i.e. many writeln()'s).  On my local harddrive it works fine.  When I later used it on a file located on a file server, it went from 500ms to 1 minute processing time.
>>
>> It there a more efficient way to write out the results that would say maybe only access the harddrive as it closes the connection...or somesuch?
>>
>> Thanks for your assistance.
>
> I imagine writeln is synchronous/non-overlapped IO.  Meaning, the call to writeln doesn't return until the write has "completed".  So, on every call you're basically waiting for the network IO to complete before you process something else locally.

Isn't the most simple approach then to build up the whole file in
memory as a single string, using \n and then do a single write
across the network ?

like:

void main(string args[]){
  string filecontent = "";
  filecontent ~= "#include std.stdio\n";
  filecontent ~= "int x = " ~ x ~ ";\n";
  //etc etc...

  auto f = new File("X:\\MyNetworkDir\\file.txt");
  f.writeln(filecontent);
  f.close();
}

Haven't tested the code, but if network IO is the wait, then just
increase the buffer..

June 26, 2012
On Thursday, 21 June 2012 at 19:52:26 UTC, Danny Arends wrote:
> On Thursday, 21 June 2012 at 17:14:34 UTC, Regan Heath wrote:
>> On Thu, 21 Jun 2012 14:56:37 +0100, Paul <phshaffer@gmail.com> wrote:
>>
>>> I wrote a program that parses a text file and writes results as it is processing the file (i.e. many writeln()'s).  On my local harddrive it works fine.  When I later used it on a file located on a file server, it went from 500ms to 1 minute processing time.
>>>
>>> It there a more efficient way to write out the results that would say maybe only access the harddrive as it closes the connection...or somesuch?
>>>
>>> Thanks for your assistance.
>>
>> I imagine writeln is synchronous/non-overlapped IO.  Meaning, the call to writeln doesn't return until the write has "completed".  So, on every call you're basically waiting for the network IO to complete before you process something else locally.
>
> Isn't the most simple approach then to build up the whole file in
> memory as a single string, using \n and then do a single write
> across the network ?
>
> like:
>
> void main(string args[]){
>   string filecontent = "";
>   filecontent ~= "#include std.stdio\n";
>   filecontent ~= "int x = " ~ x ~ ";\n";
>   //etc etc...
>
>   auto f = new File("X:\\MyNetworkDir\\file.txt");
>   f.writeln(filecontent);
>   f.close();
> }
>
> Haven't tested the code, but if network IO is the wait, then just
> increase the buffer..

Thanks for the idea.  I thought maybe there would be a way to use writeln() to do what you illustrated...basically writing to a buffer and then write the buffer to disk when needed.  It would be a nice feature that would allow a developer to change the character of his file writing quickly w/o changing much code.  Maybe open a file File("filemane.txt", "buffer") or somesuch...and then a way to tell it to write to disk when you want.