March 06, 2013
On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer <schveiguy@yahoo.com> wrote:


> a while ago (2008 or 09 I believe?), I was using Tango's Process object to execute programs on a remote agent, and forwarding all the resulting data back over the network.  On Linux, I used select to read data as it arrived.  On Windows, I think I had to spawn off a separate thread to wait for data/child processes.

More coming back to me now -- Windows pipes actually suck quite a bit.  You can't use the normal mechanisms to wait for data on them.

I also needed to spawn threads so I could combine the event-driven wait for socket data from the remote instance with the data from the pipes.  I seem to remember opening a socket to my own process in order to do this.

-Steve
March 06, 2013
On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer wrote:
> On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
>> By the way, I should mention that I ran into several issues while trying to come up with the above example. The test program does not work on Windows, for some reason I get the exception:
>>
>> std.process2.ProcessException@std\process2.d(494): Failed to spawn new process (The parameter is incorrect.)
>
> I think Lars is on that.

I will be, but I don't know when.  It may be a few days.  So if you have the time and you feel like it, feel free to have a look at it. :)

Lars
March 06, 2013
On Wednesday, 6 March 2013 at 07:27:19 UTC, Lars T. Kyllingstad wrote:
> In principle, I like that idea.  In fact, you'll see that execute() and shell() are now both implemented using a function similar to (and inspired by) the collectOutput() method you suggested.  Furthermore, pipeProcess() and pipeShell() both forward to a pipeProcessImpl() function which takes a spawn function as a template parameter.

OK, this sounds reasonable. It's just that it's easy to get a little overwhelmed by the number of various functions at first, and we've seen some confusion regarding them already. Could we add a 2-by-3 table at the top of the module, to visualize how the various function flavors relate to each other?

> Now, the output and error streams are redirected into separate pipes.  But what if "foo" starts off by writing 1 MB of data to its error stream?

What's the problem here? If the goal is to collect both stdout and stderr, and the problem is pipe clogging, we should try to solve that. In fact, if we do come up with a correct collectOutput implementation, it would likely be useful to make the function public. It would be especially useful if the function could also correctly feed the subprocess input from a buffer (string), which could be passed as an optional parameter to collectOutput.

> Maybe I'll just have to bite the bullet and accept a different name. :(  It really seems to be the one thing that is preventing the two modules from being combined.  Suggestions, anyone?

runShell? executeShell?
March 06, 2013
06-Mar-2013 21:00, Steven Schveighoffer пишет:
> On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer
> <schveiguy@yahoo.com> wrote:
>
>
>> a while ago (2008 or 09 I believe?), I was using Tango's Process
>> object to execute programs on a remote agent, and forwarding all the
>> resulting data back over the network.  On Linux, I used select to read
>> data as it arrived.  On Windows, I think I had to spawn off a separate
>> thread to wait for data/child processes.
>
> More coming back to me now -- Windows pipes actually suck quite a bit.
> You can't use the normal mechanisms to wait for data on them.
>
> I also needed to spawn threads so I could combine the event-driven wait
> for socket data from the remote instance with the data from the pipes.
> I seem to remember opening a socket to my own process in order to do this.

There is async read/write on pipes.
Though no wait on pipes does suck.
>
> -Steve


-- 
Dmitry Olshansky
March 07, 2013
On Wed, 06 Mar 2013 16:57:39 -0500, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:

> 06-Mar-2013 21:00, Steven Schveighoffer пишет:
>> On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer
>> <schveiguy@yahoo.com> wrote:
>>
>>
>>> a while ago (2008 or 09 I believe?), I was using Tango's Process
>>> object to execute programs on a remote agent, and forwarding all the
>>> resulting data back over the network.  On Linux, I used select to read
>>> data as it arrived.  On Windows, I think I had to spawn off a separate
>>> thread to wait for data/child processes.
>>
>> More coming back to me now -- Windows pipes actually suck quite a bit.
>> You can't use the normal mechanisms to wait for data on them.
>>
>> I also needed to spawn threads so I could combine the event-driven wait
>> for socket data from the remote instance with the data from the pipes.
>> I seem to remember opening a socket to my own process in order to do this.
>
> There is async read/write on pipes.
> Though no wait on pipes does suck.

Hm... I noted in the docs that async read/write is not supported:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa365141(v=vs.85).aspx

"Asynchronous (overlapped) read and write operations are not supported by anonymous pipes. This means that you cannot use the ReadFileEx and WriteFileEx functions with anonymous pipes. In addition, the lpOverlapped parameter of ReadFile and WriteFile is ignored when these functions are used with anonymous pipes."

-Steve
March 07, 2013
07-Mar-2013 16:50, Steven Schveighoffer пишет:
> On Wed, 06 Mar 2013 16:57:39 -0500, Dmitry Olshansky
> <dmitry.olsh@gmail.com> wrote:
>
>> 06-Mar-2013 21:00, Steven Schveighoffer пишет:
>>> On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer
>>> <schveiguy@yahoo.com> wrote:
>>>
>>>
>>>> a while ago (2008 or 09 I believe?), I was using Tango's Process
>>>> object to execute programs on a remote agent, and forwarding all the
>>>> resulting data back over the network.  On Linux, I used select to read
>>>> data as it arrived.  On Windows, I think I had to spawn off a separate
>>>> thread to wait for data/child processes.
>>>
>>> More coming back to me now -- Windows pipes actually suck quite a bit.
>>> You can't use the normal mechanisms to wait for data on them.
>>>
>>> I also needed to spawn threads so I could combine the event-driven wait
>>> for socket data from the remote instance with the data from the pipes.
>>> I seem to remember opening a socket to my own process in order to do
>>> this.
>>
>> There is async read/write on pipes.
>> Though no wait on pipes does suck.
>
> Hm... I noted in the docs that async read/write is not supported:
>
> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365141(v=vs.85).aspx
>
>
> "Asynchronous (overlapped) read and write operations are not supported
> by anonymous pipes. This means that you cannot use the ReadFileEx and
> WriteFileEx functions with anonymous pipes. In addition, the
> lpOverlapped parameter of ReadFile and WriteFile is ignored when these
> functions are used with anonymous pipes."

Hm.. how shitty. Especially since:
"Anonymous pipes are implemented using a named pipe with a unique name. Therefore, you can often pass a handle to an anonymous pipe to a function that requires a handle to a named pipe."

And e.g. this (Named pipe using overlapped I/O):
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365603(v=vs.85).aspx


-- 
Dmitry Olshansky
March 09, 2013
On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer wrote:
> On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
>
>> By the way, I should mention that I ran into several issues while trying to come up with the above example. The test program does not work on Windows, for some reason I get the exception:
>>
>> std.process2.ProcessException@std\process2.d(494): Failed to spawn new process (The parameter is incorrect.)
>
> I think Lars is on that.

I'm going to need som help with this one.  I only have Linux on my computer, and I can't reproduce the bug in Wine.

As a first step, could someone else try to run Vladimir's test case?


>> I've also initially tried writing a different program:
>>
>> [...]
>
> Linux should work here.  From what I can tell, you are doing it right.
>
> If I get some time, I'll try and debug this.

I think I know what the problem is, and it sucks bigtime. :(

Since the child process inherits the parent's open file descriptors, both ends of a pipe will be open in the child process.  We have separated pipe creation and process creation, so spawnProcess() knows nothing about the "other" end of the pipe it receives, and is therefore unable to close it.

In this particular case, the problem is that "sort" doesn't do anything until it receives EOF on standard input, which never happens, because even though the write end of the pipe is closed in the parent process, it is still open in the child.

I don't know how to solve this in a good way.  I can think of a few alternatives, and they all suck:

1. Make a "special" spawnProcess() function for pipe redirection.
2. Use the "process object" approach, like Tango and Qt.
3. After fork(), in the child process, loop over the full range of possible file descriptors and close the ones we don't want open.

The last one would let us keep the current API (and would have the added benefit of cleaning up unused FDs) but I have no idea how it would impact performance.

Lars
March 09, 2013
On Saturday, 9 March 2013 at 16:05:15 UTC, Lars T. Kyllingstad wrote:
> I think I know what the problem is, and it sucks bigtime. :(
>
> Since the child process inherits the parent's open file descriptors, both ends of a pipe will be open in the child process.  We have separated pipe creation and process creation, so spawnProcess() knows nothing about the "other" end of the pipe it receives, and is therefore unable to close it.
>
> In this particular case, the problem is that "sort" doesn't do anything until it receives EOF on standard input, which never happens, because even though the write end of the pipe is closed in the parent process, it is still open in the child.
>
> I don't know how to solve this in a good way.  I can think of a few alternatives, and they all suck:
>
> 1. Make a "special" spawnProcess() function for pipe redirection.
> 2. Use the "process object" approach, like Tango and Qt.
> 3. After fork(), in the child process, loop over the full range of possible file descriptors and close the ones we don't want open.
>
> The last one would let us keep the current API (and would have the added benefit of cleaning up unused FDs) but I have no idea how it would impact performance.

I have tried (3), and confirmed that it does indeed solve the problem.

Lars
March 09, 2013
On Sat, 09 Mar 2013 11:05:14 -0500, Lars T. Kyllingstad <public@kyllingen.net> wrote:

> On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer wrote:
>> On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
>
>>> I've also initially tried writing a different program:
>>>
>>> [...]
>>
>> Linux should work here.  From what I can tell, you are doing it right.
>>
>> If I get some time, I'll try and debug this.
>
> I think I know what the problem is, and it sucks bigtime. :(
>
> Since the child process inherits the parent's open file descriptors, both ends of a pipe will be open in the child process.  We have separated pipe creation and process creation, so spawnProcess() knows nothing about the "other" end of the pipe it receives, and is therefore unable to close it.
>
> In this particular case, the problem is that "sort" doesn't do anything until it receives EOF on standard input, which never happens, because even though the write end of the pipe is closed in the parent process, it is still open in the child.

Oh crap, that is bad.

Unlike Windows which is an opt-in strategy, unix has an opt-out strategy (there is the F_CLOEXEC flag).  For consistency, I think it would be good to close all the file descriptors before calling exec.

> I don't know how to solve this in a good way.  I can think of a few alternatives, and they all suck:
>
> 1. Make a "special" spawnProcess() function for pipe redirection.
> 2. Use the "process object" approach, like Tango and Qt.
> 3. After fork(), in the child process, loop over the full range of possible file descriptors and close the ones we don't want open.
>
> The last one would let us keep the current API (and would have the added benefit of cleaning up unused FDs) but I have no idea how it would impact performance.

I think 3 is the correct answer, it is consistent with Windows, and the most logical behavior.  For instance, if other threads are open and doing other things that aren't related (like network sockets), those too will be inherited!  We should close all file descriptors.

How do you loop over all open ones?  Just curious :)

-Steve
March 09, 2013
On Saturday, 9 March 2013 at 18:35:25 UTC, Steven Schveighoffer wrote:
> On Sat, 09 Mar 2013 11:05:14 -0500, Lars T. Kyllingstad <public@kyllingen.net> wrote:
>
>> On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer wrote:
>>> On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
>>
>>>> I've also initially tried writing a different program:
>>>>
>>>> [...]
>>>
>>> Linux should work here.  From what I can tell, you are doing it right.
>>>
>>> If I get some time, I'll try and debug this.
>>
>> I think I know what the problem is, and it sucks bigtime. :(
>>
>> Since the child process inherits the parent's open file descriptors, both ends of a pipe will be open in the child process.  We have separated pipe creation and process creation, so spawnProcess() knows nothing about the "other" end of the pipe it receives, and is therefore unable to close it.
>>
>> In this particular case, the problem is that "sort" doesn't do anything until it receives EOF on standard input, which never happens, because even though the write end of the pipe is closed in the parent process, it is still open in the child.
>
> Oh crap, that is bad.
>
> Unlike Windows which is an opt-in strategy, unix has an opt-out strategy (there is the F_CLOEXEC flag).  For consistency, I think it would be good to close all the file descriptors before calling exec.
>
>> I don't know how to solve this in a good way.  I can think of a few alternatives, and they all suck:
>>
>> 1. Make a "special" spawnProcess() function for pipe redirection.
>> 2. Use the "process object" approach, like Tango and Qt.
>> 3. After fork(), in the child process, loop over the full range of possible file descriptors and close the ones we don't want open.
>>
>> The last one would let us keep the current API (and would have the added benefit of cleaning up unused FDs) but I have no idea how it would impact performance.
>
> I think 3 is the correct answer, it is consistent with Windows, and the most logical behavior.  For instance, if other threads are open and doing other things that aren't related (like network sockets), those too will be inherited!  We should close all file descriptors.

I think so too.  In C, you have to know about these things, and they are specified in the documentation for fork() and exec().  In D you shouldn't have to know, things should "just work" the way you expect them to.


> How do you loop over all open ones?  Just curious :)

You don't.  That is why I said solution (3) sucks too. :)  You have to loop over all possible non-std file descriptors, i.e. from 3 to the maximum number of open files.  (On my Ubuntu installation, this is by default 1024, but may be as much as 4096.  I don't know about other *NIXes)

Here is how to do it:

import core.sys.posix.unistd, core.sys.posix.sys.resource;
rlimit r;
getrlimit(RLIMIT_NOFILE, &r);
for (int i = 0; i < r.rlim_cur; ++i)
    close(i);

Lars