March 05, 2013
On Tuesday, 5 March 2013 at 20:19:06 UTC, Lars T. Kyllingstad wrote:
> A special thanks to Vladimir P. for pointing out an egregious flaw in the original design.

But wait, there's more!

(please don't hurt me)

1. Typo: "plattform"

2. Is there any meaning in the idea of consolidating spawnProcess/pipeProcess/execute and spawnShell/pipeShell/shell? How about that collectOutput idea?

3. Where are we with compatibility with the old module? One idea I haven't seen mentioned yet is: perhaps we could make the return value of "shell" have a deprecated "alias this" to the output string, so that it's implicitly convertible to a string to preserve compatibility.

4. Is there any way to deal with pipe clogging (pipe buffer getting exceeded when manually handling both input and output of a subprocess)? Can we query the number of bytes we can immediately read/write without blocking on a File?

5. How about that Environment.opIn_r?

Great work so far otherwise!
March 05, 2013
On Tue, 05 Mar 2013 16:04:14 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:

> 4. Is there any way to deal with pipe clogging (pipe buffer getting exceeded when manually handling both input and output of a subprocess)? Can we query the number of bytes we can immediately read/write without blocking on a File?

I don't know how this could happen, can you elaborate?  Perhaps an example?

We are sort of stuck with File being the stream handler in phobos, which means we are currently stuck with FILE *.  I don't know if there is a way to do partial reads/writes on a FILE *, or checking to see if data is available.

-Steve
March 05, 2013
On Tuesday, 5 March 2013 at 21:55:24 UTC, Steven Schveighoffer wrote:
> On Tue, 05 Mar 2013 16:04:14 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
>
>> 4. Is there any way to deal with pipe clogging (pipe buffer getting exceeded when manually handling both input and output of a subprocess)? Can we query the number of bytes we can immediately read/write without blocking on a File?
>
> I don't know how this could happen, can you elaborate?  Perhaps an example?

OK! Here's a program based off the pipeProcess/pipeShell example:

---
import std.file;
import std.process2;
import std.stdio;
import std.string;

void main()
{
    auto pipes = pipeProcess("./my_application",
        Redirect.stdout | Redirect.stderr);
    scope(exit) wait(pipes.pid);

    // Store lines of output.
    string[] output;
    foreach (line; pipes.stdout.byLine) output ~= line.idup;

    // Store lines of errors.
    string[] errors;
    foreach (line; pipes.stderr.byLine) errors ~= line.idup;

    writefln("%d lines of stdout, %d lines of stderr",
        output.length, errors.length);
}
---

And here is an accompanying my_application.d:

---
import std.stdio;

enum N = 100;

void main()
{
    foreach (n; 0..N)
    {
        stdout.writeln("stdout");
        stderr.writeln("stderr");
    }
}
---

Now, everything works just fine when N is small. However, if you increase it to 10000, both the test program and my_application get stuck with 0% CPU usage.

The reason for that is that the stderr pipe is clogged: my_application can't write to it, because nothing is reading from the other end. At the same time, the first program is blocked on reading from the stdout pipe, but nothing is coming out, because my_application is blocked on writing to stderr.

By the way, I should mention that I ran into several issues while trying to come up with the above example. The test program does not work on Windows, for some reason I get the exception:

std.process2.ProcessException@std\process2.d(494): Failed to spawn new process (The parameter is incorrect.)

I've also initially tried writing a different program:

---
import std.file;
import std.process2;
import std.string;

/// Sort an array of strings using the Unix "sort" program.
string[] unixSort(string[] lines)
{
	auto pipes = pipeProcess("sort", Redirect.stdin | Redirect.stdout);
	scope(exit) wait(pipes.pid);

	foreach (line; lines)
		pipes.stdin.writeln(line);
	pipes.stdin.close();

	string[] sortedLines;
	foreach (line; pipes.stdout.byLine())
		sortedLines ~= line.idup;

	return sortedLines;
}

void main()
{
	// For the sake of example, pretend these lines came from
	// some intensive computation, and not actually a file.
	auto lines = readText("input.txt").splitLines();

	auto sortedLines = unixSort(lines);
}
---

However, I couldn't get it to work neither on Windows (same exception) nor Linux (it just gets stuck, even with a very small input.txt). No idea if I'm doing something wrong (maybe I need to indicate EOF in some way?) or if the problem is elsewhere.

> We are sort of stuck with File being the stream handler in phobos, which means we are currently stuck with FILE *.  I don't know if there is a way to do partial reads/writes on a FILE *, or checking to see if data is available.

I guess you could always get the OS file handles/descriptors and query them directly, although there's also the matter of the internal FILE * buffers.
March 05, 2013
On Sunday, 24 February 2013 at 00:25:46 UTC, Jonathan M Davis wrote:
> On Saturday, February 23, 2013 16:09:43 H. S. Teoh wrote:
>> BTW, is "std.process2" just the temporary name, or are we seriously
>> going to put in a "std.process2" into Phobos? I'm hoping the former, as
>> the latter is unforgivably ugly.
>
> In previous discussions, it was agreed that future replacement modules would
> simply have a number appended to them like that (e.g. std.xml2 or
> std.random2). I don't think that that decision is irreversible, but unless
> someone can come up with a much better name, I'd expect it to stick, and it
> has the advantage of making it very clear that it's replacing the old one.
>
> - Jonathan M Davis

That is a really really really bad idea! There are much better versioning methods out there.

import "std.process"; // uses latest version by default or file name exact match
import "std.process"[3 > ver >= 2]; or
import "std.process"[hash == hashid];

would be a better way. Module file names could have attached versioning info similar to how MS does it.

process.hash.versionid.otherattributes

The attributes are matched only if they are used, else ignored. Hence

process.hash.version.otherattributes.d
process.d

would be logically identical(and throw an error if in the same dir and no attribute matching used) but one could specify attribute matching to narrow down the choice. (or the latest version could be used by default and a warning thrown able multiple choices)


This allows one to keep multiple versions of the same module name in the same dir. It helps with upgrading because you can easily switch modules(one could set global matches instead of per module).

One could also have the compiler attach the latest match on the import so each compilation uses the latest version but it does not have to be specified by the user. When the user distributed the code it will have the proper matching elements in the code.

e.g.,

import "std.process"[auto]; // auto will be replaced by the compiler with the appropriate matching attributes. Possibly better to specify through a command line arg instead.

Another thing one can do to help is to have the compiler automatically modify the source code to include what module it was compiled with by hash and or version.

When the code is recompiled a warning can be given that a different version was used.

March 06, 2013
On Tuesday, 5 March 2013 at 21:04:15 UTC, Vladimir Panteleev wrote:
> On Tuesday, 5 March 2013 at 20:19:06 UTC, Lars T. Kyllingstad wrote:
>> A special thanks to Vladimir P. for pointing out an egregious flaw in the original design.
>
> But wait, there's more!

Aw, man....


> (please don't hurt me)
>
> 1. Typo: "plattform"

That would be my native language shining through. :)


> 2. Is there any meaning in the idea of consolidating spawnProcess/pipeProcess/execute and spawnShell/pipeShell/shell? How about that collectOutput idea?

In principle, I like that idea.  In fact, you'll see that execute() and shell() are now both implemented using a function similar to (and inspired by) the collectOutput() method you suggested.  Furthermore, pipeProcess() and pipeShell() both forward to a pipeProcessImpl() function which takes a spawn function as a template parameter.

I'm not sure if this is the API we want to expose to the user, though.  Firstly,

    auto r = execute("foo");

is a lot easier on the eye than

    auto r = pipeProcess("foo", Redirect.stdout | Redirect.stderrToStdout)
             .collectOutput();

Secondly, I only think a collectOutput() method would be appropriate to use if one of the output streams is redirected into the other.  Consider this:

    auto r = pipeProcess("foo").collectOutput();

Now, the output and error streams are redirected into separate pipes.  But what if "foo" starts off by writing 1 MB of data to its error stream?

Maybe it could be solved by some intelligent behaviour on the part of collectOutput(), based on the redirect flags, but I think it is better to encapsulate pipe creation AND reading in one function, as is currently done with execute() and shell().

pipeProcess(), on the other hand, that is another matter.  I wonder if pipeProcessImpl() would be a good public interface (with a different name, of course)?

> 3. Where are we with compatibility with the old module? One idea I haven't seen mentioned yet is: perhaps we could make the return value of "shell" have a deprecated "alias this" to the output string, so that it's implicitly convertible to a string to preserve compatibility.

If that works in all cases, I think it is a fantastic idea!  There is still the issue of the old shell() throwing when the process exits with a nonzero status, though.

Maybe I'll just have to bite the bullet and accept a different name. :(  It really seems to be the one thing that is preventing the two modules from being combined.  Suggestions, anyone?


> 4. Is there any way to deal with pipe clogging (pipe buffer getting exceeded when manually handling both input and output of a subprocess)? Can we query the number of bytes we can immediately read/write without blocking on a File?

I've wondered about that myself.  I don't know whether this is a problem std.process should aim to solve in any way, or if it should be treated as a general problem with File.

It is a real problem, though.  Pipe buffers are surprisingly small.


> 5. How about that Environment.opIn_r?

Forgot about it. :)  I'll add it.

Lars
March 06, 2013
On Tuesday, 5 March 2013 at 22:38:11 UTC, Vladimir Panteleev wrote:
> By the way, I should mention that I ran into several issues while trying to come up with the above example. The test program does not work on Windows, for some reason I get the exception:
>
> std.process2.ProcessException@std\process2.d(494): Failed to spawn new process (The parameter is incorrect.)

"The parameter is incorrect" is a Windows system error message. Apparently, there is something wrong with one of the parameters we pass to CreateProcessW.  I don't have my dev computer with me now, but my first guess would be the command line or one of the pipe handles.  I'll check it out.

> I've also initially tried writing a different program:
>
> [...]
>
> However, I couldn't get it to work neither on Windows (same exception) nor Linux (it just gets stuck, even with a very small input.txt). No idea if I'm doing something wrong (maybe I need to indicate EOF in some way?) or if the problem is elsewhere.

Usually, when such things have happened to me, it is because I've forgotten to flush a stream.  That doesn't seem to be the case here, though, since you close pipes.stdin manually.  Do you know where the program gets stuck?  I guess it is the read loop, but if you could verify that, it would be great.

Lars
March 06, 2013
On 2013-03-06 08:27, Lars T. Kyllingstad wrote:

> That would be my native language shining through. :)

Doesn't everyone have a have a spell checker in their editors :)

-- 
/Jacob Carlborg
March 06, 2013
How about std.os.process or std.system.process as names?
March 06, 2013
Am 05.03.2013 21:12, schrieb Lars T. Kyllingstad:
> On Monday, 4 March 2013 at 06:51:15 UTC, Lars T. Kyllingstad wrote:
>> On Sunday, 3 March 2013 at 11:00:52 UTC, Sönke Ludwig wrote:
>>> Mini thing: Redirect.none is not documented
>>
>> Ok, thanks!
> 
> I ended up simply removing it.  There is no point in calling pipeProcess without any redirection at all.
> 
> Lars

OK, I was actually using pipeShell() with Redirect.none to get the
output simply passed on to the console, but I simply overlooked that
this is the job of spawnShell().
March 06, 2013
On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:

> On Tuesday, 5 March 2013 at 21:55:24 UTC, Steven Schveighoffer wrote:
>> On Tue, 05 Mar 2013 16:04:14 -0500, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
>>
>>> 4. Is there any way to deal with pipe clogging (pipe buffer getting exceeded when manually handling both input and output of a subprocess)? Can we query the number of bytes we can immediately read/write without blocking on a File?
>>
>> I don't know how this could happen, can you elaborate?  Perhaps an example?
>
> OK! Here's a program based off the pipeProcess/pipeShell example:
>
> ---
> import std.file;
> import std.process2;
> import std.stdio;
> import std.string;
>
> void main()
> {
>      auto pipes = pipeProcess("./my_application",
>          Redirect.stdout | Redirect.stderr);
>      scope(exit) wait(pipes.pid);
>
>      // Store lines of output.
>      string[] output;
>      foreach (line; pipes.stdout.byLine) output ~= line.idup;
>
>      // Store lines of errors.
>      string[] errors;
>      foreach (line; pipes.stderr.byLine) errors ~= line.idup;
>
>      writefln("%d lines of stdout, %d lines of stderr",
>          output.length, errors.length);
> }
> ---
>
> And here is an accompanying my_application.d:
>
> ---
> import std.stdio;
>
> enum N = 100;
>
> void main()
> {
>      foreach (n; 0..N)
>      {
>          stdout.writeln("stdout");
>          stderr.writeln("stderr");
>      }
> }
> ---
>
> Now, everything works just fine when N is small. However, if you increase it to 10000, both the test program and my_application get stuck with 0% CPU usage.
>
> The reason for that is that the stderr pipe is clogged: my_application can't write to it, because nothing is reading from the other end. At the same time, the first program is blocked on reading from the stdout pipe, but nothing is coming out, because my_application is blocked on writing to stderr.

Right, the issue there is, File does not make a good socket/pipe interface.  I don't know what to do about that.

a while ago (2008 or 09 I believe?), I was using Tango's Process object to execute programs on a remote agent, and forwarding all the resulting data back over the network.  On Linux, I used select to read data as it arrived.  On Windows, I think I had to spawn off a separate thread to wait for data/child processes.

But Tango did not base it's I/O on FILE *, so I think we had more flexibility there.

Suggestions are welcome...

>
> By the way, I should mention that I ran into several issues while trying to come up with the above example. The test program does not work on Windows, for some reason I get the exception:
>
> std.process2.ProcessException@std\process2.d(494): Failed to spawn new process (The parameter is incorrect.)

I think Lars is on that.

>
> I've also initially tried writing a different program:
>
> ---
> import std.file;
> import std.process2;
> import std.string;
>
> /// Sort an array of strings using the Unix "sort" program.
> string[] unixSort(string[] lines)
> {
> 	auto pipes = pipeProcess("sort", Redirect.stdin | Redirect.stdout);
> 	scope(exit) wait(pipes.pid);
>
> 	foreach (line; lines)
> 		pipes.stdin.writeln(line);
> 	pipes.stdin.close();
>
> 	string[] sortedLines;
> 	foreach (line; pipes.stdout.byLine())
> 		sortedLines ~= line.idup;
>
> 	return sortedLines;
> }
>
> void main()
> {
> 	// For the sake of example, pretend these lines came from
> 	// some intensive computation, and not actually a file.
> 	auto lines = readText("input.txt").splitLines();
>
> 	auto sortedLines = unixSort(lines);
> }
> ---
>
> However, I couldn't get it to work neither on Windows (same exception) nor Linux (it just gets stuck, even with a very small input.txt). No idea if I'm doing something wrong (maybe I need to indicate EOF in some way?) or if the problem is elsewhere.

Linux should work here.  From what I can tell, you are doing it right.

If I get some time, I'll try and debug this.

>
>> We are sort of stuck with File being the stream handler in phobos, which means we are currently stuck with FILE *.  I don't know if there is a way to do partial reads/writes on a FILE *, or checking to see if data is available.
>
> I guess you could always get the OS file handles/descriptors and query them directly, although there's also the matter of the internal FILE * buffers.

I think at that point, you would have to forgo all usage of File niceties (writeln, etc).  Which would really suck.

But on the read end, this is a very viable option.

-Steve