The new std.process is ready for review (page 22)

March 12, 2013

Re: The new std.process is ready for review

Posted by Steven Schveighoffer
in reply to Lars T. Kyllingstad

Permalink

Steven Schveighoffer

Posted in reply to Lars T. Kyllingstad

Permalink

On Tue, 12 Mar 2013 03:31:30 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:


>
> Nice catch!  Fortunately, the only bug here is that I've specified a null parameter in the spawn call in pipeProcessImpl().  If you look at the various spawnProcess() overloads, you'll see that they are designed as follows:
>
> If you omit the AA parameter altogether, the child inherits the parent's environment.  A null pointer is passed as 'envz' to spawnProcessImpl(), which in turn passes this straight to CreateProcess() on Windows and replaces it by 'environ' on POSIX.
>
> If you do specify the AA parameter, it is passed through toEnvz() on its way to spawnProcessImpl().  If the AA is empty/null, toEnvz() will create an empty (but non-null) environment block, and the child's environment will be empty.

I understand that point.  I am a little concerned, however, that passing null as env results in clearing the child environment.  These concerns are simply that the most common desire is to inherit the environment, and that there is no parameter that simply says "inherit parent environment," you have to call a different function.

I suppose it's no different than exec, which you have to call the right function depending on what you want.

So this sounds fine.  The toEnvz still should be fixed to add an extra null, I'm assuming you're doing that right? :)

> I think the bug stems from the fact that, at some intermediate stage of the development, pipeProcessImpl() used to call spawnProcessImpl() directly.  I later changed this to spawnProcess(), but forgot to remove the null parameter.  I'll simply remove it, and everything will work as intended.

OK, this makes sense.

> Initially, I wrote spawnProcess() the way you suggest, i.e. that a passing a null AA is equivalent to omitting it.  But then we are left with no simple way to explicitly clear the child's environment.  We could make a distinction between a "null" and an "empty" AA, but making a non-null empty AA is a hassle.  I think it is better the way it is now.

OK

> I don't think we need to support clearing the child's environment in pipeProcess().  I suspect it is a rare need, and the user will just have to deal with spawnProcess() directly in that case.  (It's like how there are all kinds of different exec*() functions in C, but only one, simple, popen() function.)

Agree.

-Steve

12-Mar-2013 18:09, Steven Schveighoffer пишет: > On Tue, 12 Mar 2013 05:13:59 -0400, Vladimir Panteleev > <vladimir@thecybershadow.net> wrote: > >> On Tuesday, 12 March 2013 at 08:58:13 UTC, Lars T. Kyllingstad wrote: >>> For AAs, 'in' returns a pointer to the element, which is null if the >>> element does not exist. I can't think of a good way to implement >>> this. Since we have to convert the raw environment variable to a D >>> string anyways, we'd have to do something like: >>> >>> string* opIn_r(string var) >>> { >>> auto val = get(var); >>> if (val is null) return null; >>> else return [val].ptr; >>> } >>> >>> but that seems rather pointless to me. >> >> Yes. That use of the "is" operator is mainly to allow updating the value > > you meant "in", not "is", right? > >> without looking up the key twice. This behavior could be implemented >> using a proxy object, but this is not what I was talking about. I >> meant the specific case of "if (key in environment)". > > I think Valdimir wants to have opIn_r return bool? > What's wrong with adding an actual AA (inside) as a write-through cache for environment variables? -- Dmitry Olshansky

On Tuesday, 12 March 2013 at 14:09:55 UTC, Steven Schveighoffer wrote: >> Yes. That use of the "is" operator is mainly to allow updating the value > > you meant "in", not "is", right? Yes. Sorry, the keys are right next to each other :) >> without looking up the key twice. This behavior could be implemented using a proxy object, but this is not what I was talking about. I meant the specific case of "if (key in environment)". > > I think Valdimir wants to have opIn_r return bool? Returning the string (doing the same as ".get(key, null)") should have the same effect in an if statement.

On Tuesday, 12 March 2013 at 14:47:53 UTC, Vladimir Panteleev wrote: > On Tuesday, 12 March 2013 at 14:09:55 UTC, Steven Schveighoffer wrote: (...) >> you meant "in", not "is", right? > > Yes. Sorry, the keys are right next to each other :) Go dvorak!

On Tue, 12 Mar 2013 10:47:52 -0400, Vladimir Panteleev <vladimir@thecybershadow.net> wrote: > On Tuesday, 12 March 2013 at 14:09:55 UTC, Steven Schveighoffer wrote: >>> Yes. That use of the "is" operator is mainly to allow updating the value >> >> you meant "in", not "is", right? > > Yes. Sorry, the keys are right next to each other :) > >>> without looking up the key twice. This behavior could be implemented using a proxy object, but this is not what I was talking about. I meant the specific case of "if (key in environment)". >> >> I think Valdimir wants to have opIn_r return bool? > > Returning the string (doing the same as ".get(key, null)") should have the same effect in an if statement. Yes that is true. Why doesn't .get work for your case again? environment.get(key) vs. key in environment Doesn't seem that different to me... I suppose an opIn_r alias is not difficult to add, if it's just for syntax sugar. -Steve

On Tuesday, 12 March 2013 at 13:56:28 UTC, Steven Schveighoffer wrote: >> OK, but it will still solve the specific problem with the other end being open. > > Yes it does. I'd rather solve both, though. OK. The idea to close all FDs after forking seems to be the best solution so far, although I have some reservations (scaling for high max-FD environments, and it doesn't sound like "the right thing to do"). I was thinking that we could implement both approaches (closing all FDs after forking, and setting FD_CLOEXEC where appropriate), as an escape hatch: if later we suddenly find out that one of them was a horrible idea, we can simply remove it without much consequence.

On Tuesday, 12 March 2013 at 14:58:21 UTC, Steven Schveighoffer wrote: > Yes that is true. Why doesn't .get work for your case again? > > environment.get(key) > vs. > key in environment > > Doesn't seem that different to me... > > I suppose an opIn_r alias is not difficult to add, if it's just for syntax sugar. Yes, exactly. :) (We've seriously overblown this...)

On Tue, 12 Mar 2013 11:02:45 -0400, Vladimir Panteleev <vladimir@thecybershadow.net> wrote: > On Tuesday, 12 March 2013 at 13:56:28 UTC, Steven Schveighoffer wrote: >>> OK, but it will still solve the specific problem with the other end being open. >> >> Yes it does. I'd rather solve both, though. > > OK. The idea to close all FDs after forking seems to be the best solution so far, although I have some reservations (scaling for high max-FD environments, and it doesn't sound like "the right thing to do"). I have those same reservations on scaling. Anecdotal testing shows it takes about .1 microseconds per call to close on Lars' machine, so it's likely not terrible. The "right thing to do" is impossible since the OS doesn't give us the tools :) The best interface would be a list of file descriptors to keep open as a parameter to exec/CreateProcess. That should have been the original interface on all platforms. Windows ALMOST has this right, as it takes handles for stdin/stdout/stderr, but it requires that you also inherit all other inheritable handles in order to do that... > I was thinking that we could implement both approaches (closing all FDs after forking, and setting FD_CLOEXEC where appropriate), as an escape hatch: if later we suddenly find out that one of them was a horrible idea, we can simply remove it without much consequence. Since all the solutions we are talking about are implementation details, not specifically requested by the user, it should be easy to switch from one to the other. -Steve

On Tuesday, 12 March 2013 at 15:19:25 UTC, Steven Schveighoffer wrote: > On Tue, 12 Mar 2013 11:02:45 -0400, Vladimir Panteleev <vladimir@thecybershadow.net> wrote: >> I was thinking that we could implement both approaches (closing all FDs after forking, and setting FD_CLOEXEC where appropriate), as an escape hatch: if later we suddenly find out that one of them was a horrible idea, we can simply remove it without much consequence. > > Since all the solutions we are talking about are implementation details, not specifically requested by the user, it should be easy to switch from one to the other. What I'm worried about is what we can't predict: unexpected side effects. Disabling one approach should have a less drastic impact than replacing it with another.

Forums