March 12, 2013
On Tue, 12 Mar 2013 11:24:19 -0400, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:

> What I'm worried about is what we can't predict: unexpected side effects. Disabling one approach should have a less drastic impact than replacing it with another.

If someone depends on the side effects of one approach, it won't matter how less drastic it is, for them it will be bad if we disable it :)

I think it's reasonable to expect phobos does what the normal OS functions do.  If we add the F_CLOEXEC to open pipes or other file descriptors, then we could never disable that, as people may depend on that.  As our current code does NOT do that, we also may break code simply by adding that flag.

-Steve
March 12, 2013
Am Tue, 12 Mar 2013 11:19:25 -0400
schrieb "Steven Schveighoffer" <schveiguy@yahoo.com>:

> On Tue, 12 Mar 2013 11:02:45 -0400, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
> 
> > On Tuesday, 12 March 2013 at 13:56:28 UTC, Steven Schveighoffer wrote:
> >>> OK, but it will still solve the specific problem with the other end being open.
> >>
> >> Yes it does.  I'd rather solve both, though.
> >
> > OK. The idea to close all FDs after forking seems to be the best solution so far, although I have some reservations (scaling for high max-FD environments, and it doesn't sound like "the right thing to do").
> 
> I have those same reservations on scaling.

Daemon implementations on posix systems should the same thing according to best-practices: http://0pointer.de/public/systemd-man/daemon.html (step 1)

libdaemon also does that: http://git.0pointer.de/?p=libdaemon.git;a=blob;f=libdaemon/dfork.c;h=783033fe290e715df562f20d292adfc9f26e2e3a;hb=HEAD#l491

systemd probably as well so it really seems there is no better solution.
March 12, 2013
Am Tue, 12 Mar 2013 11:38:35 -0400
schrieb "Steven Schveighoffer" <schveiguy@yahoo.com>:

> On Tue, 12 Mar 2013 11:24:19 -0400, Vladimir Panteleev <vladimir@thecybershadow.net> wrote:
> 
> > What I'm worried about is what we can't predict: unexpected side effects. Disabling one approach should have a less drastic impact than replacing it with another.
> 
> If someone depends on the side effects of one approach, it won't matter how less drastic it is, for them it will be bad if we disable it :)
> 
> I think it's reasonable to expect phobos does what the normal OS functions do.  If we add the F_CLOEXEC to open pipes or other file descriptors, then we could never disable that, as people may depend on that.  As our current code does NOT do that, we also may break code simply by adding that flag.
> 
> -Steve

What makes you think Phobos should carry on all the OS specific quirks? I think a cross-platform library should offer the same behavior on all systems. In this case it can also be seen as covering up for an arguably bad Posix API design.

Secondly, unless you do a pure fork(), you wont have the data structures (like File, Socket) available any more in the sub-process to actually use the inherited file descriptors. But even if it breaks code, the affected developers will hopefully understand that the default of not inheriting descriptors by default is safer on the long run.

-- 
Marco

March 12, 2013
On Tue, 12 Mar 2013 13:48:04 -0400, Marco Leise <Marco.Leise@gmx.de> wrote:

> Am Tue, 12 Mar 2013 11:38:35 -0400
> schrieb "Steven Schveighoffer" <schveiguy@yahoo.com>:
>
>> On Tue, 12 Mar 2013 11:24:19 -0400, Vladimir Panteleev
>> <vladimir@thecybershadow.net> wrote:
>>
>> > What I'm worried about is what we can't predict: unexpected side
>> > effects. Disabling one approach should have a less drastic impact than
>> > replacing it with another.
>>
>> If someone depends on the side effects of one approach, it won't matter
>> how less drastic it is, for them it will be bad if we disable it :)
>>
>> I think it's reasonable to expect phobos does what the normal OS functions
>> do.  If we add the F_CLOEXEC to open pipes or other file descriptors, then
>> we could never disable that, as people may depend on that.  As our current
>> code does NOT do that, we also may break code simply by adding that flag.
>>
>> -Steve
>
> What makes you think Phobos should carry on all the OS
> specific quirks? I think a cross-platform library should
> offer the same behavior on all systems. In this case it can
> also be seen as covering up for an arguably bad Posix API
> design.

D's design should not preclude what is possible.  Someone may have a good reason to use that bad API.

What I want to avoid is having to require the user to pay attention to handle inheritance if he doesn't care, but also I don't want Phobos to puke when you decide to (or have to) circumvent phobos when creating file descriptors.

The most robust method is to close all the descriptors we don't want to pass, regardless of how they are opened/configured.

> Secondly, unless you do a pure fork(), you wont have the data
> structures (like File, Socket) available any more in the
> sub-process to actually use the inherited file descriptors.

Linux file descriptors are integers.  Not quite that hard to pass another file descriptor via number 3 or 4 for example.

Windows handles (I think) are guaranteed to be the same value in the child process.  But they aren't sequential integers starting at 0, so you would have to pass somehow, perhaps via parameters, what the handle values are.

I argued early on that the handles to spawnProcess should be unbuffered because the buffer will not be passed.  This can result in very weird output if both parent and child keep the same handle open.  But File is the standard structure for Phobos, so that was used.

> But even if it breaks code, the affected developers will
> hopefully understand that the default of not inheriting
> descriptors by default is safer on the long run.

This is not a good position to have as the should-be-agnostic standard library.  Phobos should make the most useful/common/safe idioms the default, but make the non-safe ones possible.  The idea of marking all descriptors as close on exec puts unnecessary burden on those who want to use straight fork/exec and want to pass Phobos-created descriptors.  Having spawnProcess depend on those flag settings puts unnecessary burden on people who use non-phobos calls to open file descriptors and want to use spawnProcess.  I'd rather avoid that.

Both methods are very similar, I just feel calling close on all descriptors between fork and exec is more effective, and puts the burden on spawnProcess instead of the user or other parts of Phobos.

-Steve
March 12, 2013
Am Tue, 12 Mar 2013 14:37:31 -0400
schrieb "Steven Schveighoffer" <schveiguy@yahoo.com>:

> On Tue, 12 Mar 2013 13:48:04 -0400, Marco Leise <Marco.Leise@gmx.de> wrote:
> 
> > What makes you think Phobos should carry on all the OS specific quirks? I think a cross-platform library should offer the same behavior on all systems. In this case it can also be seen as covering up for an arguably bad Posix API design.
> 
> D's design should not preclude what is possible.  Someone may have a good reason to use that bad API.

I don't want to stop anyone from doing that. I'm just trying
not to write book pages here. :)
It's good that we have "unwrappers" in "File":
getFP (for the C stream) and fileno (for the file descriptor)
as well as the open() syscall.

> What I want to avoid is having to require the user to pay attention to handle inheritance if he doesn't care, but also I don't want Phobos to puke when you decide to (or have to) circumvent phobos when creating file descriptors.

Exactly! And the thousands of bug reports (literally!) in other software seem to indicate that an uninstructed human being assumes that files don't remain open in sub-processes. We aren't even talking about fork here, but really new processes created with the exec family of calls.

Also common logic suggests that this behavior would not make sense, since what can a sub-process do with open files when I didn't give it the descriptor numbers?

> The most robust method is to close all the descriptors we don't want to pass, regardless of how they are opened/configured.

Ok I give in. After reading the Windows API documentation and also the SysV daemon writing tips, I had to come to the same conclusion. (In addition to not creating leaking file descriptors in Phobos in the first place that is.)

> Linux file descriptors are integers.  Not quite that hard to pass another file descriptor via number 3 or 4 for example.

No, but at that point you no longer ignorant about leaking descriptors, check what the Phobos file abstraction layer does and can unset FD_CLOEXEC:

int fileno = file.fileno();
int fdflags = fcntl(fileno, F_GETFD);
fcntl(fileno, F_SETFD, fdflags & ~FD_CLOEXEC);

or use syscalls directly.

> This is not a good position to have as the should-be-agnostic standard library.  Phobos should make the most useful/common/safe idioms the default, but make the non-safe ones possible.  The idea of marking all descriptors as close on exec puts unnecessary burden on those who want to use straight fork/exec and want to pass Phobos-created descriptors.

We agree on the "possible" and also on the most "useful/common/safe" aspect. But the conclusions that we draw are different. To you (as I read it) Phobos offers functionality to create Posix file descriptors, to me it only creates file abstractions that encapsulate and level OS quirks.

It looks to me like the issue was just not considered when writing Phobos and we can do it like Ruby and add that property to files that allows them to stay open in sub-processes. But this should not be a Posix only option. SetFileSecurity should do the job on Windows just as well.

> Having spawnProcess depend on those flag settings puts unnecessary burden on people who use non-phobos calls to open file descriptors and want to use spawnProcess.  I'd rather avoid that.
> 
> Both methods are very similar, I just feel calling close on all descriptors between fork and exec is more effective, and puts the burden on spawnProcess instead of the user or other parts of Phobos.
> 
> -Steve

Yeah, alright. Close them all by default. But keep in mind
what I said in the last post: as likely as it is that people
don't use Phobos to open files, a library might use fork/exec
directly and leak our Phobos files. The problem should not be
tackled in std.process alone.
I'll probably try myself on a pull request in the coming days
that closes Phobos file descriptors on exec and see about a
simple system-agnostic(!) property to enable it. After all,
why should Windows folks not be able to pass handles to
sub-processes?

-- 
Marco

March 12, 2013
On Tuesday, 12 March 2013 at 14:10:34 UTC, Steven Schveighoffer wrote:
> On Tue, 12 Mar 2013 03:31:30 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:
>>
>> If you omit the AA parameter altogether, the child inherits the parent's environment.  A null pointer is passed as 'envz' to spawnProcessImpl(), which in turn passes this straight to CreateProcess() on Windows and replaces it by 'environ' on POSIX.
>>
>> If you do specify the AA parameter, it is passed through toEnvz() on its way to spawnProcessImpl().  If the AA is empty/null, toEnvz() will create an empty (but non-null) environment block, and the child's environment will be empty.
>
> I understand that point.  I am a little concerned, however, that passing null as env results in clearing the child environment.  These concerns are simply that the most common desire is to inherit the environment, and that there is no parameter that simply says "inherit parent environment," you have to call a different function.

I'd be very interested to hear if you have a suggestion for a better way to do it, keeping in mind that there needs to be *some* way to clear the environment too.

> I suppose it's no different than exec, which you have to call the right function depending on what you want.
>
> So this sounds fine.  The toEnvz still should be fixed to add an extra null, I'm assuming you're doing that right? :)

Yes.  I have some local modifications based on the discussion here, which I haven't pushed yet.  Waiting for you guys to finish debating the file descriptor issue. ;)

Lars
March 12, 2013
On Tuesday, 12 March 2013 at 14:17:37 UTC, Dmitry Olshansky wrote:
>
> What's wrong with adding an actual AA (inside) as a write-through cache for environment variables?

That the environment variable may change after we've cached it.

Lars
March 12, 2013
On Tuesday, 12 March 2013 at 15:07:09 UTC, Vladimir Panteleev wrote:
> On Tuesday, 12 March 2013 at 15:02:46 UTC, Vladimir Panteleev wrote:
>> we could implement
>
> (I use the word "we" very liberally in this thread)

Nah, implementing it is easy.  Getting the API right is the hard part.  So "we" is perfectly OK. :)

Lars
March 12, 2013
On Tue, 12 Mar 2013 16:23:33 -0400, Marco Leise <Marco.Leise@gmx.de> wrote:

> Exactly! And the thousands of bug reports (literally!) in other
> software seem to indicate that an uninstructed human being
> assumes that files don't remain open in sub-processes. We
> aren't even talking about fork here, but really new processes
> created with the exec family of calls.

Given how awesome spawnProcess is, those problems should be very rare, who would use fork and exec? :)

> Also common logic suggests that this behavior would not make
> sense, since what can a sub-process do with open files when I
> didn't give it the descriptor numbers?

On unix at least, there can be an agreement that certain descriptors are passed as certain numbers.  The code that runs between fork and exec does the plumbing.

We already have agreement on what 0, 1, and 2 are.

However, I will agree that this is NOT a common thing to do.  I also agree that logic (and practicality) suggests the default on Unix should have been to not inherit descriptors.  In fact, you shouldn't really unset the close on exec flag except after calling fork to avoid racing with other threads that may be calling fork/exec at the same time.

>> Linux file descriptors are integers.  Not quite that hard to pass another
>> file descriptor via number 3 or 4 for example.
>
> No, but at that point you no longer ignorant about leaking
> descriptors, check what the Phobos file abstraction layer does
> and can unset FD_CLOEXEC:

Yeah, but why were we talking about unintended leaking?  I thought we were talking about intended leaking, and you were saying you can't use the inherited descriptors:

"Secondly, unless you do a pure fork(), you wont have the data
structures (like File, Socket) available any more in the
sub-process to actually use the inherited file descriptors."

I was contending it was quite possible to intentionally pass the descriptors as certain numbers to the child process, and then re-wrap them in new File objects.

>
> int fileno = file.fileno();
> int fdflags = fcntl(fileno, F_GETFD);
> fcntl(fileno, F_SETFD, fdflags & ~FD_CLOEXEC);

We actually need to do this in order to properly pass the standard handles.

>> This is not a good position to have as the should-be-agnostic standard
>> library.  Phobos should make the most useful/common/safe idioms the
>> default, but make the non-safe ones possible.  The idea of marking all
>> descriptors as close on exec puts unnecessary burden on those who want to
>> use straight fork/exec and want to pass Phobos-created descriptors.
>
> We agree on the "possible" and also on the most
> "useful/common/safe" aspect. But the conclusions that we draw
> are different. To you (as I read it) Phobos offers
> functionality to create Posix file descriptors, to me it only
> creates file abstractions that encapsulate and level OS quirks.
>
> It looks to me like the issue was just not considered when
> writing Phobos and we can do it like Ruby and add that
> property to files that allows them to stay open in
> sub-processes. But this should not be a Posix only option.
> SetFileSecurity should do the job on Windows just as well.

Having a property to fetch and set inheritance certainly would be a good addition to Phobos.  I still think we should do what the OS specifies by default.  There is no real "right" answer there.

-Steve
March 12, 2013
On Tue, 12 Mar 2013 17:18:43 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:

> On Tuesday, 12 March 2013 at 14:10:34 UTC, Steven Schveighoffer wrote:
>> On Tue, 12 Mar 2013 03:31:30 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:
>>>
>>> If you omit the AA parameter altogether, the child inherits the parent's environment.  A null pointer is passed as 'envz' to spawnProcessImpl(), which in turn passes this straight to CreateProcess() on Windows and replaces it by 'environ' on POSIX.
>>>
>>> If you do specify the AA parameter, it is passed through toEnvz() on its way to spawnProcessImpl().  If the AA is empty/null, toEnvz() will create an empty (but non-null) environment block, and the child's environment will be empty.
>>
>> I understand that point.  I am a little concerned, however, that passing null as env results in clearing the child environment.  These concerns are simply that the most common desire is to inherit the environment, and that there is no parameter that simply says "inherit parent environment," you have to call a different function.
>
> I'd be very interested to hear if you have a suggestion for a better way to do it, keeping in mind that there needs to be *some* way to clear the environment too.

Sadly, no I don't.  I had hoped [] would allocate an empty AA, but it fails to compile.

Note that you can "hack" it by setting a single environment variable which nobody will ever use.

i.e. spawnProcess("blah.exe", ["_____":"_____"]);

But that is really, really ugly.

>
>> I suppose it's no different than exec, which you have to call the right function depending on what you want.
>>
>> So this sounds fine.  The toEnvz still should be fixed to add an extra null, I'm assuming you're doing that right? :)
>
> Yes.  I have some local modifications based on the discussion here, which I haven't pushed yet.  Waiting for you guys to finish debating the file descriptor issue. ;)

I think all are in agreement at this point that closing the files between fork and exec is a good solution.  Whether or not to ALSO set F_CLOEXEC wherever Phobos opens a file descriptor is an additional matter.

As a fallback to standard unix behavior, we can have a Config option that says "don't do the close thing".

-Steve