February 25, 2013
On Monday, 25 February 2013 at 20:09:14 UTC, Lars T. Kyllingstad wrote:
> Exceptions are designed to handle exceptional cases.  A missing environment variable isn't exceptional, it is commonplace.

I disagree. I don't know your uses cases, but as far as I can see, if the program expects the variable to be present in the environment, then it is no different from a missing file which the program expects to be present, or malformed user input.

> That would depend on the application.

Could you provide a specific example? It's difficult to discuss the merits of either approach without some use cases.

As I see, there are two major cases:

1) The program expects a variable to be set. An example of this is COMSPEC / SHELL. These variables ought to be set on any system, so the user is not expected to verify this himself. The variables not being set is an exceptional sitation.

2) A variable may or may not be set, such as the case of passing additional options via the environment (such as INCLUDE, or LD_PRELOAD). The program will take specific action if the variable is not set, such as pretending it is empty, or defaulting to some other setting like one in a configuration file.

It seems like your approach caters to the second situation exclusively. I've mentioned the problems of applying this approach to the first situation in my previous post.
February 25, 2013
On Monday, 25 February 2013 at 20:06:19 UTC, Lars T. Kyllingstad wrote:
> That is also incredibly obscure.  I'd venture a guess that only ~10% of D's user base have even heard of Lynx.  Everyone knows firefox, and will understand what the example is supposed to illustrate.  (I admit that the ls/grep examples will also be rather incomprehensible to someone not familiar with the *NIX command line, and I will replace them with something else.  The D toolchain, as you suggest below, is a very good idea.)

I still think using Firefox is a bad idea, but I've already presented my arguments.

> BTW, browse() should never have been added to std.process, in my opinion.  Maybe to some other utility module, but then it should at least be done right, and be properly documented.

What would you improve about it?

I have no opinion on its location in Phobos, but std.process is the most fitting one if you don't consider creating a new module.

> What does it actually do?  There is no way to tell unless you read the source.

I don't see why the documentation needs to be burdened with implementation details, other than perhaps mentioning that it returns immediately. The implementation is rather OS-specific... if we find out that there is a better way of accomplishing the task on a given platform, the documentation would need to be updated. Isn't the documentation considered part of the contract with the library user?

>  (And then, it turns out that it spawns a new process for the browser and returns immediately, but it does not return a process ID that you can poll or wait for.  Bah.)

That's exactly how it should work... due to technical reasons. Most browsers will communicate the need to open a web page in a new tab to an existing instance and exit immediately, if there is an existing instance. There is no practical way to wait for the user to close the web browser.

> It is not worse.  It is a lot simpler, because the programmer does not need to know anything about the underlying platform.  They only need to know one rule:  If your arguments contain spaces, use the array functions.  I don't think the generic process-spawning functions in std.process should be bound by, or tied to, the syntax of whatever shell the programmer (or the end user) prefers.

Yes, I agree that tying it to a certain shell syntax is bad. However, introducing a third syntax incompatible with either two major ones, which additionally is less expressive, is IMHO worse. The "universal" way to distinguish / quote arguments is to use an array.

> If your arguments contain spaces, use the array functions.

Not just arguments: programs as well. On Windows, all third-party software is expected to install itself under the "Program Files" directory. Unless whatever you're launching is expected to be in PATH, splitting the string by spaces won't get you far on Windows.

>> Then, a end-user tries setting the config variable to a path that contains spaces, and everything breaks. Wrapping the path in quotes does not help either. Due to the way the function is designed, it is IMPOSSIBLE for the end-user to configure the application to launch a program located at a path containing spaces. To end-users, this comes off as a classical problem in badly written applications that don't handle command-line escaping properly.
>
> Exposing the specifics of whatever programming language you are using to the end user?

I don't understand what you mean here. It's not exposing any specifics if you don't implement anything in a way specific to D.

> I would just call that bad application programming.  If anything, you should be using one of the 'shell' functions in this case, not spawnProcess.

Yes, if the user is expected to customize the arguments as well. Otherwise it could very well be the spawnProcess overload that takes an array of arguments.

> You almost have me convinced that the single-string non-shell functions must go.  In the case of pipeProcess() and execute(), pipeShell() and shell() do the same job, with (arguably, still) less surprises.  Maybe it would then be a good idea to add a spawnShell() function to go with spawnProcess().

How about something that converts a ProcessPipes instance into a Tuple!(int, "status", string, "output") as returned by shell? Then you could do e.g.

auto result = shell("command").collectOutput();
// use result.status or result.output

> The escape*() functions need to be better documented.  Am I correct that they quote according to 'cmd.exe' rules on Windows and 'sh' rules on POSIX?

Yes. On Windows, though, the rules are defined by CommandLineToArgvW for splitting/escaping the individual arguments, and cmd.exe for the whole string when it's passed to the shell. Check the reference URLs in the comments in escapeWindowsArgumentImpl.
February 25, 2013
On 02/25/2013 08:57 PM, Walter Bright wrote:
> On 2/23/2013 6:58 PM, H. S. Teoh wrote:
>> On Sat, Feb 23, 2013 at 06:46:13PM -0800, Jonathan M Davis wrote:
>>> Possibly, but Walter takes a very dim view on most any code breakage,
>>> even if it means simply changing a makefile to make your code work
>>> again,
>>
>> I find this rather frustrating...
>
> Consider the common complaint from numerous people that "my code breaks
> with every new release".
>

I might be biased but I think a good portion of them are because of regressions.

> Even if the fix is "simple".
>

Simple fixes should be (semi-)automated by appropriate tooling.

> Just today, rdmd doesn't compile anymore.
>

Because of a Phobos regression.
February 25, 2013
On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. Kyllingstad wrote:
> It's been years in the coming, but we finally got it done. :)  The upshot is that the module has actually seen active use over those years, both by yours truly and others, so hopefully the worst wrinkles are already ironed out.
>
> Pull request:
> https://github.com/D-Programming-Language/phobos/pull/1151
>
> Code:
> https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d
>
> Documentation:
> http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
>
> I hope we can get it reviewed in time for the next release.  (The wiki page indicates that both std.benchmark and std.uni are currently being reviewed, but I fail to find any "official" review threads on the forum.  Is the wiki just out of date?)
>
> Lars


Very nice! Good job folks.

Got question, sorry if it was asked before.

Is there any way to call some functions after fork but before execve? Somekind of callback approach. It would be required to implement somekind of resources limiting in subprocess (with setrlimit) or droping root privilages in subprocces.
February 26, 2013
On Monday, February 25, 2013 21:21:53 Vladimir Panteleev wrote:
> On Monday, 25 February 2013 at 20:09:14 UTC, Lars T. Kyllingstad
> 
> wrote:
> > Exceptions are designed to handle exceptional cases. A missing environment variable isn't exceptional, it is commonplace.
> 
> I disagree. I don't know your uses cases, but as far as I can see, if the program expects the variable to be present in the environment, then it is no different from a missing file which the program expects to be present, or malformed user input.

Agreed.

I would think that it would make sense for opIndex to throw, and get to return null. Unlike AAs, opIndex would throw an exception rather than an error, but I think that the situation is very similar to that of trying to open a file.

- Jonathan M Davis
February 26, 2013
On Monday, 25 February 2013 at 20:21:55 UTC, Vladimir Panteleev
wrote:
> On Monday, 25 February 2013 at 20:09:14 UTC, Lars T. Kyllingstad wrote:
>> Exceptions are designed to handle exceptional cases.  A missing environment variable isn't exceptional, it is commonplace.
>
> I disagree. I don't know your uses cases, but as far as I can see, if the program expects the variable to be present in the environment, then it is no different from a missing file which the program expects to be present, or malformed user input.
>
>> That would depend on the application.
>
> Could you provide a specific example? It's difficult to discuss the merits of either approach without some use cases.

Well, take Phobos, for instance.  Besides std.process, there are
two places where environment/getenv is used: std.file.tempDir()
(POSIX version) and std.path.expandTilde().  None of them throw
if the variables in question don't exist, they both take some
default action instead.


> As I see, there are two major cases:
>
> 1) The program expects a variable to be set. An example of this is COMSPEC / SHELL. These variables ought to be set on any system, so the user is not expected to verify this himself. The variables not being set is an exceptional sitation.
>
> 2) A variable may or may not be set, such as the case of passing additional options via the environment (such as INCLUDE, or LD_PRELOAD). The program will take specific action if the variable is not set, such as pretending it is empty, or defaulting to some other setting like one in a configuration file.
>
> It seems like your approach caters to the second situation exclusively. I've mentioned the problems of applying this approach to the first situation in my previous post.

What if the variable is set, but empty?  Is that very different
from the situation where it doesn't exist at all?  In my opinion,
when it comes to environment variables, no.

You mention 'rm -rf $FOO/$BAR' as a "classic catastrophic bug" in
shell scripts.  This is just as much a problem if FOO and BAR are
simply empty, and throwing from opIndex() won't help you with
that.  You still have to test for empty(), which would *also*
test for null.

Lars
February 26, 2013
On Tuesday, February 26, 2013 08:08:33 Lars T. Kyllingstad wrote:
> What if the variable is set, but empty?  Is that very different from the situation where it doesn't exist at all?  In my opinion, when it comes to environment variables, no.

And yet, there _is_ a difference. I've dealt with code before that simply cared about whether an environment variable was set and not at all what it was set to. Regardless of whether that's desirable behavior, any program that needs to be compatible with a program that follows that behavior will need to be able to follow that behavior as well. So, if std.process is set up so that you can't tell the difference betwen an environment variable which hasn't been set and one that's been set to nothing, then that's a problem, even if it's not the most common case.

- Jonathan M Davis
February 26, 2013
On Monday, 25 February 2013 at 21:06:54 UTC, Vladimir Panteleev wrote:
> On Monday, 25 February 2013 at 20:06:19 UTC, Lars T. Kyllingstad wrote:
>> That is also incredibly obscure.  I'd venture a guess that only ~10% of D's user base have even heard of Lynx.  Everyone knows firefox, and will understand what the example is supposed to illustrate.  (I admit that the ls/grep examples will also be rather incomprehensible to someone not familiar with the *NIX command line, and I will replace them with something else.  The D toolchain, as you suggest below, is a very good idea.)
>
> I still think using Firefox is a bad idea, but I've already presented my arguments.
>
>> BTW, browse() should never have been added to std.process, in my opinion.  Maybe to some other utility module, but then it should at least be done right, and be properly documented.
>
> What would you improve about it?

1. I would document it properly.
2. As long as it runs in the background, I would return some kind of process ID from it.  (Yes, most browsers today may just signal another instance to open a new tab and then return, but would be surprised if they *all* do.)
(3. Maybe put it in a different module, I'm not sure.)

Also, and this is of course extremely subjective, it just looks out of place and very much "alone".  Where is writeEmailInDefaultClient(address)?  Where is openInAssociatedApp(file)?

> I have no opinion on its location in Phobos, but std.process is the most fitting one if you don't consider creating a new module.

Maybe.

[...]

>>> Then, a end-user tries setting the config variable to a path that contains spaces, and everything breaks. Wrapping the path in quotes does not help either. Due to the way the function is designed, it is IMPOSSIBLE for the end-user to configure the application to launch a program located at a path containing spaces. To end-users, this comes off as a classical problem in badly written applications that don't handle command-line escaping properly.
>>
>> Exposing the specifics of whatever programming language you are using to the end user?
>
> I don't understand what you mean here. It's not exposing any specifics if you don't implement anything in a way specific to D.

You pass a string directly from a config file into a rather low-level function in the programming language's standard library without any kind of validation.

[...]

>> You almost have me convinced that the single-string non-shell functions must go.  In the case of pipeProcess() and execute(), pipeShell() and shell() do the same job, with (arguably, still) less surprises.  Maybe it would then be a good idea to add a spawnShell() function to go with spawnProcess().
>
> How about something that converts a ProcessPipes instance into a Tuple!(int, "status", string, "output") as returned by shell? Then you could do e.g.
>
> auto result = shell("command").collectOutput();
> // use result.status or result.output

That's pretty elegant. :)

Lars
February 26, 2013
On Monday, 25 February 2013 at 22:32:44 UTC, nazriel wrote:
> Very nice! Good job folks.

Thanks!


> Got question, sorry if it was asked before.
>
> Is there any way to call some functions after fork but before execve? Somekind of callback approach. It would be required to implement somekind of resources limiting in subprocess (with setrlimit) or droping root privilages in subprocces.

Sorry, no.  I don't think we can do that, as it would be very *NIX specific.  On Windows there are no separate fork() and exec() calls, just one CreateProcess() call.

Lars
February 26, 2013
On Tuesday, 26 February 2013 at 07:16:34 UTC, Jonathan M Davis wrote:
> On Tuesday, February 26, 2013 08:08:33 Lars T. Kyllingstad wrote:
>> What if the variable is set, but empty?  Is that very different
>> from the situation where it doesn't exist at all?  In my opinion,
>> when it comes to environment variables, no.
>
> And yet, there _is_ a difference. I've dealt with code before that simply cared
> about whether an environment variable was set and not at all what it was set
> to. Regardless of whether that's desirable behavior, any program that needs to
> be compatible with a program that follows that behavior will need to be able
> to follow that behavior as well. So, if std.process is set up so that you
> can't tell the difference betwen an environment variable which hasn't been set
> and one that's been set to nothing, then that's a problem, even if it's not
> the most common case.

You can tell the difference.  In the former case environment.opIndex() will return null, in the latter it will return "".  Use 'is null' to determine which.

Lars
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19