14. Process Management

Contents:
Using system and exec
Using Backquotes
Using Processes as Filehandles
Using fork
Summary of Process Operations
Sending and Receiving Signals
Exercises

14.1 Using system and exec

When you give the shell a command line to execute, the shell usually creates a new process to execute the command. This new process becomes a child of the shell, executing independently, yet coordinating with the shell.

Similarly, a Perl program can launch new processes, and like most other operations, has more than one way to do so.

The simplest way to launch a new process is to use the system function. In its simplest form, this function hands a single string to a brand new /bin/sh shell to be executed as a command. When the command is finished, the system function returns the exit value of the command (typically 0 if everything went OK). Here's an example of a Perl program executing a date command using a shell:[1]

system("date");

[1] This doesn't actually use the shell: Perl performs the operations of the shell if the command line is simple enough, and this one is.

We're ignoring the return value here, but it's not likely that the date command is going to fail anyway.

Where does the command's output go? In fact, where does the input come from, if it's a command that wants input? These are good questions, and the answers to these questions are most of what distinguishes the various forms of process-creation.

For the system function, the three standard files (standard input, standard output, and standard error) are inherited from the Perl process. So for the date command in the previous example, the output goes wherever the print STDOUT output goes - probably the invoker's display screen. Because you are firing off a shell, you can change the location of the standard output using the normal /bin/sh I/O redirections. For example, to put the output of the date command into a file named right_now, something like this will work just fine:

system("date >right_now") && die "cannot create right_now";

This time, we not only send the output of the date command into a file with a redirection to the shell, but also check the return status. If the return status is true (nonzero), something went wrong with the shell command, and the die function will do its deed. This is backwards from normal Perl operator convention: a nonzero return value from the system operator generally indicates that something went wrong.

The argument to system can be anything you would feed /bin/sh, so multiple commands can be included, separated by semicolons or newlines. Processes that end in & are launched and not waited for, just as if you had typed a line that ends in an & to the shell.

Here's an example of generating a date and who command to the shell, sending the output to a filename specified by a Perl variable. This all takes place in the background so that we don't have to wait for it before continuing with the Perl script:

$where = "who_out.".++$i; # get a new filename
system "(date; who) >$where &";

The return value from system in this case is the exit value of the shell, and would thus indicate whether the background process had launched successfully, but not whether the date and who commands executed successfully. The double-quoted string is variable interpolated, so $where is replaced with its value (by Perl, not by the shell). If you wanted to reference a shell variable named $where, you'd have to backslash the dollar sign or use a single-quoted string.

A child process inherits many things from its parent besides the standard filehandles. These include the current umask, current directory, and of course, the user ID.

Additionally, all environment variables are inherited by the child. These variables are typically altered by the csh setenv command or the corresponding assignment and export by the /bin/sh shell. Environment variables are used by many utilities, including the shells, to alter or control the way that utility operates.

Perl gives you a way to examine and alter current environment variables through a special hash called %ENV (uppercase). Each key of this hash corresponds to the name of an environment variable, with the corresponding value being, well, the corresponding value. Examining this hash shows you the environment handed to Perl by the parent shell; altering the hash affects the environment used by Perl and by its child processes, but not parents.

For example, here's a simple program that acts like printenv :

foreach $key (sort keys %ENV) {
    print "$key=$ENV{$key}\n";
}

Note the equal sign here is not an assignment, but simply a text character that the print is using to say stuff like TERM=xterm or USER=merlyn.

Here's a program snippet that alters the value of PATH to make sure that the grep command run by system is looked for only in the normal places:

$oldPATH = $ENV{"PATH"};                 # save previous path
$ENV{"PATH"} = "/bin:/usr/bin:/usr/ucb"; # force known path
system("grep fred bedrock >output");     # run command
$ENV{"PATH"} = $oldPATH;                 # restore previous path

That's a lot of typing. It'd be faster just to set a local value for this hash element.

Despite its other shortcomings, the local operator can do one thing that my cannot: it can give just one element of an array or a hash a temporary value.

{
    local $ENV{"PATH"} = "/bin:/usr/bin:/usr/ucb";
    system "grep fred bedrock >output";
}

The system function can also take a list of arguments rather than a single argument. In that case, rather than handing the list of arguments off to a shell, Perl treats the first argument as the command to run (located according to the PATH if necessary) and the remaining arguments as arguments to the command without normal shell interpretation. In other words, you don't need to quote whitespace or worry about arguments that contain angle brackets because those are all merely characters to hand to the program. So, the following two commands are equivalent:

system "grep 'fred flintstone' buffaloes";   # using shell
system "grep","fred flintstone","buffaloes"; # avoiding shell

Giving system a list rather than giving it a simple string saves one shell process as well, so do this when you can. (Actually, when the one-argument form of system is simple enough, Perl itself optimizes away the shell invocation entirely, calling the resulting program directly as if you had used the multiple-argument invocation.)

Here's another example of equivalent forms:

@cfiles = ("fred.c","barney.c");           # what to compile
@options = ("-DHARD","-DGRANITE");         # options
system "cc -o slate @options @cfiles";     # using shell
system "cc","-o","slate",@options,@cfiles; # avoiding shell