Jay Taylor's notes
back to listing indexsh 1.11 — sh 1.11 documentation
[web search]sh 1.11
sh (previously pbs) is a full-fledged subprocess interface for Python that allows you to call any program as if it were a function:
Output:
More examples:
Note that these aren’t Python functions, these are running the binary commands on your system dynamically by resolving your $PATH, much like Bash does. In this way, all the programs on your system are easily available to you from within Python.
To install:
Follow it on Github: http://github.com/amoffat/sh
Basic Features
Command execution
Commands are called just like functions. They may be executed on the sh namespace, or imported directly from sh:
For commands that have dashes in their names, for example /usr/bin/google-chrome
,
substitute the dash for an underscore:
Note:
For commands with more exotic characters in their names, like .
, or
if you just don’t like the “magic”-ness of dynamic lookups, you
may use sh’s Command
wrapper and pass in the command name or
the absolute path of the executable:
Multiple arguments
Commands that take multiple arguments, need to be invoked using separate string for each argument rather than single string for all the arguments together. One might expect the following to work, the way it works in *nix shell, but it doesn’t:
You will run into error that may seem baffling:
Right way to do this is to break up your arguments before passing them into a program. A shell (bash) typically does this for you. It turns “tar cvf /tmp/test.tar /my/home/directory/” into 4 strings: “tar”, “cvf”, “/tmp/test.tar” and “/my/home/directory/” before passing them into the binary. You have to do this manually with sh.py.:
Arguments to sh’s Command
wrapper
Similar to the above, arguments to the sh.Command
must be separate.
e.g. the following does not work:
You will run into CommandNotFound(path)
exception even when correct full path is specified.
The correct way to do this is to :
- build
Command
object using only the binary - pass the arguments to the object when invoking
as follows:
Keyword arguments
Commands support short-form -a
and long-form --arg
arguments as
keyword arguments:
Background processes
By default, each command runs and completes its process before returning. If
you have a long-running command, you can put it in the background with the
_bg=True
special keyword argument:
Piping
Bash style piping is performed using function composition. Just pass one command as the input to another, and sh will create a pipe between the two:
By default, any command that is piping another command in waits for it to
complete. This behavior can be changed with the _piped
special keyword argument on the command being
piped, which tells it not to complete before sending its data, but to send
its data incrementally. See Advanced piping for examples of this.
Redirection
sh can redirect the standard and error output streams of a process to a file
or file-like object. This is done with the special _out
and _err
special keyword argument. You can pass a filename
or a file object as the argument value.
When the name of an already existing file is passed, the contents of the file
will be overwritten:
You can also redirect to a function. See STDOUT/ERR callbacks.
STDIN Processing
STDIN is sent to a process directly by using a command’s _in
special keyword argument:
Any command that takes input from STDIN can be used this way:
You’re also not limited to using just strings. You may use a file object, a Queue, or any iterable (list, set, dictionary, etc):
Sub-commands
Many programs have their own command subsets, like git (branch, checkout), svn (update, status), and sudo (where any command following sudo is considered a sub-command). sh handles subcommands through attribute access:
Sub-commands are mainly syntax sugar that makes calling some programs look conceptually nicer.
Note:
If you use sudo, the user executing the script must have the NOPASSWD option
set for whatever command that user is running, otherwise sudo
will hang.
Exit codes
Normal processes exit with exit code 0. This can be seen through a
command’s exit_code
property:
If a process ends with an error, and the exit code is not 0, an exception is generated dynamically. This lets you catch a specific return code, or catch all error return codes through the base class ErrorReturnCode:
Note:
Signals will not raise an ErrorReturnCode. The command will return
as if it succeeded, but its exit_code
property will be set to
-signal_num. So, for example, if a command is killed with a SIGHUP, its
return code will be -1.
Some programs return strange error codes even though they succeed. If you know
which code a program might returns and you don’t want to deal with doing
no-op exception handling, you can use the _ok_code
special keyword argument:
This means that the command will not generate an exception if the process exits with 0, 3, or 5 exit code.
Note:
If you use _ok_code
, you must specify all the exit codes that are
considered “ok”, like (typically) 0.
Glob expansion
Glob expansion is not performed on your arguments, for example, this will not work:
You’ll get an error to the effect of cannot access '\*.py': No such file or directory
.
This is because the *.py
needs to be glob expanded, not passed in literally:
Note:
Don’t use Python’s glob.glob
function, use sh.glob
. Python’s
has edge cases that break with sh.
Advanced Features
Baking
sh is capable of “baking” arguments into commands. This is similar to the stdlib functools.partial wrapper. Example:
The idea here is that now every call to ls
will have the “-la” arguments
already specified. This gets really interesting when you combine this with
subcommand via attribute access:
Now that the “myserver” callable represents a baked ssh command, you can call anything on the server easily:
‘With’ contexts
Commands can be run within a with
context. Popular commands using this
might be sudo
or fakeroot
:
If you need
to run a command in a with context and pass in arguments, for example, specifying
a -p prompt with sudo, you need to use the _with
special keyword argument.
This let’s the command know that it’s being run from a with context so
it can behave correctly:
Note:
If you use sudo, the user executing the script must have the NOPASSWD option
set for whatever command that user is running, otherwise sudo
will hang.
Iterating over output
You can iterate over long-running commands with the _iter
special keyword argument. This creates an iterator
(technically, a generator) that you can
loop over:
By default, _iter
iterates over stdout, but you can change set this specifically
by passing either “err” or “out” to _iter
(instead of True). Also by default,
output is line-buffered, but you can change this by changing Buffer sizes
Note:
If you need a non-blocking iterator, use _iter_noblock
. If the current
iteration would block, errno.EWOULDBLOCK
will be returned, otherwise
you’ll receive a chunk of output, as normal.
STDOUT/ERR callbacks
sh can use callbacks to process output incrementally. This is done much like
redirection: by passing an argument to either the _out
or _err
(or both)
special keyword arguments, except this time, you pass
a callable. This callable
will be called for each line (or chunk) of data that your command outputs:
To control whether the callback receives a line or a chunk, please see Buffer sizes. To “quit” your callback, simply return True. This tells the command not to call your callback anymore.
Note:
Returning True does not kill the process, it only keeps the callback from being called again. See Interactive callbacks for how to kill a process from a callback.
Note:
_out
and _err
don’t have to specify callables. It can be a file-like
object, a Queue, a StringIO instance, or a filename. See Redirection
for examples.
Interactive callbacks
Each command launched through sh has an internal STDIN Queue that can be used from callbacks:
You can also kill or terminate your process (or send any signal, really) from your callback by adding a third argument to receive the process object:
The above code will run, printing lines from some_log_file.log
until the
word “ERROR” appears in a line, at which point the tail process will be killed
and the script will end.
Note:
You may also use .terminate()
to send a SIGTERM, or .signal(sig)
to
send a general signal.
Buffer sizes
Buffer sizes are important to consider when you begin to use iterators, advanced piping, or callbacks. Tutorial 2: Entering an SSH password has a good example of why different buffering modes are needed. Buffer sizes control how STDIN is read and how STDOUT/ERR are written to. Consider the following:
STDIN is, by default, unbuffered, so the string “testing” is read character by character. But the result is still “TESTING”, not “T”, “E”, “S”, “T”, “I”, “N”, “G”. Why? Because although STDIN is unbuffered, STDOUT is not. STDIN is being read character by character, but all of those single characters are being aggregated to STDOUT, whose default buffering is line buffering. Try this instead:
Because now we set STDOUT to also be unbuffered with _out_bufsize=0
the result is
“T”, “E”, “S”, “T”, “I”, “N”, “G”, as expected.
There are 2 bufsize special keyword arguments:
_in_bufsize
and _out_bufsize
. They may be set to the following values:
- 0
- Unbuffered. For STDIN, strings and file objects will be read character-by-character, while Queues, callables, and iterables will be read item by item.
- 1
- Line buffered. For STDIN, data will be passed into the process line-by-line. For STDOUT/ERR, data will be output line-by-line. If any data is remaining in the STDOUT or STDIN buffers after all the lines have been consumed, it is also consumed/flushed.
- N
- Buffered by N characters. For STDIN, data will be passed into the process <=N characters at a time. For STDOUT/ERR, data will be output <=N characters at a time. If any data is remaining in the STDOUT or STDIN buffers after all the lines have been consumed, it is also consumed/flushed.
Advanced piping
By default, all piped commands execute sequentially. What this means is that the inner command executes first, then sends its data to the outer command:
In the above example, ls
executes, gathers its output, then sends that output
to wc
. This is fine for simple commands, but for commands where you need
parallelism, this isn’t good enough. Take the following example:
This won’t work because the tail -f
command never finishes. What you
need is for tail
to send its output to tr
as it receives it. This is where
the _piped
special keyword argument comes in handy:
This works by telling tail -f
that it is being used in a pipeline, and that
it should send its output line-by-line to tr
. By default, _piped
sends
stdout, but you can easily make it send stderr instead by using _piped="err"
Environments
The special keyword argument _env
allows you
to pass a dictionary of environement variables and their corresponding values:
Note:
_env
replaces your process’s environment completely. Only the key-value
pairs in _env
will be used for its environment. If you want to add new
environment variables for a process in addition to your existing environment,
try something like this:
TTYs
Some applications behave differently depending on whether their standard file
descriptors are attached to a TTY or not. For
example, git will disable features intended for humans
such as colored and paged output when STDOUT is not attached to a TTY. Other
programs may disable interactive input if a TTY is not attached to STDIN. Still
other programs, such as SSH (without -n
), expect their input to come from a
TTY/terminal.
By default, sh emulates a TTY for STDOUT but not for STDIN. You can change the default behavior by passing in extra special keyword arguments, as such:
FD | KEYWORD | DEFAULT |
---|---|---|
STDOUT | _tty_out | True (tty) |
STDIN | _tty_in | False (pipe) |
Table Of Contents
Next topic
Tutorial 1: Tailing a real-time log file
This Page
Quick search
Enter search terms or a module, class or function name.