Linux/UNIX Pipes File Descriptors, and Process Substitution

  • POSIX Shell – A logical container providing an environment and a set of commands to interact with a UNIX and/or UNIX-like system.
    Examples include: sh(Bourne shell), bash, ash, csh, etc.
  • Pipe – a unidirectional connection between two processes utilized for inter-process communications.
  • File Descriptor (fd) – A dynamically created, locally significant and non-persistent file representation of a FIFO buffer which can be connected to a pipe, or by default to the parent process’ standard input, output, and error buffers. As an example, a (virtual) terminal’s input device (keyboard, mouse, etc.) and output device (display).
  • Process substitution – Allows the POSIX shell user the ability create scripted constructs that mimic functional and/or imperative programming syntax in-line.

Every interactive process on startup within a POSIX shell is attached dynamically to three pipes local to its memory space. Standard Input(0), Standard Output(1), and Standard Error(2). If the POSIX system supports file descriptors they are then created and attached to the pipes at ‘/dev/fd/’.

Name Abbrev. File Symbols
standard input stdin /dev/fd/0 < or <<
standard output stdout /dev/fd/1 1> or 1>>
standard error stderr /dev/fd/2 2> or 2>>

POSIX Shell Plumbing

The POSIX (Bourne) shell and its derivatives provides the ability to utilize various symbols to plumb processes together by their file descriptors.

Symbol Name _Symbol_Syntax_ ________Function________
pipe { block } | { block } attach stdout of left script block
to  stdin of right script block
success
exec
cmd && { block } if cmd exits with success (0)
then execute block
failure
exec
cmd || { block } if cmd exits with error (not 0)
then execute block
write
file
{ block } #>file attach /dev/fd/#
of left script block to file
and overwrite with bytestream
write
fd
{ block } #>&? attach /dev/fd/#
of left script block to /dev/fd/?
and overwrite with bytestream
append { block } #>>file attach /dev/fd/#
of left script block to file
and append bytestream
close
stdout
#>&- close /dev/fd/#
(no # specified defaults to stdout)
read
file
{ block } <file attach stdin of left script block
to file and read bytestream
read
until
{ block } <&? attach stdin of left script block
to /dev/fd/? and read bytstream
read
until
{ block } <<word attach stdin of left script block
to stdin of proc
and read until word
close stdin <&- close stdin

Process Substitution

The POSIX shell also provides the ability to perform inline substitution utilizing two methods:

_Symbol_
Name
Symbol_Syntax ________Function________
sub-shell exec ( block ) execute process proc without environment persistance
sub-block exec { block; } execute proc resulting inline at location within script with evnironment persistance

…for examples see below

Examples

Basic Pipeline

find / | grep '^\/bin' | sed -r 's/^/\/usr/g;'

This is a very basic example showing the connection of stdout(1) of the find utility to the stdin(0) of the grep utility to filter all paths that start with ‘/bin’ and prepending the paths with ‘/usr’. The intent is to translate the output of find with all ‘/bin’ paths to ‘/usr/bin’.

Dynamic File Descriptors

$ FOO=who                         # SET VARIABLE FOO TO who
$ cat <(                          # READ FD GENERATED BY <( ... )
         FOO=bar ;                # SET LOCAL VARIABLE FOO TO bar
         echo $FOO ;              # PRINT LOCAL VARIABLE FOO
        )                         # DESTROY ENVIRONMENT
bar
$ echo $FOO                       # PRINT VARIABLE IN OUTER CONTEXT
who?
$ echo <( FOO=bar ; echo $FOO )   # PRINT FD AS FILE PATH
/dev/fd/63

The above is a great example that highlights what is returned by the <( … ) is actually a file path and not the contents. It is the responsibility of the script block, command, or executable to the left to read the contents of this file descriptor. It also shows how the contents within ( … ) are executed in a completely different context preventing access to variables in the executing script and vice versa.

Local execution of a Script generated on a Remote Server

$ /bin/sh <( curl http://www.example.com/getScriptToRun.sh ) 

Executes cURL retrieving contents generated by ‘getScriptToRun.sh’ on web-server hosted at http://www.example.com and runs the script locally without saving the file to disk first. As an added inspection into whats happening, you can see the dynamic creation of a file descriptor by utilizing the following:

Pipemill

 

{ { { tee /dev/fd/3 /dev/fd/4 | command1 >&9;
 } 3>&1 | command2 >&9;
 } 4>&1 | command3 >&9;
} 9>&1

tee /dev/fd/3 /dev/fd/4‘ – takes stdin (/dev/fd/0) and duplicates it into three separate piped paths:

  1. stdout /dev/fd/1 -> command1 -> /dev/fd/9 -> stdout (/dev/fd/1)
  2. /dev/fd/3 -> command2 -> /dev/fd/9 -> stdout (/dev/fd/1)
  3. /dev/fd/4 -> command3 -> /dev/fd/9 -> stdout (/dev/fd/1)

NOTE: stderr (/dev/fd/2) is maintained end to end without redirection.

Other Interesting tidbits

# CREATE SCRIPT

$ echo '#!/bin/sh
echo import.sh[1]: first line of imported script
echo import.sh[2]: bla bla...' > import.sh

$ echo '#!/bin/sh
echo main.sh[1]: this is the first line of the main script
echo main.sh[2]: lets inline import import.sh
. import.sh
echo main.sh[4]: this is the fourth line of the main script' > main.sh

# EXECUTE SCRIPT

$ /bin/sh ./main.sh

. filename reads a filename as if it had occurred at that place in the script..

Advertisements