Coprocesses and Bidirectional IPC
Chapter 4 — Coprocesses and Bidirectional IPC
Every time your script calls $(command) it forks a subshell, execs the command, collects the output through a pipe, and waits for the child to exit. For occasional calls that cost is negligible. For thousands of calls inside a tight loop it becomes the dominant bottleneck. Coprocesses and explicit bidirectional pipes give you a long-running peer process you can converse with — sending input and reading output repeatedly without ever re-forking the child.
1 — How coproc Works
The coproc keyword starts a command asynchronously and wires two anonymous pipes between it and the shell: one for the shell to write to the coprocess's stdin, and one to read from its stdout.
# Syntax forms coproc NAME { command; } # named coprocess — can be referenced by NAME coproc command # unnamed — stored in array COPROC
After the coproc line runs, Bash exposes the pipe file descriptors in a two-element array:
| Variable | Meaning |
|---|---|
NAME[0] | FD open for reading (shell reads coprocess stdout) |
NAME[1] | FD open for writing (shell writes coprocess stdin) |
$NAME_PID | PID of the coprocess |
# Minimal example: a coprocess running bc coproc BC { bc -l; } # Send an expression to bc's stdin printf '%s\n' '3.14159 * 2' >&${BC[1]} # Read bc's answer from bc's stdout read -r answer <&${BC[0]} echo "Result: $answer" # Result: 6.28318 # Keep using it — no extra fork printf '%s\n' 'sqrt(2)' >&${BC[1]} read -r answer <&${BC[0]} echo "Result: $answer" # Result: 1.41421356 # Clean up: close pipes, wait for the process to exit exec ${BC[1]}>&- # close write end → coprocess sees EOF on its stdin wait $BC_PID
2 — The Buffering Problem
The single most common coprocess failure is deadlock caused by output buffering. Most programs buffer their stdout when it is connected to a pipe (as opposed to a terminal). Your script writes a query, calls read, and blocks forever because the child's answer sits unflushed in its internal buffer.
# BROKEN: python's print() buffers stdout when stdout is a pipe coproc PY { python3 -c ' import sys for line in sys.stdin: sys.stdout.write("ECHO: " + line) # BUFFERED — shell will hang '; } # FIX 1: force line-buffering with -u flag (Python-specific) coproc PY { python3 -u -c ' import sys for line in sys.stdin: sys.stdout.write("ECHO: " + line) sys.stdout.flush() '; } # FIX 2: stdbuf (coreutils) — works for any C stdio program coproc GREP { stdbuf -oL grep 'ERROR'; } # -oL = line-buffered stdout # FIX 3: script -q (opens a pty, forcing line buffering) coproc TOOL { script -q -c 'some-program' /dev/null; } # FIX 4: expect's unbuffer utility coproc TOOL { unbuffer -p some-program; }
| Technique | Works for | Requires |
|---|---|---|
python3 -u | Python only | nothing extra |
stdbuf -oL | Any C stdio program | GNU coreutils |
unbuffer -p | Any program (PTY trick) | expect package |
script -q -c | Any program (PTY trick) | util-linux |
Redesign with \0 delimiter + read -d '' | Programs you control | nothing |
3 — Synchronisation Sentinels
When you cannot change the child's buffering, use a sentinel: a known output line that marks the end of a response. Your script writes a query plus a command that generates the sentinel, then reads until it sees the sentinel.
# Pattern: send a query + "echo SENTINEL", read until SENTINEL appears coproc SH { bash; } # a subordinate shell as coprocess sh_run() { local cmd="$1" local -a output local line local sentinel="__DONE_$$__" # unique per PID # Send the command, then emit the sentinel via the subordinate shell printf '%s\necho %s\n' "$cmd" "$sentinel" >&${SH[1]} # Collect lines until we see the sentinel while IFS= read -r line <&${SH[0]}; do [[ $line == "$sentinel" ]] && break output+=( "$line" ) done printf '%s\n' "${output[@]}" } sh_run 'ls /etc | head -3' sh_run 'echo $HOSTNAME' # Clean up exec ${SH[1]}>&- wait $SH_PID
4 — Explicit Bidirectional Pipes with exec
Before coproc was added in Bash 4.0, scripts used named FDs wired to process substitutions. This technique still works in all Bash versions and is often clearer when you need finer control over FD lifetimes.
# Open FDs 3 (read from child) and 4 (write to child) manually exec 3<<(stdbuf -oL some-command) # stdin of some-command not wired # For true bidirectional wiring without coproc, use a FIFO FIFO=$(mktemp -u) mkfifo "$FIFO" # Start the child with its stdin reading from the FIFO stdbuf -oL some-command < "$FIFO" & CHILD_PID=$! # Open FD 4 for writing to the FIFO (child's stdin) exec 4> "$FIFO" rm "$FIFO" # safe to unlink — FD 4 holds the write end open # Write to child's stdin via FD 4 printf 'query\n' >&4 # Clean up exec 4>&- wait $CHILD_PID
5 — Practical Coprocess Patterns
Pattern 1: Reusable database connection
# Open one persistent psql session instead of forking per query coproc PSQL { psql --no-psqlrc --no-align --tuples-only \ --field-separator='|' \ -d "$DB_NAME" -U "$DB_USER" } _PSQL_SENTINEL="__PSQL_END_$$__" db_query() { local sql="$1" local -n __rows="$2" local line __rows=() # Send SQL + a SELECT that emits the sentinel printf '%s;\nSELECT '"'"'%s'"'"';\n' \ "$sql" "$_PSQL_SENTINEL" >&${PSQL[1]} while IFS= read -r line <&${PSQL[0]}; do [[ $line == "$_PSQL_SENTINEL" ]] && break [[ -n $line ]] && __rows+=( "$line" ) done } declare -a users db_query 'SELECT id, name FROM users WHERE active' users for row in "${users[@]}"; do IFS='|' read -r id name <<< "$row" printf 'User %s: %s\n' "$id" "$name" done # Disconnect exec ${PSQL[1]}>&- wait $PSQL_PID
Pattern 2: Continuous log filter
# Use a coprocess to stream-filter a log with grep, then react in the shell coproc FILTER { stdbuf -oL grep --line-buffered -E 'ERROR|WARN'; } # Feed the log into the coprocess's stdin in the background tail -f /var/log/app.log >&${FILTER[1]} & TAIL_PID=$! # React to each filtered line while IFS= read -r line <&${FILTER[0]}; do printf '[ALERT] %s\n' "$line" # Could call send_slack_alert, page on-call, etc. done # Shutdown kill $TAIL_PID exec ${FILTER[1]}>&- wait $FILTER_PID
Pattern 3: Long-running encoder / hash worker
# sha256sum reads one filename per line and emits HASH FILENAME per line # Re-use the process instead of forking for each file coproc SHA { sha256sum -- $(cat); } # Better: pipe filenames to sha256sum directly # For line-by-line use, xargs is usually cleaner. # But for a genuine conversation pattern, use a Python worker: coproc HASHER { python3 -u -c ' import sys, hashlib for path in sys.stdin: path = path.rstrip("\n") try: h = hashlib.sha256(open(path,"rb").read()).hexdigest() print(h, path, flush=True) except Exception as e: print("ERROR", path, str(e), flush=True) ' } hash_file() { local path="$1" printf '%s\n' "$path" >&${HASHER[1]} local result read -r result <&${HASHER[0]} printf '%s\n' "$result" } while IFS= read -r -d '' f; do hash_file "$f" done < <(find /data -name '*.bin' -print0) exec ${HASHER[1]}>&- wait $HASHER_PID
6 — Multiple Coprocesses and Multiplexed IPC
Named coprocesses allow multiple long-running peers simultaneously. Each has its own FD array and PID variable.
# Two coprocesses running in parallel coproc GEOCODER { python3 -u geocode_worker.py; } coproc VALIDATOR { node validate_worker.js; } # Send to each independently geocode() { printf '%s\n' "$1" >&${GEOCODER[1]}; read -r REPLY <&${GEOCODER[0]}; } validate() { printf '%s\n' "$1" >&${VALIDATOR[1]}; read -r REPLY <&${VALIDATOR[0]}; } geocode "1600 Amphitheatre Parkway" validate "user@example.com" # Teardown helper — accept name, FD array, PID coproc_close() { local wfd="$1" pid="$2" exec {wfd}>&- # Bash 4.1+: exec {varname}>&- closes the FD stored in varname wait "$pid" } coproc_close "${GEOCODER[1]}" $GEOCODER_PID coproc_close "${VALIDATOR[1]}" $VALIDATOR_PID
7 — Bidirectional IPC Without coproc: FIFO Pairs
FIFOs are the portable alternative and are especially useful when the child process needs to be started by another mechanism (e.g. a service manager), or when you want multiple independent readers and writers.
bidir_open() { # Creates two FIFOs and stores the FD numbers in the given variable names local name="$1" # name prefix for FIFOs local -n __rfd="$2" # caller's variable to receive read FD local -n __wfd="$3" # caller's variable to receive write FD local -n __pid="$4" # caller's variable to receive child PID local fin="/tmp/${name}_in.$$" local fout="/tmp/${name}_out.$$" mkfifo "$fin" "$fout" # Start child reading from fin, writing to fout shift 4 "$@" < "$fin" > "$fout" & __pid=$! exec {__rfd}< "$fout" exec {__wfd}> "$fin" rm -f "$fin" "$fout" # unlink immediately — FDs keep the pipes alive } rfd wfd cpid bidir_open worker rfd wfd cpid stdbuf -oL ./worker.sh printf 'hello\n' >&$wfd read -r response <&$rfd echo "Worker replied: $response" exec {wfd}>&-; exec {rfd}<&- wait "$cpid"
8 — Error Handling and Timeouts
# read -t N: read with timeout (fractional seconds supported) coproc_read_timeout() { local fd="$1" local timeout="$2" local -n __out="$3" if IFS= read -r -t "$timeout" __out <&$fd; then return 0 else local rc=$? if (( rc > 128 )); then printf 'coproc_read_timeout: timed out after %ss\n' "$timeout" >&2 return 1 else # rc=1: EOF — coprocess has exited printf 'coproc_read_timeout: coprocess exited\n' >&2 return 2 fi fi } # Check if a coprocess is still alive before sending coproc_alive() { local pid="$1" kill -0 "$pid" 2>/dev/null } # Wrapper with alive-check, write, and timed read coproc_call() { local wfd="$1" rfd="$2" pid="$3" request="$4" timeout="${5:-5}" local -n __reply="$6" coproc_alive "$pid" || { echo "coproc_call: process dead" >&2; return 1; } printf '%s\n' "$request" >&$wfd || { echo "coproc_call: write failed" >&2; return 1; } coproc_read_timeout "$rfd" "$timeout" __reply }
9 — Performance: When Coprocesses Pay Off
A fork+exec costs roughly 1–5 ms on a modern Linux system. The payoff threshold for a coprocess is therefore around 50–100 calls; below that, the code complexity is not worth it.
# Benchmark: 1000 SHA256 calls — subprocess fork vs coprocess declare -a files mapfile -t -n 1000 files < <(find /usr/lib -name '*.so' -print) # Method A: subprocess per file TIMEFORMAT='A (subproc): %Rs' time { for f in "${files[@]}"; do sha256sum "$f" done >/dev/null } # Method B: all files via a single xargs invocation (not a coprocess, but fair) TIMEFORMAT='B (xargs): %Rs' time { printf '%s\0' "${files[@]}" | xargs -0 sha256sum >/dev/null; } # Method C: coprocess Python worker # (as shown in pattern 3 above) # Typical ratio on a fast machine: # A (subproc): ~5.2s — 1000 forks # B (xargs): ~0.8s — 1 fork, batched args # C (coproc): ~0.4s — 1 fork, 1000 round-trips over pipes
Exercises
Exercise 1 — Persistent bc calculator
Write a function calc EXPR that evaluates a bc -l
expression and prints the result. The coprocess must be started on first use and
reused for all subsequent calls (lazy initialisation via a flag variable). Calling
calc_close should cleanly shut it down. Handle the case where the
coprocess has died unexpectedly — restart it transparently.
_CALC_STARTED=0 _calc_start() { coproc _BC { bc -l; } _CALC_STARTED=1 } calc() { local expr="$1" local result # Start or restart coprocess if needed if (( _CALC_STARTED == 0 )) || ! kill -0 "$_BC_PID" 2>/dev/null; then _calc_start fi # bc prints the result followed by a newline — one read suffices printf '%s\n' "$expr" >&${_BC[1]} IFS= read -r -t 5 result <&${_BC[0]} || { printf 'calc: timed out or EOF\n' >&2; return 1 } printf '%s\n' "$result" } calc_close() { (( _CALC_STARTED )) || return 0 exec ${_BC[1]}>&- wait $_BC_PID 2>/dev/null _CALC_STARTED=0 } # Usage calc '2^10' # 1024 calc 'scale=4; sqrt(2)' # 1.4142 for i in {1..20}; do calc "$i * $i" done calc_close
Exercise 2 — Coprocess-based CSV validator
Write a function csv_validate_stream INPUT_FILE that processes a CSV
file line by line. For each data row (skip the header), send the row to a Python
coprocess that validates the fields and returns either OK or
ERR: reason. Collect and print a summary of how many rows passed and
failed. Use stdbuf -oL or python3 -u
to prevent buffering deadlock.
csv_validate_stream() { local file="$1" [[ -f "$file" ]] || { echo "File not found: $file" >&2; return 1; } coproc _VAL { python3 -u -c ' import sys, re EMAIL_RE = re.compile(r"^[^@]+@[^@]+\.[^@]+$") for raw in sys.stdin: row = raw.rstrip("\n").split(",") if len(row) != 3: print(f"ERR: expected 3 fields, got {len(row)}", flush=True) continue name, email, age = row if not name.strip(): print("ERR: name is empty", flush=True) elif not EMAIL_RE.match(email.strip()): print("ERR: invalid email", flush=True) elif not age.strip().isdigit(): print("ERR: age not numeric", flush=True) else: print("OK", flush=True) ' } local ok=0 fail=0 line lineno=0 reply while IFS= read -r line; do (( lineno++ )) (( lineno == 1 )) && continue # skip header printf '%s\n' "$line" >&${_VAL[1]} IFS= read -r -t 3 reply <&${_VAL[0]} || { echo "Validator timeout" >&2; break; } if [[ $reply == "OK" ]]; then (( ok++ )) else (( fail++ )) printf 'Row %d: %s\n' "$lineno" "$reply" fi done < "$file" exec ${_VAL[1]}>&- wait $_VAL_PID printf '\nSummary: %d OK, %d FAILED\n' "$ok" "$fail" }
Exercise 3 — FIFO-based worker pool (single worker)
Without using coproc, implement a bidirectional
communication channel using two FIFOs (one for requests, one for responses).
Start a worker script that reads one job per line and prints a result per line.
Write a job_submit JOB function and a job_result function
that reads the response. Demonstrate submitting 5 jobs and collecting their
results. Include a worker_shutdown function that sends a quit signal
and waits.
#!/usr/bin/env bash # worker.sh — process jobs from stdin, print results to stdout while IFS= read -r job; do [[ $job == "QUIT" ]] && break # Example: reverse each line rev <<< "$job" done #!/usr/bin/env bash # controller.sh _REQ_FIFO="/tmp/worker_req.$$" _RES_FIFO="/tmp/worker_res.$$" _WORKER_PID="" worker_start() { mkfifo "$_REQ_FIFO" "$_RES_FIFO" stdbuf -oL ./worker.sh < "$_REQ_FIFO" > "$_RES_FIFO" & _WORKER_PID=$! exec {_WREQ}> "$_REQ_FIFO" exec {_WRES}< "$_RES_FIFO" rm -f "$_REQ_FIFO" "$_RES_FIFO" } job_submit() { printf '%s\n' "$1" >&$_WREQ } job_result() { local -n _r="$1" IFS= read -r -t 5 _r <&$_WRES } worker_shutdown() { job_submit "QUIT" exec {_WREQ}>&- exec {_WRES}<&- wait "$_WORKER_PID" } worker_start for word in hello world bash script example; do job_submit "$word" declare result job_result result printf '%s → %s\n' "$word" "$result" done worker_shutdown
Exercise 4 — Benchmark: subprocess vs coprocess
Write a script that measures the wall-clock time to perform 500 integer additions using two methods:
- A fresh $(( )) expansion per iteration (no fork, but as a baseline)
- A coprocess running python3 -u -c '...' that
reads
A+Bexpressions and prints results - A fresh python3 -c 'print(...)' subprocess per iteration
Use TIMEFORMAT='%R' and { time ...; } 2>&1 to capture elapsed time for each method. Print a comparison table.
#!/usr/bin/env bash set -euo pipefail N=500 TIMEFORMAT='%R' # Method 1: pure Bash arithmetic — no fork t1=$( { time { for (( i=0; i<N; i++ )); do r=$(( i + i * 2 )) done }; } 2>&1 ) # Method 2: coprocess python3 coproc _PY { python3 -u -c ' import sys for line in sys.stdin: a, b = map(int, line.split("+")) print(a + b, flush=True) ' } t2=$( { time { for (( i=0; i<N; i++ )); do printf '%d+%d\n' "$i" $(( i*2 )) >&${_PY[1]} IFS= read -r _ <&${_PY[0]} done }; } 2>&1 ) exec ${_PY[1]}>&- wait $_PY_PID # Method 3: subprocess per call t3=$( { time { for (( i=0; i<N; i++ )); do python3 -c "print($i + $i * 2)" done >/dev/null }; } 2>&1 ) printf '\n%-30s %8s\n' "Method ($N iterations)" "Seconds" printf '%s\n' "$(printf '%.0s-' {1..40})" printf '%-30s %8s\n' "1. Bash arithmetic" "$t1" printf '%-30s %8s\n' "2. Python3 coprocess" "$t2" printf '%-30s %8s\n' "3. Python3 subprocess" "$t3"