Scripting for Cron and Systemd
Chapter 12 — Scripting for Cron and Systemd
A script that works perfectly when you run it interactively can silently fail when a scheduler runs it at 3 AM. The reasons are almost always environmental: missing PATH entries, no terminal, no home directory, or a second copy launched before the first one finished. This chapter covers everything you need to write scripts that behave correctly when run unattended.
1 — Cron Syntax and the Crontab
A crontab line has five time fields followed by the command:
# ┌──────────── minute (0–59) # │ ┌─────────── hour (0–23) # │ │ ┌────────── day of month (1–31) # │ │ │ ┌───────── month (1–12 or JAN–DEC) # │ │ │ │ ┌────────── day of week (0–7, 0 and 7 = Sunday, or SUN–SAT) # │ │ │ │ │ # * * * * * command # Run at 02:30 every day 30 2 * * * /opt/myapp/bin/backup.sh # Every 15 minutes */15 * * * * /opt/myapp/bin/poll.sh # 9 AM on weekdays only 0 9 * * 1-5 /opt/myapp/bin/report.sh # First day of every month at midnight 0 0 1 * * /opt/myapp/bin/monthly.sh # Multiple values with commas 0 8,12,17 * * * /opt/myapp/bin/check.sh
Managing crontabs
# Edit your crontab interactively crontab -e # List current crontab crontab -l # Install from a file (replaces entire crontab) crontab crontab.txt # Non-interactively add a line — safe even if crontab is empty ( crontab -l 2>/dev/null; echo "30 2 * * * /opt/myapp/bin/backup.sh" ) | crontab - # Remove a specific line (match and delete) crontab -l | grep -v 'backup.sh' | crontab - # Edit another user's crontab (as root) crontab -u deploy -e # System-wide crontab — has an extra USER field # /etc/crontab and files in /etc/cron.d/: # 30 2 * * * deploy /opt/myapp/bin/backup.sh
2 — The Cron Environment Problem
Cron runs jobs with a minimal environment — typically just HOME, LOGNAME, and a very short PATH (often just /usr/bin:/bin). Everything your interactive shell loads — ~/.bashrc, nvm, pyenv, /usr/local/bin — is absent.
The three defences
# 1. Set PATH explicitly at the top of your crontab PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin MAILTO=ops@example.com # send output/errors here; empty string = discard SHELL=/bin/bash 30 2 * * * backup.sh
# 2. Use absolute paths for every binary inside the script #!/usr/bin/env bash PYTHON=/usr/bin/python3 RSYNC=/usr/bin/rsync CURL=/usr/bin/curl
# 3. Source your profile at the top of the script (when you NEED its env) #!/usr/bin/env bash # Deliberately load the user environment — use sparingly # shellcheck source=/dev/null [[ -f /etc/profile ]] && source /etc/profile [[ -f "${HOME}/.bashrc" ]] && source "${HOME}/.bashrc"
Diagnosing environment issues
# Simulate the cron environment locally env -i HOME="$HOME" LOGNAME="$LOGNAME" PATH='/usr/bin:/bin' \ SHELL=/bin/bash bash --norc --noprofile ./myscript.sh # Dump the actual cron environment to a file for inspection * * * * * env > /tmp/cron-env.txt 2&1
3 — Locking: Preventing Overlapping Runs
Without a lock, a slow job can still be running when cron starts the next instance. Two copies writing to the same file or database is rarely safe.
flock — the right tool for the job
# Inline — entire script protected in one line from the crontab * * * * * flock -n /var/lock/myjob.lock /opt/myapp/bin/myjob.sh # -n = non-blocking: exit immediately if lock is held (skip this run) # -w N = wait up to N seconds before giving up flock -w 10 /var/lock/myjob.lock /opt/myapp/bin/myjob.sh
flock inside the script
#!/usr/bin/env bash set -euo pipefail LOCKFILE=/var/lock/myjob.lock # Open the lock file on fd 9 and attempt an exclusive non-blocking lock exec 9>"$LOCKFILE" if ! flock -n 9; then echo "Another instance is already running. Exiting." >&2 exit 1 fi # Lock is held for the rest of the script; released automatically on exit echo "Running job..." # … work …
flock uses kernel-level advisory locks that are automatically released when the
process exits — no cleanup needed.
Logging with timestamps for unattended jobs
#!/usr/bin/env bash LOGFILE=/var/log/myapp/myjob.log MAXLOG=10485760 # 10 MB — rotate when exceeded log() { printf '[%s] %s\n' "$(date '+%Y-%m-%dT%H:%M:%S')" "$*"; } # Rotate log if too large if [[ -f $LOGFILE ]] && (( $(stat -c '%s' "$LOGFILE") > MAXLOG )); then mv "$LOGFILE" "$LOGFILE.$(date +%Y%m%d%H%M%S)" fi # Redirect all output (stdout + stderr) to the log file exec >>"$LOGFILE" 2&1 log "=== Job started (PID $$) ===" # … work … log "=== Job finished ==="
4 — Systemd Timer Units
Systemd timers are the modern alternative to cron. They offer better logging (via journald), dependency management, missed-run handling, and per-service resource controls.
Every timer needs two files: a .service unit (what to run) and a .timer unit (when to run it).
The service unit
# /etc/systemd/system/myjob.service [Unit] Description=My scheduled job After=network-online.target Wants=network-online.target [Service] Type=oneshot User=deploy Group=deploy WorkingDirectory=/opt/myapp ExecStart=/opt/myapp/bin/myjob.sh StandardOutput=journal StandardError=journal SyslogIdentifier=myjob # Resource limits — prevent runaway jobs TimeoutStartSec=300 MemoryMax=512M CPUQuota=50% [Install] WantedBy=multi-user.target
The timer unit
# /etc/systemd/system/myjob.timer [Unit] Description=Run myjob daily at 02:30 [Timer] # Calendar expression — like cron but human-readable OnCalendar=*-*-* 02:30:00 # If the system was off at the scheduled time, run ASAP on next boot Persistent=true # Spread load — randomise start within 5 minutes of the scheduled time RandomizedDelaySec=5min [Install] WantedBy=timers.target
Timer management commands
# Enable and start the timer sudo systemctl daemon-reload sudo systemctl enable --now myjob.timer # Check timer status and next run time systemctl status myjob.timer systemctl list-timers --all # Run the job immediately (without waiting for the timer) sudo systemctl start myjob.service # View recent logs journalctl -u myjob.service -n 50 # Follow logs in real time journalctl -u myjob.service -f # Logs since last run journalctl -u myjob.service --since "$(systemctl show myjob.service -p ExecMainStartTimestamp --value)"
OnCalendar expressions
| Expression | Meaning |
|---|---|
daily | Every day at midnight (shorthand) |
hourly | Every hour at :00 |
weekly | Every Monday at midnight |
monthly | First day of every month at midnight |
*-*-* 02:30:00 | Every day at 02:30 |
Mon..Fri *-*-* 09:00:00 | Weekdays at 09:00 |
*-*-1 00:00:00 | First of every month at midnight |
*:0/15 | Every 15 minutes |
# Test a calendar expression without running anything systemd-analyze calendar 'Mon..Fri *-*-* 09:00:00' Original form: Mon..Fri *-*-* 09:00:00 Normalized form: Mon..Fri *-*-* 09:00:00 Next elapse: Mon 2026-06-09 09:00:00 UTC (in UTC): Mon 2026-06-09 09:00:00 UTC
5 — User Timers (No Root Required)
You don't need root to use systemd timers. User timers live under ~/.config/systemd/user/ and run as your own user.
# ~/.config/systemd/user/sync.service [Unit] Description=Sync notes to remote [Service] Type=oneshot ExecStart=%h/bin/sync-notes.sh # %h expands to $HOME # ~/.config/systemd/user/sync.timer [Unit] Description=Sync notes every 30 minutes [Timer] OnCalendar=*:0/30 Persistent=true [Install] WantedBy=timers.target
# Manage user timers — no sudo needed systemctl --user daemon-reload systemctl --user enable --now sync.timer systemctl --user list-timers journalctl --user -u sync.service -f # Allow user timers to run even when logged out sudo loginctl enable-linger "$USER"
6 — A Production-Ready Cron Script Template
This template handles every common unattended-job concern in one place:
#!/usr/bin/env bash # ============================================================================= # myjob.sh — nightly data export # Safe for cron and systemd. Run as: deploy user. # ============================================================================= set -euo pipefail # ── Identity ────────────────────────────────────────────────────────────────── PROG="$(basename "${BASH_SOURCE[0]}")" PIDSTR="[$$]" # ── Environment ─────────────────────────────────────────────────────────────── PATH=/usr/local/bin:/usr/bin:/bin # ── Paths ───────────────────────────────────────────────────────────────────── APP_DIR=/opt/myapp LOG_DIR=/var/log/myapp LOCK_FILE=/var/lock/myjob.lock LOG_FILE="$LOG_DIR/myjob.log" mkdir -p "$LOG_DIR" # ── Logging ─────────────────────────────────────────────────────────────────── exec >>"$LOG_FILE" 2&1 log() { printf '[%s] %s %s\n' "$(date '+%Y-%m-%dT%H:%M:%S')" "$PIDSTR" "$*"; } warn() { printf '[%s] %s WARN %s\n' "$(date '+%Y-%m-%dT%H:%M:%S')" "$PIDSTR" "$*"; } # ── Locking ─────────────────────────────────────────────────────────────────── exec 9>"$LOCK_FILE" if ! flock -n 9; then log "Already running — skipping this invocation" exit 0 fi # ── Traps ───────────────────────────────────────────────────────────────────── on_exit() { local rc=$? if (( rc != 0 )); then log "=== Job FAILED (exit $rc) ===" else log "=== Job finished OK ===" fi } trap on_exit EXIT # ── Main ────────────────────────────────────────────────────────────────────── log "=== Job started ===" # … your work here … log "Exporting data..." # export_data ... log "Done."
7 — Sending Alerts on Failure
Silent failures are the worst kind. Build notification into the job itself, not just the crontab's MAILTO setting (which requires a working MTA).
#!/usr/bin/env bash SLACK_WEBHOOK="${SLACK_WEBHOOK_URL:?set SLACK_WEBHOOK_URL}" PROG="$(basename "$0")" HOST=$(hostname -s) notify_failure() { local exit_code="$1" local message printf -v message '*FAILED* `%s` on `%s` (exit %d) at %s' \ "$PROG" "$HOST" "$exit_code" "$(date '+%Y-%m-%d %H:%M:%S')" curl -s -X POST -H 'Content-Type: application/json' \ --data "{\"text\":\"${message}\"}" \ "$SLACK_WEBHOOK" || true # never let alerting kill the job } on_exit() { local rc=$? (( rc != 0 )) && notify_failure "$rc" } trap on_exit EXIT # … rest of script …
Using systemd OnFailure for alert routing
# /etc/systemd/system/myjob.service [Unit] Description=My job OnFailure=notify-failure@%n.service # %n = unit name # /etc/systemd/system/notify-failure@.service (template unit) [Unit] Description=Notify on failure for %i [Service] Type=oneshot ExecStart=/opt/myapp/bin/notify-failure.sh %i
8 — Cron vs Systemd Timers: Quick Comparison
| Feature | Cron | Systemd timer |
|---|---|---|
| Availability | Universal | Linux with systemd only |
| Logging | Email / redirect manually | journald — automatic, queryable |
| Missed runs | Silently skipped | Persistent=true catches up |
| Dependencies | None | After=, Wants=, Requires= |
| Resource limits | None built-in | Memory, CPU, IO per unit |
| Overlapping runs | Requires manual locking | Default: one instance at a time |
| User jobs (no root) | crontab -e | ~/.config/systemd/user/ |
| Schedule syntax | 5-field numeric | Human-readable calendar strings |
| Randomised jitter | Manual with sleep $RANDOM | RandomizedDelaySec= |
Exercises
Exercise 1 — Hardened cron script
Write db_backup.sh — a script meant to run daily from cron that:
- Sets an explicit
PATHat the top - Uses
flockon fd 9 to prevent overlapping runs - Redirects all output to /var/log/myapp/db_backup.log with timestamps
- Rotates the log file when it exceeds 5 MB
- Traps EXIT to log success or failure
- The actual "backup" can be a stub (
sleep 2 && echo "Backup complete")
#!/usr/bin/env bash set -euo pipefail PATH=/usr/local/bin:/usr/bin:/bin LOG_DIR=/var/log/myapp LOG_FILE="$LOG_DIR/db_backup.log" LOCK_FILE=/var/lock/db_backup.lock MAX_LOG=5242880 # 5 MB mkdir -p "$LOG_DIR" # Rotate log if needed if [[ -f $LOG_FILE ]] && \ (( $(stat -c '%s' "$LOG_FILE") > MAX_LOG )); then mv "$LOG_FILE" "$LOG_FILE.$(date +%Y%m%d%H%M%S)" fi exec >>"$LOG_FILE" 2&1 log() { printf '[%s] [%d] %s\n' "$(date '+%Y-%m-%dT%H:%M:%S')" "$$" "$*"; } # Lock exec 9>"$LOCK_FILE" if ! flock -n 9; then log "Already running — skipping" exit 0 fi on_exit() { local rc=$? if (( rc == 0 )); then log "=== FINISHED OK ===" else log "=== FAILED (exit $rc) ===" fi } trap on_exit EXIT log "=== DB backup started ===" # Stub backup work sleep 2 log "Backup complete"
Exercise 2 — Systemd timer unit
Write a pair of systemd unit files for a weekly report generator:
weekly-report.service— runs/opt/reports/generate.shas userreports, depends on network being available, journals all output, times out after 10 minutesweekly-report.timer— fires every Monday at 06:00, catches up on missed runs, spreads load with up to 10 minutes of random jitter
Also write the three systemctl commands needed to deploy and start the timer
without rebooting.
# /etc/systemd/system/weekly-report.service [Unit] Description=Weekly report generator After=network-online.target Wants=network-online.target [Service] Type=oneshot User=reports Group=reports ExecStart=/opt/reports/generate.sh StandardOutput=journal StandardError=journal SyslogIdentifier=weekly-report TimeoutStartSec=600 [Install] WantedBy=multi-user.target
# /etc/systemd/system/weekly-report.timer [Unit] Description=Run weekly report every Monday at 06:00 [Timer] OnCalendar=Mon *-*-* 06:00:00 Persistent=true RandomizedDelaySec=10min [Install] WantedBy=timers.target
# Deploy commands sudo systemctl daemon-reload sudo systemctl enable weekly-report.timer sudo systemctl start weekly-report.timer
Exercise 3 — crontab management script
Write a script cron-manage.sh with subcommands:
add SCHEDULE COMMAND— add an entry if it doesn't already exist (match on COMMAND to avoid duplicates)remove PATTERN— remove all entries whose command matches PATTERNlist— pretty-print the current crontab, skipping comment linescheck— verify every command path in the crontab exists and is executable
#!/usr/bin/env bash set -euo pipefail PROG="$(basename "$0")" cmd_add() { local schedule="${1:?add requires SCHEDULE}" local command="${2:?add requires COMMAND}" local entry="$schedule $command" local current current=$(crontab -l 2>/dev/null || true) if grep -qF "$command" <<<"$current" 2>/dev/null; then echo "Entry already exists for: $command" return 0 fi ( echo "$current"; echo "$entry" ) | grep -v '^$' | crontab - echo "Added: $entry" } cmd_remove() { local pattern="${1:?remove requires PATTERN}" crontab -l 2>/dev/null | grep -v "$pattern" | crontab - echo "Removed entries matching: $pattern" } cmd_list() { crontab -l 2>/dev/null | grep -v '^[[:space:]]*#' | \ grep -v '^[[:space:]]*$' | \ awk '{ printf "%-40s %s\n", $1" "$2" "$3" "$4" "$5, substr($0, index($0,$6)) }' } cmd_check() { local ok=0 bad=0 while read -r m h dom mon dow cmd rest...; do # cmd is the first token after the 5 time fields if [[ -x "$cmd" ]]; then printf ' OK: %s\n' "$cmd"; (( ok++ )) else printf ' MISSING: %s\n' "$cmd"; (( bad++ )) fi done <(crontab -l 2>/dev/null | grep -v '^[[:space:]]*[#$]') printf '%d OK, %d missing\n' "$ok" "$bad" (( bad == 0 )) } CMD="${1:-list}"; shift || true if declare -f "cmd_${CMD}" >/dev/null 2&1; then "cmd_${CMD}" "$@" else printf '%s: unknown command: %s\n' "$PROG" "$CMD" >&2 exit 1 fi
Exercise 4 — Resilient job wrapper
Write a general-purpose wrapper script run-job.sh COMMAND [ARGS...] that
can be used from both cron and systemd to wrap any command with:
- Exclusive locking based on a hash of the command path (so two different jobs don't share a lock)
- Timestamped logging to /var/log/jobs/JOBNAME.log where JOBNAME is the basename of the wrapped command
- A timeout:
TIMEOUTenv var in seconds (default 3600) after which the job is killed - Exit code pass-through — the wrapper exits with the same code as the wrapped command
- A summary line:
FINISHED|FAILED|TIMEOUT — duration Xs
#!/usr/bin/env bash set -euo pipefail CMD="${1:?usage: $0 COMMAND [ARGS...]}" JOBNAME="$(basename "$CMD")" TIMEOUT="${TIMEOUT:-3600}" LOG_DIR=/var/log/jobs LOG_FILE="$LOG_DIR/$JOBNAME.log" # Lock file based on the full command path (md5 to keep filename safe) LOCK_HASH=$(printf '%s' "$CMD" | md5sum | cut -c1-8) LOCK_FILE="/var/lock/job_${LOCK_HASH}.lock" mkdir -p "$LOG_DIR" exec >>"$LOG_FILE" 2&1 ts() { date '+%Y-%m-%dT%H:%M:%S'; } log() { printf '[%s] %s\n' "$(ts)" "$*"; } exec 9>"$LOCK_FILE" if ! flock -n 9; then log "LOCKED — $JOBNAME already running, skipping" exit 0 fi START=$(date '+%s') log "START $JOBNAME (timeout=${TIMEOUT}s, pid=$$)" exit_code=0 timed_out=0 # Run with timeout — timeout exits 124 on expiry timeout --kill-after=5 "$TIMEOUT" "$@" || { exit_code=$? (( exit_code == 124 )) && timed_out=1 } ELAPSED=$(( $(date '+%s') - START )) if (( timed_out )); then log "TIMEOUT — $JOBNAME killed after ${ELAPSED}s" elif (( exit_code != 0 )); then log "FAILED — $JOBNAME exit $exit_code after ${ELAPSED}s" else log "FINISHED — $JOBNAME OK in ${ELAPSED}s" fi exit "$exit_code"