Scripting for Cron and Systemd

Chapter 12 — Scripting for Cron and Systemd

A script that works perfectly when you run it interactively can silently fail when a scheduler runs it at 3 AM. The reasons are almost always environmental: missing PATH entries, no terminal, no home directory, or a second copy launched before the first one finished. This chapter covers everything you need to write scripts that behave correctly when run unattended.

1 — Cron Syntax and the Crontab

A crontab line has five time fields followed by the command:

# ┌──────────── minute        (0–59)
# │  ┌─────────── hour          (0–23)
# │  │  ┌────────── day of month  (1–31)
# │  │  │  ┌───────── month         (1–12 or JAN–DEC)
# │  │  │  │  ┌────────── day of week   (0–7, 0 and 7 = Sunday, or SUN–SAT)
# │  │  │  │  │
# *  *  *  *  *  command

# Run at 02:30 every day
30  2  *  *  *  /opt/myapp/bin/backup.sh

# Every 15 minutes
*/15  *  *  *  *  /opt/myapp/bin/poll.sh

# 9 AM on weekdays only
0  9  *  *  1-5  /opt/myapp/bin/report.sh

# First day of every month at midnight
0  0  1  *  *  /opt/myapp/bin/monthly.sh

# Multiple values with commas
0  8,12,17  *  *  *  /opt/myapp/bin/check.sh

Managing crontabs

# Edit your crontab interactively
crontab -e

# List current crontab
crontab -l

# Install from a file (replaces entire crontab)
crontab crontab.txt

# Non-interactively add a line — safe even if crontab is empty
( crontab -l 2>/dev/null; echo "30 2 * * * /opt/myapp/bin/backup.sh" ) | crontab -

# Remove a specific line (match and delete)
crontab -l | grep -v 'backup.sh' | crontab -

# Edit another user's crontab (as root)
crontab -u deploy -e

# System-wide crontab — has an extra USER field
# /etc/crontab and files in /etc/cron.d/:
# 30 2 * * *  deploy  /opt/myapp/bin/backup.sh
Tip: use crontab.guru to verify your schedule expressions interactively before deploying them.

2 — The Cron Environment Problem

Cron runs jobs with a minimal environment — typically just HOME, LOGNAME, and a very short PATH (often just /usr/bin:/bin). Everything your interactive shell loads — ~/.bashrc, nvm, pyenv, /usr/local/bin — is absent.

The three defences

# 1. Set PATH explicitly at the top of your crontab
PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
MAILTO=ops@example.com    # send output/errors here; empty string = discard
SHELL=/bin/bash

30 2 * * * backup.sh
# 2. Use absolute paths for every binary inside the script
#!/usr/bin/env bash
PYTHON=/usr/bin/python3
RSYNC=/usr/bin/rsync
CURL=/usr/bin/curl
# 3. Source your profile at the top of the script (when you NEED its env)
#!/usr/bin/env bash
# Deliberately load the user environment — use sparingly
# shellcheck source=/dev/null
[[ -f /etc/profile ]] && source /etc/profile
[[ -f "${HOME}/.bashrc" ]] && source "${HOME}/.bashrc"

Diagnosing environment issues

# Simulate the cron environment locally
env -i HOME="$HOME" LOGNAME="$LOGNAME" PATH='/usr/bin:/bin' \
  SHELL=/bin/bash bash --norc --noprofile ./myscript.sh

# Dump the actual cron environment to a file for inspection
* * * * * env > /tmp/cron-env.txt 2&1

3 — Locking: Preventing Overlapping Runs

Without a lock, a slow job can still be running when cron starts the next instance. Two copies writing to the same file or database is rarely safe.

flock — the right tool for the job

# Inline — entire script protected in one line from the crontab
* * * * *  flock -n /var/lock/myjob.lock /opt/myapp/bin/myjob.sh

# -n = non-blocking: exit immediately if lock is held (skip this run)
# -w N = wait up to N seconds before giving up
flock -w 10 /var/lock/myjob.lock /opt/myapp/bin/myjob.sh

flock inside the script

#!/usr/bin/env bash
set -euo pipefail

LOCKFILE=/var/lock/myjob.lock

# Open the lock file on fd 9 and attempt an exclusive non-blocking lock
exec 9>"$LOCKFILE"
if ! flock -n 9; then
  echo "Another instance is already running. Exiting." >&2
  exit 1
fi
# Lock is held for the rest of the script; released automatically on exit

echo "Running job..."
# … work …
Why not PID files? Classic PID-file locking is race-prone and leaves stale files after crashes. flock uses kernel-level advisory locks that are automatically released when the process exits — no cleanup needed.

Logging with timestamps for unattended jobs

#!/usr/bin/env bash
LOGFILE=/var/log/myapp/myjob.log
MAXLOG=10485760   # 10 MB — rotate when exceeded

log() { printf '[%s] %s\n' "$(date '+%Y-%m-%dT%H:%M:%S')" "$*"; }

# Rotate log if too large
if [[ -f $LOGFILE ]] && (( $(stat -c '%s' "$LOGFILE") > MAXLOG )); then
  mv "$LOGFILE" "$LOGFILE.$(date +%Y%m%d%H%M%S)"
fi

# Redirect all output (stdout + stderr) to the log file
exec >>"$LOGFILE" 2&1

log "=== Job started (PID $$) ==="
# … work …
log "=== Job finished ==="

4 — Systemd Timer Units

Systemd timers are the modern alternative to cron. They offer better logging (via journald), dependency management, missed-run handling, and per-service resource controls.

Every timer needs two files: a .service unit (what to run) and a .timer unit (when to run it).

The service unit

# /etc/systemd/system/myjob.service
[Unit]
Description=My scheduled job
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
User=deploy
Group=deploy
WorkingDirectory=/opt/myapp
ExecStart=/opt/myapp/bin/myjob.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=myjob

# Resource limits — prevent runaway jobs
TimeoutStartSec=300
MemoryMax=512M
CPUQuota=50%

[Install]
WantedBy=multi-user.target

The timer unit

# /etc/systemd/system/myjob.timer
[Unit]
Description=Run myjob daily at 02:30

[Timer]
# Calendar expression — like cron but human-readable
OnCalendar=*-*-* 02:30:00

# If the system was off at the scheduled time, run ASAP on next boot
Persistent=true

# Spread load — randomise start within 5 minutes of the scheduled time
RandomizedDelaySec=5min

[Install]
WantedBy=timers.target

Timer management commands

# Enable and start the timer
sudo systemctl daemon-reload
sudo systemctl enable --now myjob.timer

# Check timer status and next run time
systemctl status myjob.timer
systemctl list-timers --all

# Run the job immediately (without waiting for the timer)
sudo systemctl start myjob.service

# View recent logs
journalctl -u myjob.service -n 50

# Follow logs in real time
journalctl -u myjob.service -f

# Logs since last run
journalctl -u myjob.service --since "$(systemctl show myjob.service -p ExecMainStartTimestamp --value)"

OnCalendar expressions

ExpressionMeaning
dailyEvery day at midnight (shorthand)
hourlyEvery hour at :00
weeklyEvery Monday at midnight
monthlyFirst day of every month at midnight
*-*-* 02:30:00Every day at 02:30
Mon..Fri *-*-* 09:00:00Weekdays at 09:00
*-*-1 00:00:00First of every month at midnight
*:0/15Every 15 minutes
# Test a calendar expression without running anything
systemd-analyze calendar 'Mon..Fri *-*-* 09:00:00'
  Original form: Mon..Fri *-*-* 09:00:00
Normalized form: Mon..Fri *-*-* 09:00:00
    Next elapse: Mon 2026-06-09 09:00:00 UTC
       (in UTC): Mon 2026-06-09 09:00:00 UTC

5 — User Timers (No Root Required)

You don't need root to use systemd timers. User timers live under ~/.config/systemd/user/ and run as your own user.

# ~/.config/systemd/user/sync.service
[Unit]
Description=Sync notes to remote

[Service]
Type=oneshot
ExecStart=%h/bin/sync-notes.sh   # %h expands to $HOME

# ~/.config/systemd/user/sync.timer
[Unit]
Description=Sync notes every 30 minutes

[Timer]
OnCalendar=*:0/30
Persistent=true

[Install]
WantedBy=timers.target
# Manage user timers — no sudo needed
systemctl --user daemon-reload
systemctl --user enable --now sync.timer
systemctl --user list-timers
journalctl --user -u sync.service -f

# Allow user timers to run even when logged out
sudo loginctl enable-linger "$USER"

6 — A Production-Ready Cron Script Template

This template handles every common unattended-job concern in one place:

#!/usr/bin/env bash
# =============================================================================
# myjob.sh — nightly data export
# Safe for cron and systemd. Run as: deploy user.
# =============================================================================
set -euo pipefail

# ── Identity ──────────────────────────────────────────────────────────────────
PROG="$(basename "${BASH_SOURCE[0]}")"
PIDSTR="[$$]"

# ── Environment ───────────────────────────────────────────────────────────────
PATH=/usr/local/bin:/usr/bin:/bin

# ── Paths ─────────────────────────────────────────────────────────────────────
APP_DIR=/opt/myapp
LOG_DIR=/var/log/myapp
LOCK_FILE=/var/lock/myjob.lock
LOG_FILE="$LOG_DIR/myjob.log"
mkdir -p "$LOG_DIR"

# ── Logging ───────────────────────────────────────────────────────────────────
exec >>"$LOG_FILE" 2&1
log()  { printf '[%s] %s %s\n'    "$(date '+%Y-%m-%dT%H:%M:%S')" "$PIDSTR" "$*"; }
warn() { printf '[%s] %s WARN %s\n' "$(date '+%Y-%m-%dT%H:%M:%S')" "$PIDSTR" "$*"; }

# ── Locking ───────────────────────────────────────────────────────────────────
exec 9>"$LOCK_FILE"
if ! flock -n 9; then
  log "Already running — skipping this invocation"
  exit 0
fi

# ── Traps ─────────────────────────────────────────────────────────────────────
on_exit() {
  local rc=$?
  if (( rc != 0 )); then
    log "=== Job FAILED (exit $rc) ==="
  else
    log "=== Job finished OK ==="
  fi
}
trap on_exit EXIT

# ── Main ──────────────────────────────────────────────────────────────────────
log "=== Job started ==="

# … your work here …
log "Exporting data..."
# export_data ...

log "Done."

7 — Sending Alerts on Failure

Silent failures are the worst kind. Build notification into the job itself, not just the crontab's MAILTO setting (which requires a working MTA).

#!/usr/bin/env bash
SLACK_WEBHOOK="${SLACK_WEBHOOK_URL:?set SLACK_WEBHOOK_URL}"
PROG="$(basename "$0")"
HOST=$(hostname -s)

notify_failure() {
  local exit_code="$1"
  local message
  printf -v message '*FAILED* `%s` on `%s` (exit %d) at %s' \
    "$PROG" "$HOST" "$exit_code" "$(date '+%Y-%m-%d %H:%M:%S')"
  curl -s -X POST -H 'Content-Type: application/json' \
    --data "{\"text\":\"${message}\"}" \
    "$SLACK_WEBHOOK" || true   # never let alerting kill the job
}

on_exit() {
  local rc=$?
  (( rc != 0 )) && notify_failure "$rc"
}
trap on_exit EXIT

# … rest of script …

Using systemd OnFailure for alert routing

# /etc/systemd/system/myjob.service
[Unit]
Description=My job
OnFailure=notify-failure@%n.service   # %n = unit name

# /etc/systemd/system/notify-failure@.service  (template unit)
[Unit]
Description=Notify on failure for %i

[Service]
Type=oneshot
ExecStart=/opt/myapp/bin/notify-failure.sh %i

8 — Cron vs Systemd Timers: Quick Comparison

FeatureCronSystemd timer
AvailabilityUniversalLinux with systemd only
LoggingEmail / redirect manuallyjournald — automatic, queryable
Missed runsSilently skippedPersistent=true catches up
DependenciesNoneAfter=, Wants=, Requires=
Resource limitsNone built-inMemory, CPU, IO per unit
Overlapping runsRequires manual lockingDefault: one instance at a time
User jobs (no root)crontab -e~/.config/systemd/user/
Schedule syntax5-field numericHuman-readable calendar strings
Randomised jitterManual with sleep $RANDOMRandomizedDelaySec=

Exercises

Exercise 1 — Hardened cron script

Write db_backup.sh — a script meant to run daily from cron that:

  • Sets an explicit PATH at the top
  • Uses flock on fd 9 to prevent overlapping runs
  • Redirects all output to /var/log/myapp/db_backup.log with timestamps
  • Rotates the log file when it exceeds 5 MB
  • Traps EXIT to log success or failure
  • The actual "backup" can be a stub (sleep 2 && echo "Backup complete")
#!/usr/bin/env bash
set -euo pipefail

PATH=/usr/local/bin:/usr/bin:/bin

LOG_DIR=/var/log/myapp
LOG_FILE="$LOG_DIR/db_backup.log"
LOCK_FILE=/var/lock/db_backup.lock
MAX_LOG=5242880   # 5 MB

mkdir -p "$LOG_DIR"

# Rotate log if needed
if [[ -f $LOG_FILE ]] && \
   (( $(stat -c '%s' "$LOG_FILE") > MAX_LOG )); then
  mv "$LOG_FILE" "$LOG_FILE.$(date +%Y%m%d%H%M%S)"
fi

exec >>"$LOG_FILE" 2&1

log() { printf '[%s] [%d] %s\n' "$(date '+%Y-%m-%dT%H:%M:%S')" "$$" "$*"; }

# Lock
exec 9>"$LOCK_FILE"
if ! flock -n 9; then
  log "Already running — skipping"
  exit 0
fi

on_exit() {
  local rc=$?
  if (( rc == 0 )); then
    log "=== FINISHED OK ==="
  else
    log "=== FAILED (exit $rc) ==="
  fi
}
trap on_exit EXIT

log "=== DB backup started ==="
# Stub backup work
sleep 2
log "Backup complete"

Exercise 2 — Systemd timer unit

Write a pair of systemd unit files for a weekly report generator:

  • weekly-report.service — runs /opt/reports/generate.sh as user reports, depends on network being available, journals all output, times out after 10 minutes
  • weekly-report.timer — fires every Monday at 06:00, catches up on missed runs, spreads load with up to 10 minutes of random jitter

Also write the three systemctl commands needed to deploy and start the timer without rebooting.

# /etc/systemd/system/weekly-report.service
[Unit]
Description=Weekly report generator
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
User=reports
Group=reports
ExecStart=/opt/reports/generate.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=weekly-report
TimeoutStartSec=600

[Install]
WantedBy=multi-user.target
# /etc/systemd/system/weekly-report.timer
[Unit]
Description=Run weekly report every Monday at 06:00

[Timer]
OnCalendar=Mon *-*-* 06:00:00
Persistent=true
RandomizedDelaySec=10min

[Install]
WantedBy=timers.target
# Deploy commands
sudo systemctl daemon-reload
sudo systemctl enable weekly-report.timer
sudo systemctl start  weekly-report.timer

Exercise 3 — crontab management script

Write a script cron-manage.sh with subcommands:

  • add SCHEDULE COMMAND — add an entry if it doesn't already exist (match on COMMAND to avoid duplicates)
  • remove PATTERN — remove all entries whose command matches PATTERN
  • list — pretty-print the current crontab, skipping comment lines
  • check — verify every command path in the crontab exists and is executable
#!/usr/bin/env bash
set -euo pipefail
PROG="$(basename "$0")"

cmd_add() {
  local schedule="${1:?add requires SCHEDULE}"
  local command="${2:?add requires COMMAND}"
  local entry="$schedule $command"
  local current
  current=$(crontab -l 2>/dev/null || true)

  if grep -qF "$command" <<<"$current" 2>/dev/null; then
    echo "Entry already exists for: $command"
    return 0
  fi

  ( echo "$current"; echo "$entry" ) | grep -v '^$' | crontab -
  echo "Added: $entry"
}

cmd_remove() {
  local pattern="${1:?remove requires PATTERN}"
  crontab -l 2>/dev/null | grep -v "$pattern" | crontab -
  echo "Removed entries matching: $pattern"
}

cmd_list() {
  crontab -l 2>/dev/null | grep -v '^[[:space:]]*#' | \
  grep -v '^[[:space:]]*$' | \
  awk '{ printf "%-40s %s\n", $1" "$2" "$3" "$4" "$5, substr($0, index($0,$6)) }'
}

cmd_check() {
  local ok=0 bad=0
  while read -r m h dom mon dow cmd rest...; do
    # cmd is the first token after the 5 time fields
    if [[ -x "$cmd" ]]; then
      printf '  OK:      %s\n' "$cmd"; (( ok++ ))
    else
      printf '  MISSING: %s\n' "$cmd"; (( bad++ ))
    fi
  done <(crontab -l 2>/dev/null | grep -v '^[[:space:]]*[#$]')
  printf '%d OK, %d missing\n' "$ok" "$bad"
  (( bad == 0 ))
}

CMD="${1:-list}"; shift || true
if declare -f "cmd_${CMD}" >/dev/null 2&1; then
  "cmd_${CMD}" "$@"
else
  printf '%s: unknown command: %s\n' "$PROG" "$CMD" >&2
  exit 1
fi

Exercise 4 — Resilient job wrapper

Write a general-purpose wrapper script run-job.sh COMMAND [ARGS...] that can be used from both cron and systemd to wrap any command with:

  • Exclusive locking based on a hash of the command path (so two different jobs don't share a lock)
  • Timestamped logging to /var/log/jobs/JOBNAME.log where JOBNAME is the basename of the wrapped command
  • A timeout: TIMEOUT env var in seconds (default 3600) after which the job is killed
  • Exit code pass-through — the wrapper exits with the same code as the wrapped command
  • A summary line: FINISHED|FAILED|TIMEOUT — duration Xs
#!/usr/bin/env bash
set -euo pipefail

CMD="${1:?usage: $0 COMMAND [ARGS...]}"
JOBNAME="$(basename "$CMD")"
TIMEOUT="${TIMEOUT:-3600}"
LOG_DIR=/var/log/jobs
LOG_FILE="$LOG_DIR/$JOBNAME.log"

# Lock file based on the full command path (md5 to keep filename safe)
LOCK_HASH=$(printf '%s' "$CMD" | md5sum | cut -c1-8)
LOCK_FILE="/var/lock/job_${LOCK_HASH}.lock"

mkdir -p "$LOG_DIR"
exec >>"$LOG_FILE" 2&1

ts()  { date '+%Y-%m-%dT%H:%M:%S'; }
log() { printf '[%s] %s\n' "$(ts)" "$*"; }

exec 9>"$LOCK_FILE"
if ! flock -n 9; then
  log "LOCKED — $JOBNAME already running, skipping"
  exit 0
fi

START=$(date '+%s')
log "START $JOBNAME (timeout=${TIMEOUT}s, pid=$$)"

exit_code=0
timed_out=0

# Run with timeout — timeout exits 124 on expiry
timeout --kill-after=5 "$TIMEOUT" "$@" || {
  exit_code=$?
  (( exit_code == 124 )) && timed_out=1
}

ELAPSED=$(( $(date '+%s') - START ))

if (( timed_out )); then
  log "TIMEOUT — $JOBNAME killed after ${ELAPSED}s"
elif (( exit_code != 0 )); then
  log "FAILED — $JOBNAME exit $exit_code after ${ELAPSED}s"
else
  log "FINISHED — $JOBNAME OK in ${ELAPSED}s"
fi

exit "$exit_code"