Automating rsync

Chapter 9 — Automating rsync

Running rsync by hand works for one-off transfers. For backups and deploys, you want them to happen automatically — on a schedule, without any human involvement, and with some kind of rotation so old snapshots don't fill your disk. This chapter ties everything together: cron scheduling, passphrase-free key auth for unattended runs, backup rotation strategies, and a complete home-server backup routine.

Passphrase-Free Keys for Unattended rsync

A cron job can't type a passphrase. For automated rsync over SSH, you need a dedicated key pair with no passphrase — used only for backup automation, with its permissions locked down on the server.

philip@debian — creating a passphrase-free backup key
# Generate a dedicated key — press Enter twice (no passphrase) philip@debian:~$ ssh-keygen -t ed25519 -C "backup-cron" -f ~/.ssh/id_backup Generating public/private ed25519 key pair. Enter passphrase (empty for no passphrase): ← press Enter Enter same passphrase again: ← press Enter again Your identification has been saved in /home/philip/.ssh/id_backup Your public key has been saved in /home/philip/.ssh/id_backup.pub # Copy it to the remote server philip@debian:~$ ssh-copy-id -i ~/.ssh/id_backup.pub server Number of key(s) added: 1 # Test — should connect without any prompt philip@debian:~$ ssh -i ~/.ssh/id_backup server "echo ok" ok
A passphrase-free key is powerful — lock it down. On the server, in ~/.ssh/authorized_keys, prefix the backup key entry with command= and no-pty restrictions so that even if the key is stolen, it can only run rsync — nothing else:
# ~/.ssh/authorized_keys on the SERVER # The backup key is restricted: can only run rsync, cannot open a shell command="rsync --server --sender -logDtpre.iLsfxCIvu . /",no-pty,no-agent-forwarding,no-port-forwarding ssh-ed25519 AAAA... backup-cron # Your normal key sits below it without restrictions ssh-ed25519 AAAA... philip@laptop
Separate keys for separate jobs. Your daily-use key has a strong passphrase and full shell access. The backup key has no passphrase and restricted access. If the backup key leaks, the attacker can only pull files — they can't get a shell, forward ports, or use your agent.

Adding the Backup Key to ~/.ssh/config

# ~/.ssh/config — add a dedicated stanza for automated backup Host server-backup HostName 192.168.1.100 User philip IdentityFile ~/.ssh/id_backup IdentitiesOnly yes StrictHostKeyChecking yes ← fail loudly if host key changes ConnectTimeout 10 # rsync commands in cron use server-backup as the alias # rsync -az server-backup:/home/philip/ ~/backups/server/

Cron — Scheduling the Job

Cron runs commands on a schedule. Each line in your crontab is a rule specifying when to run a command:

30 2 * * * /home/philip/scripts/backup.sh >> /var/log/backup.log 2>&1
min
0–59
hour
0–23
day
1–31
month
1–12
weekday
0–7
command to run
# Common cron schedule examples 30 2 * * * /path/to/backup.sh # every day at 02:30 0 3 * * 0 /path/to/backup.sh # every Sunday at 03:00 0 1 1 * * /path/to/backup.sh # 1st of every month at 01:00 */15 * * * * /path/to/script.sh # every 15 minutes @daily /path/to/backup.sh # shorthand: once a day at midnight @weekly /path/to/backup.sh # shorthand: once a week (Sunday) @reboot /path/to/startup.sh # run once at system boot
philip@debian — editing the crontab
# Open your personal crontab for editing (uses $EDITOR — set it to vi) philip@debian:~$ crontab -e # List current cron jobs philip@debian:~$ crontab -l 30 2 * * * /home/philip/scripts/backup.sh >> /var/log/backup.log 2>&1 # Check cron ran (look for entries from cron daemon) philip@debian:~$ grep CRON /var/log/syslog | tail -5 Jun 17 02:30:01 debian CRON[12345]: (philip) CMD (/home/philip/scripts/backup.sh)
Cron has a minimal environment — full paths everywhere. Cron doesn't load your .bashrc or .profile. The PATH is bare (/usr/bin:/bin). Always use absolute paths in cron scripts: /usr/bin/rsync, not just rsync. Find the path with which rsync.

Backup Rotation Strategies

A single rsync mirror is useful but fragile — if you accidentally delete a file and the next backup runs, the file is gone from both places. Rotation keeps multiple snapshots so you can go back in time.

Strategy 1
Dated snapshots
Each backup run creates a new timestamped directory. Simple but uses a lot of disk space — every file is duplicated each time, even unchanged ones.
Strategy 2
Hard-link rotation
Each snapshot directory hard-links unchanged files from the previous snapshot — they share the same inode on disk. Looks like a full copy but uses only the space of changed files.
Strategy 3
Grandfather-Father-Son
Keep daily backups for 7 days, weekly for 4 weeks, monthly for 12 months. Classic tape rotation scheme — adapted easily for rsync with a rotation script.

Hard-link rotation with --link-dest

rsync's --link-dest flag points to a previous snapshot. Files that haven't changed are hard-linked (not copied) — the new snapshot appears complete but uses almost no extra disk space:

# Create today's snapshot, hard-linking unchanged files from yesterday TODAY=$(date +%Y-%m-%d) YESTERDAY=$(date -d yesterday +%Y-%m-%d) BACKUP_DIR="/mnt/backup" rsync -az \ --link-dest="$BACKUP_DIR/$YESTERDAY" \ server-backup:/home/philip/ \ "$BACKUP_DIR/$TODAY/" # Result in /mnt/backup/: # 2026-06-15/ full snapshot (first run — all files copied) # 2026-06-16/ only changed files actually stored; rest are hard links # 2026-06-17/ same — looks like a complete copy, costs almost nothing

A Complete Backup Script

This is a production-ready backup script that pulls a server's home directory to a local machine, uses hard-link rotation, and deletes snapshots older than 30 days:

#!/bin/bash # /home/philip/scripts/backup.sh # Pulls philip's home dir from server, keeps 30 days of snapshots ## ── Config ───────────────────────────────────────────────── SOURCE="server-backup:/home/philip/" # ssh config alias + remote path DEST="/mnt/backup/server" # local backup root KEEP_DAYS=30 # delete snapshots older than this LOG="/var/log/backup-server.log" ## ── Setup ─────────────────────────────────────────────────── TODAY=$(date +%Y-%m-%d) LATEST="$DEST/latest" # symlink to most recent snapshot SNAPSHOT="$DEST/$TODAY" ## ── Run ───────────────────────────────────────────────────── echo "[$(date)] Starting backup" >> "$LOG" /usr/bin/rsync -az \ --link-dest="$LATEST" \ --exclude='.cache/' \ --exclude='tmp/' \ --exclude='*.log' \ "$SOURCE" "$SNAPSHOT/" \ >> "$LOG" 2>&1 if [ $? -eq 0 ]; then # Update the 'latest' symlink to point to today's snapshot ln -snf "$SNAPSHOT" "$LATEST" echo "[$(date)] Backup OK → $SNAPSHOT" >> "$LOG" else echo "[$(date)] Backup FAILED" >> "$LOG" fi ## ── Rotate ────────────────────────────────────────────────── # Delete snapshot directories older than $KEEP_DAYS days find "$DEST" -maxdepth 1 -type d -name '????-??-??' \ -mtime +"$KEEP_DAYS" -exec rm -rf {} \; echo "[$(date)] Rotation complete" >> "$LOG"
philip@debian — installing the backup script and cron job
# Make the script executable philip@debian:~$ chmod +x ~/scripts/backup.sh # Test it manually first — always do this before scheduling philip@debian:~$ ~/scripts/backup.sh philip@debian:~$ cat /var/log/backup-server.log [2026-06-17 14:23:01] Starting backup [2026-06-17 14:23:08] Backup OK → /mnt/backup/server/2026-06-17 [2026-06-17 14:23:08] Rotation complete # Schedule it: every day at 02:30 philip@debian:~$ crontab -e # Add this line: 30 2 * * * /home/philip/scripts/backup.sh # Verify the snapshot structure philip@debian:~$ ls -la /mnt/backup/server/ drwxr-xr-x 2026-06-15/ drwxr-xr-x 2026-06-16/ drwxr-xr-x 2026-06-17/ lrwxrwxrwx latest -> /mnt/backup/server/2026-06-17

Verifying Backups Actually Work

A backup you've never tested is not a backup. Build verification into your routine:

philip@debian — spot-checking the backup
# Check the latest snapshot has recent files philip@debian:~$ ls -lt /mnt/backup/server/latest/ | head -10 -rw-r--r-- 1 philip philip 2048 Jun 17 10:22 index.html -rw-r--r-- 1 philip philip 8192 Jun 17 09:14 notes.txt # Confirm hard-linking is working — inode count should be 2+ for unchanged files philip@debian:~$ stat /mnt/backup/server/2026-06-16/notes.txt File: notes.txt Size: 8192 Blocks: 16 IO Block: 4096 Links: 3 ← hard-linked across 3 snapshots # Check how much disk space the backup actually uses philip@debian:~$ du -sh /mnt/backup/server/ 1.2G /mnt/backup/server/ philip@debian:~$ du -sh /mnt/backup/server/*/ 1.1G /mnt/backup/server/2026-06-15/ ← first run: full copy 4.2M /mnt/backup/server/2026-06-16/ ← only changed files 2.1M /mnt/backup/server/2026-06-17/ ← only changed files

Getting Notified on Failure

A cron job that silently fails is worse than no backup at all. Three ways to get alerts:

# Option 1 — cron's built-in MAILTO (sends output by email on any output) MAILTO=emubantam@gmail.com 30 2 * * * /home/philip/scripts/backup.sh # Option 2 — send a notification only on failure (in the script) if [ $? -ne 0 ]; then echo "Backup failed on $(hostname) at $(date)" | \ mail -s "BACKUP FAILED" emubantam@gmail.com fi # Option 3 — use healthchecks.io (free service, pings a URL on success) # The service alerts you if it doesn't receive a ping on schedule 30 2 * * * /home/philip/scripts/backup.sh && \ curl -fsS --retry 3 https://hc-ping.com/YOUR-UUID > /dev/null

Automation Checklist

  • Dedicated passphrase-free key for cron jobs, separate from your daily-use key
  • Key restricted in authorized_keys with command= and no-pty
  • SSH config alias pointing to the backup key (IdentitiesOnly yes)
  • Absolute paths in the script (/usr/bin/rsync not rsync)
  • Script tested manually before scheduling — check the log output
  • Exit code checked in the script (if [ $? -eq 0 ])
  • Rotation in place — old snapshots deleted automatically
  • Failure notification configured
  • Periodic manual spot-check that recent files are actually in the snapshot
  • Disk space monitored — backups silently stop when the disk is full

Quick Reference

Command / conceptWhat it does
ssh-keygen -f ~/.ssh/id_backupGenerate a dedicated backup key (no passphrase)
ssh-copy-id -i id_backup.pub serverInstall the backup key on the server
crontab -eEdit your personal crontab
crontab -lList current cron jobs
30 2 * * *Run daily at 02:30
--link-dest=/path/to/prevHard-link unchanged files from a previous snapshot
ln -snf new latestUpdate the "latest" symlink to the newest snapshot
find … -mtime +30 -exec rm -rf {} \;Delete directories older than 30 days
du -sh snapshots/*/Show disk usage per snapshot (hard links make this tiny)
stat fileShow inode link count — confirms hard-linking is working
grep CRON /var/log/syslogCheck cron ran the job
Course Complete — SSH, SFTP & rsync
All 9 chapters finished. From TCP handshakes to automated nightly backups.
Ch 1 — How SSH Works Ch 2 — Connecting Ch 3 — Key-Based Auth Ch 4 — SSH Config File Ch 5 — Port Forwarding Ch 6 — SFTP Ch 7 — rsync Basics Ch 8 — rsync over SSH Ch 9 — Automating rsync
Courses 2 and 3 continue the remote access series: Remote Desktop (RDP & VNC) and VPN.