Marc's Blog: linux - Btrfs Tips: Doing Fast Incremental Backups With Btrfs Send and Receive

Doing much faster incremental backups than rsync with btrfs send and btrfs receive

If you are doing backups with rsync, you know that on big filesystems, it takes a long time for rsync to scan all the files on each side before it can finally sync them. You also know that rsync does not track file renames (unless you use --fuzzy and the file isi in the same dirctory, and --fuzzy can be very expensive if you have directories with many files, I had it blow through my comcast bandwidth account when I was rsyncing maildir backups).

Just like ZFS, btrfs can compute a list of block changes between 2 snapshots and only send those blocks to the other side making the backups much much faster.
At the time I'm writing this, it does work, but there still a few bugs that could cause it to abort (no data loss, but it will stop to sync further unless you start over from scratch). Most of those bugs have been fixed in kernel 3.14, so it is recommended you use this unless you're just trying it out for testing.

How does it work?

This is all based on subvolumes, so please put all your data in subvolumes (even your root filesysem).

you make a read only snpahost at the source (let's say in /mnt/btrfs_pool1, you snapshot root to root_ro_timestamp)

you do one btrfs send/receive that sends that entire snapshot to the other side

the following times you tell btrfs send to send the diff between that last read only snapshot and a new one you just made.

on the other side, you only run btrfs receive in a btrfs block pool (let's say /mnt/btrfs_pool2). You do not give it any arguments linked to the backup name_ because it keeps track of the snapshot names from what was sent at the source.

If you'd like many more details, you can find some here:

http://lwn.net/Articles/506244

https://btrfs.wiki.kernel.org/index.php/Design_notes_on_Send/Receive

https://btrfs.wiki.kernel.org/index.php/Incremental_Backup

In real life, this is tedious to do by hand, and even the script to write is not super obvious, so I wrote one that I'm sharing here. I actually do a fair amount of backups on the same machine (like I backup the SSD on my laptop to a hard drive on the same laptop every hour, because SSDs fail, and they could fail while I away from home without my regular off laptop backups), but the script does allow sending the backup to another machine (--dest).

This backup script does a bit more in the following ways:

As per my post on hourly/daily/weekly snapshots, I like snapshots, so I am using this backup script's snapshots as local data recovery snapshots too, and therefore keep some amount of them behind, not just the last one (see -k num).

On my laptop, I want the destination snapshot to be writable and I want to know automatically which snapshot is the latest, so the script creates snapshot_last and snapshot_last_rw symlinks. Using them, I can boot my system from those snapshots and use the system normally if my main boot SSD dies and I need to boot from the HD. Thankfully btrfs supports using -o subvol=root_last_rw as a subvolume name and will follow the symlink to the real volume: root_rw.20140321_07:00:35

At the same time as creating the extra _ro and _rw snapshots for time based recovery, it automatically rotates them out and deletes the oldests (--keep says how many to keep).

As another option, Ruedi Steinmann wrote a more fancy btrbck. It's more complicated since it's much bigger and in java, but it's more featureful, so you may prefer that.

Here is a link to the latest version of my btrfs-subvolume-backup script and a paste of a potentially outdated version for you to look at:

#!/bin/bash

# By Marc MERLIN <marc_soft@merlins.org>
# License: Apache-2.0

# Source: http://marc.merlins.org/linux/scripts

# $Id: btrfs-subvolume-backup 1012 2014-06-25 21:56:54Z svnuser $
#
# Documentation and details at
# http://marc.merlins.org/perso/btrfs/2014-03.html#Btrfs-Tips_-Doing-Fast-Incremental-Backups-With-Btrfs-Send-and-Receive

# cron jobs might not have /sbin in their path.
export PATH="$PATH:/sbin"

set -o nounset
set -o errexit
set -o pipefail

# From https://btrfs.wiki.kernel.org/index.php/Incremental_Backup

# bash shortcut for `basename $0`
PROG=${0##*/}
lock=/var/run/$PROG

usage() {
    cat <<EOF
Usage: 
cd /mnt/source_btrfs_pool
$PROG [--init] [--keep|-k num] [--dest hostname] volume_name /mnt/backup_btrfs_pool

Options:
    --init:          Print this help message and exit.
    --keep num:      Keep the last snapshots for local backups (5 by default)
    --dest hostname: If present, ssh to that machine to make the copy.
    --diff:	     show an approximate diff between the snapshots

This will snapshot volume_name in a btrfs pool, and send the diff
between it and the previous snapshot (volume_name.last) to another btrfs
pool (on other drives)

If your backup destination is another machine, you'll need to add a few
ssh commands this script

The num sanpshots to keep is to give snapshots you can recover data from 
and they get deleted after num runs. Set to 0 to disable (one snapshot will
be kept since it's required for the next diff to be computed).
EOF
    exit 0
}

die () {
    msg=${1:-}
    # don't loop on ERR
    trap ' ERR

    rm $lock

    echo "$msg" >&2
    echo >&2

    # This is a fancy shell core dumper
    if echo $msg | grep -q 'Error line .* with status'; then
	line=`echo $msg | sed 's/.*Error line \(.*\) with status.*/\1/'`
	echo " DIE: Code dump:" >&2
	nl -ba $0 | grep -3 "\b$line\b" >&2
    fi

    exit 1
}

# Trap errors for logging before we die (so that they can be picked up
# by the log checker)
trap 'die "Error line $LINENO with status $?"' ERR

init=""
# Keep the last 5 snapshots by default
keep=5
TEMP=$(getopt --longoptions help,usage,init,keep:,dest:,prefix:,diff -o h,k:,d:,p: -- "$@") || usage
dest=localhost
ssh=""
pf=""
diff=""

# getopt quotes arguments with ' We use eval to get rid of that
eval set -- $TEMP

while :
do
    case "$1" in
        -h|--help|--usage)
            usage
            shift
            ;;

	--prefix|-p)
	    shift
	    pf=_$1
	    lock="$lock.$pf"
	    shift
	    ;;

	--keep|-k)
	    shift
	    keep=$1
	    shift
	    ;;

	--dest|-d)
	    shift
	    dest=$1
	    ssh="ssh $dest"
	    shift
	    ;;

	--init)
	    init=1
	    shift
	    ;;

	--diff)
	    diff=1
	    shift
	    ;;

	--)
	    shift
	    break
	    ;;

        *) 
	    echo "Internal error from getopt!"
	    exit 1
	    ;;
    esac
done
[ $keep < 1 ]] && die "Must keep at least one snapshot for things to work ($keep given)"

DATE="$(date '+%Y%m%d_%H:%M:%S')"

[ $# != 2 ]] && usage
vol="$1"
dest_pool="$2"

# shlock (from inn) does the right thing and grabs a lock for a dead process
# (it checks the PID in the lock file and if it's not there, it
# updates the PID with the value given to -p)
if ! shlock -p $$ -f $lock; then
    echo "$lock held for $PROG, quitting" >&2
    exit
fi

if [ -z "$init" ]]; then
    test -e "${vol}${pf}_last" 	|| die "Cannot sync $vol, ${vol}${pf}_last missing. Try --init?"
    src_snap="$(readlink -e ${vol}${pf}_last)"
fi
src_newsnap="${vol}${pf}_ro.$DATE"
src_newsnaprw="${vol}${pf}_rw.$DATE"

$ssh test -d "$dest_pool/" || die "ABORT: $dest_pool not a directory (on $dest)"

btrfs subvolume snapshot -r "$vol" "$src_newsnap"

if [ -n "$diff" ]]; then
    echo diff between "$src_snap" "$src_newsnap"
    btrfs-diff "$src_snap" "$src_newsnap"
fi

# There is currently an issue that the snapshots to be used with "btrfs send"
# must be physically on the disk, or you may receive a "stale NFS file handle"
# error. This is accomplished by "sync" after the snapshot
sync

if [ -n "$init" ]]; then
    btrfs send "$src_newsnap" | $ssh btrfs receive "$dest_pool/"
else
    btrfs send -p "$src_snap" "$src_newsnap" | $ssh btrfs receive "$dest_pool/"
fi

# We make a read-write snapshot in case you want to use it for a chroot
# and some testing with a writeable filesystem or want to boot from a
# last good known snapshot.
btrfs subvolume snapshot "$src_newsnap" "$src_newsnaprw"
$ssh btrfs subvolume snapshot "$dest_pool/$src_newsnap" "$dest_pool/$src_newsnaprw"

# Keep track of the last snapshot to send a diff against.
ln -snf $src_newsnap ${vol}${pf}_last
# The rw version can be used for mounting with subvol=vol_last_rw
ln -snf $src_newsnaprw ${vol}${pf}_last_rw
$ssh ln -snf $src_newsnaprw $dest_pool/${vol}${pf}_last_rw

# How many snapshots to keep on the source btrfs pool (both read
# only and read-write).
ls -rd ${vol}${pf}_ro* | tail -n +$(( $keep + 1 ))| while read snap
do
    btrfs subvolume delete "$snap" | grep -v 'Transaction commit:'
done
ls -rd ${vol}${pf}_rw* | tail -n +$(( $keep + 1 ))| while read snap
do
    btrfs subvolume delete "$snap" | grep -v 'Transaction commit:'
done

# Same thing for destination (assume the same number of snapshots to keep,
# you can change this if you really want).
$ssh ls -rd $dest_pool/${vol}${pf}_ro* | tail -n +$(( $keep + 1 ))| while read snap
do
    $ssh btrfs subvolume delete "$snap" | grep -v 'Transaction commit:'
done
$ssh ls -rd $dest_pool/${vol}${pf}_rw* | tail -n +$(( $keep + 1 ))| while read snap
do
    $ssh btrfs subvolume delete "$snap" | grep -v 'Transaction commit:'
done

rm $lock

Marc's Public Blog - Linux Hacking

Doing much faster incremental backups than rsync with btrfs send and btrfs receive