32

While diffing mounted snapshots would work, it sounds like it could be horribly slow in many cases.

Is there btrfs specific functionality for diffing snapshots? (I was unable to find any in the docs)

Catskul
  • 1,839
  • 4
  • 20
  • 23
  • While it might be possible to find out which blocks were changed and how, you need to consider the case when a change has been reversed later, if you really want to compare filesystems (directories). For example if you have file `A` containing `a`, write `b` in its snapshot and later change it back to `a`, the file didn't really change at all. – Cristian Ciupitu Jun 18 '12 at 19:33
  • It seems like it would be completely analogous to source code revision control where this sort of thing is done all the time, unless I'm missing something. – Catskul Jun 18 '12 at 19:54
  • An additional problem of running something like rsync on a btrfs filesystem is that, unless the noatime mount option was used, reading all files to check if they have changed would effectively modify them and the next snapshot would be large even if no file was actually modified. See https://lwn.net/Articles/499293/ for a discussion. – Luca Citi Mar 24 '17 at 17:00

5 Answers5

14

btrfs send, which appeared in Linux 3.6 (2012), "generates a stream of changes between two subvolume snapshots." You can use it just to produce a fast metadata comparison by adding the --no-data flag.

btrfs send --no-data -p /snapshots/parent /snapshots/child

Normally, you would drop the --no-data flag and pipe the output into btrfs receive, to do incremental backups. For example, if /snapshots/parent already exists at /backup/snapshots/parent, btrfs send would stream only those changes to the /backup filesystem:

btrfs send -p /snapshots/parent /snapshots/child | btrfs receive /backup/snapshots
sjy
  • 207
  • 2
  • 8
amcnabb
  • 597
  • 4
  • 12
13

I'm running Debian stable which does did not have btrfs send, so I looked to a solution using btrfs subvolume find-new.

Update: btrfs send was added in Linux 3.6, which was released in 2012 and included in Debian stable by 2015.

If you have snapshot1 and snapshot2 and you want to know what changed in the later one, snapshot 2, since snapshot1 was made you can use the script below which provides

btrfs-diff oldsnapshot/ newsnapshot/

which will list all files changed in newsnapshot/ since oldsnapshot/.

#!/bin/bash
usage() { echo $@ >2; echo "Usage: $0 <older-snapshot> <newer-snapshot>" >2; exit 1; }

[ $# -eq 2 ] || usage "Incorrect invocation";
SNAPSHOT_OLD=$1;
SNAPSHOT_NEW=$2;

[ -d $SNAPSHOT_OLD ] || usage "$SNAPSHOT_OLD does not exist";
[ -d $SNAPSHOT_NEW ] || usage "$SNAPSHOT_NEW does not exist";

OLD_TRANSID=`btrfs subvolume find-new "$SNAPSHOT_OLD" 9999999`
OLD_TRANSID=${OLD_TRANSID#transid marker was }
[ -n "$OLD_TRANSID" -a "$OLD_TRANSID" -gt 0 ] || usage "Failed to find generation for $SNAPSHOT_NEW"

btrfs subvolume find-new "$SNAPSHOT_NEW" $OLD_TRANSID | sed '$d' | cut -f17- -d' ' | sort | uniq

To explain: btrfs subvolume find-new finds files changed after a particular 'generation' of snapshot. It also reports the current generation number.

Caveats

e.g. take the daily snapshot of a subvolume case:

mkdir test && cd test
btrfs subvolume create live
date >live/foo1
date >live/bar1
btrfs subvolume snapshot live/ snap1
date >live/foo2  # new file
date >>live/bar1 # modify file
rm live/foo1     # delete file
btrfs subvolume snapshot live/ snap2
date >live/foo3  # new file
mv live/bar{1,2} # rename file
rm live/foo2     # delete file

What changed between snap1 and snap2?

$ btrfs-diff snap1/ snap2/
bar1
foo2

So we can see the new file, see the modified file, but the deletion is not reported. This is because the command reports on files that exist, not ones that now don't.

What changed between snap2 and the live subvolume?

$ btrfs-diff snap2/ live/
foo3

the renamed file is not reported. Its data has not changed.

Now what if we add data to the renamed file

date >>live/bar2
btrfs-diff snap2/ live/
bar2
foo3

OK, makes sense. But let's make a new file

date >live/lala
btrfs-diff snap2/ live/
bar2
foo3

eh! where's lala?. If you add another file, lala appears. So this behaviour is a bit odd. Which is probably why the wiki says:

The find-new approach has some serious limitations and thus is not really usable for something like send/receive.

However, the oddness comes when you compare a live subvolume against a previous state, not when you're comparing (read-only) snapshots. So this could still be useful unless you want to also identify deleted files.

sjy
  • 207
  • 2
  • 8
artfulrobot
  • 2,627
  • 11
  • 30
  • 56
  • Hey there, I've extended your tool a little. This tool will show you a stream of all the changes that have happened in snapshots (it can also select individual links) https://github.com/talwrii/btrlog – Att Righ Apr 05 '17 at 17:12
5

This is supported by the snapshot convenience tool snapper.

sudo snapper -c config diff 445..446

Of course this requires you to be using snapper for your snapshots.

This snapshot ids can be found using snapper list -a. Unfortunately at the time of writing snapper did not support list snapshots for a single config, though these numbers can be found from subvolume names.

Att Righ
  • 321
  • 3
  • 3
4

Current solution:

btrfs send --no-data  -p SHAPSHOT_OLD  SHAPSHOT_NEW  |  btrfs receive --dump  |  grep ^update_extent
lustra_pl
  • 41
  • 2
1

The backup utility btrbk (https://github.com/digint/btrbk) also has a diff (and also a extents diff) subcommand.