Marc's Public Blog

All | Arduino | Btrfs | Cars | Clubbing | Diving | Electronics | Exercising | Flying | Hiking | Linux | Linuxha | Public | Rc | Snow | Solar | Trips
Most recent entry: 2014-04-09 00:00:00 -- Generated on 2014-04-15 17:16:06 by Rig3 0.4-440




More pages: April 2014 March 2014 February 2014 January 2014 December 2013 November 2013 October 2013 September 2013 August 2013 July 2013 June 2013 May 2013 April 2013 March 2013 February 2013 January 2013 December 2012 October 2012 September 2012 August 2012 July 2012 June 2012 May 2012 April 2012 March 2012 February 2012 January 2012 December 2011 November 2011 October 2011 September 2011 August 2011 July 2011 June 2011 May 2011 April 2011 March 2011 February 2011 January 2011 December 2010 November 2010 October 2010 September 2010 August 2010 July 2010 June 2010 May 2010 April 2010 March 2010 February 2010 January 2010 December 2009 November 2009 October 2009 September 2009 August 2009 July 2009 June 2009 May 2009 April 2009 March 2009 February 2009 January 2009 December 2008 November 2008 October 2008 September 2008 August 2008 July 2008 June 2008 May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006 May 2006 April 2006 March 2006 February 2006 January 2006 December 2005 November 2005 October 2005 September 2005 August 2005 July 2005 June 2005 May 2005 April 2005 March 2005 February 2005 January 2005 December 2004 November 2004 October 2004 September 2004 August 2004 July 2004 June 2004 May 2004 April 2004 March 2004 February 2004 January 2004 October 2003 August 2003 July 2003 May 2003 April 2003 March 2003 January 2003 November 2002 October 2002 July 2002 May 2002 April 2002 March 2002 February 2002 November 2001 October 2001 September 2001 August 2001 July 2001 June 2001 May 2001 April 2001 March 2001 February 2001 December 2000 November 2000 October 2000 September 2000 August 2000 July 2000 June 2000 April 1999 March 1999 September 1997 July 1996 September 1993 July 1991 December 1988 December 1985 January 1980



2014/04/09 Mathias' 0x28 birthday
π 2014-04-09 00:00 in Public
Jenny invited us all for a nice dinner for Mathias' BD. A good evening was had by all:













See more images for Mathias' 0x28 birthday
2014/03/26 Finally Some Snow Days in South Lake Tahoe
π 2014-03-26 00:00 in Snow
Johannes and I went during the week to catch a couple of POW days at kirkwood. We then went to get leftovers at Heavenly on a sunny day. The next day, I took off to work sicn ethe weather was going to be crap anyway (and it sure was), and we finished off with an 18 inch day at Kirkwood on sunday which was pretty epic, except for the crowds ;)

Day 1 was a good powder day:



doggy was happy running down the wall
doggy was happy running down the wall





Day #2 was quite good too:




Day #3 was sunny at Heavenly:



the middle rock field was pretty bad due to very low snow though
the middle rock field was pretty bad due to very low snow though


No one came here due to the many many rocks just above
No one came here due to the many many rocks just above

Pinnacles was closed of course
Pinnacles was closed of course

And then, I followed Johannes to Killebrew. The run down wasn't bad, but the bottom was pittyful:






this looks nice, must be fun during the summer
this looks nice, must be fun during the summer

We took saturday off, and we were right, many lifts were closed for wind but by the end of the day, we started getting real snow, and more into the night. As predicted, we got 18 inches at kirkwood, and it was a pretty epic day, even if the crowds got everything tracked way too quicky.



Dude, some snow fell on your car :)
Dude, some snow fell on your car :)

The line at Cornice was a bit ridiculous
The line at Cornice was a bit ridiculous










A big thanks to Johannes for driving there and back, and Jim for having us at his condo for the first 2 nights.

2014/03/23 Btrfs Raid5 Status
π 2014-03-23 00:00 in Btrfs, Linux

How to use Btrfs raid5/6

Since I didn't find good documentation of where Btrfs raid5/raid6 was at, I did some tests, and with some help from list members, can write this page now.

This is as of kernel 3.14 with btrfs-tools 3.12. If your are using a kernel and especially tools older than that, there are good chances things will work less well.

Btrfs raid5/6 in a nutshell

It is important to know that raid5/raid6 is more experimental than btrfs itself is. Do not use this for production systems, or if you do and things break, you were warned :)

If you're coming from the mdadm raid5 world, here's what you need to know:

  • btrfs does not yet seem to know that if you removed a drive from an array and you plug it back in later, that drive is out of date. It will auto-add an out of date drive back to an array and that will likely cause data loss by hiding files you had but the old drive didn't have. This means you should wipe a drive cleanly before you put it back into an array it used to be part of. See https://bugzilla.kernel.org/show_bug.cgi?id=72811
  • btrfs does not deal well with a drive that is present but not working. It does not know how to kick it from the array, nor can it be removed (btrfs device delete) because this causes reading from the drive that isn't working. This means btrfs will try to write to the bad drive forever. The solution there is to umount the array, remount it with the bad drive missing (it cannot be seen by btrfs, or it'll get automounted/added), and then rebuild on a new drive or rebuild/shrink the array to be one drive smaller (this is explained below).
  • You can add and remove drives from an array and rebalance to grow/shrink an array without umounting it. Note that is slow since it forces rewriting of all data blocks, and this takes about 3H per 100GB (or 30H per terabyte) with 10 drives on a dual core duo.
  • If you are missing a drive, btrfs will refuse to mount the array and give an obscure error unless you mount with -o degraded
  • btrfs has no special rebuild procedure. Rebuilding is done by rebalancing the array. You could actualy rebalance a degraded array to a smaller array by rebuilding/balancing without adding a drive, or you can add a drive, rebalance on it, and that will force a read/rewrite of all data blocks, which will restripe them nicely.
  • btrfs replace does not work, but you can easily do btrfs device add, and btrfs remove of the other drive, and this will do the same thing.
  • btrfs device add will not cause an auto rebalance. You could chose not to rebalance existing data and only have new data be balanced properly.
  • btrfs device delete will force all data from the deleted drive to be rebalanced and the command completes when the drive has been freed up.
  • The magic command to delete an unused drive from an array while it is missing from the system is btrfs device delete missing .
  • btrfs doesn't easily tell you that your array is in degraded mode (run btrfs fi show, and it'll show a missing drive as well as how much of your total data is still on it). This does means you can have an array that is half degraded: half the files are striped over the current drives because they were written after the drive was removed, or were written by a rebalance that hasn't finished, while the other half of your data could be in degraded mode.
  • You can see this by looking at the amount of data on each drive, anything on drive 11 is properly striped 10 way, while anything on drive 3 is in degraded mode:
    polgara:~# btrfs fi show
    Label: backupcopy  uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1
            Total devices 11 FS bytes used 564.54GiB
            devid    1 size 465.76GiB used 63.14GiB path /dev/dm-0
            devid    2 size 465.76GiB used 63.14GiB path /dev/dm-1
            devid    3 size 465.75GiB used 30.00GiB path   <- this device is missing
            devid    4 size 465.76GiB used 63.14GiB path /dev/dm-2
            devid    5 size 465.76GiB used 63.14GiB path /dev/dm-3
            devid    6 size 465.76GiB used 63.14GiB path /dev/dm-4
            devid    7 size 465.76GiB used 63.14GiB path /dev/mapper/crypt_sdi1
            devid    8 size 465.76GiB used 63.14GiB path /dev/mapper/crypt_sdj1
            devid    9 size 465.76GiB used 63.14GiB path /dev/dm-7
            devid    10 size 465.76GiB used 63.14GiB path /dev/dm-8
            devid    11 size 465.76GiB used 33.14GiB path /dev/mapper/crypt_sde1 <- this device was added

    Create a raid5 array

    polgara:/dev/disk/by-id# mkfs.btrfs -f -d raid5 -m raid5 -L backupcopy /dev/mapper/crypt_sd[bdfghijkl]1
    

    WARNING! - Btrfs v3.12 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using

    Turning ON incompat feature 'extref': increased hardlink limit per file to 65536 Turning ON incompat feature 'raid56': raid56 extended format adding device /dev/mapper/crypt_sdd1 id 2 adding device /dev/mapper/crypt_sdf1 id 3 adding device /dev/mapper/crypt_sdg1 id 4 adding device /dev/mapper/crypt_sdh1 id 5 adding device /dev/mapper/crypt_sdi1 id 6 adding device /dev/mapper/crypt_sdj1 id 7 adding device /dev/mapper/crypt_sdk1 id 8 adding device /dev/mapper/crypt_sdl1 id 9 fs created label backupcopy on /dev/mapper/crypt_sdb1 nodesize 16384 leafsize 16384 sectorsize 4096 size 4.09TiB polgara:/dev/disk/by-id# mount -L backupcopy /mnt/btrfs_backupcopy
    polgara:/mnt/btrfs_backupcopy# df -h . Filesystem Size Used Avail Use% Mounted on /dev/mapper/crypt_sdb1 4.1T 3.0M 4.1T 1% /mnt/btrfs_backupcopy

    As anothe example, you could use -d raid5 -m raid1 to have metadata be raid1 while data being raid5. This specific example isn't actually that useful, but just giving it as an example.

    Replacing a drive that hasn't failed yet on a running raid5 array

    btrfs replace does not work:

    polgara:/mnt/btrfs_backupcopy# btrfs replace start -r /dev/mapper/crypt_sem1 /dev/mapper/crypt_sdm1  .
    Mar 23 14:56:06 polgara kernel: [53501.511493] BTRFS warning (device dm-9): dev_replace cannot yet handle RAID5/RAID6

    No big deal, this can be done in 2 steps:

  • Add the new drive
  • polgara:/mnt/btrfs_backupcopy# btrfs device add -f /dev/mapper/crypt_sdm1 .
    polgara:/mnt/btrfs_backupcopy# btrfs fi show
    Label: backupcopy  uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1
            Total devices 11 FS bytes used 114.35GiB
            devid    1 size 465.76GiB used 32.14GiB path /dev/dm-0
            devid    2 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdd1
            devid    4 size 465.76GiB used 32.14GiB path /dev/dm-2
            devid    5 size 465.76GiB used 32.14GiB path /dev/dm-3
            devid    6 size 465.76GiB used 32.14GiB path /dev/dm-4
            devid    7 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdi1
            devid    8 size 465.76GiB used 32.14GiB path /dev/dm-6
            devid    9 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdk1
            devid    10 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdl1
            devid    11 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sde1
            devid    12 size 465.75GiB used 0.00 path /dev/mapper/crypt_sdm1

  • btrfs device delete the drive to remove. This neatly causes a rebalance which will happen to use the new drive you just added
  • polgara:/mnt/btrfs_backupcopy# btrfs device delete /dev/mapper/crypt_sde1 .
    Mar 23 11:13:31 polgara kernel: [40145.908207] BTRFS info (device dm-9): relocating block group 945203314688 flags 129
    Mar 23 14:51:51 polgara kernel: [53245.955444] BTRFS info (device dm-9): found 5576 extents
    Mar 23 14:51:57 polgara kernel: [53251.874925] BTRFS info (device dm-9): found 5576 extents
    polgara:/mnt/btrfs_backupcopy# 

    Note that this is slow, 3.5h for just 115GB of data. It could take days for a terabyte array.

    polgara:/mnt/btrfs_backupcopy# btrfs fi show Label: backupcopy uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1 Total devices 10 FS bytes used 114.35GiB devid 1 size 465.76GiB used 13.14GiB path /dev/dm-0 devid 2 size 465.76GiB used 13.14GiB path /dev/mapper/crypt_sdd1 devid 4 size 465.76GiB used 13.14GiB path /dev/dm-2 devid 5 size 465.76GiB used 13.14GiB path /dev/dm-3 devid 6 size 465.76GiB used 13.14GiB path /dev/dm-4 devid 7 size 465.76GiB used 13.14GiB path /dev/mapper/crypt_sdi1 devid 8 size 465.76GiB used 13.14GiB path /dev/dm-6 devid 9 size 465.76GiB used 13.14GiB path /dev/mapper/crypt_sdk1 devid 10 size 465.76GiB used 13.14GiB path /dev/mapper/crypt_sdl1 devid 12 size 465.75GiB used 13.14GiB path /dev/mapper/crypt_sdm1

    There we go, I'm back on 10 devices, almost as good as a btrfs replace, it simply took 2 steps

    Replacing a missing drive on a running raid5 array

    Normal mount will not work:

    polgara:~# mount -v -t btrfs -o compress=zlib,space_cache,noatime LABEL=backupcopy /mnt/btrfs_backupcopy
    mount: wrong fs type, bad option, bad superblock on /dev/mapper/crypt_sdj1,
           missing codepage or helper program, or other error
           In some cases useful info is found in syslog - try
           dmesg | tail  or so
    Mar 21 22:29:45 polgara kernel: [ 2288.285068] BTRFS info (device dm-8): disk space caching is enabled
    Mar 21 22:29:45 polgara kernel: [ 2288.285369] BTRFS: failed to read the system array on dm-8
    Mar 21 22:29:45 polgara kernel: [ 2288.316067] BTRFS: open_ctree failed
    

    So we do a mount with -o degraded polgara:~# mount -v -t btrfs -o compress=zlib,space_cache,noatime,degraded LABEL=backupcopy /mnt/btrfs_backupcopy /dev/mapper/crypt_sdj1 on /mnt/btrfs_backupcopy type btrfs (rw,noatime,compress=zlib,space_cache,degraded) Mar 21 22:29:51 polgara kernel: [ 2295.042421] BTRFS: device label backupcopy devid 8 transid 3446 /dev/mapper/crypt_sdj1 Mar 21 22:29:51 polgara kernel: [ 2295.065951] BTRFS info (device dm-8): allowing degraded mounts Mar 21 22:29:51 polgara kernel: [ 2295.065955] BTRFS info (device dm-8): disk space caching is enabled Mar 21 22:30:32 polgara kernel: [ 2336.189000] BTRFS: device label backupcopy devid 3 transid 8 /dev/dm-9 Mar 21 22:30:32 polgara kernel: [ 2336.203175] BTRFS: device label backupcopy devid 3 transid 8 /dev/dm-9

    Then we add the new drive:

    polgara:/mnt/btrfs_backupcopy# btrfs device add -f /dev/mapper/crypt_sde1 .
    polgara:/mnt/btrfs_backupcopy# df .
    /dev/dm-0       5.1T  565G  4.0T  13% /mnt/btrfs_backupcopy   < bad, it should be 4.5T, but I get space for 11 drives

    https://btrfs.wiki.kernel.org/index.php/FAQ#What_does_.22balance.22_do.3F says:
    "On a filesystem with damaged replication (e.g. a RAID-1 FS with a dead and removed disk), it will force the FS to rebuild the missing copy of the data on one of the currently active devices, restoring the RAID-1 capability of the filesystem."

    See also: https://btrfs.wiki.kernel.org/index.php/Balance_Filters

    If we have written data since the drive was removed, or if we are recovering from a unfinished balance, doing a filter on devid=3 tells balance to only rewrite data and metadata that has a chunk on missing device #3 (this is a good way to finish the balance in multiple passes if you have to reboot in between, or the filesystem deadlocks during a balance, which unfortunately is still common as of kernel 3.14.

    polgara:/mnt/btrfs_backupcopy# btrfs balance start -ddevid=3 -mdevid=3 -v . Mar 22 13:15:55 polgara kernel: [20275.690827] BTRFS info (device dm-9): relocating block group 941277446144 flags 130 Mar 22 13:15:56 polgara kernel: [20276.604760] BTRFS info (device dm-9): relocating block group 940069486592 flags 132 Mar 22 13:19:27 polgara kernel: [20487.196844] BTRFS info (device dm-9): found 52417 extents Mar 22 13:19:28 polgara kernel: [20488.056749] BTRFS info (device dm-9): relocating block group 938861527040 flags 132 Mar 22 13:22:41 polgara kernel: [20681.588762] BTRFS info (device dm-9): found 70146 extents Mar 22 13:22:42 polgara kernel: [20682.380957] BTRFS info (device dm-9): relocating block group 937653567488 flags 132 Mar 22 13:26:12 polgara kernel: [20892.816204] BTRFS info (device dm-9): found 71497 extents Mar 22 13:26:14 polgara kernel: [20894.819258] BTRFS info (device dm-9): relocating block group 927989891072 flags 129

    As balancing happens, data is taken out of devid3, the one missing, and added to devid11 (the one added):

    polgara:~# btrfs fi show
    Label: backupcopy  uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1
            Total devices 11 FS bytes used 564.54GiB
            devid    1 size 465.76GiB used 63.14GiB path /dev/dm-0
            devid    2 size 465.76GiB used 63.14GiB path /dev/dm-1
            devid    3 size 465.75GiB used 30.00GiB path   <- this device is missing
            devid    4 size 465.76GiB used 63.14GiB path /dev/dm-2
            devid    5 size 465.76GiB used 63.14GiB path /dev/dm-3
            devid    6 size 465.76GiB used 63.14GiB path /dev/dm-4
            devid    7 size 465.76GiB used 63.14GiB path /dev/mapper/crypt_sdi1
            devid    8 size 465.76GiB used 63.14GiB path /dev/mapper/crypt_sdj1
            devid    9 size 465.76GiB used 63.14GiB path /dev/dm-7
            devid    10 size 465.76GiB used 63.14GiB path /dev/dm-8
            devid    11 size 465.76GiB used 33.14GiB path /dev/mapper/crypt_sde1 <- this device was added

    You can see status with:

    polgara:/mnt/btrfs_backupcopy# while :
    > do
    > btrfs balance status .
    > sleep 60
    1 out of about 72 chunks balanced (2 considered),  99% left
    2 out of about 72 chunks balanced (3 considered),  97% left
    3 out of about 72 chunks balanced (4 considered),  96% left

    At the end (and this can take hours to days), you get:

    polgara:/mnt/btrfs_backupcopy# btrfs fi show
    Label: backupcopy  uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1
            Total devices 11 FS bytes used 114.35GiB
            devid    1 size 465.76GiB used 32.14GiB path /dev/dm-0
            devid    2 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdd1
            devid    3 size 465.75GiB used 0.00 path  <----  drive is freed up now.
            devid    4 size 465.76GiB used 32.14GiB path /dev/dm-2
            devid    5 size 465.76GiB used 32.14GiB path /dev/dm-3
            devid    6 size 465.76GiB used 32.14GiB path /dev/dm-4
            devid    7 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdi1
            devid    8 size 465.76GiB used 32.14GiB path /dev/dm-6
            devid    9 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdk1
            devid    10 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdl1
            devid    11 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sde1
    Btrfs v3.12

    But the array still shows 11 drives with one missing and will not mount without -o degraded.
    You do this with:

    polgara:/mnt/btrfs_backupcopy# btrfs device delete missing .
    polgara:/mnt/btrfs_backupcopy# btrfs fi show
    Label: backupcopy  uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1
            Total devices 10 FS bytes used 114.35GiB
            devid    1 size 465.76GiB used 32.14GiB path /dev/dm-0
            devid    2 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdd1
            devid    4 size 465.76GiB used 32.14GiB path /dev/dm-2
            devid    5 size 465.76GiB used 32.14GiB path /dev/dm-3
            devid    6 size 465.76GiB used 32.14GiB path /dev/dm-4
            devid    7 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdi1
            devid    8 size 465.76GiB used 32.14GiB path /dev/dm-6
            devid    9 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdk1
            devid    10 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sdl1
            devid    11 size 465.76GiB used 32.14GiB path /dev/mapper/crypt_sde1

    And there we go, we're back in business!

    From the above, you've also learned how to grow a raid5 array (add a drive, run balance), or remove a drive (just run btrfs device delete and the auto balance will restripe your entire array for n-1 drives).

    2014/03/23 What I Did Not Learn In School
    π 2014-03-23 00:00 in Public
    I love science shows on TV and learning about the fascinating post Einstein world that even the great French school system didn't teach me at all (a bit disappointing). My sources are NOVA, Horizon (the BBC equivalent), and other Universe shows from PBS or BBC.

    A few interesting things I learned:
    Gravity bends spacetime which in turn bends light (we can see 2 pictures of the same galaxy due to this effect).
    Newton's laws of gravity are a good approximation. but they're not fully correct. By measuring the distance of the moon from the earth, we know it is not quite in the location newton's gravity laws predict.
    This is separately from the fact that the moon is actually spinning away from the earth as a counter effect for the energy it puts into our tides, energy that is lost by the moon spinning away from us.

    Turns out we've also know for around a century that Mercury doesn't follow the orbit it should either and we now know that it's because of the effects of the sun bending spacetime and in turn Mercury's orbit. I also didn't know that before Einstein gave us the theory of relativity and spacetime, scientists made up a small planet not visible to us and that they called Vulcan to make up for the real orbit of Mercury (see encyclopedia britanica and Shapiro time delay).

    By the way, the moon spinning away from the earth and the friction from the tides it causes on it, also causes the earth to slow down, making our days longer. Brian Cox says in one of his shows that the earth day used to be 22 hours 600 million years ago (the earth is 5 billion years old as a reminder). In another Nova/PBS show, I remember they found fosil evidence that the earth year was some 420 days (based on layer deposits), which in turn showed that the earth day was only 18 hours back then. And if you know a few things about the moon, you also know that the moon is always facing us from the same direction, so since the earth day is slowing down, in turn we are slowing down the rotation of the moon on its own axis.

    Cool stuff...

    If you're interested, you can watch two old Horizon (BBC NOVA equivalent) shows with Brian Cox:

  • do you know what time is (wait for the 10sec delay after you hit play)
  • what on earth is wrong with gravity
  • 2014/03/22 Btrfs Tips: Doing Fast Incremental Backups With Btrfs Send and Receive
    π 2014-03-22 00:00 in Btrfs, Linux

    Doing much faster incremental backups than rsync with btrfs send and btrfs receive

    If you are doing backups with rsync, you know that on big filesystems, it takes a long time for rsync to scan all the files on each side before it can finally sync them. You also know that rsync does not track file renames (unless you use --fuzzy and the file isi in the same dirctory, and --fuzzy can be very expensive if you have directories with many files, I had it blow through my comcast bandwidth account when I was rsyncing maildir backups).

    Just like ZFS, btrfs can compute a list of block changes between 2 snapshots and only send those blocks to the other side making the backups much much faster.
    At the time I'm writing this, it does work, but there still a few bugs that could cause it to abort (no data loss, but it will stop to sync further unless you start over from scratch). Most of those bugs have been fixed in kernel 3.14, so it is recommended you use this unless you're just trying it out for testing.

    How does it work?

  • This is all based on subvolumes, so please put all your data in subvolumes (even your root filesysem).
  • you make a read only snpahost at the source (let's say in /mnt/btrfs_pool1, you snapshot root to root_ro_timestamp)
  • you do one btrfs send/receive that sends that entire snapshot to the other side
  • the following times you tell btrfs send to send the diff between that last read only snapshot and a new one you just made.
  • on the other side, you only run btrfs receive in a btrfs block pool (let's say /mnt/btrfs_pool2). You do not give it any arguments linked to the backup name_ because it keeps track of the snapshot names from what was sent at the source.
  • If you'd like many more details, you can find some here:

  • http://lwn.net/Articles/506244
  • https://btrfs.wiki.kernel.org/index.php/Design_notes_on_Send/Receive
  • https://btrfs.wiki.kernel.org/index.php/Incremental_Backup
  • In real life, this is tedious to do by hand, and even the script to write is not super obvious, so I wrote one that I'm sharing here. I actually do a fair amount of backups on the same machine (like I backup the SSD on my laptop to a hard drive on the same laptop every hour, because SSDs fail, and they could fail while I away from home without my regular off laptop backups), but the script does allow sending the backup to another machine (--dest).

    This backup script does a bit more in the following ways:

  • As per my post on hourly/daily/weekly snapshots, I like snapshots, so I am using this backup script's snapshots as local data recovery snapshots too, and therefore keep some amount of them behind, not just the last one (see -k num).
  • On my laptop, I want the destination snapshot to be writable and I want to know automatically which snapshot is the latest, so the script creates snapshot_last and snapshot_last_rw symlinks. Using them, I can boot my system from those snapshots and use the system normally if my main boot SSD dies and I need to boot from the HD. Thankfully btrfs supports using -o subvol=root_last_rw as a subvolume name and will follow the symlink to the real volume: root_rw.20140321_07:00:35
  • At the same time as creating the extra _ro and _rw snapshots for time based recovery, it automatically rotates them out and deletes the oldests (--keep says how many to keep).
  • Here is a link to the latest version of btrfs-subvolume-backup and a paste of a potentially outdated version for you to look at:

    #!/bin/bash
    

    # By Marc MERLIN <marc_soft@merlins.org> # License: GPL-2 or BSD at your choice.

    # Source: http://marc.merlins.org/linux/scripts
    # $Id: btrfs-subvolume-backup 958 2014-03-16 00:23:28Z svnuser $

    # cron jobs might not have /sbin in their path. export PATH="$PATH:/sbin"

    set -o nounset set -o errexit set -o pipefail

    # From https://btrfs.wiki.kernel.org/index.php/Incremental_Backup

    # bash shortcut for `basename $0` PROG=${0##*/} lock=/var/run/$PROG

    usage() { cat <<EOF Usage: cd /mnt/source_btrfs_pool $PROG [--init] [--keep|-k num] [--dest hostname] volume_name /mnt/backup_btrfs_pool

    Options: --init: Print this help message and exit. --keep num: Keep the last snapshots for local backups (5 by default) --dest hostname: If present, ssh to that machine to make the copy.

    This will snapshot volume_name in a btrfs pool, and send the diff between it and the previous snapshot (volume_name.last) to another btrfs pool (on other drives)

    If your backup destination is another machine, you'll need to add a few ssh commands this script

    The num sanpshots to keep is to give snapshots you can recover data from and they get deleted after num runs. Set to 0 to disable (one snapshot will be kept since it's required for the next diff to be computed). EOF exit 0 }

    die () { msg=${1:-} # don't loop on ERR trap ' ERR

    rm $lock

    echo "$msg" >&2 echo >&2

    # This is a fancy shell core dumper if echo $msg | grep -q 'Error line .* with status'; then line=`echo $msg | sed 's/.*Error line \(.*\) with status.*/\1/'` echo " DIE: Code dump:" >&2 nl -ba $0 | grep -3 "\b$line\b" >&2 fi

    exit 1 }

    # Trap errors for logging before we die (so that they can be picked up # by the log checker) trap 'die "Error line $LINENO with status $?"' ERR

    init="" # Keep the last 5 snapshots by default keep=5 TEMP=$(getopt --longoptions help,usage,init,keep:,dest: -o h,k:,d: -- "$@") || usage dest=localhost ssh=""

    # getopt quotes arguments with ' We use eval to get rid of that eval set -- $TEMP

    while : do case "$1" in -h|--help|--usage) usage shift ;;

    --keep|-k) shift keep=$1 shift ;;

    --dest|-d) shift dest=$1 ssh="ssh $dest" shift ;;

    --init) init=1 shift ;;

    --) shift break ;;

    *) echo "Internal error!" exit 1 ;; esac done [ $keep < 1 ]] && die "Must keep at least one snapshot for things to work ($keep given)"

    DATE="$(date '+%Y%m%d_%H:%M:%S')"

    [ $# != 2 ]] && usage vol="$1" dest_pool="$2"

    # shlock (from inn) does the right thing and grabs a lock for a dead process # (it checks the PID in the lock file and if it's not there, it # updates the PID with the value given to -p) if ! shlock -p $$ -f $lock; then echo "$lock held for $PROG, quitting" >&2 exit fi

    if [ -z "$init" ]]; then test -e "${vol}_last" || die "Cannot sync $vol, ${vol}_last missing. Try --init?" src_snap="$(readlink -e ${vol}_last)" fi src_newsnap="${vol}_ro.$DATE" src_newsnaprw="${vol}_rw.$DATE"

    $ssh test -d "$dest_pool/" || die "ABORT: $dest_pool not a directory (on $dest)"

    btrfs subvolume snapshot -r "$vol" "$src_newsnap"

    # There is currently an issue that the snapshots to be used with "btrfs send" # must be physically on the disk, or you may receive a "stale NFS file handle" # error. This is accomplished by "sync" after the snapshot sync

    if [ -n "$init" ]]; then btrfs send "$src_newsnap" | $ssh btrfs receive "$dest_pool/" else btrfs send -p "$src_snap" "$src_newsnap" | $ssh btrfs receive "$dest_pool/" fi

    # We make a read-write snapshot in case you want to use it for a chroot # and some testing with a writeable filesystem or want to boot from a # last good known snapshot. btrfs subvolume snapshot "$src_newsnap" "$src_newsnaprw" $ssh btrfs subvolume snapshot "$dest_pool/$src_newsnap" "$dest_pool/$src_newsnaprw"

    # Keep track of the last snapshot to send a diff against. ln -snf $src_newsnap ${vol}_last # The rw version can be used for mounting with subvol=vol_last_rw ln -snf $src_newsnaprw ${vol}_last_rw $ssh ln -snf $src_newsnaprw $dest_pool/${vol}_last_rw

    # How many snapshots to keep on the source btrfs pool (both read # only and read-write). ls -rd ${vol}_ro* | tail -n +$(( $keep + 1 ))| while read snap do btrfs subvolume delete "$snap" done ls -rd ${vol}_rw* | tail -n +$(( $keep + 1 ))| while read snap do btrfs subvolume delete "$snap" done

    # Same thing for destination (assume the same number of snapshots to keep, # you can change this if you really want). $ssh ls -rd $dest_pool/${vol}_ro* | tail -n +$(( $keep + 1 ))| while read snap do $ssh btrfs subvolume delete "$snap" done $ssh ls -rd $dest_pool/${vol}_rw* | tail -n +$(( $keep + 1 ))| while read snap do $ssh btrfs subvolume delete "$snap" done

    rm $lock

    2014/03/21 Btrfs Tips: How To Setup Netapp Style Snapshots
    π 2014-03-21 00:00 in Btrfs, Linux

    How to get Netapp[tm]-like snapshots with BTRFS

    Filesystem snapshots are something you'll never want to live without once you've had them. I learned about them in 1997 when I was working for Network Appliance, but unfortunately due to software patents, they managed to pevent most others from enjoying them until more recently.

    Linux did have crappy snapshots if you used LVM, but LVM and LVM2 snapshots were both so bad performance-wise (they were not meant to be long lived, and even a single snapshot would slow your filesysem down significantly, never mind multiple levels).

    If you can't use btrfs, but you still want historical snapshots, you should look into LVM thin provisioning which are newer as of kernel 3.4. They are suppposed to be faster for multiple levels of snapshots. Considering how bad LVM2 is, I'm sure they are faster no matter what, but I didn't have a use for them now that I'm using btrfs, so I can't speak of their performance. You can read up more here:

  • https://www.kernel.org/doc/Documentation/device-mapper/thin-provisioning.txt
  • https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/thinly-provisioned_snapshot_volumes.html
  • Back to btrfs, I use the recommended layout of putting all filesystems in a subvolume. I this:

  • /mnt/btrfs_pool1 -> actual btrfs filesystem
  • /mnt/btrfs_pool1/root -> gets mounted to / with -o subvol=root
  • /mnt/btrfs_pool1/usr -> gets mounted to /usr with -o subvol=usr
  • /mnt/btrfs_pool1/var -> gets mounted to /var with -o subvol=var
  • After running my script, I get multiple levels of snapshots, I'll show only root here for brevity. With this you can restore files from older versions 3 hours ago, 3 days ago, or 3 weeks ago. Here is the partial output of /mnt/btrfs_pool1:

    drwxr-xr-x 1 root root  370 Feb 24 10:38 root
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_daily_20140316_00:05:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_daily_20140318_00:05:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_daily_20140319_00:05:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_daily_20140320_00:05:00
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_hourly_20140316_22:33:00
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_hourly_20140318_00:05:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_hourly_20140319_00:05:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_hourly_20140320_00:05:00
    drwxr-xr-x 1 root root 336 Feb 19 21:40 root_weekly_20140223_00:06:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_weekly_20140302_00:06:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_weekly_20140309_00:06:01
    drwxr-xr-x 1 root root 370 Feb 24 10:38 root_weekly_20140316_00:06:01

    Note that snapshots are not backups, they give you a view into the past if your filesystem hasn't been corrupted and the disk you were using, didn't die.

    I then have a cronjob that runs this:

    0 * * * * root btrfs-snaps hourly 3 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )'
    2 0 * * * root btrfs-snaps daily  4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )'
    3 0 * * 0 root btrfs-snaps weekly 4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )'

    This is using the script, btrfs-snaps, for which I'll paste a most likely outdated copy here:

    #!/bin/bash
    

    # By Marc MERLIN <marc_soft@merlins.org> # License GPL-2 or BSD at your option.

    # This lets you create sets of snapshots at any interval (I use hourly, # daily, and weekly) and delete the older ones automatically.

    # Usage: # This is called from /etc/cron.d like so: # 0 * * * * root btrfs-snaps hourly 3 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )' # 1 0 * * * root btrfs-snaps daily 4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )' # 2 0 * * 0 root btrfs-snaps weekly 4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )'

    : ${BTRFSROOT:=/mnt/btrfs_pool1} DATE="$(date '+%Y%m%d_%H:%M:%S')"

    type=${1:-hourly} keep=${2:-3}

    cd "$BTRFSROOT"

    for i in $(btrfs subvolume list -q . | grep "parent_uuid -" | awk '{print $11}') do # Skip duplicate dirs once a year on DST 1h rewind. test -d "$BTRFSROOT/${i}_${type}_$DATE" && continue echo "Making snapshot of $type" btrfs subvolume snapshot "$BTRFSROOT"/$i "$BTRFSROOT/${i}_${type}_$DATE" count="$(ls -d ${i}_${type}_* | wc -l)" clip=$(( $count - $keep )) if [ $clip -gt 0 ]; then echo "Will delete the oldest $clip snapshots for $type" for sub in $(ls -d ${i}_${type}_* | head -n $clip) do #echo "Will delete $sub" btrfs subvolume delete "$sub" done fi done [/html:pre]

    2014/03/20 Btrfs Tips: ACPI S3 Sleep aka Suspend And Btrfs Scrub
    π 2014-03-20 00:00 in Btrfs, Linux

    Btrfs and S3 Sleep (Suspend)

    As of kernel 3.14, btrfs doesn't do the right things to freeze and allow a laptop or machine to go to ACPI sleep.

    This is discussed in more details in this thread: http://comments.gmane.org/gmane.comp.file-systems.btrfs/33106

    For now, I am using this crude workaround. I added this in /etc/acpi/sleep.sh:
    awk '/btrfs/ { print $1 }' /proc/mounts | sort -u | while read fs; do btrfs scrub cancel $fs; done

    This could easily be improved by running scrub status, pausing scrubs that are running instead of cancelling them, and resuming them after coming back from sleep.


    More pages: April 2014 March 2014 February 2014 January 2014 December 2013 November 2013 October 2013 September 2013 August 2013 July 2013 June 2013 May 2013 April 2013 March 2013 February 2013 January 2013 December 2012 October 2012 September 2012 August 2012 July 2012 June 2012 May 2012 April 2012 March 2012 February 2012 January 2012 December 2011 November 2011 October 2011 September 2011 August 2011 July 2011 June 2011 May 2011 April 2011 March 2011 February 2011 January 2011 December 2010 November 2010 October 2010 September 2010 August 2010 July 2010 June 2010 May 2010 April 2010 March 2010 February 2010 January 2010 December 2009 November 2009 October 2009 September 2009 August 2009 July 2009 June 2009 May 2009 April 2009 March 2009 February 2009 January 2009 December 2008 November 2008 October 2008 September 2008 August 2008 July 2008 June 2008 May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006 May 2006 April 2006 March 2006 February 2006 January 2006 December 2005 November 2005 October 2005 September 2005 August 2005 July 2005 June 2005 May 2005 April 2005 March 2005 February 2005 January 2005 December 2004 November 2004 October 2004 September 2004 August 2004 July 2004 June 2004 May 2004 April 2004 March 2004 February 2004 January 2004 October 2003 August 2003 July 2003 May 2003 April 2003 March 2003 January 2003 November 2002 October 2002 July 2002 May 2002 April 2002 March 2002 February 2002 November 2001 October 2001 September 2001 August 2001 July 2001 June 2001 May 2001 April 2001 March 2001 February 2001 December 2000 November 2000 October 2000 September 2000 August 2000 July 2000 June 2000 April 1999 March 1999 September 1997 July 1996 September 1993 July 1991 December 1988 December 1985 January 1980