Marc's Public Blog - Linux Hacking


All | Aquariums | Arduino | Btrfs | Cars | Cats | Clubbing | Dining | Diving | Electronics | Exercising | Flying | Hiking | Linux | Linuxha | Public | Rc | Sciencemuseums | Snow | Solar | Trips

This page has a few of my blog entries about linux, but my main linux page is here
Picture of Linus


Table of Content for linux:

More pages: January 2017 October 2016 August 2016 July 2016 June 2016 February 2016 January 2016 May 2015 March 2015 January 2015 October 2014 May 2014 April 2014 March 2014 January 2014 November 2013 September 2013 May 2013 March 2013 January 2013 December 2012 August 2012 May 2012 March 2012 January 2012 December 2011 August 2011 July 2011 January 2011 October 2010 August 2010 June 2010 April 2010 March 2010 January 2010 December 2009 November 2009 September 2009 August 2009 July 2009 May 2009 January 2009 December 2008 November 2008 October 2008 January 2008 November 2007 August 2007 July 2006 January 2006 August 2005 April 2005 November 2004 March 2004 February 2004



2014/05/21 My Btrfs Talk at Linuxcon JP 2014
π 2014-05-21 00:00 in Btrfs, Linux
This all started with my trying Btrfs after Avi Miller's talk at Linux.conf.au. I started asking some questions on the mailing list, and since the wiki wasn't up to date on them, I ddi the obvious thing to start updating the Main Btrfs Wiki here and there.

After I had learned a fair amount about Btrfs already I felt others would benefit from a introduction to it, as well as best practises and what features it offers that would make you want to switch, I ended up submitting a talk to Linuxcon. In the process of writing the talk, I learned even more about Btrfs, and wrote some of them on my Btrfs blog while linking or putting the other relevant ones on the Main Btrfs Wiki.



The fancy talk description is here:

The presentation will give you everything you know to get up to speed with btfrs, why you should want to trust your data to btrfs, how it offers a lot of what ZFS offers without the licensing problems, as well as best practices for using it.

I will go into:

  • the basics of administration of a btrfs filesystem
  • How btrfs, swraid, dmcrypt, and lvm fit or don't fit together
  • how to work with a single storage pool and create all your partitions from it without having to ever resize them, or require LVM as a slow and somewhat unreliable block layer.
  • how to have virtually as many snapshots as you want and why you really want this
  • how to do very efficient block level backups of changes only and much faster than rsync ever will
  • how those block backups can be used to deploy OS upgrades at the file level like I explained in my talk on how Google maintains its many servers last year.
  • The video:

    The quality isn't great due to poor lighting, sorry about this, but you get the talk slides here or view them inline below.
    For easier to click on links, you may prefer the pdf version or the libreoffice version.


    > To click on URLs in the presentation, click on 'with contents' in the upper left, and use those links (open to a new tab)

    See more images for My Btrfs Talk at Linuxcon JP 2014
    2014/05/20 Historical Snapshots of Backups With Btrfs
    π 2014-05-20 00:00 in Btrfs, Linux

    How to manage historical snapshots of backups with Btrfs

    I have a setup where I backup a certain number of machines to a central server. There are multiple ways to do hierarchical backups with btrfs.

    snapshots and rsync on top

    http://marc.merlins.org/linux/talks/Btrfs-LC2014-JP/html/img33.html
  • not great because of COW relationship lost to unix tools,
  • not great because backing up server on another one requires a lot more snapshots of snapshots for btrfs send of old things that never change
  • works for deduping data that partially changes or changes owners
  • cp -a --link + rsync

    http://marc.merlins.org/linux/talks/Btrfs-LC2014-JP/html/img34.html
  • newer btrfs should be ok with many hardlinks
  • du can figure out the data saved
  • you can use hardlinks.py instead of bedup
  • can be transferred via cp/rsync without losing links
  • but hardlinks will not work across subvolumes
  • cp -a --reflink + rsync

    http://marc.merlins.org/linux/talks/Btrfs-LC2014-JP/html/img35.html
  • very nice, does not require hardlinks which you can't do across subvolumes
  • works for deduping data that partially changes or changes owners
  • but totally lost as soon as you copy it anywhere else (unless you use btrfs send)
  • 2014/05/20 Linuxcon Japan 2014 in Tokyo
    π 2014-05-20 00:00 in Linux
    I was thankful to be invited to Linuxcon Japan to speak again this year, and I did a talk about the usage for btrfs.


    A few pictures from the event:

    Keynotes:





    A few talk slides:




    A very interesting talk for Japanese people on how they can better interact with the linux community and what pitfalls they should avoid:




    And gatherings/people:



    it had been ages since I had last seen Daniel Veillard from RH, a containerization expert. Made for good talks.
    it had been ages since I had last seen Daniel Veillard from RH, a containerization expert. Made for good talks.



    my buddy Horms
    my buddy Horms

    the French connection :)
    the French connection :)


    See more images for Linuxcon Japan 2014 in Tokyo
    2014/05/19 Btrfs-diff Between Snapshots
    π 2014-05-19 00:00 in Btrfs, Linux

    Differences between two btrfs snapshots

    When you have historical snapshots, it may be useful to know what changed between 2 snapshots.

    The best way to do this long term is to modify "btrfs send" to compute changes between the snapshots and just output the filelist instead of a stream with data.

    However, until then, there is a hack that shows you files that got added and removed between two snapshots. It's not bulletproof like btrfs send, but it can give you a quick mostly working diff between two snapshots (*it will not show renames or deletes*). See more caveats on this original serverfault post.

    legolas:/mnt/btrfs_pool1# btrfs-diff usr_ro.20140513_05:00:01/ usr_ro.20140514_06:00:02/ share/doc/linux-image-3.15.0-rc5-amd64-i915-preempt-20140216s1/buildinfo.gz share/doc/linux-image-3.15.0-rc5-amd64-i915-preempt-20140216s1/Buildinfo.gz share/doc/linux-image-3.15.0-rc5-amd64-i915-preempt-20140216s1/changelog.Debian.gz share/doc/linux-image-3.15.0-rc5-amd64-i915-preempt-20140216s1/Changes.gz (...)

    You can download my latest snapshot of btrfs-diff. Note that I am not the author, it was copied from this serverfault post.

    #!/bin/bash
    

    # Author: http://serverfault.com/users/96883/artfulrobot # License: Unknown # # This script will show most files that got modified or added. # Renames and deletions will not be shown. # Read limitations on: # http://serverfault.com/questions/399894/does-btrfs-have-an-efficient-way-to-compare-snapshots # # btrfs send is the best way to do this long term, but as of kernel # 3.14, btrfs send cannot just send a list of changed files without # scanning and sending all the changed data blocks along.

    usage() { echo $@ >&2; echo "Usage: $0 <older-snapshot> <newer-snapshot>" >&2; exit 1; }

    [ $# -eq 2 ] || usage "Incorrect invocation"; SNAPSHOT_OLD=$1; SNAPSHOT_NEW=$2;

    [ -d $SNAPSHOT_OLD ] || usage "$SNAPSHOT_OLD does not exist"; [ -d $SNAPSHOT_NEW ] || usage "$SNAPSHOT_NEW does not exist";

    OLD_TRANSID=`btrfs subvolume find-new "$SNAPSHOT_OLD" 9999999` OLD_TRANSID=${OLD_TRANSID#transid marker was } [ -n "$OLD_TRANSID" -a "$OLD_TRANSID" -gt 0 ] || usage "Failed to find generation for $SNAPSHOT_NEW"

    btrfs subvolume find-new "$SNAPSHOT_NEW" $OLD_TRANSID | sed '$d' | cut -f17- -d' ' | sort | uniq

    2014/05/04 Fixing Btrfs Filesystem Full Problems
    π 2014-05-04 00:00 in Btrfs, Linux

    Fixing Btrfs Filesystem Full Problems

    Clear space now

    If you have historical snapshots, the quickest way to get space back so that you can look at the filesystem and apply better fixes and cleanups is to drop the oldest historical snapshots.

    Two things to note:

  • If you have historical snapshots as described here, delete the oldest ones first, and wait (see below). However if you just just deleted 100GB, and replaced it with another 100GB which failed to fully write, giving you out of space, all your snapshots will have to be deleted to clear the blocks of that old file you just removed to make space for the new one (actually if you know exactly what file it is, you can go in all your snapshots and manually delete it, but in the common case it'll be multiple files and you won't know which ones, so you'll have to drop all your snapshots before you get the space back).
  • After deleting snapshots, it can take a minute or more for btrfs fi show to show the space freed. Do not be too impatient, run btrfs fi show in a loop and see if the number changes every minute. If it does not, carry on and delete other snapshots or look at rebalancing.
  • Note that even in the cases described below, you may have to clear one snapshot or more to make space before btrfs balance can run. As a corollary, btrfs can get in states where it's hard to get it out of the 'no space' state it's in. As a result, even if you don't need snapshot, keeping at least one around to free up space should you hit that mis-feature/bug, can be handy

    Is your filesystem really full? Mis-balanced data chunks

    Look at filesystem show output:

    legolas:~# btrfs fi show
    Label: btrfs_pool1  uuid: 4850ee22-bf32-4131-a841-02abdb4a5ba6
    	Total devices 1 FS bytes used 441.69GiB
    	devid    1 size 865.01GiB used 751.04GiB path /dev/mapper/cryptroot
    Only about 50% of the space is used (441 out of 865GB), but the device is 88% full (751 out of 865MB). Unfortunately it's not uncommon for a btrfs device to fill up due to the fact that it does not rebalance chunks (3.18+ has started freeing empty chunks, which is a step in the right direction).

    In the case above, because the filesystem is only 55% full, I can ask balance to rewrite all chunks that have less than 55% space used. Rebalancing those blocks actually means taking the data in those blocks, and putting it in fuller blocks so that you end up being able to free the less used blocks.
    This means the bigger the -dusage value, the more work balance will have to do (i.e. taking fuller and fuller blocks and trying to free them up by putting their data elsewhere). Also, if your FS is 55% full, using -dusage=55 is ok, but there isn't a 1 to 1 correlation and you'll likely be ok with a smaller dusage number, so start small and ramp up as needed.

    legolas:~# btrfs balance start -dusage=55 /mnt/btrfs_pool1 &

    # Follow the progress along with: legolas:~# while :; do btrfs balance status -v /mnt/btrfs_pool1; sleep 60; done Balance on '/mnt/btrfs_pool1' is running 10 out of about 315 chunks balanced (22 considered), 97% left Dumping filters: flags 0x1, state 0x1, force is off DATA (flags 0x2): balancing, usage=55 Balance on '/mnt/btrfs_pool1' is running 16 out of about 315 chunks balanced (28 considered), 95% left Dumping filters: flags 0x1, state 0x1, force is off DATA (flags 0x2): balancing, usage=55 (...)

    When it's over, the filesystem now looks like this (note devid used is now 513GB instead of 751GB):

    legolas:~# btrfs fi show
    Label: btrfs_pool1  uuid: 4850ee22-bf32-4131-a841-02abdb4a5ba6
    	Total devices 1 FS bytes used 441.64GiB
    	devid    1 size 865.01GiB used 513.04GiB path /dev/mapper/cryptroot

    Before you ask, yes, btrfs should do this for you on its own, but currently doesn't as of 3.14.

    Is your filesystem really full? Misbalanced metadata

    Unfortunately btrfs has another failure case where the metadata space can fill up. When this happens, even though you have data space left, no new files will be writeable.

    In the example below, you can see Metadata DUP 9.5GB out of 10GB. Btrfs keeps 0.5GB for itself, so in the case above, metadata is full and prevents new writes.

    One suggested way is to force a full rebalance, and in the example below you can see metadata goes back down to 7.39GB after it's done. Yes, there again, it would be nice if btrfs did this on its own. It will one day (some if it is now in 3.18).

    Sometimes, just using -dusage=0 is enough to rebalance metadata (this is now done automatically in 3.18 and above), but if it's not enough, you'll have to increase the number.

    legolas:/mnt/btrfs_pool2# btrfs fi df . Data, single: total=800.42GiB, used=636.91GiB System, DUP: total=8.00MiB, used=92.00KiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=10.00GiB, used=9.50GiB Metadata, single: total=8.00MiB, used=0.00

    legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2 Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=0 Done, had to relocate 91 out of 823 chunks

    legolas:/mnt/btrfs_pool2# btrfs fi df . Data, single: total=709.01GiB, used=603.85GiB System, DUP: total=8.00MiB, used=88.00KiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=10.00GiB, used=7.39GiB Metadata, single: total=8.00MiB, used=0.00

    Balance cannot run because the filesystem is full

    One trick to get around this is to add a device (even a USB key will do) to your btrfs filesystem. This should allow balance to start, and then you can remove the device with btrfs device delete when the balance is finished.
    It's also been said on the list that kernel 3.14 can fix some balancing issues that older kernels can't, so give that a shot if your kernel is old.

    Note, it's even possible for a filesystem to be full in a way that you cannot even delete snapshots to free space. This shows how you would work around it:

    root@polgara:/mnt/btrfs_pool2# btrfs fi df .
    Data, single: total=159.67GiB, used=80.33GiB
    System, single: total=4.00MiB, used=24.00KiB
    Metadata, single: total=8.01GiB, used=7.51GiB <<<< BAD
    root@polgara:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2
    Dumping filters: flags 0x1, state 0x0, force is off
      DATA (flags 0x2): balancing, usage=0
    Done, had to relocate 0 out of 170 chunks
    root@polgara:/mnt/btrfs_pool2# btrfs balance start -v -dusage=1 /mnt/btrfs_pool2
    Dumping filters: flags 0x1, state 0x0, force is off
      DATA (flags 0x2): balancing, usage=1
    ERROR: error during balancing '/mnt/btrfs_pool2' - No space left on device
    There may be more info in syslog - try dmesg | tail
    root@polgara:/mnt/btrfs_pool2# dd if=/dev/zero of=/var/tmp/btrfs bs=1G count=5
    5+0 records in
    5+0 records out
    5368709120 bytes (5.4 GB) copied, 7.68099 s, 699 MB/s
    root@polgara:/mnt/btrfs_pool2# losetup -v -f /var/tmp/btrfs
    Loop device is /dev/loop0
    root@polgara:/mnt/btrfs_pool2# btrfs device add /dev/loop0 .
    Performing full device TRIM (5.00GiB) ...
    root@polgara:/mnt/btrfs_pool2# btrfs subvolume delete space2_daily_20140603_00:05:01
    Delete subvolume '/mnt/btrfs_pool2/space2_daily_20140603_00:05:01'
    root@polgara:/mnt/btrfs_pool2# for i in *daily*; do btrfs subvolume delete $i; done
    Delete subvolume '/mnt/btrfs_pool2/space2_daily_20140604_00:05:01'
    Delete subvolume '/mnt/btrfs_pool2/space2_daily_20140605_00:05:01'
    Delete subvolume '/mnt/btrfs_pool2/space2_daily_20140606_00:05:01'
    Delete subvolume '/mnt/btrfs_pool2/space2_daily_20140607_00:05:01'
    Delete subvolume '/mnt/btrfs_pool2/space2_daily_20140608_00:05:01'
    Delete subvolume '/mnt/btrfs_pool2/space2_daily_20140609_00:05:01'
    root@polgara:/mnt/btrfs_pool2# btrfs device delete /dev/loop0 .
    root@polgara:/mnt/btrfs_pool2# btrfs balance start -v -dusage=1 /mnt/btrfs_pool2
    Dumping filters: flags 0x1, state 0x0, force is off
      DATA (flags 0x2): balancing, usage=1
    Done, had to relocate 5 out of 169 chunks
    

    root@polgara:/mnt/btrfs_pool2# btrfs fi df . Data, single: total=154.01GiB, used=80.06GiB System, single: total=4.00MiB, used=28.00KiB Metadata, single: total=8.01GiB, used=4.88GiB <<< GOOD

    Misc Balance Resources

    For more info, please read:
  • https://btrfs.wiki.kernel.org/index.php/FAQ#Raw_disk_usage
  • https://btrfs.wiki.kernel.org/index.php/Balance_Filters

  • More pages: January 2017 October 2016 August 2016 July 2016 June 2016 February 2016 January 2016 May 2015 March 2015 January 2015 October 2014 May 2014 April 2014 March 2014 January 2014 November 2013 September 2013 May 2013 March 2013 January 2013 December 2012 August 2012 May 2012 March 2012 January 2012 December 2011 August 2011 July 2011 January 2011 October 2010 August 2010 June 2010 April 2010 March 2010 January 2010 December 2009 November 2009 September 2009 August 2009 July 2009 May 2009 January 2009 December 2008 November 2008 October 2008 January 2008 November 2007 August 2007 July 2006 January 2006 August 2005 April 2005 November 2004 March 2004 February 2004