I learned most of this from Michael W. Lucas’s book FreeBSD Mastery: ZFS. Don’t let the title fool you. 99% of it is relevant to ZFS on Linux too.
A virtual device, or VDEV, is a named collection of one or more chunks of storage (usually disk partitions).
A VDEV with one device is called a stripe. It has no redundancy, so if the underlying device fails, the data is gone.
(More generally, a stripe is a chunk of data written to a single device.)
A VDEV with multiple disks/devices can store a copy of its complete data on each disk/device. This is called a mirrored VDEV. All but one device can fail without losing any data.
Otherwise a VDEV is RAID and depending on the configuration, can lose one or more devices without losing data.
In most cases, you can create the zdevs as part of pool creation and do not need to create them first.
A pool has a name and one or more VDEVs in it. You can add more VDEVs to a pool, but can never remove a VDEV from a pool.
Best practice is to use the same kind of VDEV in a pool. E.g. don’t mix mirrored with RAID-1, or RAID-1 with RAID-Z3.
Always set the “ashift” to 12 when creating new pools.
A dataset is the thing that you mount into a system’s file tree as a directory. It’s part of a pool. Datasets form a hierarchy, with the pool forming a kind of default root dataset for the pool.
You have a new disk at
/dev/sdc and you want to make it its own ZFS file system.
First, make it a pool.
# zpool create -o ashift=12 poolname /dev/sdc # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT poolname 3.62T 480K 3.62T - - 0% 0% 1.00x ONLINE - # zfs list NAME USED AVAIL REFER MOUNTPOINT poolname 516K 3.51T 96K /poolname # zpool status -LP rpool pool: rpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 /dev/sda7 ONLINE 0 0 0
Notice that the pool is kind of a dataset of its own, and even gets mounted by default, but you usually want to create child datasets to actually store stuff in.
# zfs create -p poolname/dataset1 # zfs list NAME USED AVAIL REFER MOUNTPOINT poolname 516K 3.51T 96K /poolname poolname/dataset1 96K 3.51T 96K /poolname/dataset1
-p creates all non-existing parent datasets, inheriting mountpoints
from their parents. Any
-o options on this command are not applied
to parent datasets created due to
Suppose you decide you don’t want that dataset anymore.
# zfs destroy poolname/dataset1 #
Enable compression with compression=on. Specifying on instead of lz4 or another specific algorithm will always pick the best available compression algorithm.
# zfs set compress=on dataset # zfs get compress rpool NAME PROPERTY VALUE SOURCE rpool compression on local
By default, ZFS datasets do not show up in
/etc/fstab, the traditional
file where we configure mounted filesystems in Unix. ZFS mounts them itself
based on the mount data configured directly on the datasets as properties.
See below how to use
/etc/fstab if you really want to.
Pools normally have mountpoints named after the pool, e.g. pool
would be mounted at
/poolname. Children inherit that.
To control where a dataset is mounted, set the
# zfs set mountpoint=/opt poolname/dataset1
If you don’t want a dataset mounted, you can set the property
# zfs set canmount=off poolname/dataset1
Ordinarily properties not explicitly set on a dataset are inherited from their parent. But that does not apply to
canmount for some reason. Child datasets will still be mounted after setting
canmount=off on their parent.
Why would you have a dataset you didn’t want to mount? Maybe to set properties on it that its children can inherit.
To see the properties interesting for mounting:
# zfs list -o name,canmount,mountpoint NAME CANMOUNT MOUNTPOINT sipower off /sipower sipower/Art on /opt/art sipower/books on /usr/local/books sipower/photo on /sipower/photo sipower/software on /sipower/software wdnas4 on /wdnas4
If you want or need to control mounting of a ZFS dataset using
mount commands, set its
mountpoint property to
# zfs set mountpoint=legacy poolname/dataset1
Then you can mount using filesystem type
zfs either in
or using the
By “removable media” here I mean any storage device you can disconnect in any way, even if you first have to power down the system, like hard drives and SSDs, not just USB thumb drives etc.
You can start using ZFS on removable media by just attaching it, finding
the device where it’s showing up (maybe by using
lsblk on Linux),
and creating pools and datasets as above.
BEFORE REMOVING THE MEDIA, use the
zpool export command. This will unmount
things and tell ZFS not to consider this device part of the system anymore.
# zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT sipower 928G 122G 806G - - 0% 13% 1.00x ONLINE - wdnas4 3.62T 564K 3.62T - - 0% 0% 1.00x ONLINE - # zpool export wdnas4 # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT sipower 928G 122G 806G - - 0% 13% 1.00x ONLINE -
Now you can disconnect the device, maybe take it to another system entirely or store
it for a while, and eventually connect it to some system that supports ZFS. To
make ZFS aware of it, use
zpool import. Running it without arguments will list
the pools possible to import, then run it again to import a specific pool.
# zpool import pool: wdnas4 id: 4409664093715767562 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: wdnas4 ONLINE sdc ONLINE # zpool import wdnas4 # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT sipower 928G 122G 806G - - 0% 13% 1.00x ONLINE - wdnas4 3.62T 732K 3.62T - - 0% 0% 1.00x ONLINE - #
As part of importing, the pool’s datasets will be mounted according to their properties.
This gets really complicated. See chapter 6 of FreeBSD Mastery: ZFS for all the gory details.
Looking at pools
# zpool get allocated,size,capacity NAME PROPERTY VALUE SOURCE bpool allocated 720M - bpool size 1.88G - bpool capacity 37% - rpool allocated 18.1G - rpool size 232G - rpool capacity 7% - spool allocated 1.68T - spool size 3.62T - spool capacity 46% - # zpool get allocated,size,capacity,free spool NAME PROPERTY VALUE SOURCE spool allocated 1.68T - spool size 3.62T - spool capacity 46% - spool free 1.95T -
But what’s using up all the space in our pools? That’s harder.
You can get a start with
# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 18.1G 207G 96K / rpool/ROOT 15.0G 207G 96K none rpool/ROOT/ubuntu_u9xzty 15.0G 207G 3.58G / rpool/ROOT/ubuntu_u9xzty/srv 96K 207G 96K /srv rpool/ROOT/ubuntu_u9xzty/usr 3.23M 207G 96K /usr rpool/ROOT/ubuntu_u9xzty/usr/local 3.13M 207G 2.16M /usr/local rpool/ROOT/ubuntu_u9xzty/var 7.27G 207G 96K /var rpool/ROOT/ubuntu_u9xzty/var/games 96K 207G 96K /var/games rpool/ROOT/ubuntu_u9xzty/var/lib 6.88G 207G 2.38G /var/lib rpool/ROOT/ubuntu_u9xzty/var/lib/AccountsService 816K 207G 104K /var/lib/AccountsService rpool/ROOT/ubuntu_u9xzty/var/lib/NetworkManager 1.68M 207G 172K /var/lib/NetworkManager rpool/ROOT/ubuntu_u9xzty/var/lib/apt 303M 207G 104M /var/lib/apt rpool/ROOT/ubuntu_u9xzty/var/lib/dpkg 126M 207G 39.2M /var/lib/dpkg rpool/ROOT/ubuntu_u9xzty/var/log 401M 207G 192M /var/log rpool/ROOT/ubuntu_u9xzty/var/mail 96K 207G 96K /var/mail rpool/ROOT/ubuntu_u9xzty/var/snap 760K 207G 592K /var/snap rpool/ROOT/ubuntu_u9xzty/var/spool 1.45M 207G 144K /var/spool rpool/ROOT/ubuntu_u9xzty/var/www 108K 207G 108K /var/www rpool/USERDATA 3.03G 207G 96K / rpool/USERDATA/devpi_ps1uzq 394M 207G 394M /home/devpi rpool/USERDATA/homeassistant_79drum 1.15G 207G 513M /home/homeassistant rpool/USERDATA/hometheater_s261g2 125M 207G 93.0M /home/hometheater rpool/USERDATA/root_ndpbl6 793M 207G 791M /root rpool/USERDATA/strange_dyi0il 618M 207G 225M /home/strange
This shows a bunch of nested datasets, and each dataset’s USED space includes that of all the nested datasets, so you can’t just add them up as-is.
The AVAIL column is a bit more useful, but you have to remember that because snapshots and clones use Copy-On-Write, the AVAIL space could seemingly contain many times that much data.
You might think from this example that REFER tells you the unique space used by each dataset and you could just add that up, but again, no. Multiple datasets can REFER to the same collection of data. (Again, snapshots and clones do this.)
Deleting stuff doesn’t necessarily free space.
ZFS can take some time to asynchronously update snapshots and clones, so you might see the statistics continue to change for a while.
Stuff you delete might be referred to elsewhere, so until you find and remove all the references, that space will still be in use.
I’m not going into this any deeper here. Go read chapter 6 of the book.
If you want, you can have ZFS handle NFS export of a dataset, rather than adding
This blog post
has more details.