From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id E00317519E for ; Wed, 23 Jun 2021 14:47:20 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D3C37AF2B for ; Wed, 23 Jun 2021 14:46:50 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id E4DF1AF22 for ; Wed, 23 Jun 2021 14:46:49 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id B3AB042FAD for ; Wed, 23 Jun 2021 14:46:49 +0200 (CEST) From: Wolfgang Bumiller To: pve-devel@lists.proxmox.com Date: Wed, 23 Jun 2021 14:46:48 +0200 Message-Id: <20210623124648.191520-1-w.bumiller@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.766 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [RFC docs] add a basic BTRFS section X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jun 2021 12:47:20 -0000 Signed-off-by: Wolfgang Bumiller --- Please give some feedback as to what else to add/change/remove. local-btrfs.adoc | 177 +++++++++++++++++++++++++++++++++++++++++ pve-storage-btrfs.adoc | 55 +++++++++++++ pvesm.adoc | 1 + sysadmin.adoc | 2 + 4 files changed, 235 insertions(+) create mode 100644 local-btrfs.adoc create mode 100644 pve-storage-btrfs.adoc diff --git a/local-btrfs.adoc b/local-btrfs.adoc new file mode 100644 index 0000000..e9b0c34 --- /dev/null +++ b/local-btrfs.adoc @@ -0,0 +1,177 @@ +[[chapter_btrfs]] +BTRFS +----- +ifdef::wiki[] +:pve-toplevel: +endif::wiki[] + +BTRFS is a modern copy on write file system natively supported by the Linux +kernel, implementing features such as snapshots, built-in RAID and self healing +via checksums for data and metadata. Starting with {pve} 7.0, BTRFS is +introduced as optional selection for the root file system. + +.General BTRFS advantages + +* Main system setup almost identical to the traditional ext4 based setup + +* Snapshots + +* Data compression on file system level + +* Copy-on-write clone + +* RAID0, RAID1 and RAID10 + +* Protection against data corruption + +* Self healing + +* natively supported by the Linux kernel + +* ... + +.Caveats + +* RAID levels 5/6 are experimental and dangerous + +Installation as Root File System +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When you install using the {pve} installer, you can choose BTRFS for the root +file system. You need to select the RAID type at installation time: + +[horizontal] +RAID0:: Also called ``striping''. The capacity of such volume is the sum +of the capacities of all disks. But RAID0 does not add any redundancy, +so the failure of a single drive makes the volume unusable. + +RAID1:: Also called ``mirroring''. Data is written identically to all +disks. This mode requires at least 2 disks with the same size. The +resulting capacity is that of a single disk. + +RAID10:: A combination of RAID0 and RAID1. Requires at least 4 disks. + +The installer automatically partitions the disks and creates an additional +subvolume at `/var/lib/pve/local-btrfs`. In order to use that with the {pve} +tools, the installer creates the following configuration entry in +`/etc/pve/storage.cfg`: + +---- +dir: local + path /var/lib/vz + content iso,vztmpl,backup + disable + +btrfs: local-btrfs + path /var/lib/pve/local-btrfs + content iso,vztmpl,backup,images,rootdir +---- + +This explicitly disables the default `local` storage in favor of a btrfs +specific storage entry on the additional subvolume. + +The `btrfs` command is used to configure and manage the btrfs file system, +After the installation, the following command lists all additional subvolumes: + +---- +# btrfs subvolume list / +ID 256 gen 6 top level 5 path var/lib/pve/local-btrfs +---- + +BTRFS Administration +~~~~~~~~~~~~~~~~~~~~ + +This section gives you some usage examples for common tasks. + +Creating a BTRFS file system +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To create BTRFS file systems, `mkfs.btrfs` is used. The `-d` and `-m` parameters +are used to set the profile for metadata and data respectively. With the +optional `-L` parameter, a label can be set. + +Generally, the following modes are supported: `single`, `raid0`, `raid1`, +`raid10`. + +Create a BTRFS file system on `/dev/sdb1` + +---- + # mkfs.btrfs -m single -d single -L My-Storage /dev/sdb1 +---- + +Or create a RAID1 on `/dev/sdb1` and `/dev/sdc1` + +---- + # mkfs.btrfs -m raid1 -d raid1 -L My-Storage /dev/sdb1 /dev/sdc1 +---- + +This can then be mounted or used in `/etc/fstab` like any other mount point. + +For example + +---- + # mkdir /my-storage + # mount /dev/sdb1 /my-storage +---- + +Creating a subvolume +^^^^^^^^^^^^^^^^^^^^ + +Creating a subvolume links it to a path in the btrfs file system, where it will +appear as a regular directory. + +---- +# btrfs subvolume create /some/path +---- + +Afterwards `/some/path` will act like a regular directory. + +Deleting a subvolume +^^^^^^^^^^^^^^^^^^^^ + +Contrary to directories removed via `rmdir`, subvolumes do not need to be empty +in order to be deleted via the `btrfs` command. + +---- +# btrfs subvolume delete /some/path +---- + +Creating a snapshot of a subvolume +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BTRFS does not actually distinguish between snapshots and normal subvolumes, so +taking a snapshot can also be seen as creating an arbitrary copy of a subvolume. +By convention, {pve} will use the read-only flag when creating snapshots of +guest disks or subvolumes, but this flag can also be changed later on. + +---- +# btrfs subvolume snapshot -r /some/path /a/new/path +---- + +This will create a read-only "clone" of the subvolume on `/some/path` at +`/a/new/path`. Any future modifications to `/some/path` cause the modified data +to be copied before modification. + +If the read-only (`-r`) option is left out, both subvolumes will be writable. + +Enabling compression +^^^^^^^^^^^^^^^^^^^^ + +By default, BTRFS does not compress data. To enable compression, the `compress` +mount option can be added. Note that data already written will not be compressed +after the fact. + +By default, the rootfs will be listed in `/etc/fstab` as follows: + +---- +UUID= / btrfs defaults 0 1 +---- + +You can simply append `compress=zstd`, `compress=lzo`, or `compress=zlib` to the +`defaults` above like so: + +---- +UUID= / btrfs defaults,compress=zstd 0 1 +---- + +This change will take effect after rebooting. diff --git a/pve-storage-btrfs.adoc b/pve-storage-btrfs.adoc new file mode 100644 index 0000000..8947a76 --- /dev/null +++ b/pve-storage-btrfs.adoc @@ -0,0 +1,55 @@ +[[storage_btrfs]] +BTRFS Backend +------------- +ifdef::wiki[] +:pve-toplevel: +:title: Storage: BTRFS +endif::wiki[] + +Storage pool type: `btrfs` + +On the surface, this storage type is very similar to the directory storage type, +so see the directory backend section for a general overview. + +The main difference is that with this storage type `raw` formatted disks will be +placed in a subvolume, in order to allow taking snapshots and supporting offline +storage migration with snapshots being preserved. + +NOTE: BTRFS will honor the `O_DIRECT` flag when opening files, meaning VMs +should not use cache mode `none`, otherwise there will be checksum errors. + +Configuration +~~~~~~~~~~~~~ + +This backend is configured similarly to the directory storage. Note that when +adding a directory as a BTRFS storage, which is not itself also the mount point, +it is highly recommended to specify the actual mount point via the +`is_mountpoint` option. + +For example, if a BTRFS file system is mounted at `/mnt/data2` and its +`pve-storage/` subdirectory (which may be a snapshot, which is recommended) +should be added as a storage pool called `data2`, you can use the following +entry: + +---- +btrfs: data2 + path /mnt/data2/pve-storage + content rootdir,images + is_mountpoint /mnt/data2 +---- + +Snapshots +~~~~~~~~~ + +When taking a snapshot of a subvolume or `raw` file, the snapshot will be +created as a read-only subvolume with the same path followed by an `@` and the +snapshot's name. + +ifdef::wiki[] + +See Also +~~~~~~~~ + +* link:/wiki/Storage[Storage] + +endif::wiki[] diff --git a/pvesm.adoc b/pvesm.adoc index 2e8ee89..15c0b16 100644 --- a/pvesm.adoc +++ b/pvesm.adoc @@ -433,6 +433,7 @@ include::pve-storage-rbd.adoc[] include::pve-storage-cephfs.adoc[] +include::pve-storage-btrfs.adoc[] ifdef::manvolnum[] diff --git a/sysadmin.adoc b/sysadmin.adoc index 3f62619..361fe02 100644 --- a/sysadmin.adoc +++ b/sysadmin.adoc @@ -62,6 +62,8 @@ include::local-lvm.adoc[] include::local-zfs.adoc[] +include::local-btrfs.adoc[] + include::pvenode.adoc[] include::certificate-management.adoc[] -- 2.30.2