From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 4C7B48CA92 for ; Fri, 4 Nov 2022 10:49:36 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 350F6C80 for ; Fri, 4 Nov 2022 10:49:36 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 4 Nov 2022 10:49:35 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3BDD844C7E for ; Fri, 4 Nov 2022 10:49:35 +0100 (CET) From: Dominik Csapak To: pbs-devel@lists.proxmox.com Date: Fri, 4 Nov 2022 10:49:34 +0100 Message-Id: <20221104094934.1135932-1-d.csapak@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.066 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [datastore.rs] Subject: [pbs-devel] [PATCH proxmox-backup v5] datastore: make 'filesystem' the default sync-level X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Nov 2022 09:49:36 -0000 rationale is that it makes the backup much safer than 'none', but does not incur a big of a performance hit as 'file'. here some benchmark: data to be backed up: ~14GiB semi-random test images between 12kiB and 4GiB that results in ~11GiB chunks (more than ram available on the target) PBS setup: virtualized (on an idle machine), PBS itself was also idle 8 cores (kvm64 on Intel 12700k) and 8 GiB memory all virtual disks are on LVM with discard and iothread on the HDD is a 4TB Seagate ST4000DM000 drive, and the NVME is a 2TB Crucial CT2000P5PSSD8 i tested each disk with ext4/xfs/zfs (default created with the gui) with 5 runs each, inbetween the caches are flushed and the filesystem synced i removed the biggest and smallest result and from the remaining 3 results built the average (percentage is relative to the 'none' result) result: test none filesystem file hdd - ext4 125.67s 140.39s (+11.71%) 358.10s (+184.95%) hdd - xfs 92.18s 102.64s (+11.35%) 351.58s (+281.41%) hdd - zfs 94.82s 104.00s (+9.68%) 309.13s (+226.02%) nvme - ext4 60.44s 60.26s (-0.30%) 60.47s (+0.05%) nvme - xfs 60.11s 60.47s (+0.60%) 60.49s (+0.63%) nvme - zfs 60.83s 60.85s (+0.03%) 60.80s (-0.05%) So all in all, it does not seem to make a difference for nvme drives, for hdds 'filesystem' increases backup time by ~10%, while for 'file' it largely depends on the filesystem, but always in the range of factor ~3 - ~4 Note that this does not take into account parallel actions, such as gc, verify or other backups. Signed-off-by: Dominik Csapak --- changes from v4: * included benchmark & rationale in the commit message docs/storage.rst | 4 ++-- pbs-api-types/src/datastore.rs | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/storage.rst b/docs/storage.rst index c4e44c72..d61c3a40 100644 --- a/docs/storage.rst +++ b/docs/storage.rst @@ -344,13 +344,13 @@ and only available on the CLI: the crash resistance of backups in case of a powerloss or hard shutoff. There are currently three levels: - - `none` (default): Does not do any syncing when writing chunks. This is fast + - `none` : Does not do any syncing when writing chunks. This is fast and normally OK, since the kernel eventually flushes writes onto the disk. The kernel sysctls `dirty_expire_centisecs` and `dirty_writeback_centisecs` are used to tune that behaviour, while the default is to flush old data after ~30s. - - `filesystem` : This triggers a ``syncfs(2)`` after a backup, but before + - `filesystem` (default): This triggers a ``syncfs(2)`` after a backup, but before the task returns `OK`. This way it is ensured that the written backups are on disk. This is a good balance between speed and consistency. Note that the underlying storage device still needs to protect itself against diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs index 4c9eda2f..ff8fe55f 100644 --- a/pbs-api-types/src/datastore.rs +++ b/pbs-api-types/src/datastore.rs @@ -181,7 +181,6 @@ pub enum DatastoreFSyncLevel { /// which reduces IO pressure. /// But it may cause losing data on powerloss or system crash without any uninterruptible power /// supply. - #[default] None, /// Triggers a fsync after writing any chunk on the datastore. While this can slow down /// backups significantly, depending on the underlying file system and storage used, it @@ -196,6 +195,7 @@ pub enum DatastoreFSyncLevel { /// Depending on the setup, it might have a negative impact on unrelated write operations /// of the underlying filesystem, but it is generally a good compromise between performance /// and consitency. + #[default] Filesystem, } -- 2.30.2