From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <d.csapak@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 4C7B48CA92
 for <pbs-devel@lists.proxmox.com>; Fri,  4 Nov 2022 10:49:36 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 350F6C80
 for <pbs-devel@lists.proxmox.com>; Fri,  4 Nov 2022 10:49:36 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pbs-devel@lists.proxmox.com>; Fri,  4 Nov 2022 10:49:35 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3BDD844C7E
 for <pbs-devel@lists.proxmox.com>; Fri,  4 Nov 2022 10:49:35 +0100 (CET)
From: Dominik Csapak <d.csapak@proxmox.com>
To: pbs-devel@lists.proxmox.com
Date: Fri,  4 Nov 2022 10:49:34 +0100
Message-Id: <20221104094934.1135932-1-d.csapak@proxmox.com>
X-Mailer: git-send-email 2.30.2
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.066 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [datastore.rs]
Subject: [pbs-devel] [PATCH proxmox-backup v5] datastore: make 'filesystem'
 the default sync-level
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Fri, 04 Nov 2022 09:49:36 -0000

rationale is that it makes the backup much safer than 'none', but does not
incur a big of a performance hit as 'file'.

here some benchmark:

data to be backed up:
~14GiB semi-random test images between 12kiB and 4GiB
that results in ~11GiB chunks (more than ram available on the target)

PBS setup:
virtualized (on an idle machine), PBS itself was also idle
8 cores (kvm64 on Intel 12700k) and 8 GiB memory

all virtual disks are on LVM with discard and iothread on
the HDD is a 4TB Seagate ST4000DM000 drive, and the NVME is a 2TB
Crucial CT2000P5PSSD8

i tested each disk with ext4/xfs/zfs (default created with the gui)
with 5 runs each, inbetween the caches are flushed and the filesystem synced
i removed the biggest and smallest result and from the remaining 3
results built the average (percentage is relative to the 'none' result)

result:

test         none     filesystem         file
hdd - ext4   125.67s  140.39s (+11.71%)  358.10s (+184.95%)
hdd - xfs    92.18s   102.64s (+11.35%)  351.58s (+281.41%)
hdd - zfs    94.82s   104.00s (+9.68%)   309.13s (+226.02%)
nvme - ext4  60.44s   60.26s (-0.30%)    60.47s (+0.05%)
nvme - xfs   60.11s   60.47s (+0.60%)    60.49s (+0.63%)
nvme - zfs   60.83s   60.85s (+0.03%)    60.80s (-0.05%)

So all in all, it does not seem to make a difference for nvme drives,
for hdds 'filesystem' increases backup time by ~10%, while
for 'file' it largely depends on the filesystem, but always
in the range of factor ~3 - ~4

Note that this does not take into account parallel actions, such as gc,
verify or other backups.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes from v4:
* included benchmark & rationale in the commit message

 docs/storage.rst               | 4 ++--
 pbs-api-types/src/datastore.rs | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/storage.rst b/docs/storage.rst
index c4e44c72..d61c3a40 100644
--- a/docs/storage.rst
+++ b/docs/storage.rst
@@ -344,13 +344,13 @@ and only available on the CLI:
   the crash resistance of backups in case of a powerloss or hard shutoff.
   There are currently three levels:
 
-  - `none` (default): Does not do any syncing when writing chunks. This is fast
+  - `none` : Does not do any syncing when writing chunks. This is fast
     and normally OK, since the kernel eventually flushes writes onto the disk.
     The kernel sysctls `dirty_expire_centisecs` and `dirty_writeback_centisecs`
     are used to tune that behaviour, while the default is to flush old data
     after ~30s.
 
-  - `filesystem` : This triggers a ``syncfs(2)`` after a backup, but before
+  - `filesystem` (default): This triggers a ``syncfs(2)`` after a backup, but before
     the task returns `OK`. This way it is ensured that the written backups
     are on disk. This is a good balance between speed and consistency.
     Note that the underlying storage device still needs to protect itself against
diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index 4c9eda2f..ff8fe55f 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -181,7 +181,6 @@ pub enum DatastoreFSyncLevel {
     /// which reduces IO pressure.
     /// But it may cause losing data on powerloss or system crash without any uninterruptible power
     /// supply.
-    #[default]
     None,
     /// Triggers a fsync after writing any chunk on the datastore. While this can slow down
     /// backups significantly, depending on the underlying file system and storage used, it
@@ -196,6 +195,7 @@ pub enum DatastoreFSyncLevel {
     /// Depending on the setup, it might have a negative impact on unrelated write operations
     /// of the underlying filesystem, but it is generally a good compromise between performance
     /// and consitency.
+    #[default]
     Filesystem,
 }
 
-- 
2.30.2