From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 3C9C98A788 for ; Thu, 20 Oct 2022 15:09:51 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 1596718131 for ; Thu, 20 Oct 2022 15:09:21 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 20 Oct 2022 15:09:20 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id E943444AC1 for ; Thu, 20 Oct 2022 15:09:19 +0200 (CEST) Date: Thu, 20 Oct 2022 15:09:18 +0200 From: Wolfgang Bumiller To: Dominik Csapak Cc: pbs-devel@lists.proxmox.com Message-ID: <20221020130918.gh5ktoksnzqu46c5@casey.proxmox.com> References: <20221020074058.717152-1-d.csapak@proxmox.com> <20221020074058.717152-4-d.csapak@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221020074058.717152-4-d.csapak@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.251 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Oct 2022 13:09:51 -0000 Looks mostly good to me, minor nits inline: On Thu, Oct 20, 2022 at 09:40:56AM +0200, Dominik Csapak wrote: > currently, we don't (f)sync on chunk insertion (or at any point after > that), which can lead to broken chunks in case of e.g. an unexpected > powerloss. To fix that, offer a tuning option for datastores that > controls the level of syncs it does: > > * None (default): same as current state, no (f)syncs done at any point > * Filesystem: at the end of a backup, the datastore issues > a syncfs(2) to the filesystem of the datastore > * File: issues an fsync on each chunk as they get inserted > (using our 'replace_file' helper) and a fsync on the directory handle > > a small benchmark showed the following (times in mm:ss): > setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner > > size none filesystem file > 2GiB (fits in ram) 00:13 0:41 01:00 > 33GiB 05:21 05:31 13:45 > > so if the backup fits in memory, there is a large difference between all > of the modes (expected), but as soon as it exceeds the memory size, > the difference between not syncing and syncing the fs at the end becomes > much smaller. > > i also tested on an nvme, but there the syncs basically made no difference > > Signed-off-by: Dominik Csapak > --- > pbs-api-types/src/datastore.rs | 37 ++++++++++++++++++++ > pbs-datastore/src/chunk_store.rs | 59 +++++++++++++++++++++++++++----- > pbs-datastore/src/datastore.rs | 39 ++++++++++++++++++--- > src/api2/backup/environment.rs | 2 ++ > src/api2/config/datastore.rs | 9 +++-- > 5 files changed, 131 insertions(+), 15 deletions(-) > > diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs > index 0af11b33..15ea80cd 100644 > --- a/pbs-api-types/src/datastore.rs > +++ b/pbs-api-types/src/datastore.rs > @@ -168,6 +168,42 @@ pub enum ChunkOrder { > Inode, > } > > +#[api] > +#[derive(PartialEq, Serialize, Deserialize)] + Clone, Copy, Debug, Eq also: you can derive Default now... > +#[serde(rename_all = "lowercase")] > +/// The level of syncing that is done when writing into a datastore. > +pub enum DatastoreFSyncLevel { > + /// No special fsync or syncfs calls are triggered. The system default dirty write back > + /// mechanism ensures that data gets is flushed eventually via the `dirty_writeback_centisecs` > + /// and `dirty_expire_centisecs` kernel sysctls, defaulting to ~ 30s. > + /// > + /// This mode provides generally the best performance, as all write back can happen async, > + /// which reduces IO pressure. > + /// But it may cause losing data on powerloss or system crash without any uninterruptible power > + /// supply. ... if you add #[default] here. > + None, > + /// Triggers a fsync after writing any chunk on the datastore. While this can slow down > + /// backups significantly, depending on the underlying file system and storage used, it > + /// will ensure fine-grained consistency. Depending on the exact setup, there might be no > + /// benefits over the file system level sync, so if the setup allows it, you should prefer > + /// that one. Despite the possible negative impact in performance, it's the most consistent > + /// mode. > + File, > + /// Trigger a filesystem wide sync after all backup data got written but before finishing the > + /// task. This allows that every finished backup is fully written back to storage > + /// while reducing the impact on many file systems in contrast to the file level sync. > + /// Depending on the setup, it might have a negative impact on unrelated write operations > + /// of the underlying filesystem, but it is generally a good compromise between performance > + /// and consitency. > + Filesystem, > +} > + > +impl Default for DatastoreFSyncLevel { Then you can drop this. > + fn default() -> Self { > + DatastoreFSyncLevel::None > + } > +} > + > #[api( > properties: { > "chunk-order": { > @@ -182,6 +218,7 @@ pub enum ChunkOrder { > pub struct DatastoreTuning { > /// Iterate chunks in this order > pub chunk_order: Option, > + pub sync_level: Option, > } > > pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options") > diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs > index 145d2c7e..a8582485 100644 > --- a/pbs-datastore/src/chunk_store.rs > +++ b/pbs-datastore/src/chunk_store.rs (...) > @@ -460,9 +468,27 @@ impl ChunkStore { > } > } > > - proxmox_sys::fs::replace_file(chunk_path, raw_data, CreateOptions::new(), false).map_err( > - |err| format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}"), > - )?; > + let chunk_dir_path = chunk_path > + .parent() > + .ok_or_else(|| format_err!("unable to get chunk dir"))? > + .to_owned(); No need to clone this slice, just drop the `.to_owned()`... > + > + proxmox_sys::fs::replace_file( > + chunk_path, ...and use `&chunk_path` here, this takes an `AsRef` > + raw_data, > + CreateOptions::new(), > + self.sync_level == DatastoreFSyncLevel::File, > + ) > + .map_err(|err| { > + format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}") > + })?; > + > + if self.sync_level == DatastoreFSyncLevel::File { > + // fsync dir handle to persist the tmp rename > + let dir = std::fs::File::open(chunk_dir_path)?; > + nix::unistd::fsync(dir.as_raw_fd()) > + .map_err(|err| format_err!("fsync failed: {err}"))?; > + } > > drop(lock); >