From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores
Date: Thu, 20 Oct 2022 15:09:18 +0200 [thread overview]
Message-ID: <20221020130918.gh5ktoksnzqu46c5@casey.proxmox.com> (raw)
In-Reply-To: <20221020074058.717152-4-d.csapak@proxmox.com>
Looks mostly good to me, minor nits inline:
On Thu, Oct 20, 2022 at 09:40:56AM +0200, Dominik Csapak wrote:
> currently, we don't (f)sync on chunk insertion (or at any point after
> that), which can lead to broken chunks in case of e.g. an unexpected
> powerloss. To fix that, offer a tuning option for datastores that
> controls the level of syncs it does:
>
> * None (default): same as current state, no (f)syncs done at any point
> * Filesystem: at the end of a backup, the datastore issues
> a syncfs(2) to the filesystem of the datastore
> * File: issues an fsync on each chunk as they get inserted
> (using our 'replace_file' helper) and a fsync on the directory handle
>
> a small benchmark showed the following (times in mm:ss):
> setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner
>
> size none filesystem file
> 2GiB (fits in ram) 00:13 0:41 01:00
> 33GiB 05:21 05:31 13:45
>
> so if the backup fits in memory, there is a large difference between all
> of the modes (expected), but as soon as it exceeds the memory size,
> the difference between not syncing and syncing the fs at the end becomes
> much smaller.
>
> i also tested on an nvme, but there the syncs basically made no difference
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> pbs-api-types/src/datastore.rs | 37 ++++++++++++++++++++
> pbs-datastore/src/chunk_store.rs | 59 +++++++++++++++++++++++++++-----
> pbs-datastore/src/datastore.rs | 39 ++++++++++++++++++---
> src/api2/backup/environment.rs | 2 ++
> src/api2/config/datastore.rs | 9 +++--
> 5 files changed, 131 insertions(+), 15 deletions(-)
>
> diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
> index 0af11b33..15ea80cd 100644
> --- a/pbs-api-types/src/datastore.rs
> +++ b/pbs-api-types/src/datastore.rs
> @@ -168,6 +168,42 @@ pub enum ChunkOrder {
> Inode,
> }
>
> +#[api]
> +#[derive(PartialEq, Serialize, Deserialize)]
+ Clone, Copy, Debug, Eq
also: you can derive Default now...
> +#[serde(rename_all = "lowercase")]
> +/// The level of syncing that is done when writing into a datastore.
> +pub enum DatastoreFSyncLevel {
> + /// No special fsync or syncfs calls are triggered. The system default dirty write back
> + /// mechanism ensures that data gets is flushed eventually via the `dirty_writeback_centisecs`
> + /// and `dirty_expire_centisecs` kernel sysctls, defaulting to ~ 30s.
> + ///
> + /// This mode provides generally the best performance, as all write back can happen async,
> + /// which reduces IO pressure.
> + /// But it may cause losing data on powerloss or system crash without any uninterruptible power
> + /// supply.
... if you add #[default] here.
> + None,
> + /// Triggers a fsync after writing any chunk on the datastore. While this can slow down
> + /// backups significantly, depending on the underlying file system and storage used, it
> + /// will ensure fine-grained consistency. Depending on the exact setup, there might be no
> + /// benefits over the file system level sync, so if the setup allows it, you should prefer
> + /// that one. Despite the possible negative impact in performance, it's the most consistent
> + /// mode.
> + File,
> + /// Trigger a filesystem wide sync after all backup data got written but before finishing the
> + /// task. This allows that every finished backup is fully written back to storage
> + /// while reducing the impact on many file systems in contrast to the file level sync.
> + /// Depending on the setup, it might have a negative impact on unrelated write operations
> + /// of the underlying filesystem, but it is generally a good compromise between performance
> + /// and consitency.
> + Filesystem,
> +}
> +
> +impl Default for DatastoreFSyncLevel {
Then you can drop this.
> + fn default() -> Self {
> + DatastoreFSyncLevel::None
> + }
> +}
> +
> #[api(
> properties: {
> "chunk-order": {
> @@ -182,6 +218,7 @@ pub enum ChunkOrder {
> pub struct DatastoreTuning {
> /// Iterate chunks in this order
> pub chunk_order: Option<ChunkOrder>,
> + pub sync_level: Option<DatastoreFSyncLevel>,
> }
>
> pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 145d2c7e..a8582485 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
(...)
> @@ -460,9 +468,27 @@ impl ChunkStore {
> }
> }
>
> - proxmox_sys::fs::replace_file(chunk_path, raw_data, CreateOptions::new(), false).map_err(
> - |err| format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}"),
> - )?;
> + let chunk_dir_path = chunk_path
> + .parent()
> + .ok_or_else(|| format_err!("unable to get chunk dir"))?
> + .to_owned();
No need to clone this slice, just drop the `.to_owned()`...
> +
> + proxmox_sys::fs::replace_file(
> + chunk_path,
...and use `&chunk_path` here, this takes an `AsRef<Path>`
> + raw_data,
> + CreateOptions::new(),
> + self.sync_level == DatastoreFSyncLevel::File,
> + )
> + .map_err(|err| {
> + format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}")
> + })?;
> +
> + if self.sync_level == DatastoreFSyncLevel::File {
> + // fsync dir handle to persist the tmp rename
> + let dir = std::fs::File::open(chunk_dir_path)?;
> + nix::unistd::fsync(dir.as_raw_fd())
> + .map_err(|err| format_err!("fsync failed: {err}"))?;
> + }
>
> drop(lock);
>
next prev parent reply other threads:[~2022-10-20 13:09 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-20 7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/5] docs: add information about chunk order option for datastores Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 2/5] pbs-datastore: chunk_store: use replace_file in insert_chunk Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores Dominik Csapak
2022-10-20 13:09 ` Wolfgang Bumiller [this message]
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 4/5] docs: add documentation about the 'sync-level' tuning Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 5/5] datastore: make 'filesystem' the default sync-level Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221020130918.gh5ktoksnzqu46c5@casey.proxmox.com \
--to=w.bumiller@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.