From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores
Date: Thu, 20 Oct 2022 15:09:18 +0200 [thread overview]
Message-ID: <20221020130918.gh5ktoksnzqu46c5@casey.proxmox.com> (raw)
In-Reply-To: <20221020074058.717152-4-d.csapak@proxmox.com>
Looks mostly good to me, minor nits inline:
On Thu, Oct 20, 2022 at 09:40:56AM +0200, Dominik Csapak wrote:
> currently, we don't (f)sync on chunk insertion (or at any point after
> that), which can lead to broken chunks in case of e.g. an unexpected
> powerloss. To fix that, offer a tuning option for datastores that
> controls the level of syncs it does:
>
> * None (default): same as current state, no (f)syncs done at any point
> * Filesystem: at the end of a backup, the datastore issues
> a syncfs(2) to the filesystem of the datastore
> * File: issues an fsync on each chunk as they get inserted
> (using our 'replace_file' helper) and a fsync on the directory handle
>
> a small benchmark showed the following (times in mm:ss):
> setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner
>
> size none filesystem file
> 2GiB (fits in ram) 00:13 0:41 01:00
> 33GiB 05:21 05:31 13:45
>
> so if the backup fits in memory, there is a large difference between all
> of the modes (expected), but as soon as it exceeds the memory size,
> the difference between not syncing and syncing the fs at the end becomes
> much smaller.
>
> i also tested on an nvme, but there the syncs basically made no difference
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> pbs-api-types/src/datastore.rs | 37 ++++++++++++++++++++
> pbs-datastore/src/chunk_store.rs | 59 +++++++++++++++++++++++++++-----
> pbs-datastore/src/datastore.rs | 39 ++++++++++++++++++---
> src/api2/backup/environment.rs | 2 ++
> src/api2/config/datastore.rs | 9 +++--
> 5 files changed, 131 insertions(+), 15 deletions(-)
>
> diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
> index 0af11b33..15ea80cd 100644
> --- a/pbs-api-types/src/datastore.rs
> +++ b/pbs-api-types/src/datastore.rs
> @@ -168,6 +168,42 @@ pub enum ChunkOrder {
> Inode,
> }
>
> +#[api]
> +#[derive(PartialEq, Serialize, Deserialize)]
+ Clone, Copy, Debug, Eq
also: you can derive Default now...
> +#[serde(rename_all = "lowercase")]
> +/// The level of syncing that is done when writing into a datastore.
> +pub enum DatastoreFSyncLevel {
> + /// No special fsync or syncfs calls are triggered. The system default dirty write back
> + /// mechanism ensures that data gets is flushed eventually via the `dirty_writeback_centisecs`
> + /// and `dirty_expire_centisecs` kernel sysctls, defaulting to ~ 30s.
> + ///
> + /// This mode provides generally the best performance, as all write back can happen async,
> + /// which reduces IO pressure.
> + /// But it may cause losing data on powerloss or system crash without any uninterruptible power
> + /// supply.
... if you add #[default] here.
> + None,
> + /// Triggers a fsync after writing any chunk on the datastore. While this can slow down
> + /// backups significantly, depending on the underlying file system and storage used, it
> + /// will ensure fine-grained consistency. Depending on the exact setup, there might be no
> + /// benefits over the file system level sync, so if the setup allows it, you should prefer
> + /// that one. Despite the possible negative impact in performance, it's the most consistent
> + /// mode.
> + File,
> + /// Trigger a filesystem wide sync after all backup data got written but before finishing the
> + /// task. This allows that every finished backup is fully written back to storage
> + /// while reducing the impact on many file systems in contrast to the file level sync.
> + /// Depending on the setup, it might have a negative impact on unrelated write operations
> + /// of the underlying filesystem, but it is generally a good compromise between performance
> + /// and consitency.
> + Filesystem,
> +}
> +
> +impl Default for DatastoreFSyncLevel {
Then you can drop this.
> + fn default() -> Self {
> + DatastoreFSyncLevel::None
> + }
> +}
> +
> #[api(
> properties: {
> "chunk-order": {
> @@ -182,6 +218,7 @@ pub enum ChunkOrder {
> pub struct DatastoreTuning {
> /// Iterate chunks in this order
> pub chunk_order: Option<ChunkOrder>,
> + pub sync_level: Option<DatastoreFSyncLevel>,
> }
>
> pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 145d2c7e..a8582485 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
(...)
> @@ -460,9 +468,27 @@ impl ChunkStore {
> }
> }
>
> - proxmox_sys::fs::replace_file(chunk_path, raw_data, CreateOptions::new(), false).map_err(
> - |err| format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}"),
> - )?;
> + let chunk_dir_path = chunk_path
> + .parent()
> + .ok_or_else(|| format_err!("unable to get chunk dir"))?
> + .to_owned();
No need to clone this slice, just drop the `.to_owned()`...
> +
> + proxmox_sys::fs::replace_file(
> + chunk_path,
...and use `&chunk_path` here, this takes an `AsRef<Path>`
> + raw_data,
> + CreateOptions::new(),
> + self.sync_level == DatastoreFSyncLevel::File,
> + )
> + .map_err(|err| {
> + format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}")
> + })?;
> +
> + if self.sync_level == DatastoreFSyncLevel::File {
> + // fsync dir handle to persist the tmp rename
> + let dir = std::fs::File::open(chunk_dir_path)?;
> + nix::unistd::fsync(dir.as_raw_fd())
> + .map_err(|err| format_err!("fsync failed: {err}"))?;
> + }
>
> drop(lock);
>
next prev parent reply other threads:[~2022-10-20 13:09 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-20 7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/5] docs: add information about chunk order option for datastores Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 2/5] pbs-datastore: chunk_store: use replace_file in insert_chunk Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores Dominik Csapak
2022-10-20 13:09 ` Wolfgang Bumiller [this message]
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 4/5] docs: add documentation about the 'sync-level' tuning Dominik Csapak
2022-10-20 7:40 ` [pbs-devel] [PATCH proxmox-backup v3 5/5] datastore: make 'filesystem' the default sync-level Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221020130918.gh5ktoksnzqu46c5@casey.proxmox.com \
--to=w.bumiller@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox