public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options
@ 2022-10-20  7:40 Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/5] docs: add information about chunk order option for datastores Dominik Csapak
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Dominik Csapak @ 2022-10-20  7:40 UTC (permalink / raw)
  To: pbs-devel

adds a sync-level option to the datastore tuning option, so that
the user can decide which level of file consitency is wanted, and
afterwards set the default to 'filesystem'

changes from v2:
* rebase on master
* fix typos in documentation

changes from v1:
* rebase on master
* reword documentation to be slightly less opinionated about which
  option is better

changes from RFC:
* seperate introducing of the option and changing the default
* renaming to DatastoreFSyncLevel
* adding more and better comments + docs
* saving the whole level and not only a bool in the datastore/chunk_store
* adding a fsync on the dir handle for the 'file' case in insert_chunk
* split the change to 'replace_file' into seperate patch

the first patch is mostly unrelated, but it introduces a place where
we can document the option, and could be applied independent from
the remaining patches of this series.

the second patch only changes the use of replace_file in insert_chunk,
so that could also be applied independently.

Dominik Csapak (5):
  docs: add information about chunk order option for datastores
  pbs-datastore: chunk_store: use replace_file in insert_chunk
  datastore: implement sync-level tuning for datastores
  docs: add documentation about the 'sync-level' tuning
  datastore: make 'filesystem' the default sync-level

 docs/storage.rst                 | 59 ++++++++++++++++++++++++++++
 pbs-api-types/src/datastore.rs   | 37 ++++++++++++++++++
 pbs-datastore/src/chunk_store.rs | 66 +++++++++++++++++++++++---------
 pbs-datastore/src/datastore.rs   | 39 +++++++++++++++++--
 src/api2/backup/environment.rs   |  2 +
 src/api2/config/datastore.rs     |  9 ++++-
 6 files changed, 187 insertions(+), 25 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v3 1/5] docs: add information about chunk order option for datastores
  2022-10-20  7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
@ 2022-10-20  7:40 ` Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 2/5] pbs-datastore: chunk_store: use replace_file in insert_chunk Dominik Csapak
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2022-10-20  7:40 UTC (permalink / raw)
  To: pbs-devel

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 docs/storage.rst | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/docs/storage.rst b/docs/storage.rst
index ed4e28f2..1909bd84 100644
--- a/docs/storage.rst
+++ b/docs/storage.rst
@@ -315,3 +315,26 @@ There are a few per-datastore options:
 * :ref:`Notifications <maintenance_notification>`
 * :ref:`Maintenance Mode <maintenance_mode>`
 * Verification of incoming backups
+
+Tuning
+^^^^^^
+There are some tuning related options for the datastore that are more advanced
+and only available on the CLI:
+
+* ``chunk-order``: Chunk order for verify & tape backup:
+
+  You can specify the order in which Proxmox Backup Server iterates the chunks
+  when doing a verify or backing up to tape. The two options are:
+
+  - `inode`  (default): Sorts the chunks by inode number of the filesystem before iterating
+    over them. This should be fine for most storages, especially spinning disks.
+  - `none`  Iterates the chunks in the order they appear in the
+    index file (.fidx/.didx). While this might slow down iterating on many slow
+    storages, on very fast ones (for example: NVMEs) the collecting and sorting
+    can take more time than gained through the sorted iterating.
+  This option can be set with:
+
+.. code-block:: console
+
+  # proxmox-backup-manager datastore update <storename> --tuning 'chunk-order=none'
+
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v3 2/5] pbs-datastore: chunk_store: use replace_file in insert_chunk
  2022-10-20  7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/5] docs: add information about chunk order option for datastores Dominik Csapak
@ 2022-10-20  7:40 ` Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores Dominik Csapak
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2022-10-20  7:40 UTC (permalink / raw)
  To: pbs-devel

it does the same as the current code

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 3bebc645..145d2c7e 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -1,4 +1,3 @@
-use std::io::Write;
 use std::os::unix::io::AsRawFd;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, Mutex};
@@ -461,21 +460,9 @@ impl ChunkStore {
             }
         }
 
-        let mut tmp_path = chunk_path.clone();
-        tmp_path.set_extension("tmp");
-
-        let mut file = std::fs::File::create(&tmp_path).map_err(|err| {
-            format_err!("creating chunk on store '{name}' failed for {digest_str} - {err}")
-        })?;
-
-        file.write_all(raw_data).map_err(|err| {
-            format_err!("writing chunk on store '{name}' failed for {digest_str} - {err}")
-        })?;
-
-        if let Err(err) = std::fs::rename(&tmp_path, &chunk_path) {
-            if std::fs::remove_file(&tmp_path).is_err() { /* ignore */ }
-            bail!("atomic rename on store '{name}' failed for chunk {digest_str} - {err}");
-        }
+        proxmox_sys::fs::replace_file(chunk_path, raw_data, CreateOptions::new(), false).map_err(
+            |err| format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}"),
+        )?;
 
         drop(lock);
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores
  2022-10-20  7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/5] docs: add information about chunk order option for datastores Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 2/5] pbs-datastore: chunk_store: use replace_file in insert_chunk Dominik Csapak
@ 2022-10-20  7:40 ` Dominik Csapak
  2022-10-20 13:09   ` Wolfgang Bumiller
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 4/5] docs: add documentation about the 'sync-level' tuning Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 5/5] datastore: make 'filesystem' the default sync-level Dominik Csapak
  4 siblings, 1 reply; 7+ messages in thread
From: Dominik Csapak @ 2022-10-20  7:40 UTC (permalink / raw)
  To: pbs-devel

currently, we don't (f)sync on chunk insertion (or at any point after
that), which can lead to broken chunks in case of e.g. an unexpected
powerloss. To fix that, offer a tuning option for datastores that
controls the level of syncs it does:

* None (default): same as current state, no (f)syncs done at any point
* Filesystem: at the end of a backup, the datastore issues
  a syncfs(2) to the filesystem of the datastore
* File: issues an fsync on each chunk as they get inserted
  (using our 'replace_file' helper) and a fsync on the directory handle

a small benchmark showed the following (times in mm:ss):
setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner

size                none    filesystem  file
2GiB (fits in ram)   00:13   0:41        01:00
33GiB                05:21   05:31       13:45

so if the backup fits in memory, there is a large difference between all
of the modes (expected), but as soon as it exceeds the memory size,
the difference between not syncing and syncing the fs at the end becomes
much smaller.

i also tested on an nvme, but there the syncs basically made no difference

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 pbs-api-types/src/datastore.rs   | 37 ++++++++++++++++++++
 pbs-datastore/src/chunk_store.rs | 59 +++++++++++++++++++++++++++-----
 pbs-datastore/src/datastore.rs   | 39 ++++++++++++++++++---
 src/api2/backup/environment.rs   |  2 ++
 src/api2/config/datastore.rs     |  9 +++--
 5 files changed, 131 insertions(+), 15 deletions(-)

diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index 0af11b33..15ea80cd 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -168,6 +168,42 @@ pub enum ChunkOrder {
     Inode,
 }
 
+#[api]
+#[derive(PartialEq, Serialize, Deserialize)]
+#[serde(rename_all = "lowercase")]
+/// The level of syncing that is done when writing into a datastore.
+pub enum DatastoreFSyncLevel {
+    /// No special fsync or syncfs calls are triggered. The system default dirty write back
+    /// mechanism ensures that data gets is flushed eventually via the `dirty_writeback_centisecs`
+    /// and `dirty_expire_centisecs` kernel sysctls, defaulting to ~ 30s.
+    ///
+    /// This mode provides generally the best performance, as all write back can happen async,
+    /// which reduces IO pressure.
+    /// But it may cause losing data on powerloss or system crash without any uninterruptible power
+    /// supply.
+    None,
+    /// Triggers a fsync after writing any chunk on the datastore. While this can slow down
+    /// backups significantly, depending on the underlying file system and storage used, it
+    /// will ensure fine-grained consistency. Depending on the exact setup, there might be no
+    /// benefits over the file system level sync, so if the setup allows it, you should prefer
+    /// that one. Despite the possible negative impact in performance, it's the most consistent
+    /// mode.
+    File,
+    /// Trigger a filesystem wide sync after all backup data got written but before finishing the
+    /// task. This allows that every finished backup is fully written back to storage
+    /// while reducing the impact on many file systems in contrast to the file level sync.
+    /// Depending on the setup, it might have a negative impact on unrelated write operations
+    /// of the underlying filesystem, but it is generally a good compromise between performance
+    /// and consitency.
+    Filesystem,
+}
+
+impl Default for DatastoreFSyncLevel {
+    fn default() -> Self {
+        DatastoreFSyncLevel::None
+    }
+}
+
 #[api(
     properties: {
         "chunk-order": {
@@ -182,6 +218,7 @@ pub enum ChunkOrder {
 pub struct DatastoreTuning {
     /// Iterate chunks in this order
     pub chunk_order: Option<ChunkOrder>,
+    pub sync_level: Option<DatastoreFSyncLevel>,
 }
 
 pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 145d2c7e..a8582485 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -4,7 +4,7 @@ use std::sync::{Arc, Mutex};
 
 use anyhow::{bail, format_err, Error};
 
-use pbs_api_types::GarbageCollectionStatus;
+use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
 use proxmox_sys::fs::{create_dir, create_path, file_type_from_file_stat, CreateOptions};
 use proxmox_sys::process_locker::{
     ProcessLockExclusiveGuard, ProcessLockSharedGuard, ProcessLocker,
@@ -21,6 +21,7 @@ pub struct ChunkStore {
     chunk_dir: PathBuf,
     mutex: Mutex<()>,
     locker: Option<Arc<Mutex<ProcessLocker>>>,
+    sync_level: DatastoreFSyncLevel,
 }
 
 // TODO: what about sysctl setting vm.vfs_cache_pressure (0 - 100) ?
@@ -67,6 +68,7 @@ impl ChunkStore {
             chunk_dir: PathBuf::new(),
             mutex: Mutex::new(()),
             locker: None,
+            sync_level: Default::default(),
         }
     }
 
@@ -87,6 +89,7 @@ impl ChunkStore {
         uid: nix::unistd::Uid,
         gid: nix::unistd::Gid,
         worker: Option<&dyn WorkerTaskContext>,
+        sync_level: DatastoreFSyncLevel,
     ) -> Result<Self, Error>
     where
         P: Into<PathBuf>,
@@ -143,7 +146,7 @@ impl ChunkStore {
             }
         }
 
-        Self::open(name, base)
+        Self::open(name, base, sync_level)
     }
 
     fn lockfile_path<P: Into<PathBuf>>(base: P) -> PathBuf {
@@ -157,7 +160,11 @@ impl ChunkStore {
     /// Note that this must be used with care, as it's dangerous to create two instances on the
     /// same base path, as closing the underlying ProcessLocker drops all locks from this process
     /// on the lockfile (even if separate FDs)
-    pub(crate) fn open<P: Into<PathBuf>>(name: &str, base: P) -> Result<Self, Error> {
+    pub(crate) fn open<P: Into<PathBuf>>(
+        name: &str,
+        base: P,
+        sync_level: DatastoreFSyncLevel,
+    ) -> Result<Self, Error> {
         let base: PathBuf = base.into();
 
         if !base.is_absolute() {
@@ -180,6 +187,7 @@ impl ChunkStore {
             chunk_dir,
             locker: Some(locker),
             mutex: Mutex::new(()),
+            sync_level,
         })
     }
 
@@ -460,9 +468,27 @@ impl ChunkStore {
             }
         }
 
-        proxmox_sys::fs::replace_file(chunk_path, raw_data, CreateOptions::new(), false).map_err(
-            |err| format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}"),
-        )?;
+        let chunk_dir_path = chunk_path
+            .parent()
+            .ok_or_else(|| format_err!("unable to get chunk dir"))?
+            .to_owned();
+
+        proxmox_sys::fs::replace_file(
+            chunk_path,
+            raw_data,
+            CreateOptions::new(),
+            self.sync_level == DatastoreFSyncLevel::File,
+        )
+        .map_err(|err| {
+            format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}")
+        })?;
+
+        if self.sync_level == DatastoreFSyncLevel::File {
+            // fsync dir handle to persist the tmp rename
+            let dir = std::fs::File::open(chunk_dir_path)?;
+            nix::unistd::fsync(dir.as_raw_fd())
+                .map_err(|err| format_err!("fsync failed: {err}"))?;
+        }
 
         drop(lock);
 
@@ -519,13 +545,21 @@ fn test_chunk_store1() {
 
     if let Err(_e) = std::fs::remove_dir_all(".testdir") { /* ignore */ }
 
-    let chunk_store = ChunkStore::open("test", &path);
+    let chunk_store = ChunkStore::open("test", &path, DatastoreFSyncLevel::None);
     assert!(chunk_store.is_err());
 
     let user = nix::unistd::User::from_uid(nix::unistd::Uid::current())
         .unwrap()
         .unwrap();
-    let chunk_store = ChunkStore::create("test", &path, user.uid, user.gid, None).unwrap();
+    let chunk_store = ChunkStore::create(
+        "test",
+        &path,
+        user.uid,
+        user.gid,
+        None,
+        DatastoreFSyncLevel::None,
+    )
+    .unwrap();
 
     let (chunk, digest) = crate::data_blob::DataChunkBuilder::new(&[0u8, 1u8])
         .build()
@@ -537,7 +571,14 @@ fn test_chunk_store1() {
     let (exists, _) = chunk_store.insert_chunk(&chunk, &digest).unwrap();
     assert!(exists);
 
-    let chunk_store = ChunkStore::create("test", &path, user.uid, user.gid, None);
+    let chunk_store = ChunkStore::create(
+        "test",
+        &path,
+        user.uid,
+        user.gid,
+        None,
+        DatastoreFSyncLevel::None,
+    );
     assert!(chunk_store.is_err());
 
     if let Err(_e) = std::fs::remove_dir_all(".testdir") { /* ignore */ }
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index a539ddc5..da7bdf87 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -18,8 +18,8 @@ use proxmox_sys::WorkerTaskContext;
 use proxmox_sys::{task_log, task_warn};
 
 use pbs_api_types::{
-    Authid, BackupNamespace, BackupType, ChunkOrder, DataStoreConfig, DatastoreTuning,
-    GarbageCollectionStatus, HumanByte, Operation, UPID,
+    Authid, BackupNamespace, BackupType, ChunkOrder, DataStoreConfig, DatastoreFSyncLevel,
+    DatastoreTuning, GarbageCollectionStatus, HumanByte, Operation, UPID,
 };
 
 use crate::backup_info::{BackupDir, BackupGroup};
@@ -59,6 +59,7 @@ pub struct DataStoreImpl {
     verify_new: bool,
     chunk_order: ChunkOrder,
     last_digest: Option<[u8; 32]>,
+    sync_level: DatastoreFSyncLevel,
 }
 
 impl DataStoreImpl {
@@ -72,6 +73,7 @@ impl DataStoreImpl {
             verify_new: false,
             chunk_order: ChunkOrder::None,
             last_digest: None,
+            sync_level: Default::default(),
         })
     }
 }
@@ -151,7 +153,15 @@ impl DataStore {
             }
             Arc::clone(&datastore.chunk_store)
         } else {
-            Arc::new(ChunkStore::open(name, &config.path)?)
+            let tuning: DatastoreTuning = serde_json::from_value(
+                DatastoreTuning::API_SCHEMA
+                    .parse_property_string(config.tuning.as_deref().unwrap_or(""))?,
+            )?;
+            Arc::new(ChunkStore::open(
+                name,
+                &config.path,
+                tuning.sync_level.unwrap_or_default(),
+            )?)
         };
 
         let datastore = DataStore::with_store_and_config(chunk_store, config, Some(digest))?;
@@ -207,7 +217,12 @@ impl DataStore {
     ) -> Result<Arc<Self>, Error> {
         let name = config.name.clone();
 
-        let chunk_store = ChunkStore::open(&name, &config.path)?;
+        let tuning: DatastoreTuning = serde_json::from_value(
+            DatastoreTuning::API_SCHEMA
+                .parse_property_string(config.tuning.as_deref().unwrap_or(""))?,
+        )?;
+        let chunk_store =
+            ChunkStore::open(&name, &config.path, tuning.sync_level.unwrap_or_default())?;
         let inner = Arc::new(Self::with_store_and_config(
             Arc::new(chunk_store),
             config,
@@ -254,6 +269,7 @@ impl DataStore {
             verify_new: config.verify_new.unwrap_or(false),
             chunk_order,
             last_digest,
+            sync_level: tuning.sync_level.unwrap_or_default(),
         })
     }
 
@@ -1294,4 +1310,19 @@ impl DataStore {
         todo!("split out the namespace");
     }
     */
+
+    /// Syncs the filesystem of the datastore if 'sync_level' is set to
+    /// [`DatastoreFSyncLevel::Filesystem`]. Uses syncfs(2).
+    pub fn try_ensure_sync_level(&self) -> Result<(), Error> {
+        if self.inner.sync_level != DatastoreFSyncLevel::Filesystem {
+            return Ok(());
+        }
+        let file = std::fs::File::open(self.base_path())?;
+        let fd = file.as_raw_fd();
+        log::info!("syncinc filesystem");
+        if unsafe { libc::syncfs(fd) } < 0 {
+            bail!("error during syncfs: {}", std::io::Error::last_os_error());
+        }
+        Ok(())
+    }
 }
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index e9a5cbc8..4f07f9b4 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -623,6 +623,8 @@ impl BackupEnvironment {
             }
         }
 
+        self.datastore.try_ensure_sync_level()?;
+
         // marks the backup as successful
         state.finished = true;
 
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index 08adf7c9..362661f5 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -11,8 +11,8 @@ use proxmox_section_config::SectionConfigData;
 use proxmox_sys::WorkerTaskContext;
 
 use pbs_api_types::{
-    Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreNotify, DATASTORE_SCHEMA,
-    PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY,
+    Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreNotify, DatastoreTuning,
+    DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY,
     PROXMOX_CONFIG_DIGEST_SCHEMA,
 };
 use pbs_config::BackupLockGuard;
@@ -70,6 +70,10 @@ pub(crate) fn do_create_datastore(
 ) -> Result<(), Error> {
     let path: PathBuf = datastore.path.clone().into();
 
+    let tuning: DatastoreTuning = serde_json::from_value(
+        DatastoreTuning::API_SCHEMA
+            .parse_property_string(datastore.tuning.as_deref().unwrap_or(""))?,
+    )?;
     let backup_user = pbs_config::backup_user()?;
     let _store = ChunkStore::create(
         &datastore.name,
@@ -77,6 +81,7 @@ pub(crate) fn do_create_datastore(
         backup_user.uid,
         backup_user.gid,
         worker,
+        tuning.sync_level.unwrap_or_default(),
     )?;
 
     config.set_data(&datastore.name, "datastore", &datastore)?;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v3 4/5] docs: add documentation about the 'sync-level' tuning
  2022-10-20  7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
                   ` (2 preceding siblings ...)
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores Dominik Csapak
@ 2022-10-20  7:40 ` Dominik Csapak
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 5/5] datastore: make 'filesystem' the default sync-level Dominik Csapak
  4 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2022-10-20  7:40 UTC (permalink / raw)
  To: pbs-devel

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 docs/storage.rst | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/docs/storage.rst b/docs/storage.rst
index 1909bd84..c4e44c72 100644
--- a/docs/storage.rst
+++ b/docs/storage.rst
@@ -338,3 +338,39 @@ and only available on the CLI:
 
   # proxmox-backup-manager datastore update <storename> --tuning 'chunk-order=none'
 
+* ``sync-level``: Datastore fsync level:
+
+  You can set the level of syncing on the datastore for chunks, which influences
+  the crash resistance of backups in case of a powerloss or hard shutoff.
+  There are currently three levels:
+
+  - `none` (default): Does not do any syncing when writing chunks. This is fast
+    and normally OK, since the kernel eventually flushes writes onto the disk.
+    The kernel sysctls `dirty_expire_centisecs` and `dirty_writeback_centisecs`
+    are used to tune that behaviour, while the default is to flush old data
+    after ~30s.
+
+  - `filesystem` : This triggers a ``syncfs(2)`` after a backup, but before
+    the task returns `OK`. This way it is ensured that the written backups
+    are on disk. This is a good balance between speed and consistency.
+    Note that the underlying storage device still needs to protect itself against
+    powerloss to flush its internal ephemeral caches to the permanent storage layer.
+
+  - `file` With this mode, a fsync is triggered on every chunk insertion, which
+    makes sure each and every chunk reaches the disk as soon as possible. While
+    this reaches the highest level of consistency, for many storages (especially
+    slower ones) this comes at the cost of speed. For many users the `filesystem`
+    mode is better suited, but for very fast storages this mode can be OK.
+
+  This can be set with:
+
+.. code-block:: console
+
+  # proxmox-backup-manager datastore update <storename> --tuning 'sync-level=filesystem'
+
+If you want to set multiple tuning options simultaneously, you can separate them
+with a comma, like this:
+
+.. code-block:: console
+
+  # proxmox-backup-manager datastore update <storename> --tuning 'sync-level=filesystem,chunk-order=none'
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v3 5/5] datastore: make 'filesystem' the default sync-level
  2022-10-20  7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
                   ` (3 preceding siblings ...)
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 4/5] docs: add documentation about the 'sync-level' tuning Dominik Csapak
@ 2022-10-20  7:40 ` Dominik Csapak
  4 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2022-10-20  7:40 UTC (permalink / raw)
  To: pbs-devel

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 docs/storage.rst               | 4 ++--
 pbs-api-types/src/datastore.rs | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/storage.rst b/docs/storage.rst
index c4e44c72..d61c3a40 100644
--- a/docs/storage.rst
+++ b/docs/storage.rst
@@ -344,13 +344,13 @@ and only available on the CLI:
   the crash resistance of backups in case of a powerloss or hard shutoff.
   There are currently three levels:
 
-  - `none` (default): Does not do any syncing when writing chunks. This is fast
+  - `none` : Does not do any syncing when writing chunks. This is fast
     and normally OK, since the kernel eventually flushes writes onto the disk.
     The kernel sysctls `dirty_expire_centisecs` and `dirty_writeback_centisecs`
     are used to tune that behaviour, while the default is to flush old data
     after ~30s.
 
-  - `filesystem` : This triggers a ``syncfs(2)`` after a backup, but before
+  - `filesystem` (default): This triggers a ``syncfs(2)`` after a backup, but before
     the task returns `OK`. This way it is ensured that the written backups
     are on disk. This is a good balance between speed and consistency.
     Note that the underlying storage device still needs to protect itself against
diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index 15ea80cd..f33fbc2d 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -200,7 +200,7 @@ pub enum DatastoreFSyncLevel {
 
 impl Default for DatastoreFSyncLevel {
     fn default() -> Self {
-        DatastoreFSyncLevel::None
+        DatastoreFSyncLevel::Filesystem
     }
 }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores
  2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores Dominik Csapak
@ 2022-10-20 13:09   ` Wolfgang Bumiller
  0 siblings, 0 replies; 7+ messages in thread
From: Wolfgang Bumiller @ 2022-10-20 13:09 UTC (permalink / raw)
  To: Dominik Csapak; +Cc: pbs-devel

Looks mostly good to me, minor nits inline:

On Thu, Oct 20, 2022 at 09:40:56AM +0200, Dominik Csapak wrote:
> currently, we don't (f)sync on chunk insertion (or at any point after
> that), which can lead to broken chunks in case of e.g. an unexpected
> powerloss. To fix that, offer a tuning option for datastores that
> controls the level of syncs it does:
> 
> * None (default): same as current state, no (f)syncs done at any point
> * Filesystem: at the end of a backup, the datastore issues
>   a syncfs(2) to the filesystem of the datastore
> * File: issues an fsync on each chunk as they get inserted
>   (using our 'replace_file' helper) and a fsync on the directory handle
> 
> a small benchmark showed the following (times in mm:ss):
> setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner
> 
> size                none    filesystem  file
> 2GiB (fits in ram)   00:13   0:41        01:00
> 33GiB                05:21   05:31       13:45
> 
> so if the backup fits in memory, there is a large difference between all
> of the modes (expected), but as soon as it exceeds the memory size,
> the difference between not syncing and syncing the fs at the end becomes
> much smaller.
> 
> i also tested on an nvme, but there the syncs basically made no difference
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  pbs-api-types/src/datastore.rs   | 37 ++++++++++++++++++++
>  pbs-datastore/src/chunk_store.rs | 59 +++++++++++++++++++++++++++-----
>  pbs-datastore/src/datastore.rs   | 39 ++++++++++++++++++---
>  src/api2/backup/environment.rs   |  2 ++
>  src/api2/config/datastore.rs     |  9 +++--
>  5 files changed, 131 insertions(+), 15 deletions(-)
> 
> diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
> index 0af11b33..15ea80cd 100644
> --- a/pbs-api-types/src/datastore.rs
> +++ b/pbs-api-types/src/datastore.rs
> @@ -168,6 +168,42 @@ pub enum ChunkOrder {
>      Inode,
>  }
>  
> +#[api]
> +#[derive(PartialEq, Serialize, Deserialize)]

+ Clone, Copy, Debug, Eq
also: you can derive Default now...

> +#[serde(rename_all = "lowercase")]
> +/// The level of syncing that is done when writing into a datastore.
> +pub enum DatastoreFSyncLevel {
> +    /// No special fsync or syncfs calls are triggered. The system default dirty write back
> +    /// mechanism ensures that data gets is flushed eventually via the `dirty_writeback_centisecs`
> +    /// and `dirty_expire_centisecs` kernel sysctls, defaulting to ~ 30s.
> +    ///
> +    /// This mode provides generally the best performance, as all write back can happen async,
> +    /// which reduces IO pressure.
> +    /// But it may cause losing data on powerloss or system crash without any uninterruptible power
> +    /// supply.

... if you add #[default] here.

> +    None,
> +    /// Triggers a fsync after writing any chunk on the datastore. While this can slow down
> +    /// backups significantly, depending on the underlying file system and storage used, it
> +    /// will ensure fine-grained consistency. Depending on the exact setup, there might be no
> +    /// benefits over the file system level sync, so if the setup allows it, you should prefer
> +    /// that one. Despite the possible negative impact in performance, it's the most consistent
> +    /// mode.
> +    File,
> +    /// Trigger a filesystem wide sync after all backup data got written but before finishing the
> +    /// task. This allows that every finished backup is fully written back to storage
> +    /// while reducing the impact on many file systems in contrast to the file level sync.
> +    /// Depending on the setup, it might have a negative impact on unrelated write operations
> +    /// of the underlying filesystem, but it is generally a good compromise between performance
> +    /// and consitency.
> +    Filesystem,
> +}
> +
> +impl Default for DatastoreFSyncLevel {

Then you can drop this.

> +    fn default() -> Self {
> +        DatastoreFSyncLevel::None
> +    }
> +}
> +
>  #[api(
>      properties: {
>          "chunk-order": {
> @@ -182,6 +218,7 @@ pub enum ChunkOrder {
>  pub struct DatastoreTuning {
>      /// Iterate chunks in this order
>      pub chunk_order: Option<ChunkOrder>,
> +    pub sync_level: Option<DatastoreFSyncLevel>,
>  }
>  
>  pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 145d2c7e..a8582485 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
(...)
> @@ -460,9 +468,27 @@ impl ChunkStore {
>              }
>          }
>  
> -        proxmox_sys::fs::replace_file(chunk_path, raw_data, CreateOptions::new(), false).map_err(
> -            |err| format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}"),
> -        )?;
> +        let chunk_dir_path = chunk_path
> +            .parent()
> +            .ok_or_else(|| format_err!("unable to get chunk dir"))?
> +            .to_owned();

No need to clone this slice, just drop the `.to_owned()`...

> +
> +        proxmox_sys::fs::replace_file(
> +            chunk_path,

...and use `&chunk_path` here, this takes an `AsRef<Path>`

> +            raw_data,
> +            CreateOptions::new(),
> +            self.sync_level == DatastoreFSyncLevel::File,
> +        )
> +        .map_err(|err| {
> +            format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}")
> +        })?;
> +
> +        if self.sync_level == DatastoreFSyncLevel::File {
> +            // fsync dir handle to persist the tmp rename
> +            let dir = std::fs::File::open(chunk_dir_path)?;
> +            nix::unistd::fsync(dir.as_raw_fd())
> +                .map_err(|err| format_err!("fsync failed: {err}"))?;
> +        }
>  
>          drop(lock);
>  




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-10-20 13:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-20  7:40 [pbs-devel] [PATCH proxmox-backup v3 0/5] add 'sync-level' to datatore tuning options Dominik Csapak
2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/5] docs: add information about chunk order option for datastores Dominik Csapak
2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 2/5] pbs-datastore: chunk_store: use replace_file in insert_chunk Dominik Csapak
2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/5] datastore: implement sync-level tuning for datastores Dominik Csapak
2022-10-20 13:09   ` Wolfgang Bumiller
2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 4/5] docs: add documentation about the 'sync-level' tuning Dominik Csapak
2022-10-20  7:40 ` [pbs-devel] [PATCH proxmox-backup v3 5/5] datastore: make 'filesystem' the default sync-level Dominik Csapak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal