From: Robert Obkircher <r.obkircher@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup v3 7/7] datastore: fix sync level update propagation to chunk store
Date: Wed, 13 May 2026 12:42:29 +0200 [thread overview]
Message-ID: <c45d5eef-08ec-49ef-b420-3d21aa4a4b33@proxmox.com> (raw)
In-Reply-To: <20260512131047.689089-8-c.ebner@proxmox.com>
On 12.05.26 15:09, Christian Ebner wrote:
> Changing the datastore tuning options triggers an invalidation of the
> datastore cache entry, leading to re-instantiation with the new
> config parameters on the next datastore lookup.
> Since commit 0bd9c8701 ("datastore: lookup: reuse ChunkStore on stale
> datastore re-open") this does however not lead to re-creation of the
> chunk store instance in order to avoid dropping the process locker,
> which would lead to loosing any existing shared lock. However, as a
> consequence the sync level is not updated on the chunk store.
>
> Fix this by:
> - Storing the sync level as runtime properties of the chunk store as
> state within the mutex syncing concurrent modify access. It is held
> where needed anyways.
> - Pass the mutex guard as additional parameters to the methods
> requiring the locked state. This encodes the requirement for the
> mutex guard directly into the function signature instead of
> labeling it as unsafe only.
> - Assuring the previous sync level on config changes for consistency.
> To do this, extend and rename try_ensure_sync_level to also take the
> mutex guard and update the sync level if given.
>
> Reported-by: Robert Obkircher <r.obkircher@proxmox.com>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> pbs-datastore/src/chunk_store.rs | 73 +++++++++++++------
> pbs-datastore/src/datastore.rs | 26 ++++++-
> .../src/local_datastore_lru_cache.rs | 12 +--
> 3 files changed, 78 insertions(+), 33 deletions(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index e8c279a62..6c97e31d7 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -1,7 +1,7 @@
> use std::os::unix::fs::MetadataExt;
> use std::os::unix::io::AsRawFd;
> use std::path::{Path, PathBuf};
> -use std::sync::{Arc, Mutex};
> +use std::sync::{Arc, Mutex, MutexGuard};
> use std::time::Duration;
>
> use anyhow::{bail, format_err, Context, Error};
> @@ -27,14 +27,20 @@ use crate::{DataBlob, LocalDatastoreLruCache};
>
> const USING_MARKER_FILENAME_EXT: &str = "using";
>
> +#[derive(Default)]
> +/// Configurable runtime properties of a chunk store
> +pub(crate) struct ChunkStoreProperties {
> + pub(crate) sync_level: DatastoreFSyncLevel,
> +}
> +
> /// File system based chunk store
> pub struct ChunkStore {
> name: String, // used for error reporting
> pub(crate) base: PathBuf,
> chunk_dir: PathBuf,
> - mutex: Mutex<()>,
> + // Mutex to sync chunk store access, including property updates
> + mutex: Arc<Mutex<ChunkStoreProperties>>,
> locker: Option<Arc<Mutex<ProcessLocker>>>,
> - sync_level: DatastoreFSyncLevel,
> }
>
> // TODO: what about sysctl setting vm.vfs_cache_pressure (0 - 100) ?
> @@ -79,9 +85,8 @@ impl ChunkStore {
> name: String::new(),
> base: PathBuf::new(),
> chunk_dir: PathBuf::new(),
> - mutex: Mutex::new(()),
> + mutex: Arc::new(Mutex::new(Default::default())),
This doesn't need to be wrapped in an Arc anymore.
> locker: None,
> - sync_level: Default::default(),
> }
> }
>
> @@ -204,16 +209,19 @@ impl ChunkStore {
> base,
> chunk_dir,
> locker: Some(locker),
> - mutex: Mutex::new(()),
> - sync_level,
> + mutex: Arc::new(Mutex::new(ChunkStoreProperties { sync_level })),
> })
> }
>
> - fn touch_chunk_no_lock(&self, digest: &[u8; 32]) -> Result<(), Error> {
> + fn touch_chunk_no_lock(
> + &self,
> + digest: &[u8; 32],
> + mutex_guard: MutexGuard<ChunkStoreProperties>,
> + ) -> Result<(), Error> {
> // unwrap: only `None` in unit tests
> assert!(self.locker.is_some());
>
> - self.cond_touch_chunk_no_lock(digest, true)?;
> + self.cond_touch_chunk_no_lock(digest, true, mutex_guard)?;
> Ok(())
> }
>
> @@ -226,14 +234,15 @@ impl ChunkStore {
> digest: &[u8; 32],
> assert_exists: bool,
> ) -> Result<bool, Error> {
> - let _lock = self.mutex.lock();
> - self.cond_touch_chunk_no_lock(digest, assert_exists)
> + let lock = self.mutex.lock().unwrap();
> + self.cond_touch_chunk_no_lock(digest, assert_exists, lock)
> }
>
> fn cond_touch_chunk_no_lock(
> &self,
> digest: &[u8; 32],
> assert_exists: bool,
> + _mutex_guard: MutexGuard<ChunkStoreProperties>,
> ) -> Result<bool, Error> {
> // unwrap: only `None` in unit tests
> assert!(self.locker.is_some());
> @@ -423,7 +432,11 @@ impl ChunkStore {
> ProcessLocker::oldest_shared_lock(self.locker.clone().unwrap())
> }
>
> - pub(crate) fn mutex(&self) -> &std::sync::Mutex<()> {
> + /// Mutex to lock chunk store for exclusive access.
> + ///
> + /// Must be held when modifying chunk store contents and allows to update
> + /// chunk store runtime properties.
> + pub(crate) fn mutex(&self) -> &Mutex<ChunkStoreProperties> {
> &self.mutex
> }
>
> @@ -665,18 +678,17 @@ impl ChunkStore {
>
> //println!("DIGEST {}", hex::encode(digest));
>
> - let _lock = self.mutex.lock();
> + let lock = self.mutex.lock().unwrap();
>
> - // Safety: lock acquired above
> - unsafe { self.insert_chunk_nolock(chunk, digest, true) }
> + self.insert_chunk_nolock(chunk, digest, true, lock)
> }
>
> - /// Safety: requires holding the chunk store mutex!
> - pub(crate) unsafe fn insert_chunk_nolock(
> + pub(crate) fn insert_chunk_nolock(
> &self,
> chunk: &DataBlob,
> digest: &[u8; 32],
> warn_on_overwrite_empty: bool,
> + mutex_guard: MutexGuard<ChunkStoreProperties>,
> ) -> Result<(bool, u64), Error> {
> // unwrap: only `None` in unit tests
> assert!(self.locker.is_some());
> @@ -694,7 +706,7 @@ impl ChunkStore {
> }
> let old_size = metadata.len();
> if encoded_size == old_size {
> - self.touch_chunk_no_lock(digest)?;
> + self.touch_chunk_no_lock(digest, mutex_guard)?;
> return Ok((true, old_size));
> } else if old_size == 0 {
> if warn_on_overwrite_empty {
> @@ -721,11 +733,11 @@ impl ChunkStore {
> // compressed, the size mismatch could be caused by different zstd versions
> // so let's keep the one that was uploaded first, bit-rot is hopefully detected by
> // verification at some point..
> - self.touch_chunk_no_lock(digest)?;
> + self.touch_chunk_no_lock(digest, mutex_guard)?;
> return Ok((true, old_size));
> } else if old_size < encoded_size {
> log::debug!("Got another copy of chunk with digest '{digest_str}', existing chunk is smaller, discarding uploaded one.");
> - self.touch_chunk_no_lock(digest)?;
> + self.touch_chunk_no_lock(digest, mutex_guard)?;
> return Ok((true, old_size));
> } else {
> log::debug!("Got another copy of chunk with digest '{digest_str}', existing chunk is bigger, replacing with uploaded one.");
> @@ -742,17 +754,18 @@ impl ChunkStore {
> let gid = pbs_config::backup_group()?.gid;
> create_options = create_options.owner(uid).group(gid);
> }
> +
> proxmox_sys::fs::replace_file(
> &chunk_path,
> raw_data,
> create_options,
> - self.sync_level == DatastoreFSyncLevel::File,
> + mutex_guard.sync_level == DatastoreFSyncLevel::File,
> )
> .map_err(|err| {
> format_err!("inserting chunk on store '{name}' failed for {digest_str} - {err}")
> })?;
>
> - if self.sync_level == DatastoreFSyncLevel::File {
> + if mutex_guard.sync_level == DatastoreFSyncLevel::File {
> // fsync dir handle to persist the tmp rename
> let dir = std::fs::File::open(chunk_dir_path)?;
> nix::unistd::fsync(dir.as_raw_fd())
> @@ -967,10 +980,22 @@ impl ChunkStore {
> (chunk_path, counter)
> }
>
> - pub(super) fn try_ensure_sync_level(&self) -> Result<(), Error> {
> - if self.sync_level != DatastoreFSyncLevel::Filesystem {
> + pub(super) fn try_ensure_sync_level_with_update(
> + &self,
> + mut mutex_guard: MutexGuard<ChunkStoreProperties>,
> + update_sync_level: Option<DatastoreFSyncLevel>,
> + ) -> Result<(), Error> {
> + let assure_sync_level = mutex_guard.sync_level;
> + if let Some(sync_level) = update_sync_level {
> + mutex_guard.sync_level = sync_level;
> + }
> +
> + drop(mutex_guard); // never hold during syncfs
> +
> + if assure_sync_level != DatastoreFSyncLevel::Filesystem {
> return Ok(());
> }
> +
> let file = std::fs::File::open(self.base_path())?;
> let fd = file.as_raw_fd();
> info!("syncing filesystem");
Unrelated, but it might make sense to include the path in the log
message.
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index b67ab2f3a..6ec52d3f5 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -595,6 +595,20 @@ impl DataStore {
> operation: Some(lookup.operation),
> }));
> }
> +
> + let tuning = pbs_config::datastore::parse_datastore_tuning_options(&config)?;
> + let sync_level = tuning.sync_level.unwrap_or_default();
> +
> + let mutex = datastore.chunk_store.mutex();
> + let mutex_guard = mutex.lock().unwrap();
> + if mutex_guard.sync_level != sync_level {
> + datastore
> + .chunk_store
> + .try_ensure_sync_level_with_update(mutex_guard, Some(sync_level))?;
> + } else {
> + drop(mutex_guard);
> + }
> +
> Arc::clone(&datastore.chunk_store)
> } else {
> let tuning = pbs_config::datastore::parse_datastore_tuning_options(&config)?;
> @@ -2466,7 +2480,8 @@ impl DataStore {
> Ok(guard) => guard,
> Err(_) => continue,
> };
> - let _guard = self.inner.chunk_store.mutex().lock().unwrap();
> + let mutex = self.inner.chunk_store.mutex();
> + let _guard = mutex.lock().unwrap();
This can be a one-liner again (also in other places).
>
> // Check local markers (created or atime updated during phase1) and
> // keep or delete chunk based on that.
> @@ -2995,7 +3010,11 @@ impl DataStore {
> /// Syncs the filesystem of the chunk store base path if 'sync_level' is set to
> /// [`DatastoreFSyncLevel::Filesystem`]. Uses syncfs(2).
> pub fn try_ensure_sync_level(&self) -> Result<(), Error> {
> - self.inner.chunk_store.try_ensure_sync_level()
> + let mutex = self.inner.chunk_store.mutex();
> + let mutex_guard = mutex.lock().unwrap();
> + self.inner
> + .chunk_store
> + .try_ensure_sync_level_with_update(mutex_guard, None)
> }
>
> /// Destroy a datastore. This requires that there are no active operations on the datastore.
> @@ -3417,7 +3436,8 @@ impl DataStore {
> .chunk_store
> .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
> }
> - let _lock = self.inner.chunk_store.mutex().lock().unwrap();
> + let mutex = self.inner.chunk_store.mutex();
> + let _lock = mutex.lock().unwrap();
>
> let (new_path, counter) = self.inner.chunk_store.next_bad_chunk_path(digest);
>
> diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
> index ac27d4637..cf38d4a57 100644
> --- a/pbs-datastore/src/local_datastore_lru_cache.rs
> +++ b/pbs-datastore/src/local_datastore_lru_cache.rs
> @@ -34,12 +34,11 @@ impl LocalDatastoreLruCache {
> ///
> /// Fails if the chunk cannot be inserted successfully.
> pub fn insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
> - let _lock = self.store.mutex().lock().unwrap();
> + let shared_mutex = self.store.mutex();
> + let lock = shared_mutex.lock().unwrap();
> +
> + self.store.insert_chunk_nolock(chunk, digest, false, lock)?;
This must pass the lock guard by reference, not by value. Otherwise
the code below is no longer protected by the lock!
>
> - // Safety: lock acquire above
> - unsafe {
> - self.store.insert_chunk_nolock(chunk, digest, false)?;
> - }
> self.cache.insert(*digest, (), |digest| {
> // Safety: lock acquired above, this is executed inline!
> unsafe {
> @@ -80,7 +79,8 @@ impl LocalDatastoreLruCache {
> Ok(mut file) => match DataBlob::load_from_reader(&mut file) {
> // File was still cached with contents, load response from file
> Ok(chunk) => {
> - let _lock = self.store.mutex().lock().unwrap();
> + let shared_mutex = self.store.mutex();
> + let _lock = shared_mutex.lock().unwrap();
> self.cache.insert(*digest, (), |digest| {
> // Safety: lock acquired above, this is executed inline
> unsafe {
next prev parent reply other threads:[~2026-05-13 10:42 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 13:10 [PATCH proxmox{,-backup} v3 0/7] fix sync level updates for chunk store Christian Ebner
2026-05-12 13:10 ` [PATCH proxmox v3 1/7] pbs-api-types: derive FromStr for DatastoreTuning parsing Christian Ebner
2026-05-12 13:10 ` [PATCH proxmox-backup v3 2/7] datastore: GC: avoid double parsing of datastore tuning options Christian Ebner
2026-05-12 13:10 ` [PATCH proxmox-backup v3 3/7] pbs-config/datastore: add common helper for parsing " Christian Ebner
2026-05-12 13:10 ` [PATCH proxmox-backup v3 4/7] datastore: restrict chunk store mutex scope to crate only Christian Ebner
2026-05-12 13:10 ` [PATCH proxmox-backup v3 5/7] datastore: avoid useless double borrowing of datastore Christian Ebner
2026-05-12 13:10 ` [PATCH proxmox-backup v3 6/7] datastore: move try_ensure_sync_level() implementation to chunk store Christian Ebner
2026-05-12 13:10 ` [PATCH proxmox-backup v3 7/7] datastore: fix sync level update propagation " Christian Ebner
2026-05-13 10:42 ` Robert Obkircher [this message]
2026-05-14 8:06 ` superseded: [PATCH proxmox{,-backup} v3 0/7] fix sync level updates for " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c45d5eef-08ec-49ef-b420-3d21aa4a4b33@proxmox.com \
--to=r.obkircher@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox