From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Hannes Laimer <h.laimer@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup v5 2/9] datastore: add namespace-level locking
Date: Tue, 24 Mar 2026 14:00:16 +0100 [thread overview]
Message-ID: <1774353503.z2ym4rt8ui.astroid@yuna.none> (raw)
In-Reply-To: <20260319161325.206846-3-h.laimer@proxmox.com>
On March 19, 2026 5:13 pm, Hannes Laimer wrote:
> Add exclusive/shared namespace locking keyed at
> `/run/proxmox-backup/locks/{store}/{ns}/`
>
> Operations that read from or write into a namespace hold a shared lock
> for their duration. Structural operations (move, delete) hold an
> exclusive lock. The shared lock is hierarchical: locking a/b/c also
> locks a/b and a, so an exclusive lock on any ancestor blocks all
> active operations below it. Walking up the ancestor chain costs
> O(depth), which is bounded by the maximum namespace depth of 8,
> whereas locking all descendants would be arbitrarily expensive.
but locking all ancestors also makes the number of locks go up quite a
bit in the cold path, if you access a deeply nested namespace.
whereas locking the children would only be required on the "cold" path
(when changing the namespace tree), which is a much rarer operation.
some more comments below, all conditioned on introducing this new locks
in the first place!
> Backup jobs and pull/push sync acquire the shared lock via
> create_locked_backup_group and pull_ns/push_namespace respectively.
> Verify and prune acquire it per snapshot/group and skip gracefully if
> the lock cannot be taken, since a concurrent move is a transient
> condition.
>
> Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
> ---
> pbs-datastore/src/backup_info.rs | 112 +++++++++++++++++++++++++++++++
> pbs-datastore/src/datastore.rs | 47 ++++++++++---
> src/api2/admin/namespace.rs | 6 +-
> src/api2/backup/environment.rs | 4 ++
> src/api2/backup/mod.rs | 14 ++--
> src/api2/tape/restore.rs | 9 +--
> src/backup/verify.rs | 19 +++++-
> src/server/prune_job.rs | 11 +++
> src/server/pull.rs | 8 ++-
> src/server/push.rs | 6 ++
> 10 files changed, 215 insertions(+), 21 deletions(-)
>
> diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
> index c33eb307..57e0448f 100644
> --- a/pbs-datastore/src/backup_info.rs
> +++ b/pbs-datastore/src/backup_info.rs
> [..]
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index ef378c69..18712074 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -38,7 +38,8 @@ use pbs_config::{BackupLockGuard, ConfigVersionCache};
> use proxmox_section_config::SectionConfigData;
>
> use crate::backup_info::{
> - BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FILENAME,
> + lock_namespace, lock_namespace_shared, BackupDir, BackupGroup, BackupInfo, NamespaceLockGuard,
> + OLD_LOCKING, PROTECTED_MARKER_FILENAME,
> };
> use crate::chunk_store::ChunkStore;
> use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
> @@ -83,6 +84,8 @@ const CHUNK_LOCK_TIMEOUT: Duration = Duration::from_secs(3 * 60 * 60);
> const S3_DELETE_BATCH_LIMIT: usize = 100;
> // max defer time for s3 batch deletions
> const S3_DELETE_DEFER_LIMIT_SECONDS: Duration = Duration::from_secs(60 * 5);
> +// timeout for acquiring shared namespace locks in worker tasks
> +const NS_SHARED_LOCK_TIMEOUT: Duration = Duration::from_secs(2);
>
> /// checks if auth_id is owner, or, if owner is a token, if
> /// auth_id is the user of the token
> @@ -1123,18 +1126,46 @@ impl DataStore {
> Ok(())
> }
>
> - /// Create (if it does not already exists) and lock a backup group
> + /// Acquires an exclusive lock on a backup namespace and shared locks on all its ancestors.
> ///
> - /// And set the owner to 'userid'. If the group already exists, it returns the
> - /// current owner (instead of setting the owner).
> + /// Operations that structurally modify a namespace (move, delete) must hold this for their
> + /// duration to prevent concurrent readers or writers from accessing the namespace while it is
> + /// being relocated or destroyed.
> + pub fn lock_namespace(
> + self: &Arc<Self>,
> + ns: &BackupNamespace,
> + ) -> Result<NamespaceLockGuard, Error> {
> + lock_namespace(self.name(), ns)
> + }
this one makes sense as a standalone interface, since we need to acquire
this lock on its own when modifying the namespace hierarchy..
> +
> + /// Acquires shared locks on a backup namespace and all its non-root ancestors.
> ///
> - /// This also acquires an exclusive lock on the directory and returns the lock guard.
> + /// Operations that read from or write into a namespace (backup, sync, verify, prune) must hold
> + /// this for their duration to prevent a concurrent `move_namespace` or `delete_namespace` from
> + /// relocating or destroying the namespace under them.
> + pub fn lock_namespace_shared(
> + self: &Arc<Self>,
> + ns: &BackupNamespace,
> + ) -> Result<NamespaceLockGuard, Error> {
> + lock_namespace_shared(self.name(), ns, Some(NS_SHARED_LOCK_TIMEOUT))
> + }
but this one doesn't make a lot of sense? we need this lock whenever we
obtain a shared lock on a group or snapshot.. so we should obtain it
whenever we lock a group or snapshot, and not separately, or else we
risk forgetting to do so (quickly stepping through some code paths with
this series applied already shows problematic ones, e.g. deleting a
group via the API will lock the group, then the snapshot, then the
manifest, and then remove the snapshot's contents, all without locking
any namespace..)
> + /// Create (if it does not already exist) and lock a backup group.
> + ///
> + /// Sets the owner to `auth_id`. If the group already exists, returns the current owner.
> + ///
> + /// Returns `(owner, ns_lock, group_lock)`. Both locks must be kept alive for the duration of
> + /// the backup session: the shared namespace lock prevents concurrent namespace moves or
> + /// deletions, and the exclusive group lock prevents concurrent backups to the same group.
> pub fn create_locked_backup_group(
> self: &Arc<Self>,
> ns: &BackupNamespace,
> backup_group: &pbs_api_types::BackupGroup,
> auth_id: &Authid,
> - ) -> Result<(Authid, BackupLockGuard), Error> {
> + ) -> Result<(Authid, NamespaceLockGuard, BackupLockGuard), Error> {
which in practice would mean that this should no longer return a
BackupLockGuard, but some new type?
> + let ns_guard = lock_namespace_shared(self.name(), ns, Some(NS_SHARED_LOCK_TIMEOUT))
> + .with_context(|| format!("failed to acquire shared namespace lock for '{ns}'"))?;
> +
> let backup_group = self.backup_group(ns.clone(), backup_group.clone());
>
> // create intermediate path first
> @@ -1155,14 +1186,14 @@ impl DataStore {
> return Err(err);
> }
> let owner = self.get_owner(ns, backup_group.group())?; // just to be sure
> - Ok((owner, guard))
> + Ok((owner, ns_guard, guard))
> }
> Err(ref err) if err.kind() == io::ErrorKind::AlreadyExists => {
> let guard = backup_group.lock().with_context(|| {
> format!("while creating locked backup group '{backup_group:?}'")
> })?;
> let owner = self.get_owner(ns, backup_group.group())?; // just to be sure
> - Ok((owner, guard))
> + Ok((owner, ns_guard, guard))
> }
> Err(err) => bail!("unable to create backup group {:?} - {}", full_path, err),
> }
> [.. snip ..]
next prev parent reply other threads:[~2026-03-24 13:00 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 16:13 [PATCH proxmox-backup v5 0/9] fixes #6195: add support for moving groups and namespaces Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 1/9] ui: show empty groups Hannes Laimer
2026-03-24 10:28 ` Dominik Csapak
2026-03-19 16:13 ` [PATCH proxmox-backup v5 2/9] datastore: add namespace-level locking Hannes Laimer
2026-03-24 13:00 ` Fabian Grünbichler [this message]
2026-03-19 16:13 ` [PATCH proxmox-backup v5 3/9] datastore: add move_group Hannes Laimer
2026-03-24 13:00 ` Fabian Grünbichler
2026-03-25 6:20 ` Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 4/9] datastore: add move_namespace Hannes Laimer
2026-03-24 13:00 ` Fabian Grünbichler
2026-03-25 6:32 ` Hannes Laimer
2026-03-25 8:55 ` Fabian Grünbichler
2026-03-19 16:13 ` [PATCH proxmox-backup v5 5/9] api: add PUT endpoint for move_group Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 6/9] api: add PUT endpoint for move_namespace Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 7/9] ui: add move group action Hannes Laimer
2026-03-24 10:41 ` Dominik Csapak
2026-03-19 16:13 ` [PATCH proxmox-backup v5 8/9] ui: add move namespace action Hannes Laimer
2026-03-24 10:56 ` Dominik Csapak
2026-03-19 16:13 ` [PATCH proxmox-backup v5 9/9] cli: add move-namespace and move-group commands Hannes Laimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1774353503.z2ym4rt8ui.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=h.laimer@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.