From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Hannes Laimer <h.laimer@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup v5 2/9] datastore: add namespace-level locking
Date: Tue, 24 Mar 2026 14:00:16 +0100 [thread overview]
Message-ID: <1774353503.z2ym4rt8ui.astroid@yuna.none> (raw)
In-Reply-To: <20260319161325.206846-3-h.laimer@proxmox.com>
On March 19, 2026 5:13 pm, Hannes Laimer wrote:
> Add exclusive/shared namespace locking keyed at
> `/run/proxmox-backup/locks/{store}/{ns}/`
>
> Operations that read from or write into a namespace hold a shared lock
> for their duration. Structural operations (move, delete) hold an
> exclusive lock. The shared lock is hierarchical: locking a/b/c also
> locks a/b and a, so an exclusive lock on any ancestor blocks all
> active operations below it. Walking up the ancestor chain costs
> O(depth), which is bounded by the maximum namespace depth of 8,
> whereas locking all descendants would be arbitrarily expensive.
but locking all ancestors also makes the number of locks go up quite a
bit in the cold path, if you access a deeply nested namespace.
whereas locking the children would only be required on the "cold" path
(when changing the namespace tree), which is a much rarer operation.
some more comments below, all conditioned on introducing this new locks
in the first place!
> Backup jobs and pull/push sync acquire the shared lock via
> create_locked_backup_group and pull_ns/push_namespace respectively.
> Verify and prune acquire it per snapshot/group and skip gracefully if
> the lock cannot be taken, since a concurrent move is a transient
> condition.
>
> Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
> ---
> pbs-datastore/src/backup_info.rs | 112 +++++++++++++++++++++++++++++++
> pbs-datastore/src/datastore.rs | 47 ++++++++++---
> src/api2/admin/namespace.rs | 6 +-
> src/api2/backup/environment.rs | 4 ++
> src/api2/backup/mod.rs | 14 ++--
> src/api2/tape/restore.rs | 9 +--
> src/backup/verify.rs | 19 +++++-
> src/server/prune_job.rs | 11 +++
> src/server/pull.rs | 8 ++-
> src/server/push.rs | 6 ++
> 10 files changed, 215 insertions(+), 21 deletions(-)
>
> diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
> index c33eb307..57e0448f 100644
> --- a/pbs-datastore/src/backup_info.rs
> +++ b/pbs-datastore/src/backup_info.rs
> [..]
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index ef378c69..18712074 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -38,7 +38,8 @@ use pbs_config::{BackupLockGuard, ConfigVersionCache};
> use proxmox_section_config::SectionConfigData;
>
> use crate::backup_info::{
> - BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FILENAME,
> + lock_namespace, lock_namespace_shared, BackupDir, BackupGroup, BackupInfo, NamespaceLockGuard,
> + OLD_LOCKING, PROTECTED_MARKER_FILENAME,
> };
> use crate::chunk_store::ChunkStore;
> use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
> @@ -83,6 +84,8 @@ const CHUNK_LOCK_TIMEOUT: Duration = Duration::from_secs(3 * 60 * 60);
> const S3_DELETE_BATCH_LIMIT: usize = 100;
> // max defer time for s3 batch deletions
> const S3_DELETE_DEFER_LIMIT_SECONDS: Duration = Duration::from_secs(60 * 5);
> +// timeout for acquiring shared namespace locks in worker tasks
> +const NS_SHARED_LOCK_TIMEOUT: Duration = Duration::from_secs(2);
>
> /// checks if auth_id is owner, or, if owner is a token, if
> /// auth_id is the user of the token
> @@ -1123,18 +1126,46 @@ impl DataStore {
> Ok(())
> }
>
> - /// Create (if it does not already exists) and lock a backup group
> + /// Acquires an exclusive lock on a backup namespace and shared locks on all its ancestors.
> ///
> - /// And set the owner to 'userid'. If the group already exists, it returns the
> - /// current owner (instead of setting the owner).
> + /// Operations that structurally modify a namespace (move, delete) must hold this for their
> + /// duration to prevent concurrent readers or writers from accessing the namespace while it is
> + /// being relocated or destroyed.
> + pub fn lock_namespace(
> + self: &Arc<Self>,
> + ns: &BackupNamespace,
> + ) -> Result<NamespaceLockGuard, Error> {
> + lock_namespace(self.name(), ns)
> + }
this one makes sense as a standalone interface, since we need to acquire
this lock on its own when modifying the namespace hierarchy..
> +
> + /// Acquires shared locks on a backup namespace and all its non-root ancestors.
> ///
> - /// This also acquires an exclusive lock on the directory and returns the lock guard.
> + /// Operations that read from or write into a namespace (backup, sync, verify, prune) must hold
> + /// this for their duration to prevent a concurrent `move_namespace` or `delete_namespace` from
> + /// relocating or destroying the namespace under them.
> + pub fn lock_namespace_shared(
> + self: &Arc<Self>,
> + ns: &BackupNamespace,
> + ) -> Result<NamespaceLockGuard, Error> {
> + lock_namespace_shared(self.name(), ns, Some(NS_SHARED_LOCK_TIMEOUT))
> + }
but this one doesn't make a lot of sense? we need this lock whenever we
obtain a shared lock on a group or snapshot.. so we should obtain it
whenever we lock a group or snapshot, and not separately, or else we
risk forgetting to do so (quickly stepping through some code paths with
this series applied already shows problematic ones, e.g. deleting a
group via the API will lock the group, then the snapshot, then the
manifest, and then remove the snapshot's contents, all without locking
any namespace..)
> + /// Create (if it does not already exist) and lock a backup group.
> + ///
> + /// Sets the owner to `auth_id`. If the group already exists, returns the current owner.
> + ///
> + /// Returns `(owner, ns_lock, group_lock)`. Both locks must be kept alive for the duration of
> + /// the backup session: the shared namespace lock prevents concurrent namespace moves or
> + /// deletions, and the exclusive group lock prevents concurrent backups to the same group.
> pub fn create_locked_backup_group(
> self: &Arc<Self>,
> ns: &BackupNamespace,
> backup_group: &pbs_api_types::BackupGroup,
> auth_id: &Authid,
> - ) -> Result<(Authid, BackupLockGuard), Error> {
> + ) -> Result<(Authid, NamespaceLockGuard, BackupLockGuard), Error> {
which in practice would mean that this should no longer return a
BackupLockGuard, but some new type?
> + let ns_guard = lock_namespace_shared(self.name(), ns, Some(NS_SHARED_LOCK_TIMEOUT))
> + .with_context(|| format!("failed to acquire shared namespace lock for '{ns}'"))?;
> +
> let backup_group = self.backup_group(ns.clone(), backup_group.clone());
>
> // create intermediate path first
> @@ -1155,14 +1186,14 @@ impl DataStore {
> return Err(err);
> }
> let owner = self.get_owner(ns, backup_group.group())?; // just to be sure
> - Ok((owner, guard))
> + Ok((owner, ns_guard, guard))
> }
> Err(ref err) if err.kind() == io::ErrorKind::AlreadyExists => {
> let guard = backup_group.lock().with_context(|| {
> format!("while creating locked backup group '{backup_group:?}'")
> })?;
> let owner = self.get_owner(ns, backup_group.group())?; // just to be sure
> - Ok((owner, guard))
> + Ok((owner, ns_guard, guard))
> }
> Err(err) => bail!("unable to create backup group {:?} - {}", full_path, err),
> }
> [.. snip ..]
next prev parent reply other threads:[~2026-03-24 13:00 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 16:13 [PATCH proxmox-backup v5 0/9] fixes #6195: add support for moving groups and namespaces Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 1/9] ui: show empty groups Hannes Laimer
2026-03-24 10:28 ` Dominik Csapak
2026-03-19 16:13 ` [PATCH proxmox-backup v5 2/9] datastore: add namespace-level locking Hannes Laimer
2026-03-24 13:00 ` Fabian Grünbichler [this message]
2026-03-19 16:13 ` [PATCH proxmox-backup v5 3/9] datastore: add move_group Hannes Laimer
2026-03-24 13:00 ` Fabian Grünbichler
2026-03-25 6:20 ` Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 4/9] datastore: add move_namespace Hannes Laimer
2026-03-24 13:00 ` Fabian Grünbichler
2026-03-25 6:32 ` Hannes Laimer
2026-03-25 8:55 ` Fabian Grünbichler
2026-03-19 16:13 ` [PATCH proxmox-backup v5 5/9] api: add PUT endpoint for move_group Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 6/9] api: add PUT endpoint for move_namespace Hannes Laimer
2026-03-19 16:13 ` [PATCH proxmox-backup v5 7/9] ui: add move group action Hannes Laimer
2026-03-24 10:41 ` Dominik Csapak
2026-03-19 16:13 ` [PATCH proxmox-backup v5 8/9] ui: add move namespace action Hannes Laimer
2026-03-24 10:56 ` Dominik Csapak
2026-03-19 16:13 ` [PATCH proxmox-backup v5 9/9] cli: add move-namespace and move-group commands Hannes Laimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1774353503.z2ym4rt8ui.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=h.laimer@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox