From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 838B91FF144 for ; Tue, 24 Mar 2026 14:00:34 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D409B1080D; Tue, 24 Mar 2026 14:00:54 +0100 (CET) Date: Tue, 24 Mar 2026 14:00:16 +0100 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= Subject: Re: [PATCH proxmox-backup v5 2/9] datastore: add namespace-level locking To: Hannes Laimer , pbs-devel@lists.proxmox.com References: <20260319161325.206846-1-h.laimer@proxmox.com> <20260319161325.206846-3-h.laimer@proxmox.com> In-Reply-To: <20260319161325.206846-3-h.laimer@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.17.0 (https://github.com/astroidmail/astroid) Message-Id: <1774353503.z2ym4rt8ui.astroid@yuna.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774357172804 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.054 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: ISPJANQDXFHCS2UCYISD46B5C56YZ5K5 X-Message-ID-Hash: ISPJANQDXFHCS2UCYISD46B5C56YZ5K5 X-MailFrom: f.gruenbichler@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Backup Server development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On March 19, 2026 5:13 pm, Hannes Laimer wrote: > Add exclusive/shared namespace locking keyed at > `/run/proxmox-backup/locks/{store}/{ns}/` >=20 > Operations that read from or write into a namespace hold a shared lock > for their duration. Structural operations (move, delete) hold an > exclusive lock. The shared lock is hierarchical: locking a/b/c also > locks a/b and a, so an exclusive lock on any ancestor blocks all > active operations below it. Walking up the ancestor chain costs > O(depth), which is bounded by the maximum namespace depth of 8, > whereas locking all descendants would be arbitrarily expensive. but locking all ancestors also makes the number of locks go up quite a bit in the cold path, if you access a deeply nested namespace. whereas locking the children would only be required on the "cold" path (when changing the namespace tree), which is a much rarer operation. some more comments below, all conditioned on introducing this new locks in the first place! > Backup jobs and pull/push sync acquire the shared lock via > create_locked_backup_group and pull_ns/push_namespace respectively. > Verify and prune acquire it per snapshot/group and skip gracefully if > the lock cannot be taken, since a concurrent move is a transient > condition. >=20 > Signed-off-by: Hannes Laimer > --- > pbs-datastore/src/backup_info.rs | 112 +++++++++++++++++++++++++++++++ > pbs-datastore/src/datastore.rs | 47 ++++++++++--- > src/api2/admin/namespace.rs | 6 +- > src/api2/backup/environment.rs | 4 ++ > src/api2/backup/mod.rs | 14 ++-- > src/api2/tape/restore.rs | 9 +-- > src/backup/verify.rs | 19 +++++- > src/server/prune_job.rs | 11 +++ > src/server/pull.rs | 8 ++- > src/server/push.rs | 6 ++ > 10 files changed, 215 insertions(+), 21 deletions(-) >=20 > diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_= info.rs > index c33eb307..57e0448f 100644 > --- a/pbs-datastore/src/backup_info.rs > +++ b/pbs-datastore/src/backup_info.rs > [..] > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore= .rs > index ef378c69..18712074 100644 > --- a/pbs-datastore/src/datastore.rs > +++ b/pbs-datastore/src/datastore.rs > @@ -38,7 +38,8 @@ use pbs_config::{BackupLockGuard, ConfigVersionCache}; > use proxmox_section_config::SectionConfigData; > =20 > use crate::backup_info::{ > - BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FI= LENAME, > + lock_namespace, lock_namespace_shared, BackupDir, BackupGroup, Backu= pInfo, NamespaceLockGuard, > + OLD_LOCKING, PROTECTED_MARKER_FILENAME, > }; > use crate::chunk_store::ChunkStore; > use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter}; > @@ -83,6 +84,8 @@ const CHUNK_LOCK_TIMEOUT: Duration =3D Duration::from_s= ecs(3 * 60 * 60); > const S3_DELETE_BATCH_LIMIT: usize =3D 100; > // max defer time for s3 batch deletions > const S3_DELETE_DEFER_LIMIT_SECONDS: Duration =3D Duration::from_secs(60= * 5); > +// timeout for acquiring shared namespace locks in worker tasks > +const NS_SHARED_LOCK_TIMEOUT: Duration =3D Duration::from_secs(2); > =20 > /// checks if auth_id is owner, or, if owner is a token, if > /// auth_id is the user of the token > @@ -1123,18 +1126,46 @@ impl DataStore { > Ok(()) > } > =20 > - /// Create (if it does not already exists) and lock a backup group > + /// Acquires an exclusive lock on a backup namespace and shared lock= s on all its ancestors. > /// > - /// And set the owner to 'userid'. If the group already exists, it r= eturns the > - /// current owner (instead of setting the owner). > + /// Operations that structurally modify a namespace (move, delete) m= ust hold this for their > + /// duration to prevent concurrent readers or writers from accessing= the namespace while it is > + /// being relocated or destroyed. > + pub fn lock_namespace( > + self: &Arc, > + ns: &BackupNamespace, > + ) -> Result { > + lock_namespace(self.name(), ns) > + } this one makes sense as a standalone interface, since we need to acquire this lock on its own when modifying the namespace hierarchy.. > + > + /// Acquires shared locks on a backup namespace and all its non-root= ancestors. > /// > - /// This also acquires an exclusive lock on the directory and return= s the lock guard. > + /// Operations that read from or write into a namespace (backup, syn= c, verify, prune) must hold > + /// this for their duration to prevent a concurrent `move_namespace`= or `delete_namespace` from > + /// relocating or destroying the namespace under them. > + pub fn lock_namespace_shared( > + self: &Arc, > + ns: &BackupNamespace, > + ) -> Result { > + lock_namespace_shared(self.name(), ns, Some(NS_SHARED_LOCK_TIMEO= UT)) > + } but this one doesn't make a lot of sense? we need this lock whenever we obtain a shared lock on a group or snapshot.. so we should obtain it whenever we lock a group or snapshot, and not separately, or else we risk forgetting to do so (quickly stepping through some code paths with this series applied already shows problematic ones, e.g. deleting a group via the API will lock the group, then the snapshot, then the manifest, and then remove the snapshot's contents, all without locking any namespace..) > + /// Create (if it does not already exist) and lock a backup group. > + /// > + /// Sets the owner to `auth_id`. If the group already exists, return= s the current owner. > + /// > + /// Returns `(owner, ns_lock, group_lock)`. Both locks must be kept = alive for the duration of > + /// the backup session: the shared namespace lock prevents concurren= t namespace moves or > + /// deletions, and the exclusive group lock prevents concurrent back= ups to the same group. > pub fn create_locked_backup_group( > self: &Arc, > ns: &BackupNamespace, > backup_group: &pbs_api_types::BackupGroup, > auth_id: &Authid, > - ) -> Result<(Authid, BackupLockGuard), Error> { > + ) -> Result<(Authid, NamespaceLockGuard, BackupLockGuard), Error> { which in practice would mean that this should no longer return a BackupLockGuard, but some new type? > + let ns_guard =3D lock_namespace_shared(self.name(), ns, Some(NS_= SHARED_LOCK_TIMEOUT)) > + .with_context(|| format!("failed to acquire shared namespace= lock for '{ns}'"))?; > + > let backup_group =3D self.backup_group(ns.clone(), backup_group.= clone()); > =20 > // create intermediate path first > @@ -1155,14 +1186,14 @@ impl DataStore { > return Err(err); > } > let owner =3D self.get_owner(ns, backup_group.group())?;= // just to be sure > - Ok((owner, guard)) > + Ok((owner, ns_guard, guard)) > } > Err(ref err) if err.kind() =3D=3D io::ErrorKind::AlreadyExis= ts =3D> { > let guard =3D backup_group.lock().with_context(|| { > format!("while creating locked backup group '{backup= _group:?}'") > })?; > let owner =3D self.get_owner(ns, backup_group.group())?;= // just to be sure > - Ok((owner, guard)) > + Ok((owner, ns_guard, guard)) > } > Err(err) =3D> bail!("unable to create backup group {:?} - {}= ", full_path, err), > } > [.. snip ..]