From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id A2CC61FF137 for ; Tue, 17 Mar 2026 14:03:36 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 43D0FDBA3; Tue, 17 Mar 2026 14:03:49 +0100 (CET) Message-ID: Date: Tue, 17 Mar 2026 14:03:09 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH proxmox-backup v4 1/7] datastore: add namespace-level locking From: Christian Ebner To: Hannes Laimer , pbs-devel@lists.proxmox.com References: <20260311151315.133637-1-h.laimer@proxmox.com> <20260311151315.133637-2-h.laimer@proxmox.com> <02a6d817-c785-4a03-a945-441397a11d0f@proxmox.com> Content-Language: en-US, de-DE In-Reply-To: <02a6d817-c785-4a03-a945-441397a11d0f@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1773752549436 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.059 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: EO5VYMBJNCX34ABOUD45RMOQPC3YDMUY X-Message-ID-Hash: EO5VYMBJNCX34ABOUD45RMOQPC3YDMUY X-MailFrom: c.ebner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Backup Server development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 3/12/26 4:43 PM, Christian Ebner wrote: > One high level comment: I think it would make sense to allow to set an > overall timeout when acquiring the namespace locks. Most move operations > probably are rather fast anyways, so waiting a few seconds in a e.g. > prune job will probably be preferable over parts of the prune job being > skipped. > > Another comment inline. > > On 3/11/26 4:13 PM, Hannes Laimer wrote: >> Add exclusive/shared namespace locking keyed at >> /run/proxmox-backup/locks/{store}/{ns}/.ns-lock. >> >> Operations that read from or write into a namespace hold a shared lock >> for their duration. Structural operations (move, delete) hold an >> exclusive lock. The shared lock is hierarchical: locking a/b/c also >> locks a/b and a, so an exclusive lock on any ancestor blocks all >> active operations below it. Walking up the ancestor chain costs >> O(depth), which is bounded by the maximum namespace depth of 8, >> whereas locking all descendants would be arbitrarily expensive. >> >> Backup jobs and pull/push sync acquire the shared lock via >> create_locked_backup_group and pull_ns/push_namespace respectively. >> Verify and prune acquire it per snapshot/group and skip gracefully if >> the lock cannot be taken, since a concurrent move is a transient >> condition. >> >> Signed-off-by: Hannes Laimer >> --- >>   pbs-datastore/src/backup_info.rs | 92 ++++++++++++++++++++++++++++++++ >>   pbs-datastore/src/datastore.rs   | 45 +++++++++++++--- >>   src/api2/admin/namespace.rs      |  6 ++- >>   src/api2/backup/environment.rs   |  4 ++ >>   src/api2/backup/mod.rs           | 14 +++-- >>   src/api2/tape/restore.rs         |  9 ++-- >>   src/backup/verify.rs             | 19 ++++++- >>   src/server/prune_job.rs          | 11 ++++ >>   src/server/pull.rs               |  8 ++- >>   src/server/push.rs               |  6 +++ >>   10 files changed, 193 insertions(+), 21 deletions(-) >> >> diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/ >> backup_info.rs >> index c33eb307..476daa61 100644 >> --- a/pbs-datastore/src/backup_info.rs >> +++ b/pbs-datastore/src/backup_info.rs >> @@ -937,6 +937,98 @@ fn lock_file_path_helper(ns: &BackupNamespace, >> path: PathBuf) -> PathBuf { >>       to_return.join(format!("{first_eigthy}...{last_eighty}-{hash}")) >>   } >> +/// Returns the lock file path for a backup namespace. >> +/// >> +/// The lock file will be located at: >> +/// `${DATASTORE_LOCKS_DIR}/${store_name}/${ns_colon_encoded}/.ns-lock` >> +pub(crate) fn ns_lock_path(store_name: &str, ns: &BackupNamespace) -> >> PathBuf { >> +    let ns_part = ns >> +        .components() >> +        .map(String::from) >> +        .reduce(|acc, n| format!("{acc}:{n}")) >> +        .unwrap_or_default(); >> +    Path::new(DATASTORE_LOCKS_DIR) >> +        .join(store_name) >> +        .join(ns_part) As discussed off-list, the namespace directory itself might be used for locking instead of a dedicated file, in order to reduce the number of i-nodes which are located on this tmpfs. >> +        .join(".ns-lock") >> +} >> +