From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 660F01FF136 for ; Mon, 04 May 2026 21:23:06 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 3412CFC59; Mon, 4 May 2026 21:23:05 +0200 (CEST) Message-ID: Date: Mon, 4 May 2026 21:22:57 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Subject: Re: [PATCH proxmox-backup] api: backup: cleanup backup group created by benchmark To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , Christian Ebner , pbs-devel@lists.proxmox.com References: <20260430135931.722979-1-c.ebner@proxmox.com> <1777881516.a5cvx4evyt.astroid@yuna.none> Content-Language: en-US From: Thomas Lamprecht In-Reply-To: <1777881516.a5cvx4evyt.astroid@yuna.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1777922472514 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.003 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: BTHAEVSN4CBAKA5ZVT2JIBNRYAVBIEPV X-Message-ID-Hash: BTHAEVSN4CBAKA5ZVT2JIBNRYAVBIEPV X-MailFrom: t.lamprecht@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Backup Server development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Am 04.05.26 um 10:22 schrieb Fabian Grünbichler: >> diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs >> index 86ec49487..8848ca99c 100644 >> --- a/src/api2/backup/mod.rs >> +++ b/src/api2/backup/mod.rs >> @@ -288,9 +288,14 @@ fn upgrade_to_backup_protocol( >> if benchmark { >> env.log("benchmark finished successfully"); >> proxmox_async::runtime::block_in_place(|| { >> - env.datastore.remove_backup_dir( >> - env.backup_dir.backup_ns(), >> - env.backup_dir.as_ref(), >> + let datastore = env.datastore.clone(); >> + let namespace = env.backup_dir.backup_ns().clone(); >> + let snapshot = env.backup_dir.dir().clone(); >> + // draps all locks > > nit: `draps` ;) That I fixed. > >> + drop(env); >> + datastore.remove_backup_dir( >> + &namespace, >> + &snapshot, >> true, >> ) > > doesn't this also affect the "cleanup-on-error" paths a few lines below > this? > > dropping the full env is also a bit problematic because it opens up a > race condition if there are back-to-back benchmarks (or backups): > - benchmark starts > - benchmark is finished, drops env > - next benchmark starts and locks group and previous "snapshot" > - cleanup fails to obtain lock(s) and doesn't run > > for benchmarks that is not so bad, but for backups it would leave > half-written backup snapshots around (until they are cleaned up by other > means?). > > ideally, we would not drop the locks here but just run the cleanup using > the locks we already have, which is what "force" is doing. > > we currently only set force > - for the three calls here in the backup env > - when cleaning up a newly created snapshot as part of pull error > handling > > in all those case we are holding an exclusive lock on the group and on > the snapshot already. so we could just skip the group locking as well > when force is set? (ideally we'd find a way to actually encode this in > the signature, e.g. by replacing `force` with references to the lock > guards?) > Argh, missed your reply before pushing this out, sorry. Will address in a follow-up with your proposed approach - extend `force` to also skip the group lock and revert the env-drop on top, so all four force=true sites use the same approach. The signature refactor (lock-guard references instead of the bool) makes sense too, but skipping that for now to get the more minimal fix to the repo.