From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 9D3851FF165 for ; Thu, 6 Nov 2025 18:13:34 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id F25FD1E60D; Thu, 6 Nov 2025 18:14:12 +0100 (CET) From: Christian Ebner To: pbs-devel@lists.proxmox.com Date: Thu, 6 Nov 2025 18:13:55 +0100 Message-ID: <20251106171358.865503-1-c.ebner@proxmox.com> X-Mailer: git-send-email 2.47.3 MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1762449229939 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.047 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pbs-devel] [PATCH proxmox-backup v2 0/3] fix GC atime update race window X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox Backup Server development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" Sweeping of unused chunks during garbage collection checks their atime to distinguish between chunks being in-use and chunks no longer being used. While garbage collection does lock the chunk store by guarding its mutex before reading file stats and deleting unused chunks, the conditional touch did not do this before updating the chunks atime (thereby also checking the presence). Therefore there is a race window between the chunks metadata being read and the chunk being removed, but the chunk being touched in-between. The race is however rare, as for this to happen the chunk must be older than the cutoff time and not be referenced by any index file, otherwise the atime would be updated during phase 1 already. Fix by guarding the chunk store mutex before touching a chunk. Lastly, also make sure that marker chunk inserts and atime updates on bad chunks are performed in a locked context as well. Changes since version 1 (thanks @Fabian for swiftly seeing the issues): - Limit helpers scope for better encapsulation - Make sure internal helpers do not try to lock the chunk store again - Assure the chunk store is locked for s3 local store cache marker file insertion and atime updates on bad chunks. Christian Ebner (3): chunk store: limit scope for atime update helper methods chunk store: fix race window between chunk stat and gc cleanup datastore: insert chunk marker and touch bad chunks in locked context pbs-datastore/src/chunk_store.rs | 48 +++++++++++++++++++++++++++----- pbs-datastore/src/datastore.rs | 42 +++++++++++++++++----------- 2 files changed, 66 insertions(+), 24 deletions(-) -- 2.47.3 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel