From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id F22D21FF183 for ; Wed, 5 Nov 2025 13:22:56 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 9FDDE225E3; Wed, 5 Nov 2025 13:23:36 +0100 (CET) From: Christian Ebner To: pbs-devel@lists.proxmox.com Date: Wed, 5 Nov 2025 13:22:10 +0100 Message-ID: <20251105122233.439382-1-c.ebner@proxmox.com> X-Mailer: git-send-email 2.47.3 MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1762345364001 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.046 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, restore.rs, environment.rs, pull.rs, verify.rs, datastore.rs] Subject: [pbs-devel] [PATCH proxmox-backup v3 00/23] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox Backup Server development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" These patches fix possible race conditions on datastores with s3 backend for chunk insert, renaming of corrupt chunks during verification and cleanup during garbage collection. Further, the patches assure consistency between the chunk marker file of the local datastore cache, the s3 object store and the in-memory LRU cache during state changes occurring by one of the above mentioned operations. Consistency is achieved by using a per-chunk file locking mechanism. File locks are stored on the predefined location for datastore file locks, using the same `.chunks/prefix/digest` folder layout for consistency and to keep readdir and other fs operations performant. Before introducing the file locking mechanism, the patches refactor pre-existing code to move most of the backend related logic away from the api code to the datastore implementation, in order to have a common interface especially for chunk insert. As part of the series it is now also assured that chunks which are removed from the local datastore cache, are also dropped from it's in-memory LRU cache and therefore a consistent state is achieved. Changes since version 2: - Incorporate additional race fix as discussed in https://lore.proxmox.com/pbs-devel/8ab74557-9592-43e7-8706-10fceaae31b7@proxmox.com/T/ and suggested offlist. Changes since version 1 (thanks @Fabian for review): - Fix lock inversion for rename corrup chunk. - Inline the chunk lock helper, making it explicit and thereby avoid calling the helper for regular datastores. - Pass the backend to the add_blob datastore helper, so it can be reused for the backup session and pull sync job. - Move also the s3 index upload helper from the backup env to the datastore, and reuse it for the sync job as well. This patch series obsoletes two previous patch series with unfortunately incomplete bugfix attempts found at: - https://lore.proxmox.com/pbs-devel/8d711a20-b193-47a9-8f38-6ce800e6d0e8@proxmox.com/T/ - https://lore.proxmox.com/pbs-devel/20251015164008.975591-1-c.ebner@proxmox.com/T/ proxmox-backup: Christian Ebner (23): sync: pull: instantiate backend only once per sync job api/datastore: move group notes setting to the datastore api/datastore: move snapshot deletion into dedicated datastore helper api/datastore: move backup log upload by implementing datastore helper api: backup: use datastore add_blob helper for backup session api/datastore: add dedicated datastore helper to set snapshot notes api/datastore: move s3 index upload helper to datastore backend datastore: refactor chunk insert based on backend verify: rename corrupted to corrupt in log output and function names verify/datastore: make rename corrupt chunk a datastore helper method datastore: refactor rename_corrupt_chunk error handling chunk store: implement per-chunk file locking helper for s3 backend datastore: acquire chunk store mutex lock when renaming corrupt chunk datastore: get per-chunk file lock for chunk rename on s3 backend fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache datastore: add locking to protect against races on chunk insert for s3 GC: fix race with chunk upload/insert on s3 backends GC: lock chunk marker before cleanup in phase 3 on s3 backends datastore: GC: drop overly verbose info message during s3 chunk sweep chunk store: reduce exposure of clear_chunk() to crate only chunk store: make chunk removal a helper method of the chunk store GC: fix deadlock for cache eviction and garbage collection chunk store: never fail when trying to remove missing chunk file pbs-datastore/src/backup_info.rs | 2 +- pbs-datastore/src/chunk_store.rs | 86 +++++- pbs-datastore/src/datastore.rs | 290 +++++++++++++++++- .../src/local_datastore_lru_cache.rs | 6 +- src/api2/admin/datastore.rs | 77 ++--- src/api2/backup/environment.rs | 53 +--- src/api2/backup/upload_chunk.rs | 64 +--- src/api2/tape/restore.rs | 6 +- src/backup/verify.rs | 83 +---- src/server/pull.rs | 61 ++-- 10 files changed, 442 insertions(+), 286 deletions(-) Summary over all repositories: 10 files changed, 442 insertions(+), 286 deletions(-) -- Generated by git-murpp 0.8.1 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel