From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pbs-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 268C61FF16F for <inbox@lore.proxmox.com>; Tue, 15 Apr 2025 16:50:23 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 2BAC7BECD; Tue, 15 Apr 2025 16:50:21 +0200 (CEST) From: Christian Ebner <c.ebner@proxmox.com> To: pbs-devel@lists.proxmox.com Date: Tue, 15 Apr 2025 16:49:30 +0200 Message-Id: <20250415144930.426824-1-c.ebner@proxmox.com> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.028 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [datastore.rs] Subject: [pbs-devel] [PATCH v3 proxmox-backup] garbage collection: fix rare race in chunk marking phase X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion <pbs-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/> List-Post: <mailto:pbs-devel@lists.proxmox.com> List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com> During phase 1 of garbage collection referenced chunks are marked as in use by iterating over all index files and updating the atime on the chunks referenced by these. In an edge case for long running garbage collection jobs, where a newly added snapshot (created after the start of GC) reused known chunks from a previous snapshot, but the previous snapshot index referencing them disappeared before the marking phase could reach that index (e.g. pruned because only 1 snapshot to be kept by retention setting), known chunks from that previous index file might not be marked (given that by none of the other index files it was marked). Since commit 74361da8 ("garbage collection: generate index file list via datastore iterators") this is even less likely as now the iteration reads also index files added during phase 1, and therefore either the new or the previous index file will account for these chunks (the previous backup snapshot can only be prunded after the new one finished, since locked). There remains however a small race window between the reading of the snapshots in the backup group and the reading of the actual index files for marking. Fix this race by: 1. Checking if the last snapshot of a group disappeared and if so 2. generate the list again, looking for new index files previously not accounted for 3. To avoid possible endless looping, lock the group if the snapshot list changed even after the 10th time (which will lead to concurrent operations to this group failing). Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- changes since version 2: - replace needs_retry variable by labeled loop operations - check if lock could actually be acquired in last restort branch, fail otherwise - include catchall case for retry counter overrun - catch case where snapshot listing fails if group vanished, only fail if the group still exists on the filesystem. pbs-datastore/src/datastore.rs | 116 +++++++++++++++++++++++---------- 1 file changed, 80 insertions(+), 36 deletions(-) diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs index aa38e2ac1..8ef5c6860 100644 --- a/pbs-datastore/src/datastore.rs +++ b/pbs-datastore/src/datastore.rs @@ -1143,47 +1143,91 @@ impl DataStore { let namespace = namespace.context("iterating namespaces failed")?; for group in arc_self.iter_backup_groups(namespace)? { let group = group.context("iterating backup groups failed")?; - let mut snapshots = group.list_backups().context("listing snapshots failed")?; - // Sort by snapshot timestamp to iterate over consecutive snapshots for each image. - BackupInfo::sort_list(&mut snapshots, true); - for snapshot in snapshots { - for file in snapshot.files { - worker.check_abort()?; - worker.fail_on_shutdown()?; - - let mut path = snapshot.backup_dir.full_path(); - path.push(file); - - let index = match self.open_index_reader(&path)? { - Some(index) => index, - None => { - unprocessed_index_list.remove(&path); - continue; + + // Avoid race between listing/marking of snapshots by GC and pruning the last + // snapshot in the group, following a new snapshot creation. Otherwise known chunks + // might only be referenced by the new snapshot, so it must be read as well. + let mut retry_counter = 0; + let mut processed_group_indices = HashSet::new(); + 'retry: loop { + let _lock = match retry_counter { + 0..=9 => None, + 10 => Some( + group + .lock() + .context("exhausted retries and failed to lock group")?, + ), + _ => bail!("exhausted retires and unexpected counter overrun"), + }; + + let mut snapshots = match group.list_backups() { + Ok(snapshots) => snapshots, + Err(err) => { + if group.exists() { + return Err(err).context("listing snapshots failed")?; } - }; - self.index_mark_used_chunks( - index, - &path, - &mut chunk_lru_cache, - status, - worker, - )?; - - if !unprocessed_index_list.remove(&path) { - info!("Encountered new index file '{path:?}', increment total index file count"); - index_count += 1; + break 'retry; } + }; + + let snapshot_count = snapshots.len(); + BackupInfo::sort_list(&mut snapshots, true); + for (count, snapshot) in snapshots.into_iter().enumerate() { + for file in snapshot.files { + worker.check_abort()?; + worker.fail_on_shutdown()?; + + match ArchiveType::from_path(&file) { + Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), + Ok(ArchiveType::Blob) | Err(_) => continue, + }; + + let mut path = snapshot.backup_dir.full_path(); + path.push(file); + + // Avoid reprocessing of already seen index files on retry + if retry_counter > 0 && processed_group_indices.contains(&path) { + continue; + } + + let index = match self.open_index_reader(&path)? { + Some(index) => index, + None => { + unprocessed_index_list.remove(&path); + if count + 1 == snapshot_count { + retry_counter += 1; + continue 'retry; + } + continue; + } + }; + self.index_mark_used_chunks( + index, + &path, + &mut chunk_lru_cache, + status, + worker, + )?; + processed_group_indices.insert(path.clone()); + + if !unprocessed_index_list.remove(&path) { + info!("Encountered new index file '{path:?}', increment total index file count"); + index_count += 1; + } - let percentage = (processed_index_files + 1) * 100 / index_count; - if percentage > last_percentage { - info!( - "marked {percentage}% ({} of {index_count} index files)", - processed_index_files + 1, - ); - last_percentage = percentage; + let percentage = (processed_index_files + 1) * 100 / index_count; + if percentage > last_percentage { + info!( + "marked {percentage}% ({} of {index_count} index files)", + processed_index_files + 1, + ); + last_percentage = percentage; + } + processed_index_files += 1; } - processed_index_files += 1; } + + break; } } } -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel