From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pbs-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id AE5EF1FF16B for <inbox@lore.proxmox.com>; Thu, 17 Apr 2025 11:30:10 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 157DAF4D0; Thu, 17 Apr 2025 11:30:08 +0200 (CEST) Date: Thu, 17 Apr 2025 11:29:30 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= <f.gruenbichler@proxmox.com> To: Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com> References: <20250416141803.479125-1-c.ebner@proxmox.com> <20250416141803.479125-5-c.ebner@proxmox.com> In-Reply-To: <20250416141803.479125-5-c.ebner@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.16.0 (https://github.com/astroidmail/astroid) Message-Id: <1744881430.ds1jf5x2pr.astroid@yuna.none> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.046 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion <pbs-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/> List-Post: <mailto:pbs-devel@lists.proxmox.com> List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com> On April 16, 2025 4:18 pm, Christian Ebner wrote: > Snapshots pruned during phase 1 are now also assured to be included > in the marking phase by reading the index files from trash and > touching these chunks as well. > > Clear any trash before starting with phase 1, so that only snapshots > pruned during GC are consided. > > Further, drop the retry logic used before to assure eventually newly > written index files are included in the marking phase, if the > previously last one was pruned. This is not necessary anymore as the > previously last one will be read from trash. > > Signed-off-by: Christian Ebner <c.ebner@proxmox.com> > --- > pbs-datastore/src/datastore.rs | 141 +++++++++++++++------------------ > 1 file changed, 65 insertions(+), 76 deletions(-) > > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs > index 97b78f000..688e65247 100644 > --- a/pbs-datastore/src/datastore.rs > +++ b/pbs-datastore/src/datastore.rs > @@ -1137,7 +1137,13 @@ impl DataStore { > // > // By this it is assured that all index files are used, even if they would not have been > // seen by the regular logic and the user is informed by the garbage collection run about > - // the detected index files not following the iterators logic. > + // the detected index files not following the iterators logic. Further, include trashed > + // snapshots which have been pruned during garbage collections marking phase. > + > + let trash = PathBuf::from(".trash/"); > + let mut full_trash_path = self.base_path(); > + full_trash_path.push(&trash); > + let _ = std::fs::remove_dir_all(full_trash_path); I think this would need some locking, else we start recursively deleting here while at the same time a prune operation is moving something into the trash.. > > let mut unprocessed_index_list = self.list_index_files(None)?; > let mut index_count = unprocessed_index_list.len(); > @@ -1154,88 +1160,63 @@ impl DataStore { > let namespace = namespace.context("iterating namespaces failed")?; > for group in arc_self.iter_backup_groups(namespace)? { > let group = group.context("iterating backup groups failed")?; > + let mut snapshots = match group.list_backups() { > + Ok(snapshots) => snapshots, > + Err(err) => { > + if group.exists() { > + return Err(err).context("listing snapshots failed")?; > + } > + // vanished, will be covered by trashed list below to avoid > + // not touching known chunks. > + continue; > + } > + }; > > - // Avoid race between listing/marking of snapshots by GC and pruning the last > - // snapshot in the group, following a new snapshot creation. Otherwise known chunks > - // might only be referenced by the new snapshot, so it must be read as well. > - let mut retry_counter = 0; > - 'retry: loop { > - let _lock = match retry_counter { > - 0..=9 => None, > - 10 => Some( > - group > - .lock() > - .context("exhausted retries and failed to lock group")?, > - ), > - _ => bail!("exhausted retries and unexpected counter overrun"), > - }; > - > - let mut snapshots = match group.list_backups() { > - Ok(snapshots) => snapshots, > - Err(err) => { > - if group.exists() { > - return Err(err).context("listing snapshots failed")?; > - } > - break 'retry; > + BackupInfo::sort_list(&mut snapshots, true); > + for snapshot in snapshots.into_iter() { > + for file in snapshot.files { > + worker.check_abort()?; > + worker.fail_on_shutdown()?; > + > + match ArchiveType::from_path(&file) { > + Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), > + Ok(ArchiveType::Blob) | Err(_) => continue, > } > - }; > - > - // Always start iteration with the last snapshot of the group to reduce race > - // window with concurrent backup+prune previous last snapshot. Allows to retry > - // without the need to keep track of already processed index files for the > - // current group. > - BackupInfo::sort_list(&mut snapshots, true); > - for (count, snapshot) in snapshots.into_iter().rev().enumerate() { > - for file in snapshot.files { > - worker.check_abort()?; > - worker.fail_on_shutdown()?; > - > - match ArchiveType::from_path(&file) { > - Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), > - Ok(ArchiveType::Blob) | Err(_) => continue, > - } > > - let mut path = snapshot.backup_dir.full_path(); > - path.push(file); > - > - let index = match self.open_index_reader(&path)? { > - Some(index) => index, > - None => { > - unprocessed_index_list.remove(&path); > - if count == 0 { > - retry_counter += 1; > - continue 'retry; > - } > - continue; > - } > - }; > - > - self.index_mark_used_chunks( > - index, > - &path, > - &mut chunk_lru_cache, > - status, > - worker, > - )?; > - > - if !unprocessed_index_list.remove(&path) { > - info!("Encountered new index file '{path:?}', increment total index file count"); > - index_count += 1; > - } > + let mut path = snapshot.backup_dir.full_path(); > + path.push(file); > > - let percentage = (processed_index_files + 1) * 100 / index_count; > - if percentage > last_percentage { > - info!( > - "marked {percentage}% ({} of {index_count} index files)", > - processed_index_files + 1, > - ); > - last_percentage = percentage; > + let index = match self.open_index_reader(&path)? { > + Some(index) => index, > + None => { > + unprocessed_index_list.remove(&path); > + continue; > } > - processed_index_files += 1; > + }; > + > + self.index_mark_used_chunks( > + index, > + &path, > + &mut chunk_lru_cache, > + status, > + worker, > + )?; > + > + if !unprocessed_index_list.remove(&path) { > + info!("Encountered new index file '{path:?}', increment total index file count"); > + index_count += 1; > } > - } > > - break; > + let percentage = (processed_index_files + 1) * 100 / index_count; > + if percentage > last_percentage { > + info!( > + "marked {percentage}% ({} of {index_count} index files)", > + processed_index_files + 1, > + ); > + last_percentage = percentage; > + } > + processed_index_files += 1; > + } > } > } > } > @@ -1257,6 +1238,14 @@ impl DataStore { > warn!("Found {strange_paths_count} index files outside of expected directory scheme"); > } > > + let trashed_index_list = self.list_index_files(Some(trash))?; > + for path in trashed_index_list { > + if let Some(index) = self.open_index_reader(&path)? { > + info!("Mark chunks for pruned index file found in {path:?}"); > + self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?; > + }; > + } > + I think we'd want to support undoing moving a snapshot to trash, if we have a trash can (that's the other half of wanting this feature after all). if we do so, we need to be careful to not re-introduce a race here (e.g., by keeping a copy in the trash can when undoing, or by using a trash can mechanism that doesn't require separate iteration over regular and trashed snapshots). > Ok(()) > } > > -- > 2.39.5 > > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel > > > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel