* [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots @ 2025-04-16 14:17 Christian Ebner 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files Christian Ebner ` (4 more replies) 0 siblings, 5 replies; 21+ messages in thread From: Christian Ebner @ 2025-04-16 14:17 UTC (permalink / raw) To: pbs-devel In an effort to simplify the GC phase 1 logic introduced by commit cb9814e3 ("garbage collection: fix rare race in chunk marking phase") this patch series implement a trash can functionality for snapshots. The main intention is to allow snapshot's index files, pruned while ongoing phase 1 of garbage collection, to be read and their chunks marked as in use as well. This will allow to get rid of the currently implemented and rather complex retry looping logic, which could in theory lead to failing GC or backups when trying to lock the whole group exclusively following the 10-th retry. To achieve this, pruning of snapshots does not remove them immediately, but rather moves them to a `.trash` subfolder in the datastores base directory. This directory will then be cleared before starting of GC phase 1, meaning that any index file could be restored until the next GC run. This however comes with it's own set of issues, therefore sending these patches as RFC for now. Open questions and known limitations are: - Pruning does not cleanup any space, on the contrary it might require additional space on COW filesystem. Should there be a flag to bypass the trash, also given that sometimes users truly want to remove a snapshot immediately? Although that would re-introduce the issue with new snapshot ceration and concurrent GC on a last snapshot. - Prune + sync + prune might lead to the same snapshot being pruned multiple times, currently any second prune on a snapshot would fail. Should this overwrite the trashed snapshot? - GC might now read the same index twice, once before it was pruned followed by a prune while phase 1 is still ongoing and the second time as read from the trash. Not really an issue, but rather a limitation. - Further issues I'm currently overlooking Christian Ebner (4): datastore: always skip over base directory when listing index files datastore: allow to specify sub-directory for index file listing datastore: move snapshots to trash folder on destroy garbage collection: read pruned snapshot index files from trash pbs-datastore/src/backup_info.rs | 14 ++- pbs-datastore/src/datastore.rs | 158 +++++++++++++++---------------- 2 files changed, 89 insertions(+), 83 deletions(-) -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files 2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner @ 2025-04-16 14:18 ` Christian Ebner 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing Christian Ebner ` (3 subsequent siblings) 4 siblings, 1 reply; 21+ messages in thread From: Christian Ebner @ 2025-04-16 14:18 UTC (permalink / raw) To: pbs-devel The base is a directory and not an index file anyways, so there is no need to apply the filtering and archive type matching on it. Further, this will allow to use a hidden folder as base, otherwise not possible as excluded by the filtering, which will be useful when listing index files in a `.trash` folder. No functional change intended. Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- pbs-datastore/src/datastore.rs | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs index 309137e00..3fde8b871 100644 --- a/pbs-datastore/src/datastore.rs +++ b/pbs-datastore/src/datastore.rs @@ -979,7 +979,9 @@ impl DataStore { use walkdir::WalkDir; - let walker = WalkDir::new(base).into_iter(); + let mut walker = WalkDir::new(base).into_iter(); + // always ignore the base directory itself, so a hidden folder can be used as base as well + walker.next(); // make sure we skip .chunks (and other hidden files to keep it simple) fn is_hidden(entry: &walkdir::DirEntry) -> bool { -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files Christian Ebner @ 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-17 10:27 ` Christian Ebner 0 siblings, 1 reply; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-17 9:29 UTC (permalink / raw) To: Proxmox Backup Server development discussion the commit subject could maybe indicate somehow that this is for the GC-specific helper, and not some generic index listing code ;) On April 16, 2025 4:18 pm, Christian Ebner wrote: > The base is a directory and not an index file anyways, so there is no > need to apply the filtering and archive type matching on it. > Further, this will allow to use a hidden folder as base, otherwise > not possible as excluded by the filtering, which will be useful when > listing index files in a `.trash` folder. > > No functional change intended. > > Signed-off-by: Christian Ebner <c.ebner@proxmox.com> > --- > pbs-datastore/src/datastore.rs | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs > index 309137e00..3fde8b871 100644 > --- a/pbs-datastore/src/datastore.rs > +++ b/pbs-datastore/src/datastore.rs > @@ -979,7 +979,9 @@ impl DataStore { > > use walkdir::WalkDir; > > - let walker = WalkDir::new(base).into_iter(); > + let mut walker = WalkDir::new(base).into_iter(); > + // always ignore the base directory itself, so a hidden folder can be used as base as well > + walker.next(); this should check for errors? > > // make sure we skip .chunks (and other hidden files to keep it simple) > fn is_hidden(entry: &walkdir::DirEntry) -> bool { > -- > 2.39.5 > > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel > > > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files 2025-04-17 9:29 ` Fabian Grünbichler @ 2025-04-17 10:27 ` Christian Ebner 0 siblings, 0 replies; 21+ messages in thread From: Christian Ebner @ 2025-04-17 10:27 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Fabian Grünbichler On 4/17/25 11:29, Fabian Grünbichler wrote: > the commit subject could maybe indicate somehow that this is for the > GC-specific helper, and not some generic index listing code ;) > > On April 16, 2025 4:18 pm, Christian Ebner wrote: >> The base is a directory and not an index file anyways, so there is no >> need to apply the filtering and archive type matching on it. >> Further, this will allow to use a hidden folder as base, otherwise >> not possible as excluded by the filtering, which will be useful when >> listing index files in a `.trash` folder. >> >> No functional change intended. >> >> Signed-off-by: Christian Ebner <c.ebner@proxmox.com> >> --- >> pbs-datastore/src/datastore.rs | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs >> index 309137e00..3fde8b871 100644 >> --- a/pbs-datastore/src/datastore.rs >> +++ b/pbs-datastore/src/datastore.rs >> @@ -979,7 +979,9 @@ impl DataStore { >> >> use walkdir::WalkDir; >> >> - let walker = WalkDir::new(base).into_iter(); >> + let mut walker = WalkDir::new(base).into_iter(); >> + // always ignore the base directory itself, so a hidden folder can be used as base as well >> + walker.next(); > > this should check for errors? Agreed, makes sense as otherwise the index file listing might return without error and an empty list if e.g. the directory cannot be read for some reason. Will adapt accordingly, thanks! _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing 2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files Christian Ebner @ 2025-04-16 14:18 ` Christian Ebner 2025-04-18 9:38 ` Thomas Lamprecht 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy Christian Ebner ` (2 subsequent siblings) 4 siblings, 1 reply; 21+ messages in thread From: Christian Ebner @ 2025-04-16 14:18 UTC (permalink / raw) To: pbs-devel Extend the list_index_files helper to allow passing a sub-directory to it. This will be used to fetch only index files from the `.trash` directory, ignored by regular listing since hidden folders are not considered. Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- pbs-datastore/src/datastore.rs | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs index 3fde8b871..97b78f000 100644 --- a/pbs-datastore/src/datastore.rs +++ b/pbs-datastore/src/datastore.rs @@ -972,8 +972,15 @@ impl DataStore { /// The filesystem is walked recursevly to detect index files based on their archive type based /// on the filename. This however excludes the chunks folder, hidden files and does not follow /// symlinks. - fn list_index_files(&self) -> Result<HashSet<PathBuf>, Error> { - let base = self.base_path(); + /// If a subdirectory is provided, only that is scanned for index files. + fn list_index_files(&self, subdir: Option<PathBuf>) -> Result<HashSet<PathBuf>, Error> { + let mut base = self.base_path(); + if let Some(subdir) = subdir { + base.push(subdir); + if !base.exists() { + return Ok(HashSet::new()); + } + } let mut list = HashSet::new(); @@ -1132,7 +1139,7 @@ impl DataStore { // seen by the regular logic and the user is informed by the garbage collection run about // the detected index files not following the iterators logic. - let mut unprocessed_index_list = self.list_index_files()?; + let mut unprocessed_index_list = self.list_index_files(None)?; let mut index_count = unprocessed_index_list.len(); let mut chunk_lru_cache = LruCache::new(cache_capacity); -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing Christian Ebner @ 2025-04-18 9:38 ` Thomas Lamprecht 2025-04-18 9:55 ` Christian Ebner 0 siblings, 1 reply; 21+ messages in thread From: Thomas Lamprecht @ 2025-04-18 9:38 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Christian Ebner Am 16.04.25 um 16:18 schrieb Christian Ebner: > Extend the list_index_files helper to allow passing a sub-directory > to it. This will be used to fetch only index files from the `.trash` > directory, ignored by regular listing since hidden folders are not > considered. > > Signed-off-by: Christian Ebner <c.ebner@proxmox.com> > --- > pbs-datastore/src/datastore.rs | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs > index 3fde8b871..97b78f000 100644 > --- a/pbs-datastore/src/datastore.rs > +++ b/pbs-datastore/src/datastore.rs > @@ -972,8 +972,15 @@ impl DataStore { > /// The filesystem is walked recursevly to detect index files based on their archive type based > /// on the filename. This however excludes the chunks folder, hidden files and does not follow > /// symlinks. > - fn list_index_files(&self) -> Result<HashSet<PathBuf>, Error> { > - let base = self.base_path(); > + /// If a subdirectory is provided, only that is scanned for index files. > + fn list_index_files(&self, subdir: Option<PathBuf>) -> Result<HashSet<PathBuf>, Error> { it might be a tiny bit nicer to add a new list_index_files_from_path that shares the implementation with this one here. While it's not pub and there is only one use size, it's still minimally nicer to have this encoded in the method name, at least when reading the call site and wondering what None means here. > + let mut base = self.base_path(); > + if let Some(subdir) = subdir { > + base.push(subdir); > + if !base.exists() { > + return Ok(HashSet::new()); > + } > + } > > let mut list = HashSet::new(); > > @@ -1132,7 +1139,7 @@ impl DataStore { > // seen by the regular logic and the user is informed by the garbage collection run about > // the detected index files not following the iterators logic. > > - let mut unprocessed_index_list = self.list_index_files()?; > + let mut unprocessed_index_list = self.list_index_files(None)?; > let mut index_count = unprocessed_index_list.len(); > > let mut chunk_lru_cache = LruCache::new(cache_capacity); _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing 2025-04-18 9:38 ` Thomas Lamprecht @ 2025-04-18 9:55 ` Christian Ebner 0 siblings, 0 replies; 21+ messages in thread From: Christian Ebner @ 2025-04-18 9:55 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox Backup Server development discussion On 4/18/25 11:38, Thomas Lamprecht wrote: > Am 16.04.25 um 16:18 schrieb Christian Ebner: >> Extend the list_index_files helper to allow passing a sub-directory >> to it. This will be used to fetch only index files from the `.trash` >> directory, ignored by regular listing since hidden folders are not >> considered. >> >> Signed-off-by: Christian Ebner <c.ebner@proxmox.com> >> --- >> pbs-datastore/src/datastore.rs | 13 ++++++++++--- >> 1 file changed, 10 insertions(+), 3 deletions(-) >> >> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs >> index 3fde8b871..97b78f000 100644 >> --- a/pbs-datastore/src/datastore.rs >> +++ b/pbs-datastore/src/datastore.rs >> @@ -972,8 +972,15 @@ impl DataStore { >> /// The filesystem is walked recursevly to detect index files based on their archive type based >> /// on the filename. This however excludes the chunks folder, hidden files and does not follow >> /// symlinks. >> - fn list_index_files(&self) -> Result<HashSet<PathBuf>, Error> { >> - let base = self.base_path(); >> + /// If a subdirectory is provided, only that is scanned for index files. >> + fn list_index_files(&self, subdir: Option<PathBuf>) -> Result<HashSet<PathBuf>, Error> { > > it might be a tiny bit nicer to add a new list_index_files_from_path > that shares the implementation with this one here. While it's not pub > and there is only one use size, it's still minimally nicer to have this > encoded in the method name, at least when reading the call site and > wondering what None means here. Okay, will adapt this accordingly, thanks! _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files Christian Ebner 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing Christian Ebner @ 2025-04-16 14:18 ` Christian Ebner 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash Christian Ebner 2025-04-17 9:29 ` [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Fabian Grünbichler 4 siblings, 1 reply; 21+ messages in thread From: Christian Ebner @ 2025-04-16 14:18 UTC (permalink / raw) To: pbs-devel Instead of directly deleting the snapshot directory and it's contents on a prune, move the snapshot directory into the `.trash` subfolder of the datastore. This allows to mark chunks which were used by these index files if the snapshot was pruned during an ongoing garbage collection. Garbage collection will clean up these files before starting with the marking phase 1 and read all index files after completing that phase, touching these chunks as well. Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- pbs-datastore/src/backup_info.rs | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs index d4732fdd9..cd0d521c9 100644 --- a/pbs-datastore/src/backup_info.rs +++ b/pbs-datastore/src/backup_info.rs @@ -588,11 +588,19 @@ impl BackupDir { bail!("cannot remove protected snapshot"); // use special error type? } + let relative_path = self.relative_path(); + let mut trash_path = self.store.base_path(); + trash_path.push(".trash/"); + trash_path.push(relative_path); + if let Some(parent) = trash_path.parent() { + std::fs::create_dir_all(&parent) + .with_context(|| format!("creating trash folders for {trash_path:?} failed"))?; + } + let full_path = self.full_path(); log::info!("removing backup snapshot {:?}", full_path); - std::fs::remove_dir_all(&full_path).map_err(|err| { - format_err!("removing backup snapshot {:?} failed - {}", full_path, err,) - })?; + std::fs::rename(&full_path, trash_path) + .with_context(|| format!("moving backup snapshot {full_path:?} to trash failed"))?; // remove no longer needed lock files let _ = std::fs::remove_file(self.manifest_lock_path()); // ignore errors -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy Christian Ebner @ 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-18 11:06 ` Thomas Lamprecht 0 siblings, 1 reply; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-17 9:29 UTC (permalink / raw) To: Proxmox Backup Server development discussion On April 16, 2025 4:18 pm, Christian Ebner wrote: > Instead of directly deleting the snapshot directory and it's contents > on a prune, move the snapshot directory into the `.trash` subfolder > of the datastore. > > This allows to mark chunks which were used by these index files if > the snapshot was pruned during an ongoing garbage collection. > Garbage collection will clean up these files before starting with the > marking phase 1 and read all index files after completing that phase, > touching these chunks as well. some other variants to maybe consider: marking the snapshot itself as trash (in the manifest, or by adding a trash marker file inside the dir) - this would mean that there is no iterator race issue when undoing a prune, no double-pruning collisions, .. - but it also means we need to adapt all call sites that should skip trashed snapshots (most existing ones), which is more churn. having a trash dir per group instead of a global one for the whole datastore (less likely to incur extra costs in case somebody has a weird setup where namespaces/.. are symlinked or bindmounted or similar shenanigans). would need to postpone group removal to GC in case all snapshots are pruned. > > Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> > Signed-off-by: Christian Ebner <c.ebner@proxmox.com> > --- > pbs-datastore/src/backup_info.rs | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs > index d4732fdd9..cd0d521c9 100644 > --- a/pbs-datastore/src/backup_info.rs > +++ b/pbs-datastore/src/backup_info.rs > @@ -588,11 +588,19 @@ impl BackupDir { > bail!("cannot remove protected snapshot"); // use special error type? > } > > + let relative_path = self.relative_path(); > + let mut trash_path = self.store.base_path(); > + trash_path.push(".trash/"); > + trash_path.push(relative_path); > + if let Some(parent) = trash_path.parent() { > + std::fs::create_dir_all(&parent) > + .with_context(|| format!("creating trash folders for {trash_path:?} failed"))?; > + } > + > let full_path = self.full_path(); > log::info!("removing backup snapshot {:?}", full_path); > - std::fs::remove_dir_all(&full_path).map_err(|err| { > - format_err!("removing backup snapshot {:?} failed - {}", full_path, err,) > - })?; > + std::fs::rename(&full_path, trash_path) > + .with_context(|| format!("moving backup snapshot {full_path:?} to trash failed"))?; > > // remove no longer needed lock files > let _ = std::fs::remove_file(self.manifest_lock_path()); // ignore errors > -- > 2.39.5 > > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel > > > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-17 9:29 ` Fabian Grünbichler @ 2025-04-18 11:06 ` Thomas Lamprecht 2025-04-18 11:49 ` Christian Ebner 2025-04-18 11:51 ` Fabian Grünbichler 0 siblings, 2 replies; 21+ messages in thread From: Thomas Lamprecht @ 2025-04-18 11:06 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Fabian Grünbichler Am 17.04.25 um 11:29 schrieb Fabian Grünbichler: > On April 16, 2025 4:18 pm, Christian Ebner wrote: >> Instead of directly deleting the snapshot directory and it's contents >> on a prune, move the snapshot directory into the `.trash` subfolder >> of the datastore. >> >> This allows to mark chunks which were used by these index files if >> the snapshot was pruned during an ongoing garbage collection. >> Garbage collection will clean up these files before starting with the >> marking phase 1 and read all index files after completing that phase, >> touching these chunks as well. > > some other variants to maybe consider: > > marking the snapshot itself as trash (in the manifest, or by adding a > trash marker file inside the dir) - this would mean that there is no > iterator race issue when undoing a prune, no double-pruning collisions, > .. - but it also means we need to adapt all call sites that should skip > trashed snapshots (most existing ones), which is more churn. Shouldn't we use the central iterators implementations to query indexes? I.e., any call site that doesn't should probably be switched over to those, just like Chris did for GC recently. Then it could be defined if trashed indexes should be skipped – the default – or included when instantiating that iterator, e.g. through a parameter or probably a dedicated "with_trash" fn – but that's details. > > having a trash dir per group instead of a global one for the whole > datastore (less likely to incur extra costs in case somebody has a weird > setup where namespaces/.. are symlinked or bindmounted or similar > shenanigans). would need to postpone group removal to GC in case all > snapshots are pruned. Both of those variants would make restore simpler too, depends IMO a bit if we already read the manifest everywhere anyway where the info is needed, in that case I'd slightly favor that as place to store the info if a backup is trash or not, as that would avoid the need for another directory or file to store (inode exhaustion) and manage. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-18 11:06 ` Thomas Lamprecht @ 2025-04-18 11:49 ` Christian Ebner 2025-04-18 12:03 ` Fabian Grünbichler 2025-04-18 11:51 ` Fabian Grünbichler 1 sibling, 1 reply; 21+ messages in thread From: Christian Ebner @ 2025-04-18 11:49 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Thomas Lamprecht, Fabian Grünbichler On 4/18/25 13:06, Thomas Lamprecht wrote: > Am 17.04.25 um 11:29 schrieb Fabian Grünbichler: >> On April 16, 2025 4:18 pm, Christian Ebner wrote: >>> Instead of directly deleting the snapshot directory and it's contents >>> on a prune, move the snapshot directory into the `.trash` subfolder >>> of the datastore. >>> >>> This allows to mark chunks which were used by these index files if >>> the snapshot was pruned during an ongoing garbage collection. >>> Garbage collection will clean up these files before starting with the >>> marking phase 1 and read all index files after completing that phase, >>> touching these chunks as well. >> >> some other variants to maybe consider: >> >> marking the snapshot itself as trash (in the manifest, or by adding a >> trash marker file inside the dir) - this would mean that there is no >> iterator race issue when undoing a prune, no double-pruning collisions, >> .. - but it also means we need to adapt all call sites that should skip >> trashed snapshots (most existing ones), which is more churn. > > Shouldn't we use the central iterators implementations to query indexes? Yes, correct me if I'm wrong, have not checked all call sites yet but index files are mostly accessed by going trough the manifest, either via BackupManifest::files or at least verifying it via BackupManifest::verfiy_file, as that's also were encryption and verification state are stored. So adding a label to store a trashed state there would work out just fine, filtering these snapshots for listing, sync job, ecc. is then fine as well. Also, fetching the previous backup snapshot for fast incremental mode will work, although require additional filtering. Although, I'm a bit concerned about performance for the content listing if we keep and iterate all of the pruned snapshots. After all they will persist until next GC, which could lead to a lot of accumulated snapshots. One further issue I see with that approach is again sync jobs, which now do not see the trashed snapshot on the target and try to re-sync it? Or would we include that information for the sync jobs to skip over? Would be a bit strange however if the snapshot is not trashed on the source side. Also, thinking about UI to recover from trash: Might it be good to still show the snapshots while listing, but marked with an icon, just like for e.g. encryption state? Or create a dedicated window/tab to only show trashed items. All in all storing the trash information on the manifest might not be the better option. Give above issues, I'm leaning more towards a separate folder structure for this. What's your opinion on these concerns? _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-18 11:49 ` Christian Ebner @ 2025-04-18 12:03 ` Fabian Grünbichler 2025-04-18 12:45 ` Christian Ebner 0 siblings, 1 reply; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-18 12:03 UTC (permalink / raw) To: Christian Ebner, Proxmox Backup Server development discussion, Thomas Lamprecht On April 18, 2025 1:49 pm, Christian Ebner wrote: > On 4/18/25 13:06, Thomas Lamprecht wrote: >> Am 17.04.25 um 11:29 schrieb Fabian Grünbichler: >>> On April 16, 2025 4:18 pm, Christian Ebner wrote: >>>> Instead of directly deleting the snapshot directory and it's contents >>>> on a prune, move the snapshot directory into the `.trash` subfolder >>>> of the datastore. >>>> >>>> This allows to mark chunks which were used by these index files if >>>> the snapshot was pruned during an ongoing garbage collection. >>>> Garbage collection will clean up these files before starting with the >>>> marking phase 1 and read all index files after completing that phase, >>>> touching these chunks as well. >>> >>> some other variants to maybe consider: >>> >>> marking the snapshot itself as trash (in the manifest, or by adding a >>> trash marker file inside the dir) - this would mean that there is no >>> iterator race issue when undoing a prune, no double-pruning collisions, >>> .. - but it also means we need to adapt all call sites that should skip >>> trashed snapshots (most existing ones), which is more churn. >> >> Shouldn't we use the central iterators implementations to query indexes? > > Yes, correct me if I'm wrong, have not checked all call sites yet but > index files are mostly accessed by going trough the manifest, either via > BackupManifest::files or at least verifying it via > BackupManifest::verfiy_file, as that's also were encryption and > verification state are stored. > > So adding a label to store a trashed state there would work out just > fine, filtering these snapshots for listing, sync job, ecc. is then fine > as well. Also, fetching the previous backup snapshot for fast > incremental mode will work, although require additional filtering. > > Although, I'm a bit concerned about performance for the content listing > if we keep and iterate all of the pruned snapshots. After all they will > persist until next GC, which could lead to a lot of accumulated snapshots. that's a fair point, in some environments this might be cumbersome.. OTOH, those are exactly the environments that would/should run GC often I guess, so maybe it's not that bad? > One further issue I see with that approach is again sync jobs, which now > do not see the trashed snapshot on the target and try to re-sync it? Or > would we include that information for the sync jobs to skip over? Would > be a bit strange however if the snapshot is not trashed on the source side. they'd only resync it if it's after the last local one, unless it's a "sync missing" special sync, so this is not different to the current state? at least, if syncing a trashed snapshot using the same snapshot is allowed and just undoes the trashing? > Also, thinking about UI to recover from trash: Might it be good to still > show the snapshots while listing, but marked with an icon, just like for > e.g. encryption state? Or create a dedicated window/tab to only show > trashed items. yes, the snapshot list needs to get an option to include trashed ones, and the UI should set and handle that appropriately ;) > All in all storing the trash information on the manifest might not be > the better option. Give above issues, I'm leaning more towards a > separate folder structure for this. most of the above issues apply to both variants anyway - the main difference is that with the separate folder iterating access needs to opt-into including trashed snapshots, so only does extra work in case that is desired, whereas in the manifest one the extra work is already done by the time we can decide to skip/filter out a snapshot because it's trash. maybe a summary would be: pro separate folder: - less work when only iterating over non-trash or only over trash - no need to parse manifest where it is currently not parsed con separate folder: - more work/complicated handling when iterating over both trash and non-trash - more work to put something into/out of the trash ? _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-18 12:03 ` Fabian Grünbichler @ 2025-04-18 12:45 ` Christian Ebner 2025-04-22 7:54 ` Fabian Grünbichler 0 siblings, 1 reply; 21+ messages in thread From: Christian Ebner @ 2025-04-18 12:45 UTC (permalink / raw) To: Fabian Grünbichler, Proxmox Backup Server development discussion, Thomas Lamprecht On 4/18/25 14:03, Fabian Grünbichler wrote: > On April 18, 2025 1:49 pm, Christian Ebner wrote: >> On 4/18/25 13:06, Thomas Lamprecht wrote: >>> Am 17.04.25 um 11:29 schrieb Fabian Grünbichler: >>>> On April 16, 2025 4:18 pm, Christian Ebner wrote: >>>>> Instead of directly deleting the snapshot directory and it's contents >>>>> on a prune, move the snapshot directory into the `.trash` subfolder >>>>> of the datastore. >>>>> >>>>> This allows to mark chunks which were used by these index files if >>>>> the snapshot was pruned during an ongoing garbage collection. >>>>> Garbage collection will clean up these files before starting with the >>>>> marking phase 1 and read all index files after completing that phase, >>>>> touching these chunks as well. >>>> >>>> some other variants to maybe consider: >>>> >>>> marking the snapshot itself as trash (in the manifest, or by adding a >>>> trash marker file inside the dir) - this would mean that there is no >>>> iterator race issue when undoing a prune, no double-pruning collisions, >>>> .. - but it also means we need to adapt all call sites that should skip >>>> trashed snapshots (most existing ones), which is more churn. >>> >>> Shouldn't we use the central iterators implementations to query indexes? >> >> Yes, correct me if I'm wrong, have not checked all call sites yet but >> index files are mostly accessed by going trough the manifest, either via >> BackupManifest::files or at least verifying it via >> BackupManifest::verfiy_file, as that's also were encryption and >> verification state are stored. >> >> So adding a label to store a trashed state there would work out just >> fine, filtering these snapshots for listing, sync job, ecc. is then fine >> as well. Also, fetching the previous backup snapshot for fast >> incremental mode will work, although require additional filtering. >> >> Although, I'm a bit concerned about performance for the content listing >> if we keep and iterate all of the pruned snapshots. After all they will >> persist until next GC, which could lead to a lot of accumulated snapshots. > > that's a fair point, in some environments this might be cumbersome.. > OTOH, those are exactly the environments that would/should run GC often > I guess, so maybe it's not that bad? Well, the content listing performance is already problematic for some setups, so I would like to avoid adding to that problem :) >> One further issue I see with that approach is again sync jobs, which now >> do not see the trashed snapshot on the target and try to re-sync it? Or >> would we include that information for the sync jobs to skip over? Would >> be a bit strange however if the snapshot is not trashed on the source side. > > they'd only resync it if it's after the last local one, unless it's a > "sync missing" special sync, so this is not different to the current > state? at least, if syncing a trashed snapshot using the same snapshot > is allowed and just undoes the trashing? > >> Also, thinking about UI to recover from trash: Might it be good to still >> show the snapshots while listing, but marked with an icon, just like for >> e.g. encryption state? Or create a dedicated window/tab to only show >> trashed items. > > yes, the snapshot list needs to get an option to include trashed ones, > and the UI should set and handle that appropriately ;) > >> All in all storing the trash information on the manifest might not be >> the better option. Give above issues, I'm leaning more towards a >> separate folder structure for this. > > most of the above issues apply to both variants anyway - the main > difference is that with the separate folder iterating access needs to > opt-into including trashed snapshots, so only does extra work in case > that is desired, whereas in the manifest one the extra work is already > done by the time we can decide to skip/filter out a snapshot because > it's trash. > > maybe a summary would be: > > pro separate folder: > - less work when only iterating over non-trash or only over trash > - no need to parse manifest where it is currently not parsed > con separate folder: > - more work/complicated handling when iterating over both trash and non-trash > - more work to put something into/out of the trash > > ? Hmm, maybe a variant where we do not rely on the manifest or a dedicated folder marker at all, but rather change the folder name for the folder to either be hidden or have a dedicated pre/postfix? Similar to your marker file suggestion, but without the marker and an easy way to skip reading such snapshot folders to begin with. Would then require the snapshot creation to perform some additional checking and adapt the iterators to have variants with and without the hidden structure, but should reduce the issues from both variants discussed above? _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-18 12:45 ` Christian Ebner @ 2025-04-22 7:54 ` Fabian Grünbichler 2025-04-29 11:27 ` Christian Ebner 0 siblings, 1 reply; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-22 7:54 UTC (permalink / raw) To: Christian Ebner, Proxmox Backup Server development discussion, Thomas Lamprecht > Christian Ebner <c.ebner@proxmox.com> hat am 18.04.2025 14:45 CEST geschrieben: > > > On 4/18/25 14:03, Fabian Grünbichler wrote: > > On April 18, 2025 1:49 pm, Christian Ebner wrote: > >> On 4/18/25 13:06, Thomas Lamprecht wrote: > >>> Am 17.04.25 um 11:29 schrieb Fabian Grünbichler: > >>>> On April 16, 2025 4:18 pm, Christian Ebner wrote: > >>>>> Instead of directly deleting the snapshot directory and it's contents > >>>>> on a prune, move the snapshot directory into the `.trash` subfolder > >>>>> of the datastore. > >>>>> > >>>>> This allows to mark chunks which were used by these index files if > >>>>> the snapshot was pruned during an ongoing garbage collection. > >>>>> Garbage collection will clean up these files before starting with the > >>>>> marking phase 1 and read all index files after completing that phase, > >>>>> touching these chunks as well. > >>>> > >>>> some other variants to maybe consider: > >>>> > >>>> marking the snapshot itself as trash (in the manifest, or by adding a > >>>> trash marker file inside the dir) - this would mean that there is no > >>>> iterator race issue when undoing a prune, no double-pruning collisions, > >>>> .. - but it also means we need to adapt all call sites that should skip > >>>> trashed snapshots (most existing ones), which is more churn. > >>> > >>> Shouldn't we use the central iterators implementations to query indexes? > >> > >> Yes, correct me if I'm wrong, have not checked all call sites yet but > >> index files are mostly accessed by going trough the manifest, either via > >> BackupManifest::files or at least verifying it via > >> BackupManifest::verfiy_file, as that's also were encryption and > >> verification state are stored. > >> > >> So adding a label to store a trashed state there would work out just > >> fine, filtering these snapshots for listing, sync job, ecc. is then fine > >> as well. Also, fetching the previous backup snapshot for fast > >> incremental mode will work, although require additional filtering. > >> > >> Although, I'm a bit concerned about performance for the content listing > >> if we keep and iterate all of the pruned snapshots. After all they will > >> persist until next GC, which could lead to a lot of accumulated snapshots. > > > > that's a fair point, in some environments this might be cumbersome.. > > OTOH, those are exactly the environments that would/should run GC often > > I guess, so maybe it's not that bad? > > Well, the content listing performance is already problematic for some > setups, so I would like to avoid adding to that problem :) sure, but that is a two-fold problem: - our UI code (PBS and PVE) not caching the result properly/using a streaming API to access the content tree - slow storage and big datastores > >> One further issue I see with that approach is again sync jobs, which now > >> do not see the trashed snapshot on the target and try to re-sync it? Or > >> would we include that information for the sync jobs to skip over? Would > >> be a bit strange however if the snapshot is not trashed on the source side. > > > > they'd only resync it if it's after the last local one, unless it's a > > "sync missing" special sync, so this is not different to the current > > state? at least, if syncing a trashed snapshot using the same snapshot > > is allowed and just undoes the trashing? > > > >> Also, thinking about UI to recover from trash: Might it be good to still > >> show the snapshots while listing, but marked with an icon, just like for > >> e.g. encryption state? Or create a dedicated window/tab to only show > >> trashed items. > > > > yes, the snapshot list needs to get an option to include trashed ones, > > and the UI should set and handle that appropriately ;) > > > >> All in all storing the trash information on the manifest might not be > >> the better option. Give above issues, I'm leaning more towards a > >> separate folder structure for this. > > > > most of the above issues apply to both variants anyway - the main > > difference is that with the separate folder iterating access needs to > > opt-into including trashed snapshots, so only does extra work in case > > that is desired, whereas in the manifest one the extra work is already > > done by the time we can decide to skip/filter out a snapshot because > > it's trash. > > > > maybe a summary would be: > > > > pro separate folder: > > - less work when only iterating over non-trash or only over trash > > - no need to parse manifest where it is currently not parsed > > con separate folder: > > - more work/complicated handling when iterating over both trash and non-trash > > - more work to put something into/out of the trash > > > > ? > > Hmm, maybe a variant where we do not rely on the manifest or a dedicated > folder marker at all, but rather change the folder name for the folder > to either be hidden or have a dedicated pre/postfix? Similar to your > marker file suggestion, but without the marker and an easy way to skip > reading such snapshot folders to begin with. Would then require the > snapshot creation to perform some additional checking and adapt the > iterators to have variants with and without the hidden structure, but > should reduce the issues from both variants discussed above? renaming the snapdir or moving it to an explicit (per-group?) trash dir are the same solution with all the same upsides and downsides, AFAICT. you still need to carefully reason about both directions (moving to trash and undoing), in particular about interactions with GC. you still have the issue of multiple cycles of moving to trash, adding the snapshot again, moving to trash again (which can be handled, but is more involved compared to the flag approach), when to lock both the trashed and the regular snapdir (or whether there is a single lock covering both?), etc.pp. I don't think there is any strict technical blocker for either approach, and both have pros and cons that need to be weighed. I don't see a clear winner (yet), maybe others have a strong(er) opinion? _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-22 7:54 ` Fabian Grünbichler @ 2025-04-29 11:27 ` Christian Ebner 0 siblings, 0 replies; 21+ messages in thread From: Christian Ebner @ 2025-04-29 11:27 UTC (permalink / raw) To: Fabian Grünbichler, Proxmox Backup Server development discussion, Thomas Lamprecht On 4/22/25 09:54, Fabian Grünbichler wrote: > > renaming the snapdir or moving it to an explicit (per-group?) trash dir are > the same solution with all the same upsides and downsides, AFAICT. you still > need to carefully reason about both directions (moving to trash and undoing), > in particular about interactions with GC. you still have the issue of multiple > cycles of moving to trash, adding the snapshot again, moving to trash again > (which can be handled, but is more involved compared to the flag approach), > when to lock both the trashed and the regular snapdir (or whether there is > a single lock covering both?), etc.pp. > > I don't think there is any strict technical blocker for either approach, and > both have pros and cons that need to be weighed. I don't see a clear winner > (yet), maybe others have a strong(er) opinion? Since it is also required to keep the backup groups and namespaces around when pruning these, the suggested approach with using a marker file for trashed snapshots seems best. This allows to apply the same marker approach to these as well. Re-creation of the same group would however require to clear all the trash content for these backup groups in order to not run into ownership/consistency issues. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy 2025-04-18 11:06 ` Thomas Lamprecht 2025-04-18 11:49 ` Christian Ebner @ 2025-04-18 11:51 ` Fabian Grünbichler 1 sibling, 0 replies; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-18 11:51 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Thomas Lamprecht On April 18, 2025 1:06 pm, Thomas Lamprecht wrote: > Am 17.04.25 um 11:29 schrieb Fabian Grünbichler: >> On April 16, 2025 4:18 pm, Christian Ebner wrote: >>> Instead of directly deleting the snapshot directory and it's contents >>> on a prune, move the snapshot directory into the `.trash` subfolder >>> of the datastore. >>> >>> This allows to mark chunks which were used by these index files if >>> the snapshot was pruned during an ongoing garbage collection. >>> Garbage collection will clean up these files before starting with the >>> marking phase 1 and read all index files after completing that phase, >>> touching these chunks as well. >> >> some other variants to maybe consider: >> >> marking the snapshot itself as trash (in the manifest, or by adding a >> trash marker file inside the dir) - this would mean that there is no >> iterator race issue when undoing a prune, no double-pruning collisions, >> .. - but it also means we need to adapt all call sites that should skip >> trashed snapshots (most existing ones), which is more churn. > > Shouldn't we use the central iterators implementations to query indexes? > I.e., any call site that doesn't should probably be switched over to > those, just like Chris did for GC recently. > Then it could be defined if trashed indexes should be skipped – the > default – or included when instantiating that iterator, e.g. through a > parameter or probably a dedicated "with_trash" fn – but that's details. yes, the parts where we iterate are fairly easy to handle, but we do have quite a few where we access a snapshot directly and might not want to treat a trashed one like a non-trashed one ;) >> having a trash dir per group instead of a global one for the whole >> datastore (less likely to incur extra costs in case somebody has a weird >> setup where namespaces/.. are symlinked or bindmounted or similar >> shenanigans). would need to postpone group removal to GC in case all >> snapshots are pruned. > > Both of those variants would make restore simpler too, depends IMO a > bit if we already read the manifest everywhere anyway where the info > is needed, in that case I'd slightly favor that as place to store the > info if a backup is trash or not, as that would avoid the need for > another directory or file to store (inode exhaustion) and manage. the iterator itself doesn't yet read the manifest, neither does the group's `list_backups` fn.. so the "manifest-only" part would potentially add quite a bit of overhead there.. but of course, things like verification and snapshot listing over the API already do read the manifest later... another variant would be to have a per-group list of trashed snapshots *and* a marker in the manifest - that way both use cases would be supported, with the caveat that they'd have to be kept in sync (but that only affects prune/undo_prune itself, so might not be *that* bad?) _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash 2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner ` (2 preceding siblings ...) 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy Christian Ebner @ 2025-04-16 14:18 ` Christian Ebner 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-17 9:29 ` [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Fabian Grünbichler 4 siblings, 1 reply; 21+ messages in thread From: Christian Ebner @ 2025-04-16 14:18 UTC (permalink / raw) To: pbs-devel Snapshots pruned during phase 1 are now also assured to be included in the marking phase by reading the index files from trash and touching these chunks as well. Clear any trash before starting with phase 1, so that only snapshots pruned during GC are consided. Further, drop the retry logic used before to assure eventually newly written index files are included in the marking phase, if the previously last one was pruned. This is not necessary anymore as the previously last one will be read from trash. Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- pbs-datastore/src/datastore.rs | 141 +++++++++++++++------------------ 1 file changed, 65 insertions(+), 76 deletions(-) diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs index 97b78f000..688e65247 100644 --- a/pbs-datastore/src/datastore.rs +++ b/pbs-datastore/src/datastore.rs @@ -1137,7 +1137,13 @@ impl DataStore { // // By this it is assured that all index files are used, even if they would not have been // seen by the regular logic and the user is informed by the garbage collection run about - // the detected index files not following the iterators logic. + // the detected index files not following the iterators logic. Further, include trashed + // snapshots which have been pruned during garbage collections marking phase. + + let trash = PathBuf::from(".trash/"); + let mut full_trash_path = self.base_path(); + full_trash_path.push(&trash); + let _ = std::fs::remove_dir_all(full_trash_path); let mut unprocessed_index_list = self.list_index_files(None)?; let mut index_count = unprocessed_index_list.len(); @@ -1154,88 +1160,63 @@ impl DataStore { let namespace = namespace.context("iterating namespaces failed")?; for group in arc_self.iter_backup_groups(namespace)? { let group = group.context("iterating backup groups failed")?; + let mut snapshots = match group.list_backups() { + Ok(snapshots) => snapshots, + Err(err) => { + if group.exists() { + return Err(err).context("listing snapshots failed")?; + } + // vanished, will be covered by trashed list below to avoid + // not touching known chunks. + continue; + } + }; - // Avoid race between listing/marking of snapshots by GC and pruning the last - // snapshot in the group, following a new snapshot creation. Otherwise known chunks - // might only be referenced by the new snapshot, so it must be read as well. - let mut retry_counter = 0; - 'retry: loop { - let _lock = match retry_counter { - 0..=9 => None, - 10 => Some( - group - .lock() - .context("exhausted retries and failed to lock group")?, - ), - _ => bail!("exhausted retries and unexpected counter overrun"), - }; - - let mut snapshots = match group.list_backups() { - Ok(snapshots) => snapshots, - Err(err) => { - if group.exists() { - return Err(err).context("listing snapshots failed")?; - } - break 'retry; + BackupInfo::sort_list(&mut snapshots, true); + for snapshot in snapshots.into_iter() { + for file in snapshot.files { + worker.check_abort()?; + worker.fail_on_shutdown()?; + + match ArchiveType::from_path(&file) { + Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), + Ok(ArchiveType::Blob) | Err(_) => continue, } - }; - - // Always start iteration with the last snapshot of the group to reduce race - // window with concurrent backup+prune previous last snapshot. Allows to retry - // without the need to keep track of already processed index files for the - // current group. - BackupInfo::sort_list(&mut snapshots, true); - for (count, snapshot) in snapshots.into_iter().rev().enumerate() { - for file in snapshot.files { - worker.check_abort()?; - worker.fail_on_shutdown()?; - - match ArchiveType::from_path(&file) { - Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), - Ok(ArchiveType::Blob) | Err(_) => continue, - } - let mut path = snapshot.backup_dir.full_path(); - path.push(file); - - let index = match self.open_index_reader(&path)? { - Some(index) => index, - None => { - unprocessed_index_list.remove(&path); - if count == 0 { - retry_counter += 1; - continue 'retry; - } - continue; - } - }; - - self.index_mark_used_chunks( - index, - &path, - &mut chunk_lru_cache, - status, - worker, - )?; - - if !unprocessed_index_list.remove(&path) { - info!("Encountered new index file '{path:?}', increment total index file count"); - index_count += 1; - } + let mut path = snapshot.backup_dir.full_path(); + path.push(file); - let percentage = (processed_index_files + 1) * 100 / index_count; - if percentage > last_percentage { - info!( - "marked {percentage}% ({} of {index_count} index files)", - processed_index_files + 1, - ); - last_percentage = percentage; + let index = match self.open_index_reader(&path)? { + Some(index) => index, + None => { + unprocessed_index_list.remove(&path); + continue; } - processed_index_files += 1; + }; + + self.index_mark_used_chunks( + index, + &path, + &mut chunk_lru_cache, + status, + worker, + )?; + + if !unprocessed_index_list.remove(&path) { + info!("Encountered new index file '{path:?}', increment total index file count"); + index_count += 1; } - } - break; + let percentage = (processed_index_files + 1) * 100 / index_count; + if percentage > last_percentage { + info!( + "marked {percentage}% ({} of {index_count} index files)", + processed_index_files + 1, + ); + last_percentage = percentage; + } + processed_index_files += 1; + } } } } @@ -1257,6 +1238,14 @@ impl DataStore { warn!("Found {strange_paths_count} index files outside of expected directory scheme"); } + let trashed_index_list = self.list_index_files(Some(trash))?; + for path in trashed_index_list { + if let Some(index) = self.open_index_reader(&path)? { + info!("Mark chunks for pruned index file found in {path:?}"); + self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?; + }; + } + Ok(()) } -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash Christian Ebner @ 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-17 10:38 ` Christian Ebner 0 siblings, 1 reply; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-17 9:29 UTC (permalink / raw) To: Proxmox Backup Server development discussion On April 16, 2025 4:18 pm, Christian Ebner wrote: > Snapshots pruned during phase 1 are now also assured to be included > in the marking phase by reading the index files from trash and > touching these chunks as well. > > Clear any trash before starting with phase 1, so that only snapshots > pruned during GC are consided. > > Further, drop the retry logic used before to assure eventually newly > written index files are included in the marking phase, if the > previously last one was pruned. This is not necessary anymore as the > previously last one will be read from trash. > > Signed-off-by: Christian Ebner <c.ebner@proxmox.com> > --- > pbs-datastore/src/datastore.rs | 141 +++++++++++++++------------------ > 1 file changed, 65 insertions(+), 76 deletions(-) > > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs > index 97b78f000..688e65247 100644 > --- a/pbs-datastore/src/datastore.rs > +++ b/pbs-datastore/src/datastore.rs > @@ -1137,7 +1137,13 @@ impl DataStore { > // > // By this it is assured that all index files are used, even if they would not have been > // seen by the regular logic and the user is informed by the garbage collection run about > - // the detected index files not following the iterators logic. > + // the detected index files not following the iterators logic. Further, include trashed > + // snapshots which have been pruned during garbage collections marking phase. > + > + let trash = PathBuf::from(".trash/"); > + let mut full_trash_path = self.base_path(); > + full_trash_path.push(&trash); > + let _ = std::fs::remove_dir_all(full_trash_path); I think this would need some locking, else we start recursively deleting here while at the same time a prune operation is moving something into the trash.. > > let mut unprocessed_index_list = self.list_index_files(None)?; > let mut index_count = unprocessed_index_list.len(); > @@ -1154,88 +1160,63 @@ impl DataStore { > let namespace = namespace.context("iterating namespaces failed")?; > for group in arc_self.iter_backup_groups(namespace)? { > let group = group.context("iterating backup groups failed")?; > + let mut snapshots = match group.list_backups() { > + Ok(snapshots) => snapshots, > + Err(err) => { > + if group.exists() { > + return Err(err).context("listing snapshots failed")?; > + } > + // vanished, will be covered by trashed list below to avoid > + // not touching known chunks. > + continue; > + } > + }; > > - // Avoid race between listing/marking of snapshots by GC and pruning the last > - // snapshot in the group, following a new snapshot creation. Otherwise known chunks > - // might only be referenced by the new snapshot, so it must be read as well. > - let mut retry_counter = 0; > - 'retry: loop { > - let _lock = match retry_counter { > - 0..=9 => None, > - 10 => Some( > - group > - .lock() > - .context("exhausted retries and failed to lock group")?, > - ), > - _ => bail!("exhausted retries and unexpected counter overrun"), > - }; > - > - let mut snapshots = match group.list_backups() { > - Ok(snapshots) => snapshots, > - Err(err) => { > - if group.exists() { > - return Err(err).context("listing snapshots failed")?; > - } > - break 'retry; > + BackupInfo::sort_list(&mut snapshots, true); > + for snapshot in snapshots.into_iter() { > + for file in snapshot.files { > + worker.check_abort()?; > + worker.fail_on_shutdown()?; > + > + match ArchiveType::from_path(&file) { > + Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), > + Ok(ArchiveType::Blob) | Err(_) => continue, > } > - }; > - > - // Always start iteration with the last snapshot of the group to reduce race > - // window with concurrent backup+prune previous last snapshot. Allows to retry > - // without the need to keep track of already processed index files for the > - // current group. > - BackupInfo::sort_list(&mut snapshots, true); > - for (count, snapshot) in snapshots.into_iter().rev().enumerate() { > - for file in snapshot.files { > - worker.check_abort()?; > - worker.fail_on_shutdown()?; > - > - match ArchiveType::from_path(&file) { > - Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), > - Ok(ArchiveType::Blob) | Err(_) => continue, > - } > > - let mut path = snapshot.backup_dir.full_path(); > - path.push(file); > - > - let index = match self.open_index_reader(&path)? { > - Some(index) => index, > - None => { > - unprocessed_index_list.remove(&path); > - if count == 0 { > - retry_counter += 1; > - continue 'retry; > - } > - continue; > - } > - }; > - > - self.index_mark_used_chunks( > - index, > - &path, > - &mut chunk_lru_cache, > - status, > - worker, > - )?; > - > - if !unprocessed_index_list.remove(&path) { > - info!("Encountered new index file '{path:?}', increment total index file count"); > - index_count += 1; > - } > + let mut path = snapshot.backup_dir.full_path(); > + path.push(file); > > - let percentage = (processed_index_files + 1) * 100 / index_count; > - if percentage > last_percentage { > - info!( > - "marked {percentage}% ({} of {index_count} index files)", > - processed_index_files + 1, > - ); > - last_percentage = percentage; > + let index = match self.open_index_reader(&path)? { > + Some(index) => index, > + None => { > + unprocessed_index_list.remove(&path); > + continue; > } > - processed_index_files += 1; > + }; > + > + self.index_mark_used_chunks( > + index, > + &path, > + &mut chunk_lru_cache, > + status, > + worker, > + )?; > + > + if !unprocessed_index_list.remove(&path) { > + info!("Encountered new index file '{path:?}', increment total index file count"); > + index_count += 1; > } > - } > > - break; > + let percentage = (processed_index_files + 1) * 100 / index_count; > + if percentage > last_percentage { > + info!( > + "marked {percentage}% ({} of {index_count} index files)", > + processed_index_files + 1, > + ); > + last_percentage = percentage; > + } > + processed_index_files += 1; > + } > } > } > } > @@ -1257,6 +1238,14 @@ impl DataStore { > warn!("Found {strange_paths_count} index files outside of expected directory scheme"); > } > > + let trashed_index_list = self.list_index_files(Some(trash))?; > + for path in trashed_index_list { > + if let Some(index) = self.open_index_reader(&path)? { > + info!("Mark chunks for pruned index file found in {path:?}"); > + self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?; > + }; > + } > + I think we'd want to support undoing moving a snapshot to trash, if we have a trash can (that's the other half of wanting this feature after all). if we do so, we need to be careful to not re-introduce a race here (e.g., by keeping a copy in the trash can when undoing, or by using a trash can mechanism that doesn't require separate iteration over regular and trashed snapshots). > Ok(()) > } > > -- > 2.39.5 > > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel > > > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash 2025-04-17 9:29 ` Fabian Grünbichler @ 2025-04-17 10:38 ` Christian Ebner 2025-04-17 11:27 ` Fabian Grünbichler 0 siblings, 1 reply; 21+ messages in thread From: Christian Ebner @ 2025-04-17 10:38 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Fabian Grünbichler On 4/17/25 11:29, Fabian Grünbichler wrote: > On April 16, 2025 4:18 pm, Christian Ebner wrote: >> Snapshots pruned during phase 1 are now also assured to be included >> in the marking phase by reading the index files from trash and >> touching these chunks as well. >> >> Clear any trash before starting with phase 1, so that only snapshots >> pruned during GC are consided. >> >> Further, drop the retry logic used before to assure eventually newly >> written index files are included in the marking phase, if the >> previously last one was pruned. This is not necessary anymore as the >> previously last one will be read from trash. >> >> Signed-off-by: Christian Ebner <c.ebner@proxmox.com> >> --- >> pbs-datastore/src/datastore.rs | 141 +++++++++++++++------------------ >> 1 file changed, 65 insertions(+), 76 deletions(-) >> >> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs >> index 97b78f000..688e65247 100644 >> --- a/pbs-datastore/src/datastore.rs >> +++ b/pbs-datastore/src/datastore.rs >> @@ -1137,7 +1137,13 @@ impl DataStore { >> // >> // By this it is assured that all index files are used, even if they would not have been >> // seen by the regular logic and the user is informed by the garbage collection run about >> - // the detected index files not following the iterators logic. >> + // the detected index files not following the iterators logic. Further, include trashed >> + // snapshots which have been pruned during garbage collections marking phase. >> + >> + let trash = PathBuf::from(".trash/"); >> + let mut full_trash_path = self.base_path(); >> + full_trash_path.push(&trash); >> + let _ = std::fs::remove_dir_all(full_trash_path); > > I think this would need some locking, else we start recursively deleting > here while at the same time a prune operation is moving something into > the trash.. True, if there is a concurrent insert, then deletion will probably fail because the sub-directory is then not empty anymore. Instead of locking, I could do an atomic rename of the whole trash can instead, and cleanup the renamed one? Should be not only more efficient but also easier to implement. > >> >> let mut unprocessed_index_list = self.list_index_files(None)?; >> let mut index_count = unprocessed_index_list.len(); >> @@ -1154,88 +1160,63 @@ impl DataStore { >> let namespace = namespace.context("iterating namespaces failed")?; >> for group in arc_self.iter_backup_groups(namespace)? { >> let group = group.context("iterating backup groups failed")?; >> + let mut snapshots = match group.list_backups() { >> + Ok(snapshots) => snapshots, >> + Err(err) => { >> + if group.exists() { >> + return Err(err).context("listing snapshots failed")?; >> + } >> + // vanished, will be covered by trashed list below to avoid >> + // not touching known chunks. >> + continue; >> + } >> + }; >> >> - // Avoid race between listing/marking of snapshots by GC and pruning the last >> - // snapshot in the group, following a new snapshot creation. Otherwise known chunks >> - // might only be referenced by the new snapshot, so it must be read as well. >> - let mut retry_counter = 0; >> - 'retry: loop { >> - let _lock = match retry_counter { >> - 0..=9 => None, >> - 10 => Some( >> - group >> - .lock() >> - .context("exhausted retries and failed to lock group")?, >> - ), >> - _ => bail!("exhausted retries and unexpected counter overrun"), >> - }; >> - >> - let mut snapshots = match group.list_backups() { >> - Ok(snapshots) => snapshots, >> - Err(err) => { >> - if group.exists() { >> - return Err(err).context("listing snapshots failed")?; >> - } >> - break 'retry; >> + BackupInfo::sort_list(&mut snapshots, true); >> + for snapshot in snapshots.into_iter() { >> + for file in snapshot.files { >> + worker.check_abort()?; >> + worker.fail_on_shutdown()?; >> + >> + match ArchiveType::from_path(&file) { >> + Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), >> + Ok(ArchiveType::Blob) | Err(_) => continue, >> } >> - }; >> - >> - // Always start iteration with the last snapshot of the group to reduce race >> - // window with concurrent backup+prune previous last snapshot. Allows to retry >> - // without the need to keep track of already processed index files for the >> - // current group. >> - BackupInfo::sort_list(&mut snapshots, true); >> - for (count, snapshot) in snapshots.into_iter().rev().enumerate() { >> - for file in snapshot.files { >> - worker.check_abort()?; >> - worker.fail_on_shutdown()?; >> - >> - match ArchiveType::from_path(&file) { >> - Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (), >> - Ok(ArchiveType::Blob) | Err(_) => continue, >> - } >> >> - let mut path = snapshot.backup_dir.full_path(); >> - path.push(file); >> - >> - let index = match self.open_index_reader(&path)? { >> - Some(index) => index, >> - None => { >> - unprocessed_index_list.remove(&path); >> - if count == 0 { >> - retry_counter += 1; >> - continue 'retry; >> - } >> - continue; >> - } >> - }; >> - >> - self.index_mark_used_chunks( >> - index, >> - &path, >> - &mut chunk_lru_cache, >> - status, >> - worker, >> - )?; >> - >> - if !unprocessed_index_list.remove(&path) { >> - info!("Encountered new index file '{path:?}', increment total index file count"); >> - index_count += 1; >> - } >> + let mut path = snapshot.backup_dir.full_path(); >> + path.push(file); >> >> - let percentage = (processed_index_files + 1) * 100 / index_count; >> - if percentage > last_percentage { >> - info!( >> - "marked {percentage}% ({} of {index_count} index files)", >> - processed_index_files + 1, >> - ); >> - last_percentage = percentage; >> + let index = match self.open_index_reader(&path)? { >> + Some(index) => index, >> + None => { >> + unprocessed_index_list.remove(&path); >> + continue; >> } >> - processed_index_files += 1; >> + }; >> + >> + self.index_mark_used_chunks( >> + index, >> + &path, >> + &mut chunk_lru_cache, >> + status, >> + worker, >> + )?; >> + >> + if !unprocessed_index_list.remove(&path) { >> + info!("Encountered new index file '{path:?}', increment total index file count"); >> + index_count += 1; >> } >> - } >> >> - break; >> + let percentage = (processed_index_files + 1) * 100 / index_count; >> + if percentage > last_percentage { >> + info!( >> + "marked {percentage}% ({} of {index_count} index files)", >> + processed_index_files + 1, >> + ); >> + last_percentage = percentage; >> + } >> + processed_index_files += 1; >> + } >> } >> } >> } >> @@ -1257,6 +1238,14 @@ impl DataStore { >> warn!("Found {strange_paths_count} index files outside of expected directory scheme"); >> } >> >> + let trashed_index_list = self.list_index_files(Some(trash))?; >> + for path in trashed_index_list { >> + if let Some(index) = self.open_index_reader(&path)? { >> + info!("Mark chunks for pruned index file found in {path:?}"); >> + self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?; >> + }; >> + } >> + > > I think we'd want to support undoing moving a snapshot to trash, if we > have a trash can (that's the other half of wanting this feature after > all). if we do so, we need to be careful to not re-introduce a race here > (e.g., by keeping a copy in the trash can when undoing, or by using a > trash can mechanism that doesn't require separate iteration over regular > and trashed snapshots). Hmm, that's right. So keeping a copy in the trash might be a good approach here. That is if one sticks to the trash folder mechanism. Must first see the implications of your other suggested approaches. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash 2025-04-17 10:38 ` Christian Ebner @ 2025-04-17 11:27 ` Fabian Grünbichler 0 siblings, 0 replies; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-17 11:27 UTC (permalink / raw) To: Christian Ebner, Proxmox Backup Server development discussion > Christian Ebner <c.ebner@proxmox.com> hat am 17.04.2025 12:38 CEST geschrieben: > On 4/17/25 11:29, Fabian Grünbichler wrote: > > On April 16, 2025 4:18 pm, Christian Ebner wrote: > >> Snapshots pruned during phase 1 are now also assured to be included > >> in the marking phase by reading the index files from trash and > >> touching these chunks as well. > >> > >> Clear any trash before starting with phase 1, so that only snapshots > >> pruned during GC are consided. > >> > >> Further, drop the retry logic used before to assure eventually newly > >> written index files are included in the marking phase, if the > >> previously last one was pruned. This is not necessary anymore as the > >> previously last one will be read from trash. > >> > >> Signed-off-by: Christian Ebner <c.ebner@proxmox.com> > >> --- > >> pbs-datastore/src/datastore.rs | 141 +++++++++++++++------------------ > >> 1 file changed, 65 insertions(+), 76 deletions(-) > >> > >> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs > >> index 97b78f000..688e65247 100644 > >> --- a/pbs-datastore/src/datastore.rs > >> +++ b/pbs-datastore/src/datastore.rs > >> @@ -1137,7 +1137,13 @@ impl DataStore { > >> // > >> // By this it is assured that all index files are used, even if they would not have been > >> // seen by the regular logic and the user is informed by the garbage collection run about > >> - // the detected index files not following the iterators logic. > >> + // the detected index files not following the iterators logic. Further, include trashed > >> + // snapshots which have been pruned during garbage collections marking phase. > >> + > >> + let trash = PathBuf::from(".trash/"); > >> + let mut full_trash_path = self.base_path(); > >> + full_trash_path.push(&trash); > >> + let _ = std::fs::remove_dir_all(full_trash_path); > > > > I think this would need some locking, else we start recursively deleting > > here while at the same time a prune operation is moving something into > > the trash.. > > True, if there is a concurrent insert, then deletion will probably fail > because the sub-directory is then not empty anymore. Instead of locking, > I could do an atomic rename of the whole trash can instead, and cleanup > the renamed one? Should be not only more efficient but also easier to > implement. that only works if you assume that nothing operates on (or rather, inside of) the trash dir via an already opened FD.. which seems like a dangerous and easy to miss assumption.. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots 2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner ` (3 preceding siblings ...) 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash Christian Ebner @ 2025-04-17 9:29 ` Fabian Grünbichler 4 siblings, 0 replies; 21+ messages in thread From: Fabian Grünbichler @ 2025-04-17 9:29 UTC (permalink / raw) To: Proxmox Backup Server development discussion On April 16, 2025 4:17 pm, Christian Ebner wrote: > In an effort to simplify the GC phase 1 logic introduced by commit > cb9814e3 ("garbage collection: fix rare race in chunk marking phase") > this patch series implement a trash can functionality for snapshots. that was fast ;) > The main intention is to allow snapshot's index files, pruned while > ongoing phase 1 of garbage collection, to be read and their chunks > marked as in use as well. This will allow to get rid of the currently > implemented and rather complex retry looping logic, which could in > theory lead to failing GC or backups when trying to lock the whole > group exclusively following the 10-th retry. I think the other, not really smaller intention is to allow undoing an accidental/premature deletion/pruning. So we need to consider this usecase as well when designing the trash can semantics, and ideally introduce that at the same time so we can properly rule out problems.. > To achieve this, pruning of snapshots does not remove them > immediately, but rather moves them to a `.trash` subfolder in the > datastores base directory. This directory will then be cleared before > starting of GC phase 1, meaning that any index file could be restored > until the next GC run. see my comment on patch #3 > This however comes with it's own set of issues, therefore sending > these patches as RFC for now. Open questions and known limitations > are: > - Pruning does not cleanup any space, on the contrary it might > require additional space on COW filesystem. Should there be a flag > to bypass the trash, also given that sometimes users truly want to > remove a snapshot immediately? Although that would re-introduce the > issue with new snapshot ceration and concurrent GC on a last > snapshot. I think it might make sense, but I am not sure how we could avoid the GC issue (but I think we could design the trash can feature in a way that we keep the retry logic in GC, but that it only ever triggers in case such a skip-trash pruning took place in a group). > - Prune + sync + prune might lead to the same snapshot being pruned > multiple times, currently any second prune on a snapshot would > fail. Should this overwrite the trashed snapshot? this depends on how the trash feature is implemented: - if it's a mark on the snapshot, then attempting to write the snapshot again could either fail or overwrite the trashed snapshot - if the snapshot is moved to a trash can, then we could keep multiple copies there > - GC might now read the same index twice, once before it was pruned > followed by a prune while phase 1 is still ongoing and the second > time as read from the trash. Not really an issue, but rather a > limitation. reading twice is a lot better than never reading ;) I don't think this should be particularly problematic. > - Further issues I'm currently overlooking > > Christian Ebner (4): > datastore: always skip over base directory when listing index files > datastore: allow to specify sub-directory for index file listing > datastore: move snapshots to trash folder on destroy > garbage collection: read pruned snapshot index files from trash > > pbs-datastore/src/backup_info.rs | 14 ++- > pbs-datastore/src/datastore.rs | 158 +++++++++++++++---------------- > 2 files changed, 89 insertions(+), 83 deletions(-) > > -- > 2.39.5 > > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel > > > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-04-29 11:27 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files Christian Ebner 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-17 10:27 ` Christian Ebner 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing Christian Ebner 2025-04-18 9:38 ` Thomas Lamprecht 2025-04-18 9:55 ` Christian Ebner 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy Christian Ebner 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-18 11:06 ` Thomas Lamprecht 2025-04-18 11:49 ` Christian Ebner 2025-04-18 12:03 ` Fabian Grünbichler 2025-04-18 12:45 ` Christian Ebner 2025-04-22 7:54 ` Fabian Grünbichler 2025-04-29 11:27 ` Christian Ebner 2025-04-18 11:51 ` Fabian Grünbichler 2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash Christian Ebner 2025-04-17 9:29 ` Fabian Grünbichler 2025-04-17 10:38 ` Christian Ebner 2025-04-17 11:27 ` Fabian Grünbichler 2025-04-17 9:29 ` [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Fabian Grünbichler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inboxService provided by Proxmox Server Solutions GmbH | Privacy | Legal