all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: "Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash
Date: Thu, 17 Apr 2025 12:38:14 +0200	[thread overview]
Message-ID: <fdfc7c7d-49ee-4c13-811b-1a3036bda2a9@proxmox.com> (raw)
In-Reply-To: <1744881430.ds1jf5x2pr.astroid@yuna.none>

On 4/17/25 11:29, Fabian Grünbichler wrote:
> On April 16, 2025 4:18 pm, Christian Ebner wrote:
>> Snapshots pruned during phase 1 are now also assured to be included
>> in the marking phase by reading the index files from trash and
>> touching these chunks as well.
>>
>> Clear any trash before starting with phase 1, so that only snapshots
>> pruned during GC are consided.
>>
>> Further, drop the retry logic used before to assure eventually newly
>> written index files are included in the marking phase, if the
>> previously last one was pruned. This is not necessary anymore as the
>> previously last one will be read from trash.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>>   pbs-datastore/src/datastore.rs | 141 +++++++++++++++------------------
>>   1 file changed, 65 insertions(+), 76 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 97b78f000..688e65247 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -1137,7 +1137,13 @@ impl DataStore {
>>           //
>>           // By this it is assured that all index files are used, even if they would not have been
>>           // seen by the regular logic and the user is informed by the garbage collection run about
>> -        // the detected index files not following the iterators logic.
>> +        // the detected index files not following the iterators logic. Further, include trashed
>> +        // snapshots which have been pruned during garbage collections marking phase.
>> +
>> +        let trash = PathBuf::from(".trash/");
>> +        let mut full_trash_path = self.base_path();
>> +        full_trash_path.push(&trash);
>> +        let _ = std::fs::remove_dir_all(full_trash_path);
> 
> I think this would need some locking, else we start recursively deleting
> here while at the same time a prune operation is moving something into
> the trash..

True, if there is a concurrent insert, then deletion will probably fail 
because the sub-directory is then not empty anymore. Instead of locking, 
I could do an atomic rename of the whole trash can instead, and cleanup 
the renamed one? Should be not only more efficient but also easier to 
implement.

> 
>>   
>>           let mut unprocessed_index_list = self.list_index_files(None)?;
>>           let mut index_count = unprocessed_index_list.len();
>> @@ -1154,88 +1160,63 @@ impl DataStore {
>>               let namespace = namespace.context("iterating namespaces failed")?;
>>               for group in arc_self.iter_backup_groups(namespace)? {
>>                   let group = group.context("iterating backup groups failed")?;
>> +                let mut snapshots = match group.list_backups() {
>> +                    Ok(snapshots) => snapshots,
>> +                    Err(err) => {
>> +                        if group.exists() {
>> +                            return Err(err).context("listing snapshots failed")?;
>> +                        }
>> +                        // vanished, will be covered by trashed list below to avoid
>> +                        // not touching known chunks.
>> +                        continue;
>> +                    }
>> +                };
>>   
>> -                // Avoid race between listing/marking of snapshots by GC and pruning the last
>> -                // snapshot in the group, following a new snapshot creation. Otherwise known chunks
>> -                // might only be referenced by the new snapshot, so it must be read as well.
>> -                let mut retry_counter = 0;
>> -                'retry: loop {
>> -                    let _lock = match retry_counter {
>> -                        0..=9 => None,
>> -                        10 => Some(
>> -                            group
>> -                                .lock()
>> -                                .context("exhausted retries and failed to lock group")?,
>> -                        ),
>> -                        _ => bail!("exhausted retries and unexpected counter overrun"),
>> -                    };
>> -
>> -                    let mut snapshots = match group.list_backups() {
>> -                        Ok(snapshots) => snapshots,
>> -                        Err(err) => {
>> -                            if group.exists() {
>> -                                return Err(err).context("listing snapshots failed")?;
>> -                            }
>> -                            break 'retry;
>> +                BackupInfo::sort_list(&mut snapshots, true);
>> +                for snapshot in snapshots.into_iter() {
>> +                    for file in snapshot.files {
>> +                        worker.check_abort()?;
>> +                        worker.fail_on_shutdown()?;
>> +
>> +                        match ArchiveType::from_path(&file) {
>> +                            Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (),
>> +                            Ok(ArchiveType::Blob) | Err(_) => continue,
>>                           }
>> -                    };
>> -
>> -                    // Always start iteration with the last snapshot of the group to reduce race
>> -                    // window with concurrent backup+prune previous last snapshot. Allows to retry
>> -                    // without the need to keep track of already processed index files for the
>> -                    // current group.
>> -                    BackupInfo::sort_list(&mut snapshots, true);
>> -                    for (count, snapshot) in snapshots.into_iter().rev().enumerate() {
>> -                        for file in snapshot.files {
>> -                            worker.check_abort()?;
>> -                            worker.fail_on_shutdown()?;
>> -
>> -                            match ArchiveType::from_path(&file) {
>> -                                Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (),
>> -                                Ok(ArchiveType::Blob) | Err(_) => continue,
>> -                            }
>>   
>> -                            let mut path = snapshot.backup_dir.full_path();
>> -                            path.push(file);
>> -
>> -                            let index = match self.open_index_reader(&path)? {
>> -                                Some(index) => index,
>> -                                None => {
>> -                                    unprocessed_index_list.remove(&path);
>> -                                    if count == 0 {
>> -                                        retry_counter += 1;
>> -                                        continue 'retry;
>> -                                    }
>> -                                    continue;
>> -                                }
>> -                            };
>> -
>> -                            self.index_mark_used_chunks(
>> -                                index,
>> -                                &path,
>> -                                &mut chunk_lru_cache,
>> -                                status,
>> -                                worker,
>> -                            )?;
>> -
>> -                            if !unprocessed_index_list.remove(&path) {
>> -                                info!("Encountered new index file '{path:?}', increment total index file count");
>> -                                index_count += 1;
>> -                            }
>> +                        let mut path = snapshot.backup_dir.full_path();
>> +                        path.push(file);
>>   
>> -                            let percentage = (processed_index_files + 1) * 100 / index_count;
>> -                            if percentage > last_percentage {
>> -                                info!(
>> -                                    "marked {percentage}% ({} of {index_count} index files)",
>> -                                    processed_index_files + 1,
>> -                                );
>> -                                last_percentage = percentage;
>> +                        let index = match self.open_index_reader(&path)? {
>> +                            Some(index) => index,
>> +                            None => {
>> +                                unprocessed_index_list.remove(&path);
>> +                                continue;
>>                               }
>> -                            processed_index_files += 1;
>> +                        };
>> +
>> +                        self.index_mark_used_chunks(
>> +                            index,
>> +                            &path,
>> +                            &mut chunk_lru_cache,
>> +                            status,
>> +                            worker,
>> +                        )?;
>> +
>> +                        if !unprocessed_index_list.remove(&path) {
>> +                            info!("Encountered new index file '{path:?}', increment total index file count");
>> +                            index_count += 1;
>>                           }
>> -                    }
>>   
>> -                    break;
>> +                        let percentage = (processed_index_files + 1) * 100 / index_count;
>> +                        if percentage > last_percentage {
>> +                            info!(
>> +                                "marked {percentage}% ({} of {index_count} index files)",
>> +                                processed_index_files + 1,
>> +                            );
>> +                            last_percentage = percentage;
>> +                        }
>> +                        processed_index_files += 1;
>> +                    }
>>                   }
>>               }
>>           }
>> @@ -1257,6 +1238,14 @@ impl DataStore {
>>               warn!("Found {strange_paths_count} index files outside of expected directory scheme");
>>           }
>>   
>> +        let trashed_index_list = self.list_index_files(Some(trash))?;
>> +        for path in trashed_index_list {
>> +            if let Some(index) = self.open_index_reader(&path)? {
>> +                info!("Mark chunks for pruned index file found in {path:?}");
>> +                self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?;
>> +            };
>> +        }
>> +
> 
> I think we'd want to support undoing moving a snapshot to trash, if we
> have a trash can (that's the other half of wanting this feature after
> all). if we do so, we need to be careful to not re-introduce a race here
> (e.g., by keeping a copy in the trash can when undoing, or by using a
> trash can mechanism that doesn't require separate iteration over regular
> and trashed snapshots).

Hmm, that's right. So keeping a copy in the trash might be a good 
approach here. That is if one sticks to the trash folder mechanism. Must 
first see the implications of your other suggested approaches.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2025-04-17 10:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files Christian Ebner
2025-04-17  9:29   ` Fabian Grünbichler
2025-04-17 10:27     ` Christian Ebner
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing Christian Ebner
2025-04-18  9:38   ` Thomas Lamprecht
2025-04-18  9:55     ` Christian Ebner
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy Christian Ebner
2025-04-17  9:29   ` Fabian Grünbichler
2025-04-18 11:06     ` Thomas Lamprecht
2025-04-18 11:49       ` Christian Ebner
2025-04-18 12:03         ` Fabian Grünbichler
2025-04-18 12:45           ` Christian Ebner
2025-04-22  7:54             ` Fabian Grünbichler
2025-04-29 11:27               ` Christian Ebner
2025-04-18 11:51       ` Fabian Grünbichler
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash Christian Ebner
2025-04-17  9:29   ` Fabian Grünbichler
2025-04-17 10:38     ` Christian Ebner [this message]
2025-04-17 11:27       ` Fabian Grünbichler
2025-04-17  9:29 ` [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Fabian Grünbichler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fdfc7c7d-49ee-4c13-811b-1a3036bda2a9@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal