public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: "Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash
Date: Thu, 17 Apr 2025 12:38:14 +0200	[thread overview]
Message-ID: <fdfc7c7d-49ee-4c13-811b-1a3036bda2a9@proxmox.com> (raw)
In-Reply-To: <1744881430.ds1jf5x2pr.astroid@yuna.none>

On 4/17/25 11:29, Fabian Grünbichler wrote:
> On April 16, 2025 4:18 pm, Christian Ebner wrote:
>> Snapshots pruned during phase 1 are now also assured to be included
>> in the marking phase by reading the index files from trash and
>> touching these chunks as well.
>>
>> Clear any trash before starting with phase 1, so that only snapshots
>> pruned during GC are consided.
>>
>> Further, drop the retry logic used before to assure eventually newly
>> written index files are included in the marking phase, if the
>> previously last one was pruned. This is not necessary anymore as the
>> previously last one will be read from trash.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>>   pbs-datastore/src/datastore.rs | 141 +++++++++++++++------------------
>>   1 file changed, 65 insertions(+), 76 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 97b78f000..688e65247 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -1137,7 +1137,13 @@ impl DataStore {
>>           //
>>           // By this it is assured that all index files are used, even if they would not have been
>>           // seen by the regular logic and the user is informed by the garbage collection run about
>> -        // the detected index files not following the iterators logic.
>> +        // the detected index files not following the iterators logic. Further, include trashed
>> +        // snapshots which have been pruned during garbage collections marking phase.
>> +
>> +        let trash = PathBuf::from(".trash/");
>> +        let mut full_trash_path = self.base_path();
>> +        full_trash_path.push(&trash);
>> +        let _ = std::fs::remove_dir_all(full_trash_path);
> 
> I think this would need some locking, else we start recursively deleting
> here while at the same time a prune operation is moving something into
> the trash..

True, if there is a concurrent insert, then deletion will probably fail 
because the sub-directory is then not empty anymore. Instead of locking, 
I could do an atomic rename of the whole trash can instead, and cleanup 
the renamed one? Should be not only more efficient but also easier to 
implement.

> 
>>   
>>           let mut unprocessed_index_list = self.list_index_files(None)?;
>>           let mut index_count = unprocessed_index_list.len();
>> @@ -1154,88 +1160,63 @@ impl DataStore {
>>               let namespace = namespace.context("iterating namespaces failed")?;
>>               for group in arc_self.iter_backup_groups(namespace)? {
>>                   let group = group.context("iterating backup groups failed")?;
>> +                let mut snapshots = match group.list_backups() {
>> +                    Ok(snapshots) => snapshots,
>> +                    Err(err) => {
>> +                        if group.exists() {
>> +                            return Err(err).context("listing snapshots failed")?;
>> +                        }
>> +                        // vanished, will be covered by trashed list below to avoid
>> +                        // not touching known chunks.
>> +                        continue;
>> +                    }
>> +                };
>>   
>> -                // Avoid race between listing/marking of snapshots by GC and pruning the last
>> -                // snapshot in the group, following a new snapshot creation. Otherwise known chunks
>> -                // might only be referenced by the new snapshot, so it must be read as well.
>> -                let mut retry_counter = 0;
>> -                'retry: loop {
>> -                    let _lock = match retry_counter {
>> -                        0..=9 => None,
>> -                        10 => Some(
>> -                            group
>> -                                .lock()
>> -                                .context("exhausted retries and failed to lock group")?,
>> -                        ),
>> -                        _ => bail!("exhausted retries and unexpected counter overrun"),
>> -                    };
>> -
>> -                    let mut snapshots = match group.list_backups() {
>> -                        Ok(snapshots) => snapshots,
>> -                        Err(err) => {
>> -                            if group.exists() {
>> -                                return Err(err).context("listing snapshots failed")?;
>> -                            }
>> -                            break 'retry;
>> +                BackupInfo::sort_list(&mut snapshots, true);
>> +                for snapshot in snapshots.into_iter() {
>> +                    for file in snapshot.files {
>> +                        worker.check_abort()?;
>> +                        worker.fail_on_shutdown()?;
>> +
>> +                        match ArchiveType::from_path(&file) {
>> +                            Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (),
>> +                            Ok(ArchiveType::Blob) | Err(_) => continue,
>>                           }
>> -                    };
>> -
>> -                    // Always start iteration with the last snapshot of the group to reduce race
>> -                    // window with concurrent backup+prune previous last snapshot. Allows to retry
>> -                    // without the need to keep track of already processed index files for the
>> -                    // current group.
>> -                    BackupInfo::sort_list(&mut snapshots, true);
>> -                    for (count, snapshot) in snapshots.into_iter().rev().enumerate() {
>> -                        for file in snapshot.files {
>> -                            worker.check_abort()?;
>> -                            worker.fail_on_shutdown()?;
>> -
>> -                            match ArchiveType::from_path(&file) {
>> -                                Ok(ArchiveType::FixedIndex) | Ok(ArchiveType::DynamicIndex) => (),
>> -                                Ok(ArchiveType::Blob) | Err(_) => continue,
>> -                            }
>>   
>> -                            let mut path = snapshot.backup_dir.full_path();
>> -                            path.push(file);
>> -
>> -                            let index = match self.open_index_reader(&path)? {
>> -                                Some(index) => index,
>> -                                None => {
>> -                                    unprocessed_index_list.remove(&path);
>> -                                    if count == 0 {
>> -                                        retry_counter += 1;
>> -                                        continue 'retry;
>> -                                    }
>> -                                    continue;
>> -                                }
>> -                            };
>> -
>> -                            self.index_mark_used_chunks(
>> -                                index,
>> -                                &path,
>> -                                &mut chunk_lru_cache,
>> -                                status,
>> -                                worker,
>> -                            )?;
>> -
>> -                            if !unprocessed_index_list.remove(&path) {
>> -                                info!("Encountered new index file '{path:?}', increment total index file count");
>> -                                index_count += 1;
>> -                            }
>> +                        let mut path = snapshot.backup_dir.full_path();
>> +                        path.push(file);
>>   
>> -                            let percentage = (processed_index_files + 1) * 100 / index_count;
>> -                            if percentage > last_percentage {
>> -                                info!(
>> -                                    "marked {percentage}% ({} of {index_count} index files)",
>> -                                    processed_index_files + 1,
>> -                                );
>> -                                last_percentage = percentage;
>> +                        let index = match self.open_index_reader(&path)? {
>> +                            Some(index) => index,
>> +                            None => {
>> +                                unprocessed_index_list.remove(&path);
>> +                                continue;
>>                               }
>> -                            processed_index_files += 1;
>> +                        };
>> +
>> +                        self.index_mark_used_chunks(
>> +                            index,
>> +                            &path,
>> +                            &mut chunk_lru_cache,
>> +                            status,
>> +                            worker,
>> +                        )?;
>> +
>> +                        if !unprocessed_index_list.remove(&path) {
>> +                            info!("Encountered new index file '{path:?}', increment total index file count");
>> +                            index_count += 1;
>>                           }
>> -                    }
>>   
>> -                    break;
>> +                        let percentage = (processed_index_files + 1) * 100 / index_count;
>> +                        if percentage > last_percentage {
>> +                            info!(
>> +                                "marked {percentage}% ({} of {index_count} index files)",
>> +                                processed_index_files + 1,
>> +                            );
>> +                            last_percentage = percentage;
>> +                        }
>> +                        processed_index_files += 1;
>> +                    }
>>                   }
>>               }
>>           }
>> @@ -1257,6 +1238,14 @@ impl DataStore {
>>               warn!("Found {strange_paths_count} index files outside of expected directory scheme");
>>           }
>>   
>> +        let trashed_index_list = self.list_index_files(Some(trash))?;
>> +        for path in trashed_index_list {
>> +            if let Some(index) = self.open_index_reader(&path)? {
>> +                info!("Mark chunks for pruned index file found in {path:?}");
>> +                self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?;
>> +            };
>> +        }
>> +
> 
> I think we'd want to support undoing moving a snapshot to trash, if we
> have a trash can (that's the other half of wanting this feature after
> all). if we do so, we need to be careful to not re-introduce a race here
> (e.g., by keeping a copy in the trash can when undoing, or by using a
> trash can mechanism that doesn't require separate iteration over regular
> and trashed snapshots).

Hmm, that's right. So keeping a copy in the trash might be a good 
approach here. That is if one sticks to the trash folder mechanism. Must 
first see the implications of your other suggested approaches.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2025-04-17 10:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-16 14:17 [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Christian Ebner
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 1/4] datastore: always skip over base directory when listing index files Christian Ebner
2025-04-17  9:29   ` Fabian Grünbichler
2025-04-17 10:27     ` Christian Ebner
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 2/4] datastore: allow to specify sub-directory for index file listing Christian Ebner
2025-04-18  9:38   ` Thomas Lamprecht
2025-04-18  9:55     ` Christian Ebner
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy Christian Ebner
2025-04-17  9:29   ` Fabian Grünbichler
2025-04-18 11:06     ` Thomas Lamprecht
2025-04-18 11:49       ` Christian Ebner
2025-04-18 12:03         ` Fabian Grünbichler
2025-04-18 12:45           ` Christian Ebner
2025-04-22  7:54             ` Fabian Grünbichler
2025-04-29 11:27               ` Christian Ebner
2025-04-18 11:51       ` Fabian Grünbichler
2025-04-16 14:18 ` [pbs-devel] [RFC proxmox-backup 4/4] garbage collection: read pruned snapshot index files from trash Christian Ebner
2025-04-17  9:29   ` Fabian Grünbichler
2025-04-17 10:38     ` Christian Ebner [this message]
2025-04-17 11:27       ` Fabian Grünbichler
2025-04-17  9:29 ` [pbs-devel] [RFC proxmox-backup 0/4] implement trash can for snapshots Fabian Grünbichler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fdfc7c7d-49ee-4c13-811b-1a3036bda2a9@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal