public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v4 proxmox-backup 2/2] garbage collection: fix rare race in chunk marking phase
Date: Wed, 16 Apr 2025 12:00:04 +0200	[thread overview]
Message-ID: <1744797358.cnv8yb7tao.astroid@yuna.none> (raw)
In-Reply-To: <20250416091718.182071-3-c.ebner@proxmox.com>

On April 16, 2025 11:17 am, Christian Ebner wrote:
> During phase 1 of garbage collection referenced chunks are marked as
> in use by iterating over all index files and updating the atime on
> the chunks referenced by these.
> 
> In an edge case for long running garbage collection jobs, where a
> newly added snapshot (created after the start of GC) reused known
> chunks from a previous snapshot, but the previous snapshot index
> referencing them disappeared before the marking phase could reach
> that index (e.g. pruned because only 1 snapshot to be kept by
> retention setting), known chunks from that previous index file might
> not be marked (given that by none of the other index files it was
> marked).
> 
> Since commit 74361da8 ("garbage collection: generate index file list
> via datastore iterators") this is even less likely as now the
> iteration reads also index files added during phase 1, and
> therefore either the new or the previous index file will account for
> these chunks (the previous backup snapshot can only be prunded after
> the new one finished, since locked). There remains however a small
> race window between the reading of the snapshots in the backup group
> and the reading of the actual index files for marking.
> 
> Fix this race by:
> 1. Checking if the last snapshot of a group disappeared and if so
> 2. generate the list again, looking for new index files previously
>    not accounted for
> 3. To avoid possible endless looping, lock the group if the snapshot
>    list changed even after the 10th time (which will lead to
>    concurrent operations to this group failing).
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  pbs-datastore/src/datastore.rs | 112 ++++++++++++++++++++++-----------
>  1 file changed, 76 insertions(+), 36 deletions(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 4630b1b39..8a5075786 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1148,47 +1148,87 @@ impl DataStore {
>              let namespace = namespace.context("iterating namespaces failed")?;
>              for group in arc_self.iter_backup_groups(namespace)? {
>                  let group = group.context("iterating backup groups failed")?;
> -                let mut snapshots = group.list_backups().context("listing snapshots failed")?;
> -                // Sort by snapshot timestamp to iterate over consecutive snapshots for each image.
> -                BackupInfo::sort_list(&mut snapshots, true);
> -                for snapshot in snapshots {
> -                    for file in snapshot.files {
> -                        worker.check_abort()?;
> -                        worker.fail_on_shutdown()?;
> -
> -                        let mut path = snapshot.backup_dir.full_path();
> -                        path.push(file);
> -
> -                        let index = match self.open_index_reader(&path)? {
> -                            IndexReaderOption::Some(index) => index,
> -                            IndexReaderOption::NoneWrongType | IndexReaderOption::NoneNotFound => {
> -                                unprocessed_index_list.remove(&path);
> -                                continue;
> +
> +                // Avoid race between listing/marking of snapshots by GC and pruning the last
> +                // snapshot in the group, following a new snapshot creation. Otherwise known chunks
> +                // might only be referenced by the new snapshot, so it must be read as well.
> +                let mut retry_counter = 0;
> +                'retry: loop {
> +                    let _lock = match retry_counter {
> +                        0..=9 => None,
> +                        10 => Some(
> +                            group
> +                                .lock()
> +                                .context("exhausted retries and failed to lock group")?,
> +                        ),
> +                        _ => bail!("exhausted retires and unexpected counter overrun"),

typo 'retires'

other than that and the comments on patch#1, this looks good to me, so
consider this

Acked-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

> +                    };
> +
> +                    let mut snapshots = match group.list_backups() {
> +                        Ok(snapshots) => snapshots,
> +                        Err(err) => {
> +                            if group.exists() {
> +                                return Err(err).context("listing snapshots failed")?;
>                              }
> -                        };
> -                        self.index_mark_used_chunks(
> -                            index,
> -                            &path,
> -                            &mut chunk_lru_cache,
> -                            status,
> -                            worker,
> -                        )?;
> -
> -                        if !unprocessed_index_list.remove(&path) {
> -                            info!("Encountered new index file '{path:?}', increment total index file count");
> -                            index_count += 1;
> +                            break 'retry;
>                          }
> +                    };
> +
> +                    // Always start iteration with the last snapshot of the group to reduce race
> +                    // window with concurrent backup+prune previous last snapshot. Allows to retry
> +                    // without the need to keep track of already processed index files for the
> +                    // current group.
> +                    BackupInfo::sort_list(&mut snapshots, true);
> +                    for (count, snapshot) in snapshots.into_iter().rev().enumerate() {
> +                        for file in snapshot.files {
> +                            worker.check_abort()?;
> +                            worker.fail_on_shutdown()?;
> +
> +                            let mut path = snapshot.backup_dir.full_path();
> +                            path.push(file);
> +
> +                            let index = match self.open_index_reader(&path)? {
> +                                IndexReaderOption::Some(index) => index,
> +                                IndexReaderOption::NoneWrongType => {
> +                                    unprocessed_index_list.remove(&path);
> +                                    continue;
> +                                }
> +                                IndexReaderOption::NoneNotFound => {
> +                                    if count == 0 {
> +                                        retry_counter += 1;
> +                                        continue 'retry;
> +                                    }
> +                                    unprocessed_index_list.remove(&path);
> +                                    continue;
> +                                }
> +                            };
> +
> +                            self.index_mark_used_chunks(
> +                                index,
> +                                &path,
> +                                &mut chunk_lru_cache,
> +                                status,
> +                                worker,
> +                            )?;
> +
> +                            if !unprocessed_index_list.remove(&path) {
> +                                info!("Encountered new index file '{path:?}', increment total index file count");
> +                                index_count += 1;
> +                            }
>  
> -                        let percentage = (processed_index_files + 1) * 100 / index_count;
> -                        if percentage > last_percentage {
> -                            info!(
> -                                "marked {percentage}% ({} of {index_count} index files)",
> -                                processed_index_files + 1,
> -                            );
> -                            last_percentage = percentage;
> +                            let percentage = (processed_index_files + 1) * 100 / index_count;
> +                            if percentage > last_percentage {
> +                                info!(
> +                                    "marked {percentage}% ({} of {index_count} index files)",
> +                                    processed_index_files + 1,
> +                                );
> +                                last_percentage = percentage;
> +                            }
> +                            processed_index_files += 1;
>                          }
> -                        processed_index_files += 1;
>                      }
> +
> +                    break;
>                  }
>              }
>          }
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2025-04-16 10:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-16  9:17 [pbs-devel] [PATCH v4 proxmox-backup 0/2] fix rare race in garbage collection Christian Ebner
2025-04-16  9:17 ` [pbs-devel] [PATCH v4 proxmox-backup 1/2] garbage collection: distinguish variants for failed open index reader Christian Ebner
2025-04-16 10:00   ` Fabian Grünbichler
2025-04-16  9:17 ` [pbs-devel] [PATCH v4 proxmox-backup 2/2] garbage collection: fix rare race in chunk marking phase Christian Ebner
2025-04-16 10:00   ` Fabian Grünbichler [this message]
2025-04-16 10:51 ` [pbs-devel] superseded: [PATCH v4 proxmox-backup 0/2] fix rare race in garbage collection Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1744797358.cnv8yb7tao.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal