public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 1/2] fix #6750: api: avoid possible deadlock on datastores with s3 backend
Date: Thu, 25 Sep 2025 14:41:19 +0200	[thread overview]
Message-ID: <1758803678.fo1e90c9lf.astroid@yuna.none> (raw)
In-Reply-To: <20250924145612.188579-2-c.ebner@proxmox.com>

On September 24, 2025 4:56 pm, Christian Ebner wrote:
> Closing of the fixed or dynamic index files with s3 backend will call
> async code, which must be avoided because of possible deadlocks [0].
> Therefore, perform all changes on the shared backup state and drop the
> guard before uploading the fixed index file to the s3 backend.
> 
> Account for active backend operations and check consistency, since it
> must be assured that all active backend operations are finished before
> the finish call can succeed.
> 
> [0] https://docs.rs/tokio/latest/tokio/sync/struct.Mutex.html#which-kind-of-mutex-should-you-use
> 
> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=6750
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  src/api2/backup/environment.rs | 77 +++++++++++++++++++++++-----------
>  1 file changed, 53 insertions(+), 24 deletions(-)
> 
> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
> index d5e6869cd..e535891a4 100644
> --- a/src/api2/backup/environment.rs
> +++ b/src/api2/backup/environment.rs
> @@ -82,6 +82,7 @@ struct SharedBackupState {
>      finished: bool,
>      uid_counter: usize,
>      file_counter: usize, // successfully uploaded files
> +    active_backend_operations: usize,
>      dynamic_writers: HashMap<usize, DynamicWriterState>,
>      fixed_writers: HashMap<usize, FixedWriterState>,
>      known_chunks: KnownChunksMap,
> @@ -135,6 +136,7 @@ impl BackupEnvironment {
>              finished: false,
>              uid_counter: 0,
>              file_counter: 0,
> +            active_backend_operations: 0,
>              dynamic_writers: HashMap::new(),
>              fixed_writers: HashMap::new(),
>              known_chunks: HashMap::new(),
> @@ -483,15 +485,10 @@ impl BackupEnvironment {
>              );
>          }
>  
> -        // For S3 backends, upload the index file to the object store after closing
> -        if let DatastoreBackend::S3(s3_client) = &self.backend {
> -            self.s3_upload_index(s3_client, &data.name)
> -                .context("failed to upload dynamic index to s3 backend")?;
> -            self.log(format!(
> -                "Uploaded dynamic index file to s3 backend: {}",
> -                data.name
> -            ))
> -        }
> +        state.file_counter += 1;
> +        state.backup_size += size;
> +        state.backup_stat = state.backup_stat + data.upload_stat;
> +        state.active_backend_operations += 1;
>  
>          self.log_upload_stat(
>              &data.name,
> @@ -502,9 +499,21 @@ impl BackupEnvironment {
>              &data.upload_stat,
>          );
>  
> -        state.file_counter += 1;
> -        state.backup_size += size;
> -        state.backup_stat = state.backup_stat + data.upload_stat;
> +        // never hold mutex guard during s3 upload due to possible deadlocks
> +        drop(state);
> +
> +        // For S3 backends, upload the index file to the object store after closing
> +        if let DatastoreBackend::S3(s3_client) = &self.backend {
> +            self.s3_upload_index(s3_client, &data.name)
> +                .context("failed to upload dynamic index to s3 backend")?;
> +            self.log(format!(
> +                "Uploaded dynamic index file to s3 backend: {}",
> +                data.name
> +            ))
> +        }
> +
> +        let mut state = self.state.lock().unwrap();
> +        state.active_backend_operations -= 1;

these two hunks are okay, although we could also reuse the registered
writers map to encode whether a writer is active, being processed/closed
or doesn't exist? would allow more fine-grained logging..

>          Ok(())
>      }
> @@ -567,15 +576,10 @@ impl BackupEnvironment {
>              );
>          }
>  
> -        // For S3 backends, upload the index file to the object store after closing
> -        if let DatastoreBackend::S3(s3_client) = &self.backend {
> -            self.s3_upload_index(s3_client, &data.name)
> -                .context("failed to upload fixed index to s3 backend")?;
> -            self.log(format!(
> -                "Uploaded fixed index file to object store: {}",
> -                data.name
> -            ))
> -        }
> +        state.file_counter += 1;
> +        state.backup_size += size;
> +        state.backup_stat = state.backup_stat + data.upload_stat;
> +        state.active_backend_operations += 1;
>  
>          self.log_upload_stat(
>              &data.name,
> @@ -586,9 +590,21 @@ impl BackupEnvironment {
>              &data.upload_stat,
>          );
>  
> -        state.file_counter += 1;
> -        state.backup_size += size;
> -        state.backup_stat = state.backup_stat + data.upload_stat;
> +        // never hold mutex guard during s3 upload due to possible deadlocks
> +        drop(state);
> +
> +        // For S3 backends, upload the index file to the object store after closing
> +        if let DatastoreBackend::S3(s3_client) = &self.backend {
> +            self.s3_upload_index(s3_client, &data.name)
> +                .context("failed to upload fixed index to s3 backend")?;
> +            self.log(format!(
> +                "Uploaded fixed index file to object store: {}",
> +                data.name
> +            ))
> +        }
> +
> +        let mut state = self.state.lock().unwrap();
> +        state.active_backend_operations -= 1;
>  
>          Ok(())
>      }
> @@ -645,6 +661,13 @@ impl BackupEnvironment {
>              bail!("found open index writer - unable to finish backup");
>          }
>  
> +        if state.active_backend_operations != 0 {
> +            bail!(
> +                "backup task still has {} active operations.",
> +                state.active_backend_operations,
> +            );
> +        }
> +
>          if state.file_counter == 0 {
>              bail!("backup does not contain valid files (file count == 0)");
>          }
> @@ -753,6 +776,12 @@ impl BackupEnvironment {
>          if !state.finished {
>              bail!("backup ended but finished flag is not set.");
>          }
> +        if state.active_backend_operations != 0 {
> +            bail!(
> +                "backup ended but {} active backend operations.",
> +                state.active_backend_operations,
> +            );
> +        }

there's now an inconsistency between ensure_finished(), which checks both
conditions, and finished() (used to determine whether a connection being
interrupted is benign or not!), which just checks the finished flag..

>          Ok(())
>      }
>  
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


  reply	other threads:[~2025-09-25 12:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-24 14:56 [pbs-devel] [RFC proxmox-backup 0/2] fix #6750: fix possible deadlock for s3 backed datastore backups Christian Ebner
2025-09-24 14:56 ` [pbs-devel] [PATCH proxmox-backup 1/2] fix #6750: api: avoid possible deadlock on datastores with s3 backend Christian Ebner
2025-09-25 12:41   ` Fabian Grünbichler [this message]
2025-09-25 13:08     ` Christian Ebner
2025-09-24 14:56 ` [pbs-devel] [PATCH proxmox-backup 2/2] api: backup: never hold mutex guard when doing manifest update Christian Ebner
2025-09-25 12:46   ` Fabian Grünbichler
2025-09-25 13:20     ` Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1758803678.fo1e90c9lf.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal