From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>,
Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v2 0/2] fix #6750: fix possible deadlock for s3 backed datastore backups
Date: Fri, 26 Sep 2025 12:45:28 +0200 [thread overview]
Message-ID: <1758883467.nn2vu5yd1x.astroid@yuna.none> (raw)
In-Reply-To: <7b56aa23-c632-4c77-ba85-6405ccba2209@proxmox.com>
On September 26, 2025 12:35 pm, Christian Ebner wrote:
> On 9/26/25 12:26 PM, Fabian Grünbichler wrote:
>> On September 26, 2025 10:42 am, Christian Ebner wrote:
>>> These patches aim to fix a deadlock which can occur during backup
>>> jobs to datastores backed by S3 backend. The deadlock most likely is
>>> caused by the mutex guard for the backup shared state being held
>>> while entering the tokio::task::block_in_place context and executing
>>> async code, which however can lead to deadlocks as described in [0].
>>>
>>> Therefore, these patches avoid holding the mutex guard for the shared
>>> backup state while performing the s3 backend operations, by
>>> prematurely dropping it. To avoid inconsistencies, introduce flags
>>> to keep track of the index writers closing state and add a transient
>>> `Finishing` state to be entered during manifest updates.
>>>
>>> Changes since version 1 (thanks @Fabian):
>>> - Use the shared backup state's writers in addition with a closed flag
>>> instead of counting active backend operations.
>>> - Replace finished flag with BackupState enum to introduce the new,
>>> transient `Finishing` state to be entered during manifest updates.
>>> - Add missing checks and refactor code to the now mutable reference when
>>> accessing the shared backup state in the respective close calls.
>>
>> this looks a lot better!
>>
>> but I think we both missed one more problematic code path:
>>
>> - env.remove_backup() (sync)
>> -- locks state
>> -- calls pbs_datastore::datastore::remove_backup() (sync)
>> --- calls pbs_datastore::backup_info::BackupDir::destroy (sync)
>> ---- calls proxmox_async_runtime::block_on(s3_client.delete_objects_by_prefix)
>
> Good catch!
>
>> this one is only called in mod.rs *after* the backup session processing
>> is completed, I am not even sure why we call into the env there (all we
>> do with it is set the state to finished, but that has no effect at that
>> point anymore AFAICT?)
>
> Must double check, but that might be related to allowing the client
> connection to disappear without further error?
I don't think so, that (ugly hack) happens as part of processing
requests, the removal happens afterwards *based on the result* of that
processing..
>> maybe we should just move the remove_backup fn from the env to mod.rs
>> and drop the state update from it?
>
> Okay, will check what are the further implications of that, thanks!
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
prev parent reply other threads:[~2025-09-26 10:45 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-26 8:42 Christian Ebner
2025-09-26 8:42 ` [pbs-devel] [PATCH proxmox-backup v2 1/2] fix #6750: api: avoid possible deadlock on datastores with s3 backend Christian Ebner
2025-09-26 8:42 ` [pbs-devel] [PATCH proxmox-backup v2 2/2] api: backup: never hold mutex guard when doing manifest update Christian Ebner
2025-09-26 10:26 ` [pbs-devel] [PATCH proxmox-backup v2 0/2] fix #6750: fix possible deadlock for s3 backed datastore backups Fabian Grünbichler
2025-09-26 10:35 ` Christian Ebner
2025-09-26 10:45 ` Fabian Grünbichler [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1758883467.nn2vu5yd1x.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox