all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: "Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v2] fix #6566: backup: api: conditionally drop group and snapshot locks
Date: Tue, 30 Sep 2025 11:21:11 +0200	[thread overview]
Message-ID: <f4e4c13e-0e9d-4a13-9237-847ef3676694@proxmox.com> (raw)
In-Reply-To: <1759221679.xvpl4bbm4p.astroid@yuna.none>

On 9/30/25 10:44 AM, Fabian Grünbichler wrote:
> On September 29, 2025 5:17 pm, Christian Ebner wrote:
>> To guarantee consistency by possible concurrent operations, the
>> backup protocol locks the backup group, the previous backup
>> snapshot (if any) and holds a lock for the newly created backup
>> snapshot. All of these are currently stored in the backup worker
>> task, only released on its destruction.
>>
>> The backup API however signals a successful backup via the return
>> status of the `finish` call, while still holding the locks.
>> Therefore, an immediate subsequent backup of the client to the same
>> group can fail because the locks cannot be acquired until the previous
>> backup task is completely destroyed, which can however outlive the
>> `finish` return for some time. This manifests in e.g. a push sync job
>> failing.
>>
>> To fix this, store the lock guards inside the RPC environments shared
>> state instead, allowing to selectively drop the locks on successful
>> backup finish. On error, hold the locks until the cleanup was
>> successful.
>>
>> Immediate verification of new snapshots already downgraded the lock
>> by dropping the exclusive lock and getting a shared lock. Since the
>> dropping is now already handled by the finish call, only gathering
>> the shared lock is required. While there is now a larger time window
>> for concurrent prunes, the underlying possible race between
>> verification and prune remains in place.
>>
>> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=6566
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> 
> Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> 
> it should be very rare that a snapshot is attempted to be removed right
> as it was created, so the slightly increased race window there should be
> okay.
> 
> did you test this with vzdump's `protected` option and verify after
> completion?

I did not manage to race the backup finish <-> verify after finished.

But there is the (pre-existing) race between pruning and setting the 
protected marker. This can be triggered if timed right, e.g. just did 
with the latest rebased version. The backup task then fails with, e.g.:
```
...
INFO: stopping kvm after backup task
INFO: adding notes to backup
WARN: unable to add notes - proxmox-backup-client failed: Error: unable 
to update manifest blob - unable to load blob 
'"/devel/datastore/vm/100/2025-09-30T09:09:40Z/index.json.blob"' - No 
such file or directory (os error 2)
INFO: marking backup as protected
ERROR: Backup of VM 100 failed - unable to set protected flag - 400 Bad 
Request
INFO: Failed at 2025-09-30 11:09:52
...
```

To fix that, we would need to extend the backup upgrade api endpoint by 
a protected marker flag, so this can be set a-priori and written when 
the manifest is persisted to disk.



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2025-09-30  9:21 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-29 15:17 Christian Ebner
2025-09-30  8:44 ` Fabian Grünbichler
2025-09-30  9:21   ` Christian Ebner [this message]
2025-10-01  8:59 ` [pbs-devel] applied: " Fabian Grünbichler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f4e4c13e-0e9d-4a13-9237-847ef3676694@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal