From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup 4/4] api/datastore: use maintenance-mode lock to protect against changes
Date: Wed, 06 May 2026 08:30:23 +0200 [thread overview]
Message-ID: <1778048819.s4fbqpt0x2.astroid@yuna.none> (raw)
In-Reply-To: <189d0714-90c1-4607-95d0-3ae74febef59@proxmox.com>
On May 5, 2026 3:02 pm, Christian Ebner wrote:
> On 5/5/26 2:12 PM, Fabian Grünbichler wrote:
>> On May 5, 2026 10:11 am, Christian Ebner wrote:
> [..]
>>> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
>>> index 48acba4c8..823f3bf2f 100644
>>> --- a/pbs-datastore/src/lib.rs
>>> +++ b/pbs-datastore/src/lib.rs
>>> @@ -159,8 +159,9 @@
>>>
>>> use std::os::unix::io::AsRawFd;
>>> use std::path::Path;
>>> +use std::time::Duration;
>>>
>>> -use anyhow::{bail, Error};
>>> +use anyhow::{bail, Context, Error};
>>>
>>> use pbs_config::BackupLockGuard;
>>>
>>> @@ -271,3 +272,11 @@ where
>>>
>>> Ok(lock)
>>> }
>>> +
>>> +/// Acquire an exclusive lock for the datastore's maintenance-mode
>>> +pub fn maintenance_mode_lock(store: &str) -> Result<BackupLockGuard, Error> {
>>> + lock_helper(store, Path::new("maintenance-mode.lck"), |p| {
>>> + pbs_config::open_backup_lockfile(p, Some(Duration::from_secs(0)), true)
>>
>> this might warrant a comment, usually we open config related locks with
>> a timeout to prevent flaky locking in case of concurrency, waiting a few
>> seconds is normally fine.
>>
>> but here, we must always first acquire the datastore config lock (should
>> we encode that in the signature??) anyway, which makes the 0
>> timeout/non-blocking behaviour okay, if I understand it correctly?
>
> Right, it makes sense to pass in the config lock to "prove" that it was
> acquired first.
>
> But it could make sense to add the short delay here nevertheless. After
> all, another operation might have dropped the config lock already, while
> still holding onto the maintenance-mode lock. OTOH the operations which
> do hold it for longer probably will outlive the timeout, so a bit torn
> as I'm not to sure if there is much benefit.
but the only time that happens is if the operation holding *only* the
maintainence mode lock is doing a long-running maintenance operation on
that datastore (the code modified below). and we don't want a long
timeout, because that means holding up the much more important global
datastore.cfg lock (that we must be holding to reach this fn).
I'd keep this non-blocking but add a comment *why*.
>>> + .context("unable to acquire exclusive datastore's maintenance-mode lock")
>>> + })
>>> +}
>>> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
>>> index 1e0a78f3a..883091e6f 100644
>>> --- a/src/api2/admin/datastore.rs
>>> +++ b/src/api2/admin/datastore.rs
>>> @@ -61,8 +61,8 @@ use pbs_datastore::index::IndexFile;
>>> use pbs_datastore::manifest::BackupManifest;
>>> use pbs_datastore::prune::compute_prune_info;
>>> use pbs_datastore::{
>>> - check_backup_owner, ensure_datastore_is_mounted, task_tracking, BackupDir, DataStore,
>>> - LocalChunkReader, StoreProgress,
>>> + check_backup_owner, ensure_datastore_is_mounted, maintenance_mode_lock, task_tracking,
>>> + BackupDir, DataStore, LocalChunkReader, StoreProgress,
>>> };
>>> use pbs_tools::json::required_string_param;
>>> use proxmox_rest_server::{formatter, worker_is_active, WorkerTask};
>>> @@ -2705,9 +2705,13 @@ fn expect_maintenance_type(
>>> }
>>>
>>> fn unset_maintenance(
>>> - _lock: pbs_config::BackupLockGuard,
>>> + lock: Option<pbs_config::BackupLockGuard>,
>>
>> I think this should take Option<(lock, lock)>, and
>> expect_maintenance_type should acquire and return both locks, but it
>> needs to be adapted further, because atm it is broken:
>>
>> 1. start s3-refresh while active operations are running
>> 2. s3-refresh will be in loop waiting for operations to disappear
>> 3. roughly every second, it will call expect_maintenance_type
>> 4. if somebody else is holding the datastore config lock for more than
>> ten seconds (let's say you started a re-use existing datastore
>> creation with S3 ;)), expect_maintenance_type will error out!
>
> Yeah, point 4 is broken in any case... :/ But will fix according to the
> suggestions above.
it could also happen cause of lock contention, or something else
blocking for more than a few seconds, or a combination of those factors.
the datastore creation I just stumbled upon and seemed like an obvious
way to trigger it :)
next prev parent reply other threads:[~2026-05-06 6:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 8:11 [PATCH proxmox-backup 0/4] don't hold datastore config lock during s3-refresh Christian Ebner
2026-05-05 8:11 ` [PATCH proxmox-backup 1/4] pbs-config: reorganize crate use statements Christian Ebner
2026-05-05 12:15 ` applied: " Fabian Grünbichler
2026-05-05 8:11 ` [PATCH proxmox-backup 2/4] datastore: move lock files base path constant to central location Christian Ebner
2026-05-05 8:11 ` [PATCH proxmox-backup 3/4] datastore: move file lock helper to centralized place Christian Ebner
2026-05-05 12:15 ` Fabian Grünbichler
2026-05-05 8:11 ` [PATCH proxmox-backup 4/4] api/datastore: use maintenance-mode lock to protect against changes Christian Ebner
2026-05-05 12:14 ` Fabian Grünbichler
2026-05-05 13:02 ` Christian Ebner
2026-05-06 6:30 ` Fabian Grünbichler [this message]
2026-05-06 16:58 ` superseded: [PATCH proxmox-backup 0/4] don't hold datastore config lock during s3-refresh Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1778048819.s4fbqpt0x2.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox