From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup 4/4] api/datastore: use maintenance-mode lock to protect against changes
Date: Wed, 06 May 2026 08:30:23 +0200 [thread overview]
Message-ID: <1778048819.s4fbqpt0x2.astroid@yuna.none> (raw)
In-Reply-To: <189d0714-90c1-4607-95d0-3ae74febef59@proxmox.com>
On May 5, 2026 3:02 pm, Christian Ebner wrote:
> On 5/5/26 2:12 PM, Fabian Grünbichler wrote:
>> On May 5, 2026 10:11 am, Christian Ebner wrote:
> [..]
>>> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
>>> index 48acba4c8..823f3bf2f 100644
>>> --- a/pbs-datastore/src/lib.rs
>>> +++ b/pbs-datastore/src/lib.rs
>>> @@ -159,8 +159,9 @@
>>>
>>> use std::os::unix::io::AsRawFd;
>>> use std::path::Path;
>>> +use std::time::Duration;
>>>
>>> -use anyhow::{bail, Error};
>>> +use anyhow::{bail, Context, Error};
>>>
>>> use pbs_config::BackupLockGuard;
>>>
>>> @@ -271,3 +272,11 @@ where
>>>
>>> Ok(lock)
>>> }
>>> +
>>> +/// Acquire an exclusive lock for the datastore's maintenance-mode
>>> +pub fn maintenance_mode_lock(store: &str) -> Result<BackupLockGuard, Error> {
>>> + lock_helper(store, Path::new("maintenance-mode.lck"), |p| {
>>> + pbs_config::open_backup_lockfile(p, Some(Duration::from_secs(0)), true)
>>
>> this might warrant a comment, usually we open config related locks with
>> a timeout to prevent flaky locking in case of concurrency, waiting a few
>> seconds is normally fine.
>>
>> but here, we must always first acquire the datastore config lock (should
>> we encode that in the signature??) anyway, which makes the 0
>> timeout/non-blocking behaviour okay, if I understand it correctly?
>
> Right, it makes sense to pass in the config lock to "prove" that it was
> acquired first.
>
> But it could make sense to add the short delay here nevertheless. After
> all, another operation might have dropped the config lock already, while
> still holding onto the maintenance-mode lock. OTOH the operations which
> do hold it for longer probably will outlive the timeout, so a bit torn
> as I'm not to sure if there is much benefit.
but the only time that happens is if the operation holding *only* the
maintainence mode lock is doing a long-running maintenance operation on
that datastore (the code modified below). and we don't want a long
timeout, because that means holding up the much more important global
datastore.cfg lock (that we must be holding to reach this fn).
I'd keep this non-blocking but add a comment *why*.
>>> + .context("unable to acquire exclusive datastore's maintenance-mode lock")
>>> + })
>>> +}
>>> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
>>> index 1e0a78f3a..883091e6f 100644
>>> --- a/src/api2/admin/datastore.rs
>>> +++ b/src/api2/admin/datastore.rs
>>> @@ -61,8 +61,8 @@ use pbs_datastore::index::IndexFile;
>>> use pbs_datastore::manifest::BackupManifest;
>>> use pbs_datastore::prune::compute_prune_info;
>>> use pbs_datastore::{
>>> - check_backup_owner, ensure_datastore_is_mounted, task_tracking, BackupDir, DataStore,
>>> - LocalChunkReader, StoreProgress,
>>> + check_backup_owner, ensure_datastore_is_mounted, maintenance_mode_lock, task_tracking,
>>> + BackupDir, DataStore, LocalChunkReader, StoreProgress,
>>> };
>>> use pbs_tools::json::required_string_param;
>>> use proxmox_rest_server::{formatter, worker_is_active, WorkerTask};
>>> @@ -2705,9 +2705,13 @@ fn expect_maintenance_type(
>>> }
>>>
>>> fn unset_maintenance(
>>> - _lock: pbs_config::BackupLockGuard,
>>> + lock: Option<pbs_config::BackupLockGuard>,
>>
>> I think this should take Option<(lock, lock)>, and
>> expect_maintenance_type should acquire and return both locks, but it
>> needs to be adapted further, because atm it is broken:
>>
>> 1. start s3-refresh while active operations are running
>> 2. s3-refresh will be in loop waiting for operations to disappear
>> 3. roughly every second, it will call expect_maintenance_type
>> 4. if somebody else is holding the datastore config lock for more than
>> ten seconds (let's say you started a re-use existing datastore
>> creation with S3 ;)), expect_maintenance_type will error out!
>
> Yeah, point 4 is broken in any case... :/ But will fix according to the
> suggestions above.
it could also happen cause of lock contention, or something else
blocking for more than a few seconds, or a combination of those factors.
the datastore creation I just stumbled upon and seemed like an obvious
way to trigger it :)
next prev parent reply other threads:[~2026-05-06 6:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 8:11 [PATCH proxmox-backup 0/4] don't hold datastore config lock during s3-refresh Christian Ebner
2026-05-05 8:11 ` [PATCH proxmox-backup 1/4] pbs-config: reorganize crate use statements Christian Ebner
2026-05-05 12:15 ` applied: " Fabian Grünbichler
2026-05-05 8:11 ` [PATCH proxmox-backup 2/4] datastore: move lock files base path constant to central location Christian Ebner
2026-05-05 8:11 ` [PATCH proxmox-backup 3/4] datastore: move file lock helper to centralized place Christian Ebner
2026-05-05 12:15 ` Fabian Grünbichler
2026-05-05 8:11 ` [PATCH proxmox-backup 4/4] api/datastore: use maintenance-mode lock to protect against changes Christian Ebner
2026-05-05 12:14 ` Fabian Grünbichler
2026-05-05 13:02 ` Christian Ebner
2026-05-06 6:30 ` Fabian Grünbichler [this message]
2026-05-06 16:58 ` superseded: [PATCH proxmox-backup 0/4] don't hold datastore config lock during s3-refresh Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1778048819.s4fbqpt0x2.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.