From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: "Proxmox Backup Server development discussion"
<pbs-devel@lists.proxmox.com>,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v5 3/4] partial fix #6049: datastore: use config fast-path in Drop
Date: Fri, 28 Nov 2025 10:03:27 +0100 [thread overview]
Message-ID: <a5867e40-676c-4c72-8942-05d8576128e1@proxmox.com> (raw)
In-Reply-To: <1764167584.qinchnekm5.astroid@yuna.none>
On 11/26/25 4:15 PM, Fabian Grünbichler wrote:
> On November 24, 2025 6:04 pm, Samuel Rufinatscha wrote:
>> The Drop impl of DataStore re-read datastore.cfg to decide whether
>> the entry should be evicted from the in-process cache (based on
>> maintenance mode’s clear_from_cache). During the investigation of
>> issue #6049 [1], a flamegraph [2] showed that the config reload in Drop
>> accounted for a measurable share of CPU time under load.
>>
>> This patch wires the datastore config fast path to the Drop
>> impl to eventually avoid an expensive config reload from disk to capture
>> the maintenance mandate. Also, to ensure the Drop handlers will detect
>> that a newer config exists / to mitigate usage of an eventually stale
>> cached entry, generation will not only be bumped on config save, but also
>> on re-read of the config file (slow path), if `update_cache = true`.
>>
>> Links
>>
>> [1] Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
>> [2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
>>
>> Fixes: #6049
>> Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
>> ---
>> Changes:
>>
>> From v1 → v2
>> - Replace caching logic with the datastore_section_config_cached()
>> helper.
>>
>> From v2 → v3
>> No changes
>>
>> From v3 → v4, thanks @Fabian
>> - Pass datastore_section_config_cached(false) in Drop to avoid
>> concurrent cache updates.
>>
>> From v4 → v5
>> - Rebased only, no changes
>>
>> pbs-datastore/src/datastore.rs | 60 ++++++++++++++++++++++++++--------
>> 1 file changed, 47 insertions(+), 13 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index c9cb5d65..7638a899 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -225,15 +225,40 @@ impl Drop for DataStore {
>> // remove datastore from cache iff
>> // - last task finished, and
>> // - datastore is in a maintenance mode that mandates it
>> - let remove_from_cache = last_task
>> - && pbs_config::datastore::config()
>> - .and_then(|(s, _)| s.lookup::<DataStoreConfig>("datastore", self.name()))
>> - .is_ok_and(|c| {
>> - c.get_maintenance_mode()
>> - .is_some_and(|m| m.clear_from_cache())
>> - });
>
> old code here ignored parsing/locking/.. issues and just assumed if no
> config can be obtained nothing should be done..
>
>> -
>> - if remove_from_cache {
>> +
>> + // first check: check if last task finished
>> + if !last_task {
>> + return;
>> + }
>> +
>> + let (section_config, _gen) = match datastore_section_config_cached(false) {
>> + Ok(v) => v,
>> + Err(err) => {
>> + log::error!(
>> + "failed to load datastore config in Drop for {} - {err}",
>> + self.name()
>> + );
>> + return;
>> + }
>> + };
>> +
>> + let datastore_cfg: DataStoreConfig =
>> + match section_config.lookup("datastore", self.name()) {
>> + Ok(cfg) => cfg,
>> + Err(err) => {
>> + log::error!(
>> + "failed to look up datastore '{}' in Drop - {err}",
>> + self.name()
>> + );
>> + return;
>
> here we now have fancy error logging ;) which can be fine, but if we go
> from silently ignoring errors to logging them at error level that should
> be mentioned to make it clear that it is intentional.
>
Makes sense, will mention that change in the commit message.
> besides that, the second error here means that the datastore was removed
> from the config in the meantime.. in which case we should probably
> remove it from the map as well, if is still there, even though we can't
> check the maintenance mode..
>
>> + }
>> + };
>> +
>> + // second check: check maintenance mode mandate
>
> what is a "maintenance mode mandate"? ;)
>
> keeping it simple, why not just
>
> // check if maintenance mode requires closing FDs
>
I see, will rephrase this, thanks!
>> + if datastore_cfg
>> + .get_maintenance_mode()
>> + .is_some_and(|m| m.clear_from_cache())
>> + {
>> DATASTORE_MAP.lock().unwrap().remove(self.name());
>> }
>> }
>> @@ -307,12 +332,12 @@ impl DatastoreThreadSettings {
>> /// - If the cached generation matches the current generation, the
>> /// cached config is returned.
>> /// - Otherwise the config is re-read from disk. If `update_cache` is
>> -/// `true`, the new config and current generation are stored in the
>> +/// `true`, the new config and bumped generation are stored in the
>> /// cache. Callers that set `update_cache = true` must hold the
>> /// datastore config lock to avoid racing with concurrent config
>> /// changes.
>> /// - If `update_cache` is `false`, the freshly read config is returned
>> -/// but the cache is left unchanged.
>> +/// but the cache and generation are left unchanged.
>> ///
>> /// If `ConfigVersionCache` is not available, the config is always read
>> /// from disk and `None` is returned as the generation.
>> @@ -333,14 +358,23 @@ fn datastore_section_config_cached(
>
> does this part here make any sense in this patch?
>
> we don't check the generation in the Drop handler anyway, so it will get
> the latest cached version, no matter what?
>
we don't check the generation in the Drop handler, but the drop handler
depends on this to potentially get a most fresh cached version?
> we'd only end up in this part of the code via lookup_datastore, and only
> if:
> - the previous cached entry and the current one have a different
> generation -> no need to bump again, the cache is already invalidated
> - there is no previous cached entry -> nothing to invalidate
>
> I think this part should move to the next patch..
Shouldn't it be rather in PATCH 2 then, instead part of the TTL feature
Also I would adjust the comment below then, so that it doesn't
necessarily just benefit the drop handler that calls
datastore_section_config_cached(false) but would in general future uses
of datastore_section_config_cached(false)?
>
>> let (config_raw, _digest) = pbs_config::datastore::config()?;
>> let config = Arc::new(config_raw);
>>
>> + let mut effective_gen = current_gen;
>> if update_cache {
>> + // Bump the generation. This ensures that Drop
>> + // handlers will detect that a newer config exists
>> + // and will not rely on a stale cached entry for
>> + // maintenance mandate.
>> + let prev_gen = version_cache.increase_datastore_generation();
>> + effective_gen = prev_gen + 1;
>> +
>> + // Persist
>> *config_cache = Some(DatastoreConfigCache {
>> config: config.clone(),
>> - last_generation: current_gen,
>> + last_generation: effective_gen,
>> });
>> }
>>
>> - Ok((config, Some(current_gen)))
>> + Ok((config, Some(effective_gen)))
>> } else {
>> // Fallback path, no config version cache: read datastore.cfg and return None as generation
>> *config_cache = None;
>> --
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-28 9:03 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-24 17:04 [pbs-devel] [PATCH proxmox-backup v5 0/4] datastore: remove config reload on hot path Samuel Rufinatscha
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 1/4] partial fix #6049: config: enable config version cache for datastore Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 2/4] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler
2025-11-26 17:21 ` Samuel Rufinatscha
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 3/4] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler
2025-11-28 9:03 ` Samuel Rufinatscha [this message]
2025-11-28 10:46 ` Fabian Grünbichler
2025-11-28 11:10 ` Samuel Rufinatscha
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 4/4] partial fix #6049: datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler
2025-11-26 15:16 ` [pbs-devel] [PATCH proxmox-backup v5 0/4] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-26 16:10 ` Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a5867e40-676c-4c72-8942-05d8576128e1@proxmox.com \
--to=s.rufinatscha@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox