From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v5 3/4] partial fix #6049: datastore: use config fast-path in Drop
Date: Wed, 26 Nov 2025 16:15:50 +0100 [thread overview]
Message-ID: <1764167584.qinchnekm5.astroid@yuna.none> (raw)
In-Reply-To: <20251124170423.303300-4-s.rufinatscha@proxmox.com>
On November 24, 2025 6:04 pm, Samuel Rufinatscha wrote:
> The Drop impl of DataStore re-read datastore.cfg to decide whether
> the entry should be evicted from the in-process cache (based on
> maintenance mode’s clear_from_cache). During the investigation of
> issue #6049 [1], a flamegraph [2] showed that the config reload in Drop
> accounted for a measurable share of CPU time under load.
>
> This patch wires the datastore config fast path to the Drop
> impl to eventually avoid an expensive config reload from disk to capture
> the maintenance mandate. Also, to ensure the Drop handlers will detect
> that a newer config exists / to mitigate usage of an eventually stale
> cached entry, generation will not only be bumped on config save, but also
> on re-read of the config file (slow path), if `update_cache = true`.
>
> Links
>
> [1] Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> [2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
>
> Fixes: #6049
> Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
> ---
> Changes:
>
> From v1 → v2
> - Replace caching logic with the datastore_section_config_cached()
> helper.
>
> From v2 → v3
> No changes
>
> From v3 → v4, thanks @Fabian
> - Pass datastore_section_config_cached(false) in Drop to avoid
> concurrent cache updates.
>
> From v4 → v5
> - Rebased only, no changes
>
> pbs-datastore/src/datastore.rs | 60 ++++++++++++++++++++++++++--------
> 1 file changed, 47 insertions(+), 13 deletions(-)
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index c9cb5d65..7638a899 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -225,15 +225,40 @@ impl Drop for DataStore {
> // remove datastore from cache iff
> // - last task finished, and
> // - datastore is in a maintenance mode that mandates it
> - let remove_from_cache = last_task
> - && pbs_config::datastore::config()
> - .and_then(|(s, _)| s.lookup::<DataStoreConfig>("datastore", self.name()))
> - .is_ok_and(|c| {
> - c.get_maintenance_mode()
> - .is_some_and(|m| m.clear_from_cache())
> - });
old code here ignored parsing/locking/.. issues and just assumed if no
config can be obtained nothing should be done..
> -
> - if remove_from_cache {
> +
> + // first check: check if last task finished
> + if !last_task {
> + return;
> + }
> +
> + let (section_config, _gen) = match datastore_section_config_cached(false) {
> + Ok(v) => v,
> + Err(err) => {
> + log::error!(
> + "failed to load datastore config in Drop for {} - {err}",
> + self.name()
> + );
> + return;
> + }
> + };
> +
> + let datastore_cfg: DataStoreConfig =
> + match section_config.lookup("datastore", self.name()) {
> + Ok(cfg) => cfg,
> + Err(err) => {
> + log::error!(
> + "failed to look up datastore '{}' in Drop - {err}",
> + self.name()
> + );
> + return;
here we now have fancy error logging ;) which can be fine, but if we go
from silently ignoring errors to logging them at error level that should
be mentioned to make it clear that it is intentional.
besides that, the second error here means that the datastore was removed
from the config in the meantime.. in which case we should probably
remove it from the map as well, if is still there, even though we can't
check the maintenance mode..
> + }
> + };
> +
> + // second check: check maintenance mode mandate
what is a "maintenance mode mandate"? ;)
keeping it simple, why not just
// check if maintenance mode requires closing FDs
> + if datastore_cfg
> + .get_maintenance_mode()
> + .is_some_and(|m| m.clear_from_cache())
> + {
> DATASTORE_MAP.lock().unwrap().remove(self.name());
> }
> }
> @@ -307,12 +332,12 @@ impl DatastoreThreadSettings {
> /// - If the cached generation matches the current generation, the
> /// cached config is returned.
> /// - Otherwise the config is re-read from disk. If `update_cache` is
> -/// `true`, the new config and current generation are stored in the
> +/// `true`, the new config and bumped generation are stored in the
> /// cache. Callers that set `update_cache = true` must hold the
> /// datastore config lock to avoid racing with concurrent config
> /// changes.
> /// - If `update_cache` is `false`, the freshly read config is returned
> -/// but the cache is left unchanged.
> +/// but the cache and generation are left unchanged.
> ///
> /// If `ConfigVersionCache` is not available, the config is always read
> /// from disk and `None` is returned as the generation.
> @@ -333,14 +358,23 @@ fn datastore_section_config_cached(
does this part here make any sense in this patch?
we don't check the generation in the Drop handler anyway, so it will get
the latest cached version, no matter what?
we'd only end up in this part of the code via lookup_datastore, and only
if:
- the previous cached entry and the current one have a different
generation -> no need to bump again, the cache is already invalidated
- there is no previous cached entry -> nothing to invalidate
I think this part should move to the next patch..
> let (config_raw, _digest) = pbs_config::datastore::config()?;
> let config = Arc::new(config_raw);
>
> + let mut effective_gen = current_gen;
> if update_cache {
> + // Bump the generation. This ensures that Drop
> + // handlers will detect that a newer config exists
> + // and will not rely on a stale cached entry for
> + // maintenance mandate.
> + let prev_gen = version_cache.increase_datastore_generation();
> + effective_gen = prev_gen + 1;
> +
> + // Persist
> *config_cache = Some(DatastoreConfigCache {
> config: config.clone(),
> - last_generation: current_gen,
> + last_generation: effective_gen,
> });
> }
>
> - Ok((config, Some(current_gen)))
> + Ok((config, Some(effective_gen)))
> } else {
> // Fallback path, no config version cache: read datastore.cfg and return None as generation
> *config_cache = None;
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-26 15:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-24 17:04 [pbs-devel] [PATCH proxmox-backup v5 0/4] datastore: remove config reload on hot path Samuel Rufinatscha
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 1/4] partial fix #6049: config: enable config version cache for datastore Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 2/4] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler
2025-11-26 17:21 ` Samuel Rufinatscha
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 3/4] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler [this message]
2025-11-28 9:03 ` Samuel Rufinatscha
2025-11-28 10:46 ` Fabian Grünbichler
2025-11-28 11:10 ` Samuel Rufinatscha
2025-11-24 17:04 ` [pbs-devel] [PATCH proxmox-backup v5 4/4] partial fix #6049: datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-26 15:15 ` Fabian Grünbichler
2025-11-26 15:16 ` [pbs-devel] [PATCH proxmox-backup v5 0/4] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-26 16:10 ` Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1764167584.qinchnekm5.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox