From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups
Date: Wed, 12 Nov 2025 14:24:50 +0100 [thread overview]
Message-ID: <1762936034.bqz00j2ard.astroid@yuna.none> (raw)
In-Reply-To: <20251111122941.110412-2-s.rufinatscha@proxmox.com>
On November 11, 2025 1:29 pm, Samuel Rufinatscha wrote:
> Repeated /status requests caused lookup_datastore() to re-read and
> parse datastore.cfg on every call. The issue was mentioned in report
> #6049 [1]. cargo-flamegraph [2] confirmed that the hot path is
> dominated by pbs_config::datastore::config() (config parsing).
>
> This patch adds a fast path to lookup_datastore() using the shared-
> memory ConfigVersionCache generation counter for datastore.cfg. It
> stores the last seen generation number alongside the cached
> DataStoreImpl and, when a subsequent lookup sees the same generation,
> we reuse the cached instance without re-reading the on-disk config. If
> the generation differs (or the cache is unavailable), it falls back to
> the existing slow path with no behavioral changes.
>
> Behavioral notes
>
> - The generation is bumped via the existing save_config() path, so
> API-driven config changes are detected immediately.
> - Manual edits to datastore.cfg are not detected by this commit; a TTL
> guard is introduced in a later patch to cover that case.
> - DataStore::drop still performs a config read on the common path; this
> is addressed in the next patch.
>
> Testing
>
> cargo-flamegraph confirms the config-parse hotspot in
> lookup_datastore() disappears from the hot path.
>
> Additionally, an end-to-end benchmark was performed using the
> `/status?verbose=0` API with 1000 datastores, 5 requests per store,
> and 16-way parallelism:
>
> | Metric | Before | After | Δ (abs) | Δ (%) |
> |--------------------------|:------:|:------:|:-------:|:-------:|
> | Total time | 13s | 11s | −2s | −15.38% |
> | Throughput (all rounds) | 384.62 | 454.55 | +69.93 | +18.18% |
> | Cold RPS (round #1) | 76.92 | 90.91 | +13.99 | +18.19% |
> | Warm RPS (rounds 2..N) | 307.69 | 363.64 | +55.95 | +18.18% |
>
> Throughput improved by ~18% overall, with total time reduced by ~15%.
> Warm lookups now reuse cached datastore instances and skip redundant
> config parsing entirely. The first-access round also shows a similar
> improvement, likely due to reduced contention and I/O when many stores
> are accessed in parallel.
>
> Note: A second hotspot remains in Drop where a config read occurs; that
> is addressed in the next commit.
it would be interesting to also have numbers for just lookup+Drop,
without all the HTTP and TLS overhead to really isolate it. that should
also make it easier to reliably benchmark with something like hyperfine
;)
for my simple config (5 datastores) single-threaded lookup+drop of a
single datastore 100k times gives around 1.31 speedup for the whole
series. the slightly modified version from below (which basically runs
the most expensive part of Drop only once) for the same test setup still
gives a speedup of 1.17
re-running the same benchmark with a config with 1k datastores, querying
M datastores N times gives the following speedups:
M = 1, N = 1000: 15.6x faster
M = 10, N = 100: 14.5x
M = 100, N = 10: 8.8x
M = 1000, N = 1: 1.8x (so this is basically showing the speedup of the
improved Drop handling alone!)
and then a slightly modified version, that keeps the DataStore instance
around until all N lookups for that datastore are done, then dropping
them in bulk (which mimics having lots of lookups while a task is
running, making the Drop handler only do the expensive part every once
in a while when the last task for a given datastore exits):
use anyhow::Error;
use pbs_api_types::Operation;
use pbs_datastore::DataStore;
fn main() -> Result<(), Error> {
let mut args = std::env::args();
args.next();
let datastores = if let Some(n) = args.next() {
n.parse::<usize>()?
} else {
1000
};
let iterations = if let Some(n) = args.next() {
n.parse::<usize>()?
} else {
1000
};
for d in 1..=datastores {
let name = format!("pbs_bench_{d}");
let mut stores = Vec::with_capacity(iterations);
for i in 1..=iterations {
stores.push(DataStore::lookup_datastore(&name, Some(Operation::Write))?);
}
}
Ok(())
}
M = 1, N = 1000: 8.5x faster
M = 10, N = 100: 7.9x
M = 100, N = 10: 5.2x
M = 1000, N = 1: 1.9x
looking at the flamegraph of this isolated benchmark it's now obvious
that the remaining overhead is lockfiles and tracking the operations
(both in lookup and when dropping)..
side-note: I actually found a bug in our operations tracking while
benchmarking this series that gave me wildly different numbers because
the "drop last task" part never got executed as a result:
https://lore.proxmox.com/pbs-devel/20251112131525.645971-1-f.gruenbichler@proxmox.com/T/#u
>
> Links
>
> [1] Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> [2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
>
> Fixes: #6049
> Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
> ---
> pbs-config/src/config_version_cache.rs | 10 +++-
> pbs-datastore/src/datastore.rs | 77 +++++++++++++++++---------
> 2 files changed, 59 insertions(+), 28 deletions(-)
>
> diff --git a/pbs-config/src/config_version_cache.rs b/pbs-config/src/config_version_cache.rs
> index e8fb994f..b875f7e0 100644
> --- a/pbs-config/src/config_version_cache.rs
> +++ b/pbs-config/src/config_version_cache.rs
> @@ -26,7 +26,6 @@ struct ConfigVersionCacheDataInner {
> // Traffic control (traffic-control.cfg) generation/version.
> traffic_control_generation: AtomicUsize,
> // datastore (datastore.cfg) generation/version
> - // FIXME: remove with PBS 3.0
> datastore_generation: AtomicUsize,
> // Add further atomics here
> }
> @@ -145,8 +144,15 @@ impl ConfigVersionCache {
> .fetch_add(1, Ordering::AcqRel);
> }
>
> + /// Returns the datastore generation number.
> + pub fn datastore_generation(&self) -> usize {
> + self.shmem
> + .data()
> + .datastore_generation
> + .load(Ordering::Acquire)
> + }
> +
> /// Increase the datastore generation number.
> - // FIXME: remove with PBS 3.0 or make actually useful again in datastore lookup
> pub fn increase_datastore_generation(&self) -> usize {
> self.shmem
> .data()
this part could be split out into its own patch
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 70af94d8..18eebb58 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -32,7 +32,7 @@ use pbs_api_types::{
> MaintenanceType, Operation, UPID,
> };
> use pbs_config::s3::S3_CFG_TYPE_ID;
> -use pbs_config::BackupLockGuard;
> +use pbs_config::{BackupLockGuard, ConfigVersionCache};
>
> use crate::backup_info::{
> BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FILENAME,
> @@ -140,10 +140,10 @@ pub struct DataStoreImpl {
> last_gc_status: Mutex<GarbageCollectionStatus>,
> verify_new: bool,
> chunk_order: ChunkOrder,
> - last_digest: Option<[u8; 32]>,
> sync_level: DatastoreFSyncLevel,
> backend_config: DatastoreBackendConfig,
> lru_store_caching: Option<LocalDatastoreLruCache>,
> + cached_config_tag: Option<CachedDatastoreConfigTag>,
> }
>
> impl DataStoreImpl {
> @@ -156,10 +156,10 @@ impl DataStoreImpl {
> last_gc_status: Mutex::new(GarbageCollectionStatus::default()),
> verify_new: false,
> chunk_order: Default::default(),
> - last_digest: None,
> sync_level: Default::default(),
> backend_config: Default::default(),
> lru_store_caching: None,
> + cached_config_tag: None,
> })
> }
> }
> @@ -224,6 +224,15 @@ pub enum DatastoreBackend {
> S3(Arc<S3Client>),
> }
>
> +/// Used to determine whether a cached datastore instance is still valid
> +/// or needs to be reloaded after a config change.
> +struct CachedDatastoreConfigTag {
> + /// Maintenance mode at the time the lookup hint was captured, if any.
> + last_maintenance_mode: Option<MaintenanceMode>,
> + /// Datastore generation number from `ConfigVersionCache`; `None` when the cache wasn't available.
> + last_generation: Option<usize>,
if the whole tag is an option, do we really need to make the generation
an option as well?
> +}
> +
> impl DataStore {
> // This one just panics on everything
> #[doc(hidden)]
> @@ -299,13 +308,40 @@ impl DataStore {
> // we use it to decide whether it is okay to delete the datastore.
> let _config_lock = pbs_config::datastore::lock_config()?;
>
> - // we could use the ConfigVersionCache's generation for staleness detection, but we load
> - // the config anyway -> just use digest, additional benefit: manual changes get detected
> - let (config, digest) = pbs_config::datastore::config()?;
> + // Get the current datastore.cfg generation number
> + let gen_num = ConfigVersionCache::new()
> + .ok()
> + .map(|c| c.datastore_generation());
> +
> + // Fast-path: if we have a cached entry created under the same datastore.cfg generation number, reuse it.
> + if let (Some(gen_num), Some(ds)) =
> + (gen_num, DATASTORE_MAP.lock().unwrap().get(name).cloned())
> + {
> + if let Some(cached_tag) = &ds.cached_config_tag {
> + if cached_tag.last_generation == Some(gen_num) {
> + if let Some(mm) = &cached_tag.last_maintenance_mode {
> + if let Err(error) = mm.check(operation) {
> + bail!("datastore '{name}' is unavailable: {error}");
> + }
> + }
> + if let Some(operation) = operation {
> + update_active_operations(name, operation, 1)?;
> + }
> + return Ok(Arc::new(Self {
> + inner: ds,
> + operation,
> + }));
> + }
> + }
> + }
> +
> + // Slow path: (re)load config
> + let (config, _digest) = pbs_config::datastore::config()?;
> let config: DataStoreConfig = config.lookup("datastore", name)?;
>
> - if let Some(maintenance_mode) = config.get_maintenance_mode() {
> - if let Err(error) = maintenance_mode.check(operation) {
> + let maintenance_mode = config.get_maintenance_mode();
> + if let Some(mm) = &maintenance_mode {
> + if let Err(error) = mm.check(operation) {
> bail!("datastore '{name}' is unavailable: {error}");
> }
> }
after this here we have a check for the mount status in the old hot
path, that is missing in the new hot path. the mount status can change
even if the config doesn't, so we should probably add this back to the
hot path and re-run the numbers?
that check again needs more parts of the config, so maybe we could
explore caching the full config here? e.g., add a new static
Mutex<Option<(DataStoreConfig, usize)>> (extended by the timestamp in
the last patch) and adapt the patches here to use it? depending on
whether we make the cached config available outside of lookup_datastore,
we could then even not add the maintenance mode to the cache tag, and
just store the generation number there, and retrieve the maintenance
mode from the cached config in the Drop implementation..
there's a little more duplication here, e.g. we lock the map and check
for an entry in both the fast and slow paths, we could do it once up
front (nobody can change the map while we have it locked anyway), the
checks are written twice and could probably be extracted into a helper
so that future similar checks are added to both paths and not to only
one by accident, ..
> @@ -321,16 +357,6 @@ impl DataStore {
>
> // reuse chunk store so that we keep using the same process locker instance!
> let chunk_store = if let Some(datastore) = &entry {
> - let last_digest = datastore.last_digest.as_ref();
> - if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
> - if let Some(operation) = operation {
> - update_active_operations(name, operation, 1)?;
> - }
> - return Ok(Arc::new(Self {
> - inner: Arc::clone(datastore),
> - operation,
> - }));
> - }
> Arc::clone(&datastore.chunk_store)
> } else {
> let tuning: DatastoreTuning = serde_json::from_value(
> @@ -344,7 +370,11 @@ impl DataStore {
> )?)
> };
>
> - let datastore = DataStore::with_store_and_config(chunk_store, config, Some(digest))?;
> + let mut datastore = DataStore::with_store_and_config(chunk_store, config)?;
> + datastore.cached_config_tag = Some(CachedDatastoreConfigTag {
> + last_maintenance_mode: maintenance_mode,
> + last_generation: gen_num,
> + });
this part should be in with_store_and_config, which should get the
last_generation (and later last_update) as parameter(s), just like it
had the digest before this patch..
>
> let datastore = Arc::new(datastore);
> datastore_cache.insert(name.to_string(), datastore.clone());
> @@ -430,11 +460,7 @@ impl DataStore {
> config.absolute_path(),
> tuning.sync_level.unwrap_or_default(),
> )?;
> - let inner = Arc::new(Self::with_store_and_config(
> - Arc::new(chunk_store),
> - config,
> - None,
> - )?);
> + let inner = Arc::new(Self::with_store_and_config(Arc::new(chunk_store), config)?);
>
> if let Some(operation) = operation {
> update_active_operations(&name, operation, 1)?;
> @@ -446,7 +472,6 @@ impl DataStore {
> fn with_store_and_config(
> chunk_store: Arc<ChunkStore>,
> config: DataStoreConfig,
> - last_digest: Option<[u8; 32]>,
> ) -> Result<DataStoreImpl, Error> {
> let mut gc_status_path = chunk_store.base_path();
> gc_status_path.push(".gc-status");
> @@ -506,10 +531,10 @@ impl DataStore {
> last_gc_status: Mutex::new(gc_status),
> verify_new: config.verify_new.unwrap_or(false),
> chunk_order: tuning.chunk_order.unwrap_or_default(),
> - last_digest,
> sync_level: tuning.sync_level.unwrap_or_default(),
> backend_config,
> lru_store_caching,
> + cached_config_tag: None,
> })
> }
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-12 13:24 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-11 12:29 [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Samuel Rufinatscha
2025-11-12 13:24 ` Fabian Grünbichler [this message]
2025-11-13 12:59 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 2/3] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-12 11:24 ` Fabian Grünbichler
2025-11-12 15:20 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 3/3] datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-12 11:27 ` [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-12 17:27 ` Samuel Rufinatscha
2025-11-14 15:08 ` [pbs-devel] superseded: " Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1762936034.bqz00j2ard.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox