From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups
Date: Tue, 11 Nov 2025 13:29:39 +0100 [thread overview]
Message-ID: <20251111122941.110412-2-s.rufinatscha@proxmox.com> (raw)
In-Reply-To: <20251111122941.110412-1-s.rufinatscha@proxmox.com>
Repeated /status requests caused lookup_datastore() to re-read and
parse datastore.cfg on every call. The issue was mentioned in report
#6049 [1]. cargo-flamegraph [2] confirmed that the hot path is
dominated by pbs_config::datastore::config() (config parsing).
This patch adds a fast path to lookup_datastore() using the shared-
memory ConfigVersionCache generation counter for datastore.cfg. It
stores the last seen generation number alongside the cached
DataStoreImpl and, when a subsequent lookup sees the same generation,
we reuse the cached instance without re-reading the on-disk config. If
the generation differs (or the cache is unavailable), it falls back to
the existing slow path with no behavioral changes.
Behavioral notes
- The generation is bumped via the existing save_config() path, so
API-driven config changes are detected immediately.
- Manual edits to datastore.cfg are not detected by this commit; a TTL
guard is introduced in a later patch to cover that case.
- DataStore::drop still performs a config read on the common path; this
is addressed in the next patch.
Testing
cargo-flamegraph confirms the config-parse hotspot in
lookup_datastore() disappears from the hot path.
Additionally, an end-to-end benchmark was performed using the
`/status?verbose=0` API with 1000 datastores, 5 requests per store,
and 16-way parallelism:
| Metric | Before | After | Δ (abs) | Δ (%) |
|--------------------------|:------:|:------:|:-------:|:-------:|
| Total time | 13s | 11s | −2s | −15.38% |
| Throughput (all rounds) | 384.62 | 454.55 | +69.93 | +18.18% |
| Cold RPS (round #1) | 76.92 | 90.91 | +13.99 | +18.19% |
| Warm RPS (rounds 2..N) | 307.69 | 363.64 | +55.95 | +18.18% |
Throughput improved by ~18% overall, with total time reduced by ~15%.
Warm lookups now reuse cached datastore instances and skip redundant
config parsing entirely. The first-access round also shows a similar
improvement, likely due to reduced contention and I/O when many stores
are accessed in parallel.
Note: A second hotspot remains in Drop where a config read occurs; that
is addressed in the next commit.
Links
[1] Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
[2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
Fixes: #6049
Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
---
pbs-config/src/config_version_cache.rs | 10 +++-
pbs-datastore/src/datastore.rs | 77 +++++++++++++++++---------
2 files changed, 59 insertions(+), 28 deletions(-)
diff --git a/pbs-config/src/config_version_cache.rs b/pbs-config/src/config_version_cache.rs
index e8fb994f..b875f7e0 100644
--- a/pbs-config/src/config_version_cache.rs
+++ b/pbs-config/src/config_version_cache.rs
@@ -26,7 +26,6 @@ struct ConfigVersionCacheDataInner {
// Traffic control (traffic-control.cfg) generation/version.
traffic_control_generation: AtomicUsize,
// datastore (datastore.cfg) generation/version
- // FIXME: remove with PBS 3.0
datastore_generation: AtomicUsize,
// Add further atomics here
}
@@ -145,8 +144,15 @@ impl ConfigVersionCache {
.fetch_add(1, Ordering::AcqRel);
}
+ /// Returns the datastore generation number.
+ pub fn datastore_generation(&self) -> usize {
+ self.shmem
+ .data()
+ .datastore_generation
+ .load(Ordering::Acquire)
+ }
+
/// Increase the datastore generation number.
- // FIXME: remove with PBS 3.0 or make actually useful again in datastore lookup
pub fn increase_datastore_generation(&self) -> usize {
self.shmem
.data()
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 70af94d8..18eebb58 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -32,7 +32,7 @@ use pbs_api_types::{
MaintenanceType, Operation, UPID,
};
use pbs_config::s3::S3_CFG_TYPE_ID;
-use pbs_config::BackupLockGuard;
+use pbs_config::{BackupLockGuard, ConfigVersionCache};
use crate::backup_info::{
BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FILENAME,
@@ -140,10 +140,10 @@ pub struct DataStoreImpl {
last_gc_status: Mutex<GarbageCollectionStatus>,
verify_new: bool,
chunk_order: ChunkOrder,
- last_digest: Option<[u8; 32]>,
sync_level: DatastoreFSyncLevel,
backend_config: DatastoreBackendConfig,
lru_store_caching: Option<LocalDatastoreLruCache>,
+ cached_config_tag: Option<CachedDatastoreConfigTag>,
}
impl DataStoreImpl {
@@ -156,10 +156,10 @@ impl DataStoreImpl {
last_gc_status: Mutex::new(GarbageCollectionStatus::default()),
verify_new: false,
chunk_order: Default::default(),
- last_digest: None,
sync_level: Default::default(),
backend_config: Default::default(),
lru_store_caching: None,
+ cached_config_tag: None,
})
}
}
@@ -224,6 +224,15 @@ pub enum DatastoreBackend {
S3(Arc<S3Client>),
}
+/// Used to determine whether a cached datastore instance is still valid
+/// or needs to be reloaded after a config change.
+struct CachedDatastoreConfigTag {
+ /// Maintenance mode at the time the lookup hint was captured, if any.
+ last_maintenance_mode: Option<MaintenanceMode>,
+ /// Datastore generation number from `ConfigVersionCache`; `None` when the cache wasn't available.
+ last_generation: Option<usize>,
+}
+
impl DataStore {
// This one just panics on everything
#[doc(hidden)]
@@ -299,13 +308,40 @@ impl DataStore {
// we use it to decide whether it is okay to delete the datastore.
let _config_lock = pbs_config::datastore::lock_config()?;
- // we could use the ConfigVersionCache's generation for staleness detection, but we load
- // the config anyway -> just use digest, additional benefit: manual changes get detected
- let (config, digest) = pbs_config::datastore::config()?;
+ // Get the current datastore.cfg generation number
+ let gen_num = ConfigVersionCache::new()
+ .ok()
+ .map(|c| c.datastore_generation());
+
+ // Fast-path: if we have a cached entry created under the same datastore.cfg generation number, reuse it.
+ if let (Some(gen_num), Some(ds)) =
+ (gen_num, DATASTORE_MAP.lock().unwrap().get(name).cloned())
+ {
+ if let Some(cached_tag) = &ds.cached_config_tag {
+ if cached_tag.last_generation == Some(gen_num) {
+ if let Some(mm) = &cached_tag.last_maintenance_mode {
+ if let Err(error) = mm.check(operation) {
+ bail!("datastore '{name}' is unavailable: {error}");
+ }
+ }
+ if let Some(operation) = operation {
+ update_active_operations(name, operation, 1)?;
+ }
+ return Ok(Arc::new(Self {
+ inner: ds,
+ operation,
+ }));
+ }
+ }
+ }
+
+ // Slow path: (re)load config
+ let (config, _digest) = pbs_config::datastore::config()?;
let config: DataStoreConfig = config.lookup("datastore", name)?;
- if let Some(maintenance_mode) = config.get_maintenance_mode() {
- if let Err(error) = maintenance_mode.check(operation) {
+ let maintenance_mode = config.get_maintenance_mode();
+ if let Some(mm) = &maintenance_mode {
+ if let Err(error) = mm.check(operation) {
bail!("datastore '{name}' is unavailable: {error}");
}
}
@@ -321,16 +357,6 @@ impl DataStore {
// reuse chunk store so that we keep using the same process locker instance!
let chunk_store = if let Some(datastore) = &entry {
- let last_digest = datastore.last_digest.as_ref();
- if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
- if let Some(operation) = operation {
- update_active_operations(name, operation, 1)?;
- }
- return Ok(Arc::new(Self {
- inner: Arc::clone(datastore),
- operation,
- }));
- }
Arc::clone(&datastore.chunk_store)
} else {
let tuning: DatastoreTuning = serde_json::from_value(
@@ -344,7 +370,11 @@ impl DataStore {
)?)
};
- let datastore = DataStore::with_store_and_config(chunk_store, config, Some(digest))?;
+ let mut datastore = DataStore::with_store_and_config(chunk_store, config)?;
+ datastore.cached_config_tag = Some(CachedDatastoreConfigTag {
+ last_maintenance_mode: maintenance_mode,
+ last_generation: gen_num,
+ });
let datastore = Arc::new(datastore);
datastore_cache.insert(name.to_string(), datastore.clone());
@@ -430,11 +460,7 @@ impl DataStore {
config.absolute_path(),
tuning.sync_level.unwrap_or_default(),
)?;
- let inner = Arc::new(Self::with_store_and_config(
- Arc::new(chunk_store),
- config,
- None,
- )?);
+ let inner = Arc::new(Self::with_store_and_config(Arc::new(chunk_store), config)?);
if let Some(operation) = operation {
update_active_operations(&name, operation, 1)?;
@@ -446,7 +472,6 @@ impl DataStore {
fn with_store_and_config(
chunk_store: Arc<ChunkStore>,
config: DataStoreConfig,
- last_digest: Option<[u8; 32]>,
) -> Result<DataStoreImpl, Error> {
let mut gc_status_path = chunk_store.base_path();
gc_status_path.push(".gc-status");
@@ -506,10 +531,10 @@ impl DataStore {
last_gc_status: Mutex::new(gc_status),
verify_new: config.verify_new.unwrap_or(false),
chunk_order: tuning.chunk_order.unwrap_or_default(),
- last_digest,
sync_level: tuning.sync_level.unwrap_or_default(),
backend_config,
lru_store_caching,
+ cached_config_tag: None,
})
}
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-11 12:29 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-11 12:29 [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Samuel Rufinatscha
2025-11-11 12:29 ` Samuel Rufinatscha [this message]
2025-11-12 13:24 ` [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Fabian Grünbichler
2025-11-13 12:59 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 2/3] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-12 11:24 ` Fabian Grünbichler
2025-11-12 15:20 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 3/3] datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-12 11:27 ` [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-12 17:27 ` Samuel Rufinatscha
2025-11-14 15:08 ` [pbs-devel] superseded: " Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251111122941.110412-2-s.rufinatscha@proxmox.com \
--to=s.rufinatscha@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.