From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups
Date: Tue, 11 Nov 2025 13:29:39 +0100 [thread overview]
Message-ID: <20251111122941.110412-2-s.rufinatscha@proxmox.com> (raw)
In-Reply-To: <20251111122941.110412-1-s.rufinatscha@proxmox.com>
Repeated /status requests caused lookup_datastore() to re-read and
parse datastore.cfg on every call. The issue was mentioned in report
#6049 [1]. cargo-flamegraph [2] confirmed that the hot path is
dominated by pbs_config::datastore::config() (config parsing).
This patch adds a fast path to lookup_datastore() using the shared-
memory ConfigVersionCache generation counter for datastore.cfg. It
stores the last seen generation number alongside the cached
DataStoreImpl and, when a subsequent lookup sees the same generation,
we reuse the cached instance without re-reading the on-disk config. If
the generation differs (or the cache is unavailable), it falls back to
the existing slow path with no behavioral changes.
Behavioral notes
- The generation is bumped via the existing save_config() path, so
API-driven config changes are detected immediately.
- Manual edits to datastore.cfg are not detected by this commit; a TTL
guard is introduced in a later patch to cover that case.
- DataStore::drop still performs a config read on the common path; this
is addressed in the next patch.
Testing
cargo-flamegraph confirms the config-parse hotspot in
lookup_datastore() disappears from the hot path.
Additionally, an end-to-end benchmark was performed using the
`/status?verbose=0` API with 1000 datastores, 5 requests per store,
and 16-way parallelism:
| Metric | Before | After | Δ (abs) | Δ (%) |
|--------------------------|:------:|:------:|:-------:|:-------:|
| Total time | 13s | 11s | −2s | −15.38% |
| Throughput (all rounds) | 384.62 | 454.55 | +69.93 | +18.18% |
| Cold RPS (round #1) | 76.92 | 90.91 | +13.99 | +18.19% |
| Warm RPS (rounds 2..N) | 307.69 | 363.64 | +55.95 | +18.18% |
Throughput improved by ~18% overall, with total time reduced by ~15%.
Warm lookups now reuse cached datastore instances and skip redundant
config parsing entirely. The first-access round also shows a similar
improvement, likely due to reduced contention and I/O when many stores
are accessed in parallel.
Note: A second hotspot remains in Drop where a config read occurs; that
is addressed in the next commit.
Links
[1] Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
[2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
Fixes: #6049
Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
---
pbs-config/src/config_version_cache.rs | 10 +++-
pbs-datastore/src/datastore.rs | 77 +++++++++++++++++---------
2 files changed, 59 insertions(+), 28 deletions(-)
diff --git a/pbs-config/src/config_version_cache.rs b/pbs-config/src/config_version_cache.rs
index e8fb994f..b875f7e0 100644
--- a/pbs-config/src/config_version_cache.rs
+++ b/pbs-config/src/config_version_cache.rs
@@ -26,7 +26,6 @@ struct ConfigVersionCacheDataInner {
// Traffic control (traffic-control.cfg) generation/version.
traffic_control_generation: AtomicUsize,
// datastore (datastore.cfg) generation/version
- // FIXME: remove with PBS 3.0
datastore_generation: AtomicUsize,
// Add further atomics here
}
@@ -145,8 +144,15 @@ impl ConfigVersionCache {
.fetch_add(1, Ordering::AcqRel);
}
+ /// Returns the datastore generation number.
+ pub fn datastore_generation(&self) -> usize {
+ self.shmem
+ .data()
+ .datastore_generation
+ .load(Ordering::Acquire)
+ }
+
/// Increase the datastore generation number.
- // FIXME: remove with PBS 3.0 or make actually useful again in datastore lookup
pub fn increase_datastore_generation(&self) -> usize {
self.shmem
.data()
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 70af94d8..18eebb58 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -32,7 +32,7 @@ use pbs_api_types::{
MaintenanceType, Operation, UPID,
};
use pbs_config::s3::S3_CFG_TYPE_ID;
-use pbs_config::BackupLockGuard;
+use pbs_config::{BackupLockGuard, ConfigVersionCache};
use crate::backup_info::{
BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FILENAME,
@@ -140,10 +140,10 @@ pub struct DataStoreImpl {
last_gc_status: Mutex<GarbageCollectionStatus>,
verify_new: bool,
chunk_order: ChunkOrder,
- last_digest: Option<[u8; 32]>,
sync_level: DatastoreFSyncLevel,
backend_config: DatastoreBackendConfig,
lru_store_caching: Option<LocalDatastoreLruCache>,
+ cached_config_tag: Option<CachedDatastoreConfigTag>,
}
impl DataStoreImpl {
@@ -156,10 +156,10 @@ impl DataStoreImpl {
last_gc_status: Mutex::new(GarbageCollectionStatus::default()),
verify_new: false,
chunk_order: Default::default(),
- last_digest: None,
sync_level: Default::default(),
backend_config: Default::default(),
lru_store_caching: None,
+ cached_config_tag: None,
})
}
}
@@ -224,6 +224,15 @@ pub enum DatastoreBackend {
S3(Arc<S3Client>),
}
+/// Used to determine whether a cached datastore instance is still valid
+/// or needs to be reloaded after a config change.
+struct CachedDatastoreConfigTag {
+ /// Maintenance mode at the time the lookup hint was captured, if any.
+ last_maintenance_mode: Option<MaintenanceMode>,
+ /// Datastore generation number from `ConfigVersionCache`; `None` when the cache wasn't available.
+ last_generation: Option<usize>,
+}
+
impl DataStore {
// This one just panics on everything
#[doc(hidden)]
@@ -299,13 +308,40 @@ impl DataStore {
// we use it to decide whether it is okay to delete the datastore.
let _config_lock = pbs_config::datastore::lock_config()?;
- // we could use the ConfigVersionCache's generation for staleness detection, but we load
- // the config anyway -> just use digest, additional benefit: manual changes get detected
- let (config, digest) = pbs_config::datastore::config()?;
+ // Get the current datastore.cfg generation number
+ let gen_num = ConfigVersionCache::new()
+ .ok()
+ .map(|c| c.datastore_generation());
+
+ // Fast-path: if we have a cached entry created under the same datastore.cfg generation number, reuse it.
+ if let (Some(gen_num), Some(ds)) =
+ (gen_num, DATASTORE_MAP.lock().unwrap().get(name).cloned())
+ {
+ if let Some(cached_tag) = &ds.cached_config_tag {
+ if cached_tag.last_generation == Some(gen_num) {
+ if let Some(mm) = &cached_tag.last_maintenance_mode {
+ if let Err(error) = mm.check(operation) {
+ bail!("datastore '{name}' is unavailable: {error}");
+ }
+ }
+ if let Some(operation) = operation {
+ update_active_operations(name, operation, 1)?;
+ }
+ return Ok(Arc::new(Self {
+ inner: ds,
+ operation,
+ }));
+ }
+ }
+ }
+
+ // Slow path: (re)load config
+ let (config, _digest) = pbs_config::datastore::config()?;
let config: DataStoreConfig = config.lookup("datastore", name)?;
- if let Some(maintenance_mode) = config.get_maintenance_mode() {
- if let Err(error) = maintenance_mode.check(operation) {
+ let maintenance_mode = config.get_maintenance_mode();
+ if let Some(mm) = &maintenance_mode {
+ if let Err(error) = mm.check(operation) {
bail!("datastore '{name}' is unavailable: {error}");
}
}
@@ -321,16 +357,6 @@ impl DataStore {
// reuse chunk store so that we keep using the same process locker instance!
let chunk_store = if let Some(datastore) = &entry {
- let last_digest = datastore.last_digest.as_ref();
- if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
- if let Some(operation) = operation {
- update_active_operations(name, operation, 1)?;
- }
- return Ok(Arc::new(Self {
- inner: Arc::clone(datastore),
- operation,
- }));
- }
Arc::clone(&datastore.chunk_store)
} else {
let tuning: DatastoreTuning = serde_json::from_value(
@@ -344,7 +370,11 @@ impl DataStore {
)?)
};
- let datastore = DataStore::with_store_and_config(chunk_store, config, Some(digest))?;
+ let mut datastore = DataStore::with_store_and_config(chunk_store, config)?;
+ datastore.cached_config_tag = Some(CachedDatastoreConfigTag {
+ last_maintenance_mode: maintenance_mode,
+ last_generation: gen_num,
+ });
let datastore = Arc::new(datastore);
datastore_cache.insert(name.to_string(), datastore.clone());
@@ -430,11 +460,7 @@ impl DataStore {
config.absolute_path(),
tuning.sync_level.unwrap_or_default(),
)?;
- let inner = Arc::new(Self::with_store_and_config(
- Arc::new(chunk_store),
- config,
- None,
- )?);
+ let inner = Arc::new(Self::with_store_and_config(Arc::new(chunk_store), config)?);
if let Some(operation) = operation {
update_active_operations(&name, operation, 1)?;
@@ -446,7 +472,6 @@ impl DataStore {
fn with_store_and_config(
chunk_store: Arc<ChunkStore>,
config: DataStoreConfig,
- last_digest: Option<[u8; 32]>,
) -> Result<DataStoreImpl, Error> {
let mut gc_status_path = chunk_store.base_path();
gc_status_path.push(".gc-status");
@@ -506,10 +531,10 @@ impl DataStore {
last_gc_status: Mutex::new(gc_status),
verify_new: config.verify_new.unwrap_or(false),
chunk_order: tuning.chunk_order.unwrap_or_default(),
- last_digest,
sync_level: tuning.sync_level.unwrap_or_default(),
backend_config,
lru_store_caching,
+ cached_config_tag: None,
})
}
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-11 12:29 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-11 12:29 [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Samuel Rufinatscha
2025-11-11 12:29 ` Samuel Rufinatscha [this message]
2025-11-12 13:24 ` [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Fabian Grünbichler
2025-11-13 12:59 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 2/3] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-12 11:24 ` Fabian Grünbichler
2025-11-12 15:20 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 3/3] datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-12 11:27 ` [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-12 17:27 ` Samuel Rufinatscha
2025-11-14 15:08 ` [pbs-devel] superseded: " Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251111122941.110412-2-s.rufinatscha@proxmox.com \
--to=s.rufinatscha@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox