From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: "Proxmox Backup Server development discussion"
<pbs-devel@lists.proxmox.com>,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups
Date: Thu, 13 Nov 2025 13:59:34 +0100 [thread overview]
Message-ID: <1c91221e-6979-4b38-9011-0b53921d4093@proxmox.com> (raw)
In-Reply-To: <1762936034.bqz00j2ard.astroid@yuna.none>
On 11/12/25 2:24 PM, Fabian Grünbichler wrote:
> On November 11, 2025 1:29 pm, Samuel Rufinatscha wrote:
>> Repeated /status requests caused lookup_datastore() to re-read and
>> parse datastore.cfg on every call. The issue was mentioned in report
>> #6049 [1]. cargo-flamegraph [2] confirmed that the hot path is
>> dominated by pbs_config::datastore::config() (config parsing).
>>
>> This patch adds a fast path to lookup_datastore() using the shared-
>> memory ConfigVersionCache generation counter for datastore.cfg. It
>> stores the last seen generation number alongside the cached
>> DataStoreImpl and, when a subsequent lookup sees the same generation,
>> we reuse the cached instance without re-reading the on-disk config. If
>> the generation differs (or the cache is unavailable), it falls back to
>> the existing slow path with no behavioral changes.
>>
>> Behavioral notes
>>
>> - The generation is bumped via the existing save_config() path, so
>> API-driven config changes are detected immediately.
>> - Manual edits to datastore.cfg are not detected by this commit; a TTL
>> guard is introduced in a later patch to cover that case.
>> - DataStore::drop still performs a config read on the common path; this
>> is addressed in the next patch.
>>
>> Testing
>>
>> cargo-flamegraph confirms the config-parse hotspot in
>> lookup_datastore() disappears from the hot path.
>>
>> Additionally, an end-to-end benchmark was performed using the
>> `/status?verbose=0` API with 1000 datastores, 5 requests per store,
>> and 16-way parallelism:
>>
>> | Metric | Before | After | Δ (abs) | Δ (%) |
>> |--------------------------|:------:|:------:|:-------:|:-------:|
>> | Total time | 13s | 11s | −2s | −15.38% |
>> | Throughput (all rounds) | 384.62 | 454.55 | +69.93 | +18.18% |
>> | Cold RPS (round #1) | 76.92 | 90.91 | +13.99 | +18.19% |
>> | Warm RPS (rounds 2..N) | 307.69 | 363.64 | +55.95 | +18.18% |
>>
>> Throughput improved by ~18% overall, with total time reduced by ~15%.
>> Warm lookups now reuse cached datastore instances and skip redundant
>> config parsing entirely. The first-access round also shows a similar
>> improvement, likely due to reduced contention and I/O when many stores
>> are accessed in parallel.
>>
>> Note: A second hotspot remains in Drop where a config read occurs; that
>> is addressed in the next commit.
>
> it would be interesting to also have numbers for just lookup+Drop,
> without all the HTTP and TLS overhead to really isolate it. that should
> also make it easier to reliably benchmark with something like hyperfine
> ;)
>
Good point, I will isolate the numbers - seems like the TLS overhead is
quite huge. Thanks for the hyperfine reference!
> for my simple config (5 datastores) single-threaded lookup+drop of a
> single datastore 100k times gives around 1.31 speedup for the whole
> series. the slightly modified version from below (which basically runs
> the most expensive part of Drop only once) for the same test setup still
> gives a speedup of 1.17
>
So the lookup optimization dominates the speedup if we hold longer to
the datastores, great to see.
> re-running the same benchmark with a config with 1k datastores, querying> M datastores N times gives the following speedups:
>
> M = 1, N = 1000: 15.6x faster
> M = 10, N = 100: 14.5x
> M = 100, N = 10: 8.8x
> M = 1000, N = 1: 1.8x (so this is basically showing the speedup of the
> improved Drop handling alone!)
>
> and then a slightly modified version, that keeps the DataStore instance
> around until all N lookups for that datastore are done, then dropping
> them in bulk (which mimics having lots of lookups while a task is
> running, making the Drop handler only do the expensive part every once
> in a while when the last task for a given datastore exits):
>
> use anyhow::Error;
>
> use pbs_api_types::Operation;
> use pbs_datastore::DataStore;
>
> fn main() -> Result<(), Error> {
> let mut args = std::env::args();
> args.next();
>
> let datastores = if let Some(n) = args.next() {
> n.parse::<usize>()?
> } else {
> 1000
> };
>
> let iterations = if let Some(n) = args.next() {
> n.parse::<usize>()?
> } else {
> 1000
> };
>
> for d in 1..=datastores {
> let name = format!("pbs_bench_{d}");
> let mut stores = Vec::with_capacity(iterations);
> for i in 1..=iterations {
> stores.push(DataStore::lookup_datastore(&name, Some(Operation::Write))?);
> }
> }
>
> Ok(())
> }
>
> M = 1, N = 1000: 8.5x faster
> M = 10, N = 100: 7.9x
> M = 100, N = 10: 5.2x
> M = 1000, N = 1: 1.9x
>
> looking at the flamegraph of this isolated benchmark it's now obvious
> that the remaining overhead is lockfiles and tracking the operations
> (both in lookup and when dropping)..
>
Thanks a lot also for setting up the test environment and providing your
numbers, which are helpful to compare against!
Regarding the overhead in lockfiles and tracking operations, it is a
good confirmation that everything else on the hot path is optimized - I
think the locks and tracking operations could maybe be revisited in a
future effort!
> side-note: I actually found a bug in our operations tracking while
> benchmarking this series that gave me wildly different numbers because
> the "drop last task" part never got executed as a result:
>
> https://lore.proxmox.com/pbs-devel/20251112131525.645971-1-f.gruenbichler@proxmox.com/T/#u
>
Nice catch!
>>
>> Links
>>
>> [1] Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
>> [2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
>>
>> Fixes: #6049
>> Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
>> ---
>> pbs-config/src/config_version_cache.rs | 10 +++-
>> pbs-datastore/src/datastore.rs | 77 +++++++++++++++++---------
>> 2 files changed, 59 insertions(+), 28 deletions(-)
>>
>> diff --git a/pbs-config/src/config_version_cache.rs b/pbs-config/src/config_version_cache.rs
>> index e8fb994f..b875f7e0 100644
>> --- a/pbs-config/src/config_version_cache.rs
>> +++ b/pbs-config/src/config_version_cache.rs
>> @@ -26,7 +26,6 @@ struct ConfigVersionCacheDataInner {
>> // Traffic control (traffic-control.cfg) generation/version.
>> traffic_control_generation: AtomicUsize,
>> // datastore (datastore.cfg) generation/version
>> - // FIXME: remove with PBS 3.0
>> datastore_generation: AtomicUsize,
>> // Add further atomics here
>> }
>> @@ -145,8 +144,15 @@ impl ConfigVersionCache {
>> .fetch_add(1, Ordering::AcqRel);
>> }
>>
>> + /// Returns the datastore generation number.
>> + pub fn datastore_generation(&self) -> usize {
>> + self.shmem
>> + .data()
>> + .datastore_generation
>> + .load(Ordering::Acquire)
>> + }
>> +
>> /// Increase the datastore generation number.
>> - // FIXME: remove with PBS 3.0 or make actually useful again in datastore lookup
>> pub fn increase_datastore_generation(&self) -> usize {
>> self.shmem
>> .data()
>
> this part could be split out into its own patch
>
Will factor this out into a separate patch.
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 70af94d8..18eebb58 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -32,7 +32,7 @@ use pbs_api_types::{
>> MaintenanceType, Operation, UPID,
>> };
>> use pbs_config::s3::S3_CFG_TYPE_ID;
>> -use pbs_config::BackupLockGuard;
>> +use pbs_config::{BackupLockGuard, ConfigVersionCache};
>>
>> use crate::backup_info::{
>> BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FILENAME,
>> @@ -140,10 +140,10 @@ pub struct DataStoreImpl {
>> last_gc_status: Mutex<GarbageCollectionStatus>,
>> verify_new: bool,
>> chunk_order: ChunkOrder,
>> - last_digest: Option<[u8; 32]>,
>> sync_level: DatastoreFSyncLevel,
>> backend_config: DatastoreBackendConfig,
>> lru_store_caching: Option<LocalDatastoreLruCache>,
>> + cached_config_tag: Option<CachedDatastoreConfigTag>,
>> }
>>
>> impl DataStoreImpl {
>> @@ -156,10 +156,10 @@ impl DataStoreImpl {
>> last_gc_status: Mutex::new(GarbageCollectionStatus::default()),
>> verify_new: false,
>> chunk_order: Default::default(),
>> - last_digest: None,
>> sync_level: Default::default(),
>> backend_config: Default::default(),
>> lru_store_caching: None,
>> + cached_config_tag: None,
>> })
>> }
>> }
>> @@ -224,6 +224,15 @@ pub enum DatastoreBackend {
>> S3(Arc<S3Client>),
>> }
>>
>> +/// Used to determine whether a cached datastore instance is still valid
>> +/// or needs to be reloaded after a config change.
>> +struct CachedDatastoreConfigTag {
>> + /// Maintenance mode at the time the lookup hint was captured, if any.
>> + last_maintenance_mode: Option<MaintenanceMode>,
>> + /// Datastore generation number from `ConfigVersionCache`; `None` when the cache wasn't available.
>> + last_generation: Option<usize>,
>
> if the whole tag is an option, do we really need to make the generation
> an option as well?
>
Good point, keeping the tag optional only is enough and will simplify
generation checks.
>> +}
>> +
>> impl DataStore {
>> // This one just panics on everything
>> #[doc(hidden)]
>> @@ -299,13 +308,40 @@ impl DataStore {
>> // we use it to decide whether it is okay to delete the datastore.
>> let _config_lock = pbs_config::datastore::lock_config()?;
>>
>> - // we could use the ConfigVersionCache's generation for staleness detection, but we load
>> - // the config anyway -> just use digest, additional benefit: manual changes get detected
>> - let (config, digest) = pbs_config::datastore::config()?;
>> + // Get the current datastore.cfg generation number
>> + let gen_num = ConfigVersionCache::new()
>> + .ok()
>> + .map(|c| c.datastore_generation());
>> +
>> + // Fast-path: if we have a cached entry created under the same datastore.cfg generation number, reuse it.
>> + if let (Some(gen_num), Some(ds)) =
>> + (gen_num, DATASTORE_MAP.lock().unwrap().get(name).cloned())
>> + {
>> + if let Some(cached_tag) = &ds.cached_config_tag {
>> + if cached_tag.last_generation == Some(gen_num) {
>> + if let Some(mm) = &cached_tag.last_maintenance_mode {
>> + if let Err(error) = mm.check(operation) {
>> + bail!("datastore '{name}' is unavailable: {error}");
>> + }
>> + }
>> + if let Some(operation) = operation {
>> + update_active_operations(name, operation, 1)?;
>> + }
>> + return Ok(Arc::new(Self {
>> + inner: ds,
>> + operation,
>> + }));
>> + }
>> + }
>> + }
>> +
>> + // Slow path: (re)load config
>> + let (config, _digest) = pbs_config::datastore::config()?;
>> let config: DataStoreConfig = config.lookup("datastore", name)?;
>>
>> - if let Some(maintenance_mode) = config.get_maintenance_mode() {
>> - if let Err(error) = maintenance_mode.check(operation) {
>> + let maintenance_mode = config.get_maintenance_mode();
>> + if let Some(mm) = &maintenance_mode {
>> + if let Err(error) = mm.check(operation) {
>> bail!("datastore '{name}' is unavailable: {error}");
>> }
>> }
>
> after this here we have a check for the mount status in the old hot
> path, that is missing in the new hot path. the mount status can change
> even if the config doesn't, so we should probably add this back to the
> hot path and re-run the numbers?
>
> that check again needs more parts of the config, so maybe we could
> explore caching the full config here? e.g., add a new static
> Mutex<Option<(DataStoreConfig, usize)>> (extended by the timestamp in
> the last patch) and adapt the patches here to use it? depending on
> whether we make the cached config available outside of lookup_datastore,
> we could then even not add the maintenance mode to the cache tag, and
> just store the generation number there, and retrieve the maintenance
> mode from the cached config in the Drop implementation..
>
For the mount check we will need device_uuid and datastore mount dir
path which we could add to the cached entry, however I think I would
also rather explore caching of the full global config here. This should
result in further performance gains - this would/should allow for
eventual fast-lookups to any other datastore. Will try to integrate it
in v2.
> there's a little more duplication here, e.g. we lock the map and check
> for an entry in both the fast and slow paths, we could do it once up
> front (nobody can change the map while we have it locked anyway), the
> checks are written twice and could probably be extracted into a helper
> so that future similar checks are added to both paths and not to only
> one by accident, ..
>
Good point, I will factor it out (e.g. small closure).
>> @@ -321,16 +357,6 @@ impl DataStore {
>>
>> // reuse chunk store so that we keep using the same process locker instance!
>> let chunk_store = if let Some(datastore) = &entry {
>> - let last_digest = datastore.last_digest.as_ref();
>> - if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
>> - if let Some(operation) = operation {
>> - update_active_operations(name, operation, 1)?;
>> - }
>> - return Ok(Arc::new(Self {
>> - inner: Arc::clone(datastore),
>> - operation,
>> - }));
>> - }
>> Arc::clone(&datastore.chunk_store)
>> } else {
>> let tuning: DatastoreTuning = serde_json::from_value(
>> @@ -344,7 +370,11 @@ impl DataStore {
>> )?)
>> };
>>
>> - let datastore = DataStore::with_store_and_config(chunk_store, config, Some(digest))?;
>> + let mut datastore = DataStore::with_store_and_config(chunk_store, config)?;
>> + datastore.cached_config_tag = Some(CachedDatastoreConfigTag {
>> + last_maintenance_mode: maintenance_mode,
>> + last_generation: gen_num,
>> + });
>
> this part should be in with_store_and_config, which should get the
> last_generation (and later last_update) as parameter(s), just like it
> had the digest before this patch..
>
Agree, I will move this part.
>>
>> let datastore = Arc::new(datastore);
>> datastore_cache.insert(name.to_string(), datastore.clone());
>> @@ -430,11 +460,7 @@ impl DataStore {
>> config.absolute_path(),
>> tuning.sync_level.unwrap_or_default(),
>> )?;
>> - let inner = Arc::new(Self::with_store_and_config(
>> - Arc::new(chunk_store),
>> - config,
>> - None,
>> - )?);
>> + let inner = Arc::new(Self::with_store_and_config(Arc::new(chunk_store), config)?);
>>
>> if let Some(operation) = operation {
>> update_active_operations(&name, operation, 1)?;
>> @@ -446,7 +472,6 @@ impl DataStore {
>> fn with_store_and_config(
>> chunk_store: Arc<ChunkStore>,
>> config: DataStoreConfig,
>> - last_digest: Option<[u8; 32]>,
>> ) -> Result<DataStoreImpl, Error> {
>> let mut gc_status_path = chunk_store.base_path();
>> gc_status_path.push(".gc-status");
>> @@ -506,10 +531,10 @@ impl DataStore {
>> last_gc_status: Mutex::new(gc_status),
>> verify_new: config.verify_new.unwrap_or(false),
>> chunk_order: tuning.chunk_order.unwrap_or_default(),
>> - last_digest,
>> sync_level: tuning.sync_level.unwrap_or_default(),
>> backend_config,
>> lru_store_caching,
>> + cached_config_tag: None,
>> })
>> }
>>
>> --
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-13 12:59 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-11 12:29 [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Samuel Rufinatscha
2025-11-12 13:24 ` Fabian Grünbichler
2025-11-13 12:59 ` Samuel Rufinatscha [this message]
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 2/3] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-12 11:24 ` Fabian Grünbichler
2025-11-12 15:20 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 3/3] datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-12 11:27 ` [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-12 17:27 ` Samuel Rufinatscha
2025-11-14 15:08 ` [pbs-devel] superseded: " Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1c91221e-6979-4b38-9011-0b53921d4093@proxmox.com \
--to=s.rufinatscha@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox