* [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses
@ 2026-05-13 13:54 Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation Lukas Wagner
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Lukas Wagner @ 2026-05-13 13:54 UTC (permalink / raw)
To: pdm-devel
The main intention is to avoid a sprawl of different caching approaches by
establishing a simple, easy to use cache implementation that can be used to
persistently cache API responses from remotes (and derived aggregations).
The `namespaced_cache` module is pretty generic and can be moved to proxmox.git
(maybe in proxmox-shared-cache) once it has sufficiently stabilized.
Changes since the RFC:
- change storage location to /run/proxmox-datacenter-manager/api-cache
- change name of pdm_cache to api_cache
- use freestanding functions in api_cache
- add async interface
- minor code style improments
- improve test coverage and documentation
- add basic sanity checks for namespaces and keys (e.g. prohibiting ../)
- add cleanup code for the old remote-updates cachefile
Thanks to Dominik C. and Wolfgang for their reviews of the RFC, highly appreciated!
proxmox-datacenter-manager:
Lukas Wagner (4):
add persistent, generic, namespaced key-value cache implementation
add api_cache as a specialized wrapper around the namespaced cache
api: resources: subscriptions: switch over to api_cache
remote-updates: switch over to new api_cache
Cargo.toml | 1 +
debian/control | 1 +
server/Cargo.toml | 4 +
server/src/api/pve/mod.rs | 4 +-
server/src/api/remotes/updates.rs | 4 +-
server/src/api/resources.rs | 81 +-
server/src/api_cache.rs | 126 +++
.../bin/proxmox-datacenter-privileged-api.rs | 9 +-
server/src/lib.rs | 2 +
server/src/namespaced_cache.rs | 742 ++++++++++++++++++
server/src/remote_updates.rs | 87 +-
11 files changed, 962 insertions(+), 99 deletions(-)
create mode 100644 server/src/api_cache.rs
create mode 100644 server/src/namespaced_cache.rs
Summary over all repositories:
11 files changed, 962 insertions(+), 99 deletions(-)
--
Generated by murpp 0.12.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation
2026-05-13 13:54 [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
@ 2026-05-13 13:54 ` Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-13 13:54 ` [PATCH datacenter-manager 2/4] add api_cache as a specialized wrapper around the namespaced cache Lukas Wagner
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Lukas Wagner @ 2026-05-13 13:54 UTC (permalink / raw)
To: pdm-devel
Namespaces map to directories inside a base directory. Cache contents
are stored as a single JSON-file per key inside the namespace directory.
Value expiry is implemented via a max_age parameter when retrieving
values. Callers might have different requirements to the freshness of
entries, so this seemed a better fit than statically setting an expiry
time when *setting* the value.
The cache offers both a blocking as well as an async interface. Once we
move this implementation to a shared crate, we should probably use
feature flags to gate these.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
---
Notes:
This module might be well suited to be moved to proxmox.git when it has
stabilized enough.
Changes since the RFC:
- In NamespacedCache::new, use Into<PathBuf> instead of AsRef<Path>
(thx @ Wolfgang)
- Improve test coverage
- Basic sanity checks for namespaces and keys, for instance
prohibiting "../"
- Fix path generation for cases where there is already a
file extension in the key (before, it would have been replaced by
.json)
- add async interface and rename the old read/write calls to
read_blocking and write_blocking.
read/write now return a distinct wrapper type that wraps all
blocking operations in a `spawn_blocking`
Cargo.toml | 1 +
debian/control | 1 +
server/Cargo.toml | 4 +
server/src/lib.rs | 1 +
server/src/namespaced_cache.rs | 742 +++++++++++++++++++++++++++++++++
5 files changed, 749 insertions(+)
create mode 100644 server/src/namespaced_cache.rs
diff --git a/Cargo.toml b/Cargo.toml
index 9806a4f0..5cf05b41 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -125,6 +125,7 @@ serde = { version = "1.0", features = ["derive"] }
serde_cbor = "0.11.1"
serde_json = "1.0"
serde_plain = "1"
+tempfile = "3.15"
thiserror = "1.0"
tokio = "1.6"
tracing = "0.1"
diff --git a/debian/control b/debian/control
index 61b9cb36..5955faa1 100644
--- a/debian/control
+++ b/debian/control
@@ -104,6 +104,7 @@ Build-Depends: debhelper-compat (= 13),
librust-proxmox-syslog-api-1+default-dev,
librust-proxmox-syslog-api-1+impl-dev,
librust-proxmox-systemd-1+default-dev,
+ librust-tempfile-3+default-dev (>= 3.15.0),
librust-proxmox-tfa-6+api-dev,
librust-proxmox-tfa-6+api-types-dev,
librust-proxmox-tfa-6+types-dev,
diff --git a/server/Cargo.toml b/server/Cargo.toml
index 3f185bbc..663f21cb 100644
--- a/server/Cargo.toml
+++ b/server/Cargo.toml
@@ -29,6 +29,7 @@ openssl.workspace = true
percent-encoding.workspace = true
serde.workspace = true
serde_json.workspace = true
+thiserror.workspace = true
tokio = { workspace = true, features = [ "fs", "io-util", "io-std", "macros", "net", "parking_lot", "process", "rt", "rt-multi-thread", "signal", "time" ] }
tracing.workspace = true
url.workspace = true
@@ -87,6 +88,9 @@ pdm-search.workspace = true
pve-api-types = { workspace = true, features = [ "client" ] }
pbs-api-types.workspace = true
+[dev-dependencies]
+tempfile.workspace = true
+
[lints.rust.unexpected_cfgs]
level = "warn"
check-cfg = ['cfg(remote_config, values("faked"))']
diff --git a/server/src/lib.rs b/server/src/lib.rs
index 5ed10d69..0b7642ab 100644
--- a/server/src/lib.rs
+++ b/server/src/lib.rs
@@ -7,6 +7,7 @@ pub mod context;
pub mod env;
pub mod jobstate;
pub mod metric_collection;
+pub mod namespaced_cache;
pub mod parallel_fetcher;
pub mod remote_cache;
pub mod remote_tasks;
diff --git a/server/src/namespaced_cache.rs b/server/src/namespaced_cache.rs
new file mode 100644
index 00000000..9807fc54
--- /dev/null
+++ b/server/src/namespaced_cache.rs
@@ -0,0 +1,742 @@
+//! Generic namespaced cache implementation with optional value expiry.
+//!
+//! ```
+//! use std::time::Duration;
+//!
+//! use proxmox_sys::fs::CreateOptions;
+//! use server::namespaced_cache::NamespacedCache;
+//!
+//! let dir = tempfile::tempdir().unwrap();
+//! let cache = NamespacedCache::new(dir.as_ref(), CreateOptions::new(), CreateOptions::new());
+//! let write_guard = cache.write_blocking("remote-a", Duration::from_secs(1)).unwrap();
+//! write_guard.set("val1", 1).unwrap();
+//! assert_eq!(write_guard.get::<i32>("val1").unwrap().unwrap(), 1);
+//!
+//! ```
+
+use std::fs::File;
+use std::io::ErrorKind;
+use std::path::{Path, PathBuf};
+use std::sync::Arc;
+use std::time::Duration;
+
+use serde::{de::DeserializeOwned, Deserialize, Serialize};
+use tokio::task::JoinError;
+
+use proxmox_sys::fs::CreateOptions;
+
+/// Error type for [`NamespacedCache`].
+#[derive(thiserror::Error, Debug)]
+pub enum CacheError {
+ #[error("IO error: {0}")]
+ Io(#[from] std::io::Error),
+
+ #[error("serialization error: {0}")]
+ Serde(#[from] serde_json::Error),
+
+ #[error("error: {0}")]
+ Other(#[from] anyhow::Error),
+
+ #[error("invalid key: '{0}'")]
+ InvalidKey(String),
+
+ #[error("invalid namespace: '{0}'")]
+ InvalidNamespace(String),
+
+ #[error("join error: {0}")]
+ JoinError(#[from] JoinError),
+}
+
+/// A generic, namespaced cache with optional value expiration.
+pub struct NamespacedCache {
+ base_directory: PathBuf,
+ file_options: CreateOptions,
+ dir_options: CreateOptions,
+}
+
+impl NamespacedCache {
+ /// Create a new cache instance.
+ ///
+ /// Cache entries will be persisted in the provided `base_directory`.
+ /// `dir_options` are the [`CreateOptions`] used the namespace directories, while `file_options`
+ /// are the ones for persisted cache entries.
+ pub fn new<P: Into<PathBuf>>(
+ base_directory: P,
+ dir_options: CreateOptions,
+ file_options: CreateOptions,
+ ) -> Self {
+ Self {
+ base_directory: base_directory.into(),
+ dir_options,
+ file_options,
+ }
+ }
+
+ /// Lock a namespace for writing (blocking interface).
+ ///
+ /// This should *not* be called from async code. Use [`NamespacedCache::write`] instead.
+ pub fn write_blocking(
+ &self,
+ namespace: &str,
+ timeout: Duration,
+ ) -> Result<BlockingWritableCacheNamespace, CacheError> {
+ ensure_valid_namespace(namespace)?;
+ let lock = self.lock_namespace_impl_blocking(namespace, timeout, true)?;
+
+ Ok(BlockingWritableCacheNamespace {
+ inner: WriteableInner {
+ _lock: lock,
+ namespace: namespace.to_string(),
+ base_path: self.base_directory.clone(),
+ dir_options: self.dir_options,
+ file_options: self.file_options,
+ },
+ })
+ }
+
+ /// Lock a namespace for writing (async interface).
+ pub async fn write(
+ &self,
+ namespace: &str,
+ timeout: Duration,
+ ) -> Result<WritableCacheNamespace, CacheError> {
+ ensure_valid_namespace(namespace)?;
+ let lock = self.lock_namespace_impl(namespace, timeout, true).await?;
+
+ Ok(WritableCacheNamespace {
+ inner: Arc::new(WriteableInner {
+ _lock: lock,
+ namespace: namespace.to_string(),
+ base_path: self.base_directory.clone(),
+ dir_options: self.dir_options,
+ file_options: self.file_options,
+ }),
+ })
+ }
+
+ /// Lock a namespace for reading (blocking interface).
+ ///
+ /// This should *not* be called from async code. Use [`NamespacedCache::read`] instead.
+ pub fn read_blocking(
+ &self,
+ namespace: &str,
+ timeout: Duration,
+ ) -> Result<BlockingReadableCacheNamespace, CacheError> {
+ ensure_valid_namespace(namespace)?;
+ let lock = self.lock_namespace_impl_blocking(namespace, timeout, false)?;
+
+ Ok(BlockingReadableCacheNamespace {
+ inner: ReadableInner {
+ _lock: lock,
+ namespace: namespace.to_string(),
+ base_path: self.base_directory.clone(),
+ },
+ })
+ }
+
+ /// Lock a namespace for reading (async interface).
+ pub async fn read(
+ &self,
+ namespace: &str,
+ timeout: Duration,
+ ) -> Result<ReadableCacheNamespace, CacheError> {
+ ensure_valid_namespace(namespace)?;
+ let lock = self.lock_namespace_impl(namespace, timeout, false).await?;
+
+ Ok(ReadableCacheNamespace {
+ inner: Arc::new(ReadableInner {
+ _lock: lock,
+ namespace: namespace.to_string(),
+ base_path: self.base_directory.clone(),
+ }),
+ })
+ }
+
+ async fn lock_namespace_impl(
+ &self,
+ namespace: &str,
+ timeout: Duration,
+ exclusive: bool,
+ ) -> Result<File, CacheError> {
+ let path = get_lockfile(&self.base_directory, namespace);
+ let file_options = self.file_options;
+ let lock = tokio::task::spawn_blocking(move || {
+ proxmox_sys::fs::open_file_locked(&path, timeout, exclusive, file_options)
+ })
+ .await??;
+
+ Ok(lock)
+ }
+
+ fn lock_namespace_impl_blocking(
+ &self,
+ namespace: &str,
+ timeout: Duration,
+ exclusive: bool,
+ ) -> Result<File, CacheError> {
+ let path = get_lockfile(&self.base_directory, namespace);
+ let lock = proxmox_sys::fs::open_file_locked(&path, timeout, exclusive, self.file_options)?;
+
+ Ok(lock)
+ }
+}
+
+/// A readable cache namespace (blocking interface).
+pub struct BlockingReadableCacheNamespace {
+ inner: ReadableInner,
+}
+
+/// A readable cache namespace (async interface).
+pub struct ReadableCacheNamespace {
+ inner: Arc<ReadableInner>,
+}
+
+struct ReadableInner {
+ _lock: File,
+ namespace: String,
+ base_path: PathBuf,
+}
+
+impl BlockingReadableCacheNamespace {
+ /// Read a value from the cache.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub fn get<T: Serialize + DeserializeOwned>(&self, key: &str) -> Result<Option<T>, CacheError> {
+ get_impl(&self.inner.base_path, &self.inner.namespace, key, None)
+ }
+
+ /// Read a value from the cache, given a maximum age of the cache entry.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub fn get_with_max_age<T: Serialize + DeserializeOwned>(
+ &self,
+ key: &str,
+ max_age: i64,
+ ) -> Result<Option<T>, CacheError> {
+ get_impl(
+ &self.inner.base_path,
+ &self.inner.namespace,
+ key,
+ Some(max_age),
+ )
+ }
+}
+
+impl ReadableCacheNamespace {
+ /// Read a value from the cache.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub async fn get<T: Serialize + DeserializeOwned + Send + 'static>(
+ &self,
+ key: &str,
+ ) -> Result<Option<T>, CacheError> {
+ let key = key.to_string();
+ let inner = Arc::clone(&self.inner);
+
+ tokio::task::spawn_blocking(move || {
+ get_impl(&inner.base_path, &inner.namespace, &key, None)
+ })
+ .await?
+ }
+
+ /// Read a value from the cache, given a maximum age of the cache entry.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub async fn get_with_max_age<T: Serialize + DeserializeOwned + Send + 'static>(
+ &self,
+ key: &str,
+ max_age: i64,
+ ) -> Result<Option<T>, CacheError> {
+ let key = key.to_string();
+ let inner = Arc::clone(&self.inner);
+
+ tokio::task::spawn_blocking(move || {
+ get_impl(&inner.base_path, &inner.namespace, &key, Some(max_age))
+ })
+ .await?
+ }
+}
+
+/// A writable cache namespace (blocking interface).
+pub struct BlockingWritableCacheNamespace {
+ inner: WriteableInner,
+}
+
+/// A writable cache namespace (async interface).
+pub struct WritableCacheNamespace {
+ inner: Arc<WriteableInner>,
+}
+
+struct WriteableInner {
+ _lock: File,
+ namespace: String,
+ base_path: PathBuf,
+ dir_options: CreateOptions,
+ file_options: CreateOptions,
+}
+
+impl BlockingWritableCacheNamespace {
+ /// Remote a cache entry.
+ ///
+ /// This returns `Ok(())` if the key does not exist.
+ ///
+ /// # Errors:
+ /// - The file could not be deleted due to insufficient privileges.
+ pub fn remove(&self, key: &str) -> Result<(), CacheError> {
+ remove_impl(&self.inner, key)
+ }
+
+ /// Set a cache entry.
+ ///
+ /// # Errors
+ /// - `value` could not be serialized
+ /// - The namespace directory could not be created
+ /// - The cache file could not be written to or atomically replaced
+ pub fn set<T: Serialize + DeserializeOwned>(
+ &self,
+ key: &str,
+ value: T,
+ ) -> Result<(), CacheError> {
+ set_impl(&self.inner, key, value, proxmox_time::epoch_i64())
+ }
+
+ /// Set a cache entry with an explicitly provided timestamp.
+ ///
+ /// # Errors
+ /// - `value` could not be serialized
+ /// - The namespace directory could not be created
+ /// - The cache file could not be written to or atomically replaced
+ pub fn set_with_timestamp<T: Serialize + DeserializeOwned>(
+ &self,
+ key: &str,
+ value: T,
+ timestamp: i64,
+ ) -> Result<(), CacheError> {
+ set_impl(&self.inner, key, value, timestamp)
+ }
+
+ /// Read a value from the cache.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub fn get<T: Serialize + DeserializeOwned>(&self, key: &str) -> Result<Option<T>, CacheError> {
+ get_impl(&self.inner.base_path, &self.inner.namespace, key, None)
+ }
+
+ /// Read a value from the cache, given a maximum age of the cache entry.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub fn get_with_max_age<T: Serialize + DeserializeOwned>(
+ &self,
+ key: &str,
+ max_age: i64,
+ ) -> Result<Option<T>, CacheError> {
+ get_impl(
+ &self.inner.base_path,
+ &self.inner.namespace,
+ key,
+ Some(max_age),
+ )
+ }
+}
+
+impl WritableCacheNamespace {
+ /// Remote a cache entry.
+ ///
+ /// This returns `Ok(())` if the key does not exist.
+ ///
+ /// # Errors:
+ /// - The file could not be deleted due to insufficient privileges.
+ pub async fn remove(&self, key: &str) -> Result<(), CacheError> {
+ let inner = Arc::clone(&self.inner);
+ let key = key.to_string();
+
+ tokio::task::spawn_blocking(move || remove_impl(&inner, &key)).await?
+ }
+
+ /// Set a cache entry.
+ ///
+ /// # Errors
+ /// - `value` could not be serialized
+ /// - The namespace directory could not be created
+ /// - The cache file could not be written to or atomically replaced
+ pub async fn set<T: Serialize + DeserializeOwned + Send + 'static>(
+ &self,
+ key: &str,
+ value: T,
+ ) -> Result<(), CacheError> {
+ let inner = Arc::clone(&self.inner);
+ let key = key.to_string();
+
+ tokio::task::spawn_blocking(move || {
+ set_impl(&inner, &key, value, proxmox_time::epoch_i64())
+ })
+ .await?
+ }
+
+ /// Set a cache entry with an explicitly provided timestamp.
+ ///
+ /// # Errors
+ /// - `value` could not be serialized
+ /// - The namespace directory could not be created
+ /// - The cache file could not be written to or atomically replaced
+ pub async fn set_with_timestamp<T: Serialize + DeserializeOwned + Send + 'static>(
+ &self,
+ key: &str,
+ value: T,
+ timestamp: i64,
+ ) -> Result<(), CacheError> {
+ let inner = Arc::clone(&self.inner);
+ let key = key.to_string();
+ tokio::task::spawn_blocking(move || set_impl(&inner, &key, value, timestamp)).await?
+ }
+
+ /// Read a value from the cache.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub async fn get<T: Serialize + DeserializeOwned + Send + 'static>(
+ &self,
+ key: &str,
+ ) -> Result<Option<T>, CacheError> {
+ let key = key.to_string();
+ let inner = Arc::clone(&self.inner);
+
+ tokio::task::spawn_blocking(move || {
+ get_impl(&inner.base_path, &inner.namespace, &key, None)
+ })
+ .await?
+ }
+
+ /// Read a value from the cache, given a maximum age of the cache entry.
+ ///
+ /// # Errors:
+ /// - The file associated with this key could not be read
+ /// - The file could not be deserialized (e.g. invalid format)
+ pub async fn get_with_max_age<T: Serialize + DeserializeOwned + Send + 'static>(
+ &self,
+ key: &str,
+ max_age: i64,
+ ) -> Result<Option<T>, CacheError> {
+ let key = key.to_string();
+ let inner = Arc::clone(&self.inner);
+
+ tokio::task::spawn_blocking(move || {
+ get_impl(&inner.base_path, &inner.namespace, &key, Some(max_age))
+ })
+ .await?
+ }
+}
+
+#[derive(Serialize, Deserialize, Debug)]
+struct CacheEntry<T> {
+ timestamp: i64,
+ value: T,
+}
+
+impl<T> CacheEntry<T> {
+ fn is_expired(&self, now: i64, max_age: i64) -> bool {
+ if max_age == 0 {
+ return true;
+ }
+
+ let diff = now - self.timestamp;
+ diff >= max_age || diff < 0
+ }
+}
+
+fn get_impl<T: Serialize + DeserializeOwned>(
+ base: &Path,
+ namespace: &str,
+ key: &str,
+ max_age: Option<i64>,
+) -> Result<Option<T>, CacheError> {
+ // Namespace should already be verified at this point, no point in checking it again.
+ ensure_valid_key(key)?;
+
+ let path = get_path(base, namespace, key);
+ let content = proxmox_sys::fs::file_read_optional_string(path)?;
+
+ if let Some(content) = content {
+ let val = serde_json::from_str::<CacheEntry<T>>(&content)?;
+
+ if let Some(max_age) = max_age {
+ if val.is_expired(proxmox_time::epoch_i64(), max_age) {
+ return Ok(None);
+ }
+ }
+ return Ok(Some(val.value));
+ }
+
+ Ok(None)
+}
+
+fn set_impl<T: Serialize + DeserializeOwned>(
+ inner: &WriteableInner,
+ key: &str,
+ value: T,
+ timestamp: i64,
+) -> Result<(), CacheError> {
+ ensure_valid_key(key)?;
+ let path = get_path(&inner.base_path, &inner.namespace, key);
+
+ proxmox_sys::fs::create_path(
+ path.parent().unwrap(),
+ Some(inner.dir_options),
+ Some(inner.dir_options),
+ )?;
+
+ let entry = CacheEntry { timestamp, value };
+
+ let data = serde_json::to_vec(&entry)?;
+ proxmox_sys::fs::replace_file(path, &data, inner.file_options, true)?;
+
+ Ok(())
+}
+
+fn remove_impl(inner: &WriteableInner, key: &str) -> Result<(), CacheError> {
+ ensure_valid_key(key)?;
+ let path = get_path(&inner.base_path, &inner.namespace, key);
+
+ if let Err(err) = std::fs::remove_file(path) {
+ if err.kind() == ErrorKind::NotFound {
+ return Ok(());
+ }
+
+ return Err(err.into());
+ }
+
+ Ok(())
+}
+
+fn get_path(base: &Path, namespace: &str, key: &str) -> PathBuf {
+ let path = base.join(namespace).join(format!("{key}.json"));
+ path
+}
+
+fn get_lockfile(base: &Path, namespace: &str) -> PathBuf {
+ let path = base.join(format!(".{namespace}.lock"));
+ path
+}
+
+// Make sure that an identifier is safe to use as a cache key.
+//
+// At the moment, this only checks for '../', as to ensure that the base directory
+// is not escaped.
+fn ensure_valid_key(key: &str) -> Result<(), CacheError> {
+ if key.contains("../") {
+ return Err(CacheError::InvalidKey(key.into()));
+ }
+
+ Ok(())
+}
+
+// Make sure that an identifier is safe to use as a namespace.
+//
+// At the moment, this ensures that there is no '../' in the namespace, as well as that the
+// identifier does not start with '/'.
+fn ensure_valid_namespace(namespace: &str) -> Result<(), CacheError> {
+ if namespace.contains("../") || namespace.starts_with("/") {
+ return Err(CacheError::InvalidNamespace(namespace.into()));
+ }
+
+ Ok(())
+}
+
+#[cfg(test)]
+mod tests {
+ use tempfile::TempDir;
+
+ use super::*;
+
+ const TIMEOUT: Duration = Duration::from_secs(1);
+
+ fn make_cache() -> (TempDir, NamespacedCache) {
+ let dir = tempfile::tempdir().unwrap();
+
+ let cache = NamespacedCache::new(dir.as_ref(), CreateOptions::new(), CreateOptions::new());
+
+ (dir, cache)
+ }
+
+ #[test]
+ fn test_cache() {
+ let (_dir, cache) = make_cache();
+
+ let write_guard = cache.write_blocking("remote-a", TIMEOUT).unwrap();
+ write_guard.set("val1", 1).unwrap();
+ write_guard.set("val2", 1).unwrap();
+
+ assert_eq!(write_guard.get::<i32>("val1").unwrap().unwrap(), 1);
+
+ write_guard.remove("val1").unwrap();
+ assert!(write_guard.get::<String>("val1").unwrap().is_none());
+
+ drop(write_guard);
+
+ let read_guard = cache.read_blocking("remote-a", TIMEOUT).unwrap();
+
+ assert_eq!(read_guard.get::<i32>("val2").unwrap().unwrap(), 1);
+ }
+
+ #[test]
+ fn test_remove_nonexisting() {
+ let (_dir, cache) = make_cache();
+
+ let a = cache.write_blocking("remote-a", TIMEOUT).unwrap();
+
+ // Deleting a key that does not exist is okay and should not error.
+ assert!(a.remove("val").is_ok());
+ }
+
+ #[test]
+ fn test_remove_failure() {
+ let (dir, cache) = make_cache();
+
+ let a = cache.write_blocking("remote-a", TIMEOUT).unwrap();
+ // Triggering a general failure by generating a directory that conflicts with the cache key
+ std::fs::create_dir_all(dir.path().join("remote-a").join("val.json")).unwrap();
+ assert!(a.remove("val").is_err());
+ }
+
+ #[test]
+ fn test_get_with_max_age() {
+ let (_dir, cache) = make_cache();
+
+ let write_guard = cache.write_blocking("remote-a", TIMEOUT).unwrap();
+
+ let now = proxmox_time::epoch_i64();
+
+ write_guard
+ .set_with_timestamp("somekey", 1, now - 1000)
+ .unwrap();
+
+ assert!(write_guard
+ .get_with_max_age::<i32>("somekey", 999)
+ .unwrap()
+ .is_none());
+
+ drop(write_guard);
+ let read_guard = cache.read_blocking("remote-a", TIMEOUT).unwrap();
+ assert!(read_guard
+ .get_with_max_age::<i32>("somekey", 999)
+ .unwrap()
+ .is_none());
+ }
+
+ #[test]
+ fn test_expiration() {
+ let entry = CacheEntry {
+ value: (),
+ timestamp: 1000,
+ };
+
+ assert!(!entry.is_expired(1000, 100));
+ assert!(!entry.is_expired(1099, 100));
+ assert!(entry.is_expired(1100, 100));
+ assert!(entry.is_expired(1101, 100));
+
+ // if max-age is 0, the entry is never fresh
+ assert!(entry.is_expired(1000, 0));
+ }
+
+ #[test]
+ fn test_invalid_namespaces() {
+ let (_dir, cache) = make_cache();
+
+ for id in [
+ "../remote-a",
+ "remote-a/../something",
+ "remote-a/../",
+ "../",
+ ] {
+ assert!(matches!(
+ cache.write_blocking(id, TIMEOUT),
+ Err(CacheError::InvalidNamespace(_))
+ ));
+ assert!(matches!(
+ cache.read_blocking(id, TIMEOUT),
+ Err(CacheError::InvalidNamespace(_))
+ ));
+ }
+ }
+
+ #[test]
+ fn test_invalid_keys() {
+ let (_dir, cache) = make_cache();
+
+ let write_guard = cache.write_blocking("remote-a", TIMEOUT).unwrap();
+ let read_guard = cache.write_blocking("remote-b", TIMEOUT).unwrap();
+
+ for id in ["../somekey", "somekey/../something", "somekey/../", "../"] {
+ assert!(matches!(
+ write_guard.set(id, ()),
+ Err(CacheError::InvalidKey(_))
+ ));
+ assert!(matches!(
+ write_guard.get::<()>(id),
+ Err(CacheError::InvalidKey(_))
+ ));
+ assert!(matches!(
+ write_guard.get_with_max_age::<()>(id, 1000),
+ Err(CacheError::InvalidKey(_))
+ ));
+ assert!(matches!(
+ write_guard.remove(id),
+ Err(CacheError::InvalidKey(_))
+ ));
+ assert!(matches!(
+ read_guard.get::<()>(id),
+ Err(CacheError::InvalidKey(_))
+ ));
+ assert!(matches!(
+ read_guard.get_with_max_age::<()>(id, 1000),
+ Err(CacheError::InvalidKey(_))
+ ));
+ }
+ }
+
+ #[tokio::test]
+ async fn test_async() {
+ let (_dir, cache) = make_cache();
+
+ let lock = cache.write("some-remote", TIMEOUT).await.unwrap();
+
+ lock.set("somekey", 1234).await.unwrap();
+ assert_eq!(lock.get::<i32>("somekey").await.unwrap(), Some(1234));
+ lock.remove("somekey").await.unwrap();
+ assert!(lock.get::<i32>("somekey").await.unwrap().is_none());
+
+ lock.set_with_timestamp("somekey", 1234, proxmox_time::epoch_i64() - 1000)
+ .await
+ .unwrap();
+
+ assert!(lock
+ .get_with_max_age::<i32>("somekey", 900)
+ .await
+ .unwrap()
+ .is_none());
+
+ drop(lock);
+
+ let lock = cache.read("some-remote", TIMEOUT).await.unwrap();
+ assert_eq!(lock.get::<i32>("somekey").await.unwrap(), Some(1234));
+ assert!(lock
+ .get_with_max_age::<i32>("somekey", 900)
+ .await
+ .unwrap()
+ .is_none());
+ }
+}
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH datacenter-manager 2/4] add api_cache as a specialized wrapper around the namespaced cache
2026-05-13 13:54 [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation Lukas Wagner
@ 2026-05-13 13:54 ` Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-13 13:54 ` [PATCH datacenter-manager 3/4] api: resources: subscriptions: switch over to api_cache Lukas Wagner
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Lukas Wagner @ 2026-05-13 13:54 UTC (permalink / raw)
To: pdm-devel
This is a thin wrapper around the previously introduced namespaced
key-value cache, but introducing PDM-specific concepts.
Instead of the higher-level read/write methods for locking a namespace,
this wrapper provides {read,write}_remote and {read,write}_global, for
accessing remote-specific and globally cached values. The
cache-namespaces are 'global' and 'remote-<remote-name>'.
The base directory for the cache is
/run/proxmox-datacenter-manager/api-cache
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
---
Notes:
Changes since the RFC:
- Changed the cache location to
/run/proxmox-datacenter-manager/api-cache
- Renamed the module from pdm_cache to api_cache
- removed the instance() accessor and use freestanding functions
instead. While the instance-based approach might come in useful at a
later stage when dependency injection is more widely used in our
stack, it's a premature step that can be reintroduced later without
big troubles
- Some minor code cleanup
server/src/api_cache.rs | 126 ++++++++++++++++++
.../bin/proxmox-datacenter-privileged-api.rs | 9 +-
server/src/lib.rs | 1 +
3 files changed, 135 insertions(+), 1 deletion(-)
create mode 100644 server/src/api_cache.rs
diff --git a/server/src/api_cache.rs b/server/src/api_cache.rs
new file mode 100644
index 00000000..b20f0535
--- /dev/null
+++ b/server/src/api_cache.rs
@@ -0,0 +1,126 @@
+//! Cache for API responses from remotes.
+//!
+//! This cache is namespaced by remote and also offers a 'global' namespace
+//! for values that are valid across remotes (e.g. aggregations).
+//!
+//! The cache has both, a blocking, as well as an async interface that can be used to get or set
+//! cache entries.
+//!
+//! A namespace (so, either the remote one's or the global one) must be locked before it can be
+//! accessed. All locking functions use a 10 second timeout while waiting for the lock.
+//!
+//! ## Blocking interface
+//! - [`read_remote_blocking`]
+//! - [`write_remote_blocking`]
+//! - [`read_global_blocking`]
+//! - [`write_global_blocking`]
+//!
+//! These functions return [`BlockingReadableCacheNamespace`] and [`BlockingWritableCacheNamespace`], respectively.
+//! Both only offer blocking operations for interacting with the entries of the locked namespace.
+//!
+//! ## `async` interface
+//! - [`read_remote`]
+//! - [`write_remote`]
+//! - [`read_global`]
+//! - [`write_global`]
+//!
+//! These functions return [`ReadableCacheNamespace`] and [`WritableCacheNamespace`], respectively.
+//! Both offer an async wrapper for interacting with the entries of the locked namespace.
+//!
+//! ```no_run
+//! use server::api_cache;
+//!
+//! #[derive(serde::Serialize, serde::Deserialize)]
+//! struct CacheableData {
+//! id: String,
+//! }
+//!
+//! let data = CacheableData {
+//! id: "some-id".to_string(),
+//! };
+//!
+//! // Lock the cache namespace for 'some-remote' for write access
+//! let lock = api_cache::write_remote_blocking("some-remote").unwrap();
+//!
+//! // Set some value (must be Serialize + Deserialize)
+//! lock.set("some-key", data).unwrap();
+//!
+//! // Retrieve the cached entry
+//! let data: Option<CacheableData> = lock.get("some-key").unwrap();
+//!
+//! // Remove the cached entry
+//! lock.remove("some-key").unwrap();
+//!
+//! ```
+
+use std::path::PathBuf;
+use std::sync::LazyLock;
+use std::time::Duration;
+
+use nix::sys::stat::Mode;
+
+use crate::namespaced_cache::{
+ BlockingReadableCacheNamespace, BlockingWritableCacheNamespace, CacheError, NamespacedCache,
+ ReadableCacheNamespace, WritableCacheNamespace,
+};
+
+/// Path at which API responses are cached.
+pub const PDM_API_CACHE_PATH: &str = concat!(pdm_buildcfg::PDM_RUN_DIR_M!(), "/api-cache");
+
+const GLOBAL_NAMESPACE: &str = "global";
+const LOCK_TIMEOUT: Duration = Duration::from_secs(10);
+
+static CACHE: LazyLock<NamespacedCache> = LazyLock::new(|| {
+ let file_options = proxmox_product_config::default_create_options();
+ let dir_options = file_options.perm(Mode::from_bits_truncate(0o750));
+
+ NamespacedCache::new(PathBuf::from(PDM_API_CACHE_PATH), dir_options, file_options)
+});
+
+fn format_remote_namespace(remote: &str) -> String {
+ format!("remote-{remote}")
+}
+
+/// Lock the cache for reading remote-specific data (blocking interface).
+pub fn read_remote_blocking(remote: &str) -> Result<BlockingReadableCacheNamespace, CacheError> {
+ CACHE.read_blocking(&format_remote_namespace(remote), LOCK_TIMEOUT)
+}
+
+/// Lock the cache for writing remote-specific data (blocking interface).
+pub fn write_remote_blocking(remote: &str) -> Result<BlockingWritableCacheNamespace, CacheError> {
+ CACHE.write_blocking(&format_remote_namespace(remote), LOCK_TIMEOUT)
+}
+
+/// Lock the cache for reading global data (blocking interface).
+pub fn read_global_blocking() -> Result<BlockingReadableCacheNamespace, CacheError> {
+ CACHE.read_blocking(GLOBAL_NAMESPACE, LOCK_TIMEOUT)
+}
+
+/// Lock the cache for writing global data (blocking interface).
+pub fn write_global_blocking() -> Result<BlockingWritableCacheNamespace, CacheError> {
+ CACHE.write_blocking(GLOBAL_NAMESPACE, LOCK_TIMEOUT)
+}
+
+/// Lock the cache for reading remote-specific data (async interface).
+pub async fn read_remote(remote: &str) -> Result<ReadableCacheNamespace, CacheError> {
+ CACHE
+ .read(&format_remote_namespace(remote), LOCK_TIMEOUT)
+ .await
+}
+
+/// Lock the cache for writing remote-specific data (async interface).
+pub async fn write_remote(remote: &str) -> Result<WritableCacheNamespace, CacheError> {
+ CACHE
+ .write(&format_remote_namespace(remote), LOCK_TIMEOUT)
+ .await
+}
+
+/// Lock the cache for reading global data (async interface).
+pub async fn read_global() -> Result<ReadableCacheNamespace, CacheError> {
+ CACHE.read(GLOBAL_NAMESPACE, LOCK_TIMEOUT).await
+}
+
+/// Lock the cache for writing global data (async interface).
+pub async fn write_global() -> Result<WritableCacheNamespace, CacheError> {
+ CACHE.write(GLOBAL_NAMESPACE, LOCK_TIMEOUT).await
+}
diff --git a/server/src/bin/proxmox-datacenter-privileged-api.rs b/server/src/bin/proxmox-datacenter-privileged-api.rs
index 6b490f2b..a3c448cf 100644
--- a/server/src/bin/proxmox-datacenter-privileged-api.rs
+++ b/server/src/bin/proxmox-datacenter-privileged-api.rs
@@ -14,7 +14,7 @@ use proxmox_rest_server::{ApiConfig, RestServer};
use proxmox_router::RpcEnvironmentType;
use proxmox_sys::fs::CreateOptions;
-use server::auth;
+use server::{api_cache, auth};
use pdm_buildcfg::configdir;
@@ -102,6 +102,13 @@ fn create_directories() -> Result<(), Error> {
0o755,
)?;
+ pdm_config::setup::mkdir_perms(
+ api_cache::PDM_API_CACHE_PATH,
+ api_user.uid,
+ api_user.gid,
+ 0o755,
+ )?;
+
server::jobstate::create_jobstate_dir()?;
Ok(())
diff --git a/server/src/lib.rs b/server/src/lib.rs
index 0b7642ab..89ab3035 100644
--- a/server/src/lib.rs
+++ b/server/src/lib.rs
@@ -2,6 +2,7 @@
pub mod acl;
pub mod api;
+pub mod api_cache;
pub mod auth;
pub mod context;
pub mod env;
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH datacenter-manager 3/4] api: resources: subscriptions: switch over to api_cache
2026-05-13 13:54 [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 2/4] add api_cache as a specialized wrapper around the namespaced cache Lukas Wagner
@ 2026-05-13 13:54 ` Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-13 13:54 ` [PATCH datacenter-manager 4/4] remote-updates: switch over to new api_cache Lukas Wagner
2026-05-15 8:30 ` superseded: [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
4 siblings, 1 reply; 14+ messages in thread
From: Lukas Wagner @ 2026-05-13 13:54 UTC (permalink / raw)
To: pdm-devel
Instead of storing the subscription state in a module-level HashMap, use
the newly introduced api_cache module to cache the status.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
---
Notes:
Changes since the RFC:
- Use new async cache interface
server/src/api/resources.rs | 81 +++++++++++++------------------------
1 file changed, 29 insertions(+), 52 deletions(-)
diff --git a/server/src/api/resources.rs b/server/src/api/resources.rs
index 50315b11..f549c250 100644
--- a/server/src/api/resources.rs
+++ b/server/src/api/resources.rs
@@ -30,9 +30,10 @@ use proxmox_schema::{api, parse_boolean};
use proxmox_sortable_macro::sortable;
use proxmox_subscription::SubscriptionStatus;
use pve_api_types::{ClusterResource, ClusterResourceNetworkType, ClusterResourceType};
+use serde::{Deserialize, Serialize};
use crate::metric_collection::top_entities;
-use crate::{connection, views};
+use crate::{api_cache, connection, views};
pub const ROUTER: Router = Router::new()
.get(&list_subdirs_api_method!(SUBDIRS))
@@ -798,15 +799,11 @@ async fn get_top_entities(
Ok(res)
}
-#[derive(Clone)]
+#[derive(Clone, Serialize, Deserialize)]
struct CachedSubscriptionState {
node_info: HashMap<String, Option<NodeSubscriptionInfo>>,
- timestamp: i64,
}
-static SUBSCRIPTION_CACHE: LazyLock<RwLock<HashMap<String, CachedSubscriptionState>>> =
- LazyLock::new(|| RwLock::new(HashMap::new()));
-
/// Get the subscription state for a given remote.
///
/// If recent enough cached data is available, it is returned
@@ -815,66 +812,46 @@ pub async fn get_subscription_info_for_remote(
remote: &Remote,
max_age: u64,
) -> Result<HashMap<String, Option<NodeSubscriptionInfo>>, Error> {
- if let Some(cached_subscription) = get_cached_subscription_info(&remote.id, max_age) {
+ if let Some(cached_subscription) =
+ get_cached_subscription_info(remote.id.clone(), max_age).await?
+ {
Ok(cached_subscription.node_info)
} else {
let node_info = fetch_remote_subscription_info(remote).await?;
- let now = proxmox_time::epoch_i64();
- update_cached_subscription_info(&remote.id, &node_info, now);
+ update_cached_subscription_info(remote.id.clone(), node_info.clone()).await?;
Ok(node_info)
}
}
-fn get_cached_subscription_info(remote: &str, max_age: u64) -> Option<CachedSubscriptionState> {
- let cache = SUBSCRIPTION_CACHE
- .read()
- .expect("subscription mutex poisoned");
+const SUBSCRIPTION_STATE_CACHE_KEY: &str = "subscription-state";
- if max_age == 0 {
- return None;
- }
- if let Some(cached_subscription) = cache.get(remote) {
- let now = proxmox_time::epoch_i64();
- let diff = now - cached_subscription.timestamp;
-
- if diff >= max_age as i64 || diff < 0 {
- // value is too old or from the future
- None
- } else {
- Some(cached_subscription.clone())
- }
- } else {
- None
- }
+async fn get_cached_subscription_info(
+ remote: String,
+ max_age: u64,
+) -> Result<Option<CachedSubscriptionState>, Error> {
+ let cache = api_cache::read_remote(&remote).await?;
+ Ok(cache
+ .get_with_max_age(SUBSCRIPTION_STATE_CACHE_KEY, max_age as i64)
+ .await?)
}
/// Update cached subscription data.
///
/// If the cache already contains more recent data we don't insert the passed resources.
-fn update_cached_subscription_info(
- remote: &str,
- node_info: &HashMap<String, Option<NodeSubscriptionInfo>>,
- now: i64,
-) {
- // there is no good way to recover from this, so panicking should be fine
- let mut cache = SUBSCRIPTION_CACHE
- .write()
- .expect("subscription mutex poisoned");
+async fn update_cached_subscription_info(
+ remote: String,
+ node_info: HashMap<String, Option<NodeSubscriptionInfo>>,
+) -> Result<(), Error> {
+ let cache = api_cache::write_remote(&remote).await?;
- if let Some(cached_resource) = cache.get(remote) {
- // skip updating if the data is new enough
- if cached_resource.timestamp >= now {
- return;
- }
- }
-
- cache.insert(
- remote.into(),
- CachedSubscriptionState {
- node_info: node_info.clone(),
- timestamp: now,
- },
- );
+ Ok(cache
+ .set(
+ SUBSCRIPTION_STATE_CACHE_KEY,
+ CachedSubscriptionState {
+ node_info: node_info,
+ },
+ )
+ .await?)
}
/// Maps a list of node subscription infos into a single [`RemoteSubscriptionState`]
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH datacenter-manager 4/4] remote-updates: switch over to new api_cache
2026-05-13 13:54 [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
` (2 preceding siblings ...)
2026-05-13 13:54 ` [PATCH datacenter-manager 3/4] api: resources: subscriptions: switch over to api_cache Lukas Wagner
@ 2026-05-13 13:54 ` Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 8:30 ` superseded: [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
4 siblings, 1 reply; 14+ messages in thread
From: Lukas Wagner @ 2026-05-13 13:54 UTC (permalink / raw)
To: pdm-devel
Use the new, centralized API cache for caching remote update summaries.
Add some cleanup logic to remove the old cachefile, which can be removed
at some point in the future.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
---
Notes:
Changes since the RFC:
- use new async cache interface
- clean up old cachefile automatically
server/src/api/pve/mod.rs | 4 +-
server/src/api/remotes/updates.rs | 4 +-
server/src/remote_updates.rs | 87 ++++++++++++++++---------------
3 files changed, 49 insertions(+), 46 deletions(-)
diff --git a/server/src/api/pve/mod.rs b/server/src/api/pve/mod.rs
index 20892f38..649ab624 100644
--- a/server/src/api/pve/mod.rs
+++ b/server/src/api/pve/mod.rs
@@ -588,8 +588,8 @@ pub async fn get_options(remote: String) -> Result<serde_json::Value, Error> {
},
)]
/// Return the cached update information about a remote.
-pub fn get_updates(remote: String) -> Result<RemoteUpdateSummary, Error> {
- let update_summary = get_available_updates_for_remote(&remote)?;
+pub async fn get_updates(remote: String) -> Result<RemoteUpdateSummary, Error> {
+ let update_summary = get_available_updates_for_remote(&remote).await?;
Ok(update_summary)
}
diff --git a/server/src/api/remotes/updates.rs b/server/src/api/remotes/updates.rs
index 365ffc19..0ca78214 100644
--- a/server/src/api/remotes/updates.rs
+++ b/server/src/api/remotes/updates.rs
@@ -42,7 +42,7 @@ const SUBDIRS: SubdirMap = &sorted!([
returns: { type: UpdateSummary }
)]
/// Return available update summary for managed remote nodes.
-pub fn update_summary(rpcenv: &mut dyn RpcEnvironment) -> Result<UpdateSummary, Error> {
+pub async fn update_summary(rpcenv: &mut dyn RpcEnvironment) -> Result<UpdateSummary, Error> {
let auth_id = rpcenv
.get_auth_id()
.context("no authid available")?
@@ -53,7 +53,7 @@ pub fn update_summary(rpcenv: &mut dyn RpcEnvironment) -> Result<UpdateSummary,
http_bail!(FORBIDDEN, "user has no access to resources");
}
- let mut update_summary = remote_updates::get_available_updates_summary()?;
+ let mut update_summary = remote_updates::get_available_updates_summary().await?;
update_summary.remotes.retain(|remote_name, _| {
user_info
diff --git a/server/src/remote_updates.rs b/server/src/remote_updates.rs
index 7aaacc46..0129a563 100644
--- a/server/src/remote_updates.rs
+++ b/server/src/remote_updates.rs
@@ -1,6 +1,3 @@
-use std::fs::File;
-use std::io::ErrorKind;
-
use anyhow::{bail, Error};
use serde::{Deserialize, Serialize};
@@ -12,12 +9,13 @@ use pdm_api_types::remote_updates::{
};
use pdm_api_types::remotes::{Remote, RemoteType};
use pdm_api_types::RemoteUpid;
-use pdm_buildcfg::PDM_CACHE_DIR_M;
-use crate::connection;
use crate::parallel_fetcher::ParallelFetcher;
+use crate::{api_cache, connection};
-pub const UPDATE_CACHE: &str = concat!(PDM_CACHE_DIR_M!(), "/remote-updates.json");
+const OLD_CACHEFILE: &str = concat!(pdm_buildcfg::PDM_CACHE_DIR_M!(), "/remote-updates.json");
+
+const UPDATE_SUMMARY_CACHE_KEY: &str = "remote-updates";
#[derive(Clone, Default, Debug, Deserialize, Serialize)]
#[serde(rename_all = "kebab-case")]
@@ -106,10 +104,10 @@ pub async fn get_changelog(remote: &Remote, node: &str, package: String) -> Resu
}
/// Get update summary for all managed remotes.
-pub fn get_available_updates_summary() -> Result<UpdateSummary, Error> {
+pub async fn get_available_updates_summary() -> Result<UpdateSummary, Error> {
let (config, _digest) = pdm_config::remotes::config()?;
- let cache_content = get_cached_summary_or_default()?;
+ let cache_content = get_cached_summary_or_default().await?;
let mut summary = UpdateSummary::default();
@@ -137,11 +135,11 @@ pub fn get_available_updates_summary() -> Result<UpdateSummary, Error> {
}
/// Return cached update information from specific remote
-pub fn get_available_updates_for_remote(remote: &str) -> Result<RemoteUpdateSummary, Error> {
+pub async fn get_available_updates_for_remote(remote: &str) -> Result<RemoteUpdateSummary, Error> {
let (config, _digest) = pdm_config::remotes::config()?;
if let Some(remote) = config.get(remote) {
- let cache_content = get_cached_summary_or_default()?;
+ let cache_content = get_cached_summary_or_default().await?;
Ok(cache_content
.remotes
.get(&remote.id)
@@ -156,22 +154,12 @@ pub fn get_available_updates_for_remote(remote: &str) -> Result<RemoteUpdateSumm
}
}
-fn get_cached_summary_or_default() -> Result<UpdateSummary, Error> {
- match File::open(UPDATE_CACHE) {
- Ok(file) => {
- let content = match serde_json::from_reader(file) {
- Ok(cache_content) => cache_content,
- Err(err) => {
- log::error!("failed to deserialize remote update cache: {err:#}");
- Default::default()
- }
- };
-
- Ok(content)
- }
- Err(err) if err.kind() == ErrorKind::NotFound => Ok(Default::default()),
- Err(err) => Err(err.into()),
- }
+async fn get_cached_summary_or_default() -> Result<UpdateSummary, Error> {
+ Ok(api_cache::read_global()
+ .await?
+ .get::<UpdateSummary>(UPDATE_SUMMARY_CACHE_KEY)
+ .await?
+ .unwrap_or_default())
}
async fn update_cached_summary_for_node(
@@ -179,10 +167,11 @@ async fn update_cached_summary_for_node(
node: String,
node_data: NodeUpdateSummary,
) -> Result<(), Error> {
- let mut file = File::open(UPDATE_CACHE)?;
- let mut cache_content: UpdateSummary = serde_json::from_reader(&mut file)?;
- let remote_entry =
- cache_content
+ let cache = api_cache::write_global().await?;
+ let cache_content = cache.get::<UpdateSummary>(UPDATE_SUMMARY_CACHE_KEY).await?;
+
+ if let Some(mut entry) = cache_content {
+ let remote_entry = entry
.remotes
.entry(remote.id)
.or_insert_with(|| RemoteUpdateSummary {
@@ -191,15 +180,9 @@ async fn update_cached_summary_for_node(
status: RemoteUpdateStatus::Success,
});
- remote_entry.nodes.insert(node, node_data);
-
- let options = proxmox_product_config::default_create_options();
- proxmox_sys::fs::replace_file(
- UPDATE_CACHE,
- &serde_json::to_vec(&cache_content)?,
- options,
- true,
- )?;
+ remote_entry.nodes.insert(node, node_data);
+ cache.set(UPDATE_SUMMARY_CACHE_KEY, entry).await?;
+ }
Ok(())
}
@@ -212,7 +195,7 @@ pub async fn refresh_update_summary_cache(remotes: Vec<Remote>) -> Result<(), Er
.do_for_all_remote_nodes(remotes.clone().into_iter(), fetch_available_updates)
.await;
- let mut content = get_cached_summary_or_default()?;
+ let mut content = get_cached_summary_or_default().await?;
// Clean out any remotes that might have been removed from the remote config in the meanwhile.
content
@@ -275,8 +258,28 @@ pub async fn refresh_update_summary_cache(remotes: Vec<Remote>) -> Result<(), Er
}
}
- let options = proxmox_product_config::default_create_options();
- proxmox_sys::fs::replace_file(UPDATE_CACHE, &serde_json::to_vec(&content)?, options, true)?;
+ cleanup_old_cachefile().await?;
+
+ let cache = api_cache::write_global().await?;
+ cache.set(UPDATE_SUMMARY_CACHE_KEY, content).await?;
+
+ Ok(())
+}
+
+// FIXME: We can remove this pretty soon.
+async fn cleanup_old_cachefile() -> Result<(), Error> {
+ tokio::task::spawn_blocking(|| {
+ if let Err(err) = std::fs::remove_file(OLD_CACHEFILE) {
+ if err.kind() != std::io::ErrorKind::NotFound {
+ log::error!(
+ "could not clean up old remote update cache file {OLD_CACHEFILE}: {err}"
+ );
+ }
+ } else {
+ log::info!("removed obsolete remote update cachefile {OLD_CACHEFILE}")
+ }
+ })
+ .await?;
Ok(())
}
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* superseded: [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses
2026-05-13 13:54 [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
` (3 preceding siblings ...)
2026-05-13 13:54 ` [PATCH datacenter-manager 4/4] remote-updates: switch over to new api_cache Lukas Wagner
@ 2026-05-15 8:30 ` Lukas Wagner
4 siblings, 0 replies; 14+ messages in thread
From: Lukas Wagner @ 2026-05-15 8:30 UTC (permalink / raw)
To: Lukas Wagner, pdm-devel
On Wed May 13, 2026 at 3:54 PM CEST, Lukas Wagner wrote:
> The main intention is to avoid a sprawl of different caching approaches by
> establishing a simple, easy to use cache implementation that can be used to
> persistently cache API responses from remotes (and derived aggregations).
>
superseded by v2: https://lore.proxmox.com/all/20260515082855.85698-1-l.wagner@proxmox.com/T/#u
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation
2026-05-13 13:54 ` [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation Lukas Wagner
@ 2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 9:19 ` Lukas Wagner
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Lamprecht @ 2026-05-15 9:06 UTC (permalink / raw)
To: Lukas Wagner; +Cc: pdm-devel
On Wed, 13 May 2026 15:54:54 +0200, Lukas Wagner wrote:
> diff --git a/server/src/namespaced_cache.rs b/server/src/namespaced_cache.rs
> new file mode 100644
> --- /dev/null
> +++ b/server/src/namespaced_cache.rs
> @@ -0,0 +1,742 @@
[...]
> +/// A generic, namespaced cache with optional value expiration.
> +pub struct NamespacedCache {
> + base_directory: PathBuf,
> + file_options: CreateOptions,
> + dir_options: CreateOptions,
> +}
> +
> +impl NamespacedCache {
> + /// Create a new cache instance.
> + ///
> + /// Cache entries will be persisted in the provided `base_directory`.
> + /// `dir_options` are the [`CreateOptions`] used the namespace directories, while `file_options`
nit: s/used the/used for the/ in the doc.
> + /// are the ones for persisted cache entries.
> + pub fn new<P: Into<PathBuf>>(
> + base_directory: P,
> + dir_options: CreateOptions,
> + file_options: CreateOptions,
> + ) -> Self {
tiny nit: Struct and new constructr disagree on parameter order
(file_options/dir_options vs. dir_options/file_options), but both are
the same type.
> +/// A writable cache namespace (blocking interface).
> +pub struct BlockingWritableCacheNamespace {
> + inner: WriteableInner,
> +}
> +
> +/// A writable cache namespace (async interface).
> +pub struct WritableCacheNamespace {
> + inner: Arc<WriteableInner>,
> +}
> +
> +struct WriteableInner {
nit: public types are `Writable`, inner is `Writeable` - unify the
spelling here (e.g. rename to `WritableInner`).
[...]
> +impl BlockingWritableCacheNamespace {
> + /// Remote a cache entry.
nit: s/Remote/Remove/ (and on `WritableCacheNamespace::remove`).
[...]
> +// Make sure that an identifier is safe to use as a cache key.
> +//
> +// At the moment, this only checks for '../', as to ensure that the base directory
> +// is not escaped.
> +fn ensure_valid_key(key: &str) -> Result<(), CacheError> {
> + if key.contains("../") {
> + return Err(CacheError::InvalidKey(key.into()));
> + }
> +
> + Ok(())
> +}
> +
> +// Make sure that an identifier is safe to use as a namespace.
> +//
> +// At the moment, this ensures that there is no '../' in the namespace, as well as that the
> +// identifier does not start with '/'.
> +fn ensure_valid_namespace(namespace: &str) -> Result<(), CacheError> {
> + if namespace.contains("../") || namespace.starts_with("/") {
> + return Err(CacheError::InvalidNamespace(namespace.into()));
> + }
> +
> + Ok(())
> +}
Something that I missed for the RRD fix back then w.r.t. "ensure that
the base directory is not escaped", but it's not enough to only reject
the `../` substring, as with that a couple of inputs still slip through:
- a key like `/etc/foo` passes; joining an absolute path replaces the
prefix, so the write ends up at `/etc/foo.json` outside the cache.
- a namespace of just `".."` passes too (no `../` substring, no leading
`/`), and then `base.join("..")` walks one level up.
No caller hits either today (keys are hardcoded and namespaces always go
through `format_remote_namespace`), but allowing only a known-good
character set (like the SAFE_ID regex) would match the stated intent
more directly than blocking specific substrings.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 2/4] add api_cache as a specialized wrapper around the namespaced cache
2026-05-13 13:54 ` [PATCH datacenter-manager 2/4] add api_cache as a specialized wrapper around the namespaced cache Lukas Wagner
@ 2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 9:22 ` Lukas Wagner
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Lamprecht @ 2026-05-15 9:06 UTC (permalink / raw)
To: Lukas Wagner; +Cc: pdm-devel
On Wed, 13 May 2026 15:54:55 +0200, Lukas Wagner wrote:
> diff --git a/server/src/api_cache.rs b/server/src/api_cache.rs
> new file mode 100644
> --- /dev/null
> +++ b/server/src/api_cache.rs
> @@ -0,0 +1,126 @@
[...]
> +const GLOBAL_NAMESPACE: &str = "global";
> +const LOCK_TIMEOUT: Duration = Duration::from_secs(10);
> +
> +static CACHE: LazyLock<NamespacedCache> = LazyLock::new(|| {
> + let file_options = proxmox_product_config::default_create_options();
> + let dir_options = file_options.perm(Mode::from_bits_truncate(0o750));
w.r.t. 0o750 mode for per-namespace subdirectories here and ...
> +
> + NamespacedCache::new(PathBuf::from(PDM_API_CACHE_PATH), dir_options, file_options)
> +});
> diff --git a/server/src/bin/proxmox-datacenter-privileged-api.rs b/server/src/bin/proxmox-datacenter-privileged-api.rs
> --- a/server/src/bin/proxmox-datacenter-privileged-api.rs
> +++ b/server/src/bin/proxmox-datacenter-privileged-api.rs
> @@ -102,6 +102,13 @@ fn create_directories() -> Result<(), Error> {
[...]
> + pdm_config::setup::mkdir_perms(
> + api_cache::PDM_API_CACHE_PATH,
> + api_user.uid,
> + api_user.gid,
> + 0o755,
> + )?;
.. the cache root is created with 0o755 here - is that on purpose? With
the current modes any local user can list the cache root and see which
remotes have cached data from the directory names, even though they
cannot read the cached contents. Either drop the root to 0o750 to match,
or add a short comment explaining why the two levels differ.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 3/4] api: resources: subscriptions: switch over to api_cache
2026-05-13 13:54 ` [PATCH datacenter-manager 3/4] api: resources: subscriptions: switch over to api_cache Lukas Wagner
@ 2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 9:49 ` Lukas Wagner
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Lamprecht @ 2026-05-15 9:06 UTC (permalink / raw)
To: Lukas Wagner; +Cc: pdm-devel
On Wed, 13 May 2026 15:54:56 +0200, Lukas Wagner wrote:
> diff --git a/server/src/api/resources.rs b/server/src/api/resources.rs
> --- a/server/src/api/resources.rs
> +++ b/server/src/api/resources.rs
> @@ -815,66 +812,46 @@ pub async fn get_subscription_info_for_remote(
> remote: &Remote,
> max_age: u64,
> ) -> Result<HashMap<String, Option<NodeSubscriptionInfo>>, Error> {
> - if let Some(cached_subscription) = get_cached_subscription_info(&remote.id, max_age) {
> + if let Some(cached_subscription) =
> + get_cached_subscription_info(remote.id.clone(), max_age).await?
> + {
> Ok(cached_subscription.node_info)
> } else {
> let node_info = fetch_remote_subscription_info(remote).await?;
> - let now = proxmox_time::epoch_i64();
> - update_cached_subscription_info(&remote.id, &node_info, now);
> + update_cached_subscription_info(remote.id.clone(), node_info.clone()).await?;
> Ok(node_info)
> }
> }
Both helpers below only borrow their `remote` parameter (they pass
`&remote` into `api_cache::read_remote` / `write_remote`), so changing
their parameter type from `String` to `&str` would let this call site
stop cloning `remote.id` twice FWICT.
The old `update_cached_subscription_info` used to compare timestamps and
skip the insert when the existing cache entry was already at least as
new:
if let Some(cached_resource) = cache.get(remote) {
if cached_resource.timestamp >= now {
return;
}
}
cache.insert(...)
The new code drops that check and just calls `set` unconditionally, so
under two concurrent misses for the same remote the slower fetch result
will overwrite the fresher one that arrived first. The fetch race itself
existed before too, but the compare-before-insert mitigated the worst
outcome (older data replacing newer). If you want to keep that property,
the new function would have to `get` the existing entry under the held
write lock and skip when its timestamp is already at least as new. See
also the doc-comment point below.
[...]
> /// Update cached subscription data.
> ///
> /// If the cache already contains more recent data we don't insert the passed resources.
[...]
> +async fn update_cached_subscription_info(
> + remote: String,
> + node_info: HashMap<String, Option<NodeSubscriptionInfo>>,
> +) -> Result<(), Error> {
> + let cache = api_cache::write_remote(&remote).await?;
>
> + Ok(cache
> + .set(
> + SUBSCRIPTION_STATE_CACHE_KEY,
> + CachedSubscriptionState {
> + node_info: node_info,
nit: `node_info: node_info,` -> `node_info,` (clippy redundant_field_names).
> + },
> + )
> + .await?)
> +}
The doc comment above is the one that used to describe the
compare-before-insert behaviour from the old code. Either drop the doc
line (IMO not ideal), or restore the behaviour as discussed above.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 4/4] remote-updates: switch over to new api_cache
2026-05-13 13:54 ` [PATCH datacenter-manager 4/4] remote-updates: switch over to new api_cache Lukas Wagner
@ 2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 12:56 ` Lukas Wagner
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Lamprecht @ 2026-05-15 9:06 UTC (permalink / raw)
To: Lukas Wagner; +Cc: pdm-devel
On Wed, 13 May 2026 15:54:57 +0200, Lukas Wagner wrote:
> diff --git a/server/src/remote_updates.rs b/server/src/remote_updates.rs
> --- a/server/src/remote_updates.rs
> +++ b/server/src/remote_updates.rs
> @@ -179,10 +167,11 @@ async fn update_cached_summary_for_node(
> node: String,
> node_data: NodeUpdateSummary,
> ) -> Result<(), Error> {
> - let mut file = File::open(UPDATE_CACHE)?;
> - let mut cache_content: UpdateSummary = serde_json::from_reader(&mut file)?;
> - let remote_entry =
> - cache_content
> + let cache = api_cache::write_global().await?;
> + let cache_content = cache.get::<UpdateSummary>(UPDATE_SUMMARY_CACHE_KEY).await?;
> +
> + if let Some(mut entry) = cache_content {
> + let remote_entry = entry
> .remotes
> .entry(remote.id)
> .or_insert_with(|| RemoteUpdateSummary {
> @@ -191,15 +180,9 @@ async fn update_cached_summary_for_node(
> status: RemoteUpdateStatus::Success,
> });
>
> - remote_entry.nodes.insert(node, node_data);
[...]
> + remote_entry.nodes.insert(node, node_data);
> + cache.set(UPDATE_SUMMARY_CACHE_KEY, entry).await?;
> + }
>
> Ok(())
> }
Small behaviour change worth a second look: the old code did
`File::open(UPDATE_CACHE)?`, which returned an error if the cache file
did not exist.
The new code uses `cache.get(..)`, which returns `Ok(None)` for that
case, and the `if let Some(..)` then silently skips the write.
So, if `list_available_updates` is called before any refresh has
populated the cache, its result is now thrown away silently instead of
surfaced as an error.
If you want this code path to be able to create the initial cache entry
as well, replacing the `if let Some(..)` with `let mut entry =
cache_content.unwrap_or_default();` would do it.
> @@ -212,7 +195,7 @@ pub async fn refresh_update_summary_cache(remotes: Vec<Remote>) -> Result<(), Er
> .do_for_all_remote_nodes(remotes.clone().into_iter(), fetch_available_updates)
> .await;
>
> - let mut content = get_cached_summary_or_default()?;
> + let mut content = get_cached_summary_or_default().await?;
>
> // Clean out any remotes that might have been removed from the remote config in the meanwhile.
> content
> @@ -275,8 +258,28 @@ pub async fn refresh_update_summary_cache(remotes: Vec<Remote>) -> Result<(), Er
> }
> }
>
> - let options = proxmox_product_config::default_create_options();
> - proxmox_sys::fs::replace_file(UPDATE_CACHE, &serde_json::to_vec(&content)?, options, true)?;
> + cleanup_old_cachefile().await?;
> +
> + let cache = api_cache::write_global().await?;
> + cache.set(UPDATE_SUMMARY_CACHE_KEY, content).await?;
> +
> + Ok(())
> +}
Two things on the final write:
- `get_cached_summary_or_default()` above takes a read lock and drops
it again before `write_global()` is called down here. If
`update_cached_summary_for_node` runs in that gap, the entry it just
wrote will be overwritten by the `cache.set` below. Holding a single
write lock for the whole function would prevent that. Not sure if we
strictly need that guarantee though.
- `cleanup_old_cachefile` runs before the new `cache.set`. If the `set`
ever fails, the old file is already gone, so the next refresh starts
from an empty cache. Either move the cleanup after the successful
write, or check whether the old file exists first so the cleanup only
runs once.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation
2026-05-15 9:06 ` Thomas Lamprecht
@ 2026-05-15 9:19 ` Lukas Wagner
0 siblings, 0 replies; 14+ messages in thread
From: Lukas Wagner @ 2026-05-15 9:19 UTC (permalink / raw)
To: Thomas Lamprecht, Lukas Wagner; +Cc: pdm-devel
On Fri May 15, 2026 at 11:06 AM CEST, Thomas Lamprecht wrote:
> On Wed, 13 May 2026 15:54:54 +0200, Lukas Wagner wrote:
>> diff --git a/server/src/namespaced_cache.rs b/server/src/namespaced_cache.rs
>> new file mode 100644
>> --- /dev/null
>> +++ b/server/src/namespaced_cache.rs
>> @@ -0,0 +1,742 @@
> [...]
>> +/// A generic, namespaced cache with optional value expiration.
>> +pub struct NamespacedCache {
>> + base_directory: PathBuf,
>> + file_options: CreateOptions,
>> + dir_options: CreateOptions,
>> +}
>> +
>> +impl NamespacedCache {
>> + /// Create a new cache instance.
>> + ///
>> + /// Cache entries will be persisted in the provided `base_directory`.
>> + /// `dir_options` are the [`CreateOptions`] used the namespace directories, while `file_options`
>
> nit: s/used the/used for the/ in the doc.
>
Fixed for the next revision, thanks!
>> + /// are the ones for persisted cache entries.
>> + pub fn new<P: Into<PathBuf>>(
>> + base_directory: P,
>> + dir_options: CreateOptions,
>> + file_options: CreateOptions,
>> + ) -> Self {
>
> tiny nit: Struct and new constructr disagree on parameter order
> (file_options/dir_options vs. dir_options/file_options), but both are
> the same type.
>
Fixed for the next revision, thanks!
>> +/// A writable cache namespace (blocking interface).
>> +pub struct BlockingWritableCacheNamespace {
>> + inner: WriteableInner,
>> +}
>> +
>> +/// A writable cache namespace (async interface).
>> +pub struct WritableCacheNamespace {
>> + inner: Arc<WriteableInner>,
>> +}
>> +
>> +struct WriteableInner {
>
> nit: public types are `Writable`, inner is `Writeable` - unify the
> spelling here (e.g. rename to `WritableInner`).
>
Fixed for the next revision, thanks!
> [...]
>> +impl BlockingWritableCacheNamespace {
>> + /// Remote a cache entry.
>
> nit: s/Remote/Remove/ (and on `WritableCacheNamespace::remove`).
Fixed for the next revision, thanks!
>
> [...]
>> +// Make sure that an identifier is safe to use as a cache key.
>> +//
>> +// At the moment, this only checks for '../', as to ensure that the base directory
>> +// is not escaped.
>> +fn ensure_valid_key(key: &str) -> Result<(), CacheError> {
>> + if key.contains("../") {
>> + return Err(CacheError::InvalidKey(key.into()));
>> + }
>> +
>> + Ok(())
>> +}
>> +
>> +// Make sure that an identifier is safe to use as a namespace.
>> +//
>> +// At the moment, this ensures that there is no '../' in the namespace, as well as that the
>> +// identifier does not start with '/'.
>> +fn ensure_valid_namespace(namespace: &str) -> Result<(), CacheError> {
>> + if namespace.contains("../") || namespace.starts_with("/") {
>> + return Err(CacheError::InvalidNamespace(namespace.into()));
>> + }
>> +
>> + Ok(())
>> +}
>
> Something that I missed for the RRD fix back then w.r.t. "ensure that
> the base directory is not escaped", but it's not enough to only reject
> the `../` substring, as with that a couple of inputs still slip through:
>
> - a key like `/etc/foo` passes; joining an absolute path replaces the
> prefix, so the write ends up at `/etc/foo.json` outside the cache.
> - a namespace of just `".."` passes too (no `../` substring, no leading
> `/`), and then `base.join("..")` walks one level up.
>
> No caller hits either today (keys are hardcoded and namespaces always go
> through `format_remote_namespace`), but allowing only a known-good
> character set (like the SAFE_ID regex) would match the stated intent
> more directly than blocking specific substrings.
Ack, will switch to a regex-based verification in the next revision!
Thanks for the feedback!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 2/4] add api_cache as a specialized wrapper around the namespaced cache
2026-05-15 9:06 ` Thomas Lamprecht
@ 2026-05-15 9:22 ` Lukas Wagner
0 siblings, 0 replies; 14+ messages in thread
From: Lukas Wagner @ 2026-05-15 9:22 UTC (permalink / raw)
To: Thomas Lamprecht, Lukas Wagner; +Cc: pdm-devel
On Fri May 15, 2026 at 11:06 AM CEST, Thomas Lamprecht wrote:
> On Wed, 13 May 2026 15:54:55 +0200, Lukas Wagner wrote:
>> diff --git a/server/src/api_cache.rs b/server/src/api_cache.rs
>> new file mode 100644
>> --- /dev/null
>> +++ b/server/src/api_cache.rs
>> @@ -0,0 +1,126 @@
> [...]
>> +const GLOBAL_NAMESPACE: &str = "global";
>> +const LOCK_TIMEOUT: Duration = Duration::from_secs(10);
>> +
>> +static CACHE: LazyLock<NamespacedCache> = LazyLock::new(|| {
>> + let file_options = proxmox_product_config::default_create_options();
>> + let dir_options = file_options.perm(Mode::from_bits_truncate(0o750));
>
> w.r.t. 0o750 mode for per-namespace subdirectories here and ...
>
>> +
>> + NamespacedCache::new(PathBuf::from(PDM_API_CACHE_PATH), dir_options, file_options)
>> +});
>
>> diff --git a/server/src/bin/proxmox-datacenter-privileged-api.rs b/server/src/bin/proxmox-datacenter-privileged-api.rs
>> --- a/server/src/bin/proxmox-datacenter-privileged-api.rs
>> +++ b/server/src/bin/proxmox-datacenter-privileged-api.rs
>> @@ -102,6 +102,13 @@ fn create_directories() -> Result<(), Error> {
> [...]
>> + pdm_config::setup::mkdir_perms(
>> + api_cache::PDM_API_CACHE_PATH,
>> + api_user.uid,
>> + api_user.gid,
>> + 0o755,
>> + )?;
>
> .. the cache root is created with 0o755 here - is that on purpose? With
> the current modes any local user can list the cache root and see which
> remotes have cached data from the directory names, even though they
> cannot read the cached contents. Either drop the root to 0o750 to match,
> or add a short comment explaining why the two levels differ.
Good catch! This was a copy-paste mistake, the mkdir_perms call was
copied from from a section above. will be fixed for the next revision,
I'll change it to 0o750.
Thanks!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 3/4] api: resources: subscriptions: switch over to api_cache
2026-05-15 9:06 ` Thomas Lamprecht
@ 2026-05-15 9:49 ` Lukas Wagner
0 siblings, 0 replies; 14+ messages in thread
From: Lukas Wagner @ 2026-05-15 9:49 UTC (permalink / raw)
To: Thomas Lamprecht, Lukas Wagner; +Cc: pdm-devel
On Fri May 15, 2026 at 11:06 AM CEST, Thomas Lamprecht wrote:
> On Wed, 13 May 2026 15:54:56 +0200, Lukas Wagner wrote:
>> diff --git a/server/src/api/resources.rs b/server/src/api/resources.rs
>> --- a/server/src/api/resources.rs
>> +++ b/server/src/api/resources.rs
>> @@ -815,66 +812,46 @@ pub async fn get_subscription_info_for_remote(
>> remote: &Remote,
>> max_age: u64,
>> ) -> Result<HashMap<String, Option<NodeSubscriptionInfo>>, Error> {
>> - if let Some(cached_subscription) = get_cached_subscription_info(&remote.id, max_age) {
>> + if let Some(cached_subscription) =
>> + get_cached_subscription_info(remote.id.clone(), max_age).await?
>> + {
>> Ok(cached_subscription.node_info)
>> } else {
>> let node_info = fetch_remote_subscription_info(remote).await?;
>> - let now = proxmox_time::epoch_i64();
>> - update_cached_subscription_info(&remote.id, &node_info, now);
>> + update_cached_subscription_info(remote.id.clone(), node_info.clone()).await?;
>> Ok(node_info)
>> }
>> }
>
> Both helpers below only borrow their `remote` parameter (they pass
> `&remote` into `api_cache::read_remote` / `write_remote`), so changing
> their parameter type from `String` to `&str` would let this call site
> stop cloning `remote.id` twice FWICT.
Argh, yeah, this was a leftover from the RFC, when I used spawn_blocking
at the callsite, which requires ownership of the values that are moved
into to the closure.
I should have reviewed the final diff more careful, then I would've
caught this myself.
>
> The old `update_cached_subscription_info` used to compare timestamps and
> skip the insert when the existing cache entry was already at least as
> new:
>
> if let Some(cached_resource) = cache.get(remote) {
> if cached_resource.timestamp >= now {
> return;
> }
> }
> cache.insert(...)
>
> The new code drops that check and just calls `set` unconditionally, so
> under two concurrent misses for the same remote the slower fetch result
> will overwrite the fresher one that arrived first. The fetch race itself
> existed before too, but the compare-before-insert mitigated the worst
> outcome (older data replacing newer). If you want to keep that property,
> the new function would have to `get` the existing entry under the held
> write lock and skip when its timestamp is already at least as new. See
> also the doc-comment point below.
I wonder if I should maybe fix this in the cache implementation itself,
e.g. by offering something like set_if_newer(val) and
set_if_newer_with_timestamp(val, ts)... I assume this is something that
we will need quite often.
FWIW, redis has something similar, there they have SET with the NX
parameter, which only sets if a key does not exist (or is expired).
Our key expiry works differently, by specifying a max-age on get,
instead of a TTL on set, but it roughly boils down to the same outcome
for scenarios like these.
>
> [...]
>> /// Update cached subscription data.
>> ///
>> /// If the cache already contains more recent data we don't insert the passed resources.
> [...]
>> +async fn update_cached_subscription_info(
>> + remote: String,
>> + node_info: HashMap<String, Option<NodeSubscriptionInfo>>,
>> +) -> Result<(), Error> {
>> + let cache = api_cache::write_remote(&remote).await?;
>>
>> + Ok(cache
>> + .set(
>> + SUBSCRIPTION_STATE_CACHE_KEY,
>> + CachedSubscriptionState {
>> + node_info: node_info,
>
> nit: `node_info: node_info,` -> `node_info,` (clippy redundant_field_names).
>
Fixed, thanks!
>> + },
>> + )
>> + .await?)
>> +}
>
> The doc comment above is the one that used to describe the
> compare-before-insert behaviour from the old code. Either drop the doc
> line (IMO not ideal), or restore the behaviour as discussed above.
Will attempt to restore the previous behavior.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH datacenter-manager 4/4] remote-updates: switch over to new api_cache
2026-05-15 9:06 ` Thomas Lamprecht
@ 2026-05-15 12:56 ` Lukas Wagner
0 siblings, 0 replies; 14+ messages in thread
From: Lukas Wagner @ 2026-05-15 12:56 UTC (permalink / raw)
To: Thomas Lamprecht, Lukas Wagner; +Cc: pdm-devel
On Fri May 15, 2026 at 11:06 AM CEST, Thomas Lamprecht wrote:
> On Wed, 13 May 2026 15:54:57 +0200, Lukas Wagner wrote:
>> diff --git a/server/src/remote_updates.rs b/server/src/remote_updates.rs
>> --- a/server/src/remote_updates.rs
>> +++ b/server/src/remote_updates.rs
>> @@ -179,10 +167,11 @@ async fn update_cached_summary_for_node(
>> node: String,
>> node_data: NodeUpdateSummary,
>> ) -> Result<(), Error> {
>> - let mut file = File::open(UPDATE_CACHE)?;
>> - let mut cache_content: UpdateSummary = serde_json::from_reader(&mut file)?;
>> - let remote_entry =
>> - cache_content
>> + let cache = api_cache::write_global().await?;
>> + let cache_content = cache.get::<UpdateSummary>(UPDATE_SUMMARY_CACHE_KEY).await?;
>> +
>> + if let Some(mut entry) = cache_content {
>> + let remote_entry = entry
>> .remotes
>> .entry(remote.id)
>> .or_insert_with(|| RemoteUpdateSummary {
>> @@ -191,15 +180,9 @@ async fn update_cached_summary_for_node(
>> status: RemoteUpdateStatus::Success,
>> });
>>
>> - remote_entry.nodes.insert(node, node_data);
> [...]
>> + remote_entry.nodes.insert(node, node_data);
>> + cache.set(UPDATE_SUMMARY_CACHE_KEY, entry).await?;
>> + }
>>
>> Ok(())
>> }
>
> Small behaviour change worth a second look: the old code did
> `File::open(UPDATE_CACHE)?`, which returned an error if the cache file
> did not exist.
Yeah, that actually was problematic before. It did not really surface,
since at the moment we trigger the remote update fetching for all
remotes very soon after daemon startup.
> The new code uses `cache.get(..)`, which returns `Ok(None)` for that
> case, and the `if let Some(..)` then silently skips the write.
> So, if `list_available_updates` is called before any refresh has
> populated the cache, its result is now thrown away silently instead of
> surfaced as an error.
> If you want this code path to be able to create the initial cache entry
> as well, replacing the `if let Some(..)` with `let mut entry =
> cache_content.unwrap_or_default();` would do it.
>
Did exactly that, thanks!
>> @@ -212,7 +195,7 @@ pub async fn refresh_update_summary_cache(remotes: Vec<Remote>) -> Result<(), Er
>> .do_for_all_remote_nodes(remotes.clone().into_iter(), fetch_available_updates)
>> .await;
>>
>> - let mut content = get_cached_summary_or_default()?;
>> + let mut content = get_cached_summary_or_default().await?;
>>
>> // Clean out any remotes that might have been removed from the remote config in the meanwhile.
>> content
>> @@ -275,8 +258,28 @@ pub async fn refresh_update_summary_cache(remotes: Vec<Remote>) -> Result<(), Er
>> }
>> }
>>
>> - let options = proxmox_product_config::default_create_options();
>> - proxmox_sys::fs::replace_file(UPDATE_CACHE, &serde_json::to_vec(&content)?, options, true)?;
>> + cleanup_old_cachefile().await?;
>> +
>> + let cache = api_cache::write_global().await?;
>> + cache.set(UPDATE_SUMMARY_CACHE_KEY, content).await?;
>> +
>> + Ok(())
>> +}
>
> Two things on the final write:
>
> - `get_cached_summary_or_default()` above takes a read lock and drops
> it again before `write_global()` is called down here. If
> `update_cached_summary_for_node` runs in that gap, the entry it just
> wrote will be overwritten by the `cache.set` below. Holding a single
> write lock for the whole function would prevent that. Not sure if we
> strictly need that guarantee though.
Moved the write_global() up top to make sure that the lock is held while
updating the values in the UpdateSummary data structure.
Thanks!
>
> - `cleanup_old_cachefile` runs before the new `cache.set`. If the `set`
> ever fails, the old file is already gone, so the next refresh starts
> from an empty cache.
We don't really migrate contents over from the old cachefile, but start
from a clean slate anyway with the new cache, so I don't think this
makes any difference?
> Either move the cleanup after the successful
> write, or check whether the old file exists first so the cleanup only
> runs once.
The cleanup literally just make sure that the old, unused cachefile does
not litter /var/cache/... - it's also fine to run
'cleanup_old_cachefile' multiple times. It would re-attempt to delete
the file, but explicitly catches the 'file does not exist' case and does
not log or throw an error then.
I changed the order of operations anyway (so, set before cleanup), but
I don't really think it makes a difference. (But maybe I'm missing
something?)
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-05-15 12:57 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 13:54 [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 1/4] add persistent, generic, namespaced key-value cache implementation Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 9:19 ` Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 2/4] add api_cache as a specialized wrapper around the namespaced cache Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 9:22 ` Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 3/4] api: resources: subscriptions: switch over to api_cache Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 9:49 ` Lukas Wagner
2026-05-13 13:54 ` [PATCH datacenter-manager 4/4] remote-updates: switch over to new api_cache Lukas Wagner
2026-05-15 9:06 ` Thomas Lamprecht
2026-05-15 12:56 ` Lukas Wagner
2026-05-15 8:30 ` superseded: [PATCH datacenter-manager 0/4] add generic, per-remote (and global) cache for remote API responses Lukas Wagner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox