From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup v7 30/38] datastore: add local datastore cache for network attached storages
Date: Thu, 10 Jul 2025 19:07:20 +0200 [thread overview]
Message-ID: <20250710170728.102829-40-c.ebner@proxmox.com> (raw)
In-Reply-To: <20250710170728.102829-1-c.ebner@proxmox.com>
Use a local datastore as cache using LRU cache replacement policy for
operations on a datastore backed by a network, e.g. by an S3 object
store backend. The goal is to reduce number of requests to the
backend and thereby save costs (monetary as well as time).
Cached chunks are stored on the local datastore cache, already
containing the datastore's contents metadata (namespace, group,
snapshot, owner, index files, ecc..), used to perform fast lookups.
The cache itself only stores chunk digests, not the raw data itself.
When payload data is required, contents are looked up and read from
the local datastore cache filesystem, including fallback to fetch from
the backend if the presumably cached entry is not found.
The cacher allows to fetch cache items on cache misses via the access
method.
The capacity of the cache is derived from the local datastore cache
filesystem, or by the user configured value, whichever is smalller.
The capacity is only set on instantiation of the store, and the current
value kept as long as the datastore remains cached in the datastore
cache. To change the value, the store has to be either be set to offline
mode and back, or the services restarted.
Basic performance tests:
Backup and upload of contents of linux git repository to AWS S3,
snapshots removed in-between each backup run to avoid other chunk reuse
optimization of PBS.
no-cache:
had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.76 s (average 102.258 MiB/s)
empty-cache:
had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.42 s (average 102.945 MiB/s)
all-cached:
had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 43.78 s (average 118.554 MiB/s)
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 6:
- make local datastore cache capacity configurable, autodetect based on
available storage space
- add basic upload performance tests
NOTE:
I'm not really happy with the current approach for detecting the cache
capacity.
Using the available storage space has unfortunately the following downsides:
- It does not detect chunks which use space on restart of the service,
or when the datastore gets dropped from the store cache and readded
(e.g. maintenance mode).
- The available storage might be shared with other datasets on ZFS,
making this rather volatile
Given these concerns, maybe it is better to opt for a fixed value,
chosen by the user on datastore creation? An only adapt that when
reloaded/reinserted in the datastore cache?
pbs-datastore/src/datastore.rs | 70 ++++++-
pbs-datastore/src/lib.rs | 3 +
.../src/local_datastore_lru_cache.rs | 172 ++++++++++++++++++
3 files changed, 244 insertions(+), 1 deletion(-)
create mode 100644 pbs-datastore/src/local_datastore_lru_cache.rs
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 89f45e7f8..20e8e3d41 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -40,9 +40,10 @@ use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
use crate::hierarchy::{ListGroups, ListGroupsType, ListNamespaces, ListNamespacesRecursive};
use crate::index::IndexFile;
+use crate::local_datastore_lru_cache::S3Cacher;
use crate::s3::S3_CONTENT_PREFIX;
use crate::task_tracking::{self, update_active_operations};
-use crate::DataBlob;
+use crate::{DataBlob, LocalDatastoreLruCache};
static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
LazyLock::new(|| Mutex::new(HashMap::new()));
@@ -136,6 +137,7 @@ pub struct DataStoreImpl {
last_digest: Option<[u8; 32]>,
sync_level: DatastoreFSyncLevel,
backend_config: DatastoreBackendConfig,
+ lru_store_caching: Option<LocalDatastoreLruCache>,
}
impl DataStoreImpl {
@@ -151,6 +153,7 @@ impl DataStoreImpl {
last_digest: None,
sync_level: Default::default(),
backend_config: Default::default(),
+ lru_store_caching: None,
})
}
}
@@ -255,6 +258,37 @@ impl DataStore {
Ok(backend_type)
}
+ pub fn cache(&self) -> Option<&LocalDatastoreLruCache> {
+ self.inner.lru_store_caching.as_ref()
+ }
+
+ /// Check if the digest is present in the local datastore cache.
+ /// Always returns false if there is no cache configured for this datastore.
+ pub fn cache_contains(&self, digest: &[u8; 32]) -> bool {
+ if let Some(cache) = self.inner.lru_store_caching.as_ref() {
+ return cache.contains(digest);
+ }
+ false
+ }
+
+ /// Insert digest as most recently used on in the cache.
+ /// Returns with success if there is no cache configured for this datastore.
+ pub fn cache_insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
+ if let Some(cache) = self.inner.lru_store_caching.as_ref() {
+ return cache.insert(digest, chunk);
+ }
+ Ok(())
+ }
+
+ pub fn cacher(&self) -> Result<Option<S3Cacher>, Error> {
+ self.backend().map(|backend| match backend {
+ DatastoreBackend::S3(s3_client) => {
+ Some(S3Cacher::new(s3_client, self.inner.chunk_store.clone()))
+ }
+ DatastoreBackend::Filesystem => None,
+ })
+ }
+
pub fn lookup_datastore(
name: &str,
operation: Option<Operation>,
@@ -437,6 +471,33 @@ impl DataStore {
.parse_property_string(config.backend.as_deref().unwrap_or(""))?,
)?;
+ let lru_store_caching = if DatastoreBackendType::S3 == backend_config.ty.unwrap_or_default()
+ {
+ let mut cache_capacity = 0;
+ if let Ok(fs_info) = proxmox_sys::fs::fs_info(&chunk_store.base_path()) {
+ cache_capacity = fs_info.available / (16 * 1024 * 1024);
+ }
+ if let Some(max_cache_size) = backend_config.max_cache_size {
+ warn!(
+ "Got requested max cache size {max_cache_size} for store {}",
+ config.name
+ );
+ let max_cache_capacity = max_cache_size.as_u64() / (16 * 1024 * 1024);
+ cache_capacity = cache_capacity.min(max_cache_capacity);
+ }
+ let cache_capacity = usize::try_from(cache_capacity).unwrap_or_default();
+
+ warn!(
+ "Using datastore cache with capacity {cache_capacity} for store {}",
+ config.name
+ );
+
+ let cache = LocalDatastoreLruCache::new(cache_capacity, chunk_store.clone());
+ Some(cache)
+ } else {
+ None
+ };
+
Ok(DataStoreImpl {
chunk_store,
gc_mutex: Mutex::new(()),
@@ -446,6 +507,7 @@ impl DataStore {
last_digest,
sync_level: tuning.sync_level.unwrap_or_default(),
backend_config,
+ lru_store_caching,
})
}
@@ -1580,6 +1642,12 @@ impl DataStore {
chunk_count += 1;
if atime < min_atime {
+ if let Some(cache) = self.cache() {
+ let mut digest_bytes = [0u8; 32];
+ hex::decode_to_slice(digest.as_bytes(), &mut digest_bytes)?;
+ // ignore errors, phase 3 will retry cleanup anyways
+ let _ = cache.remove(&digest_bytes);
+ }
delete_list.push(content.key);
if bad {
gc_status.removed_bad += 1;
diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
index ca6fdb7d8..b9eb035c2 100644
--- a/pbs-datastore/src/lib.rs
+++ b/pbs-datastore/src/lib.rs
@@ -217,3 +217,6 @@ pub use snapshot_reader::SnapshotReader;
mod local_chunk_reader;
pub use local_chunk_reader::LocalChunkReader;
+
+mod local_datastore_lru_cache;
+pub use local_datastore_lru_cache::LocalDatastoreLruCache;
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
new file mode 100644
index 000000000..bb64c52f3
--- /dev/null
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -0,0 +1,172 @@
+//! Use a local datastore as cache for operations on a datastore attached via
+//! a network layer (e.g. via the S3 backend).
+
+use std::future::Future;
+use std::sync::Arc;
+
+use anyhow::{bail, Error};
+use http_body_util::BodyExt;
+
+use pbs_tools::async_lru_cache::{AsyncCacher, AsyncLruCache};
+use proxmox_s3_client::S3Client;
+
+use crate::ChunkStore;
+use crate::DataBlob;
+
+#[derive(Clone)]
+pub struct S3Cacher {
+ client: Arc<S3Client>,
+ store: Arc<ChunkStore>,
+}
+
+impl AsyncCacher<[u8; 32], ()> for S3Cacher {
+ fn fetch(
+ &self,
+ key: [u8; 32],
+ ) -> Box<dyn Future<Output = Result<Option<()>, Error>> + Send + 'static> {
+ let client = self.client.clone();
+ let store = self.store.clone();
+ Box::new(async move {
+ let object_key = crate::s3::object_key_from_digest(&key)?;
+ match client.get_object(object_key).await? {
+ None => bail!("could not fetch object with key {}", hex::encode(key)),
+ Some(response) => {
+ let bytes = response.content.collect().await?.to_bytes();
+ let chunk = DataBlob::from_raw(bytes.to_vec())?;
+ store.insert_chunk(&chunk, &key)?;
+ Ok(Some(()))
+ }
+ }
+ })
+ }
+}
+
+impl S3Cacher {
+ pub fn new(client: Arc<S3Client>, store: Arc<ChunkStore>) -> Self {
+ Self { client, store }
+ }
+}
+
+/// LRU cache using local datastore for caching chunks
+///
+/// Uses a LRU cache, but without storing the values in-memory but rather
+/// on the filesystem
+pub struct LocalDatastoreLruCache {
+ cache: AsyncLruCache<[u8; 32], ()>,
+ store: Arc<ChunkStore>,
+}
+
+impl LocalDatastoreLruCache {
+ pub fn new(capacity: usize, store: Arc<ChunkStore>) -> Self {
+ Self {
+ cache: AsyncLruCache::new(capacity),
+ store,
+ }
+ }
+
+ /// Insert a new chunk into the local datastore cache.
+ ///
+ /// Fails if the chunk cannot be inserted successfully.
+ pub fn insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
+ self.store.insert_chunk(chunk, digest)?;
+ self.cache.insert(*digest, (), |digest| {
+ let (path, _digest_str) = self.store.chunk_path(&digest);
+ // Truncate to free up space but keep the inode around, since that
+ // is used as marker for chunks in use by garbage collection.
+ if let Err(err) = nix::unistd::truncate(&path, 0) {
+ if err != nix::errno::Errno::ENOENT {
+ return Err(Error::from(err));
+ }
+ }
+ Ok(())
+ })
+ }
+
+ /// Remove a chunk from the local datastore cache.
+ ///
+ /// Fails if the chunk cannot be deleted successfully.
+ pub fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
+ self.cache.remove(*digest);
+ let (path, _digest_str) = self.store.chunk_path(digest);
+ std::fs::remove_file(path).map_err(Error::from)
+ }
+
+ pub async fn access(
+ &self,
+ digest: &[u8; 32],
+ cacher: &mut S3Cacher,
+ ) -> Result<Option<DataBlob>, Error> {
+ if self
+ .cache
+ .access(*digest, cacher, |digest| {
+ let (path, _digest_str) = self.store.chunk_path(&digest);
+ // Truncate to free up space but keep the inode around, since that
+ // is used as marker for chunks in use by garbage collection.
+ if let Err(err) = nix::unistd::truncate(&path, 0) {
+ if err != nix::errno::Errno::ENOENT {
+ return Err(Error::from(err));
+ }
+ }
+ Ok(())
+ })
+ .await?
+ .is_some()
+ {
+ let (path, _digest_str) = self.store.chunk_path(digest);
+ let mut file = match std::fs::File::open(&path) {
+ Ok(file) => file,
+ Err(err) => {
+ // Expected chunk to be present since LRU cache has it, but it is missing
+ // locally, try to fetch again
+ if err.kind() == std::io::ErrorKind::NotFound {
+ let object_key = crate::s3::object_key_from_digest(digest)?;
+ match cacher.client.get_object(object_key).await? {
+ None => {
+ bail!("could not fetch object with key {}", hex::encode(digest))
+ }
+ Some(response) => {
+ let bytes = response.content.collect().await?.to_bytes();
+ let chunk = DataBlob::from_raw(bytes.to_vec())?;
+ self.store.insert_chunk(&chunk, digest)?;
+ std::fs::File::open(&path)?
+ }
+ }
+ } else {
+ return Err(Error::from(err));
+ }
+ }
+ };
+ let chunk = match DataBlob::load_from_reader(&mut file) {
+ Ok(chunk) => chunk,
+ Err(err) => {
+ use std::io::Seek;
+ // Check if file is empty marker file, try fetching content if so
+ if file.seek(std::io::SeekFrom::End(0))? == 0 {
+ let object_key = crate::s3::object_key_from_digest(digest)?;
+ match cacher.client.get_object(object_key).await? {
+ None => {
+ bail!("could not fetch object with key {}", hex::encode(digest))
+ }
+ Some(response) => {
+ let bytes = response.content.collect().await?.to_bytes();
+ let chunk = DataBlob::from_raw(bytes.to_vec())?;
+ self.store.insert_chunk(&chunk, digest)?;
+ let mut file = std::fs::File::open(&path)?;
+ DataBlob::load_from_reader(&mut file)?
+ }
+ }
+ } else {
+ return Err(err);
+ }
+ }
+ };
+ Ok(Some(chunk))
+ } else {
+ Ok(None)
+ }
+ }
+
+ pub fn contains(&self, digest: &[u8; 32]) -> bool {
+ self.cache.contains(*digest)
+ }
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-07-10 17:08 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-10 17:06 [pbs-devel] [PATCH proxmox{, -backup} v7 00/47] fix #2943: S3 storage backend for datastores Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 1/9] s3 client: add crate for AWS s3 compatible object store client Christian Ebner
2025-07-11 7:42 ` Thomas Lamprecht
2025-07-11 8:17 ` Christian Ebner
2025-07-11 8:22 ` Thomas Lamprecht
2025-07-11 10:52 ` Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 2/9] s3 client: implement AWS signature v4 request authentication Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 3/9] s3 client: add dedicated type for s3 object keys Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 4/9] s3 client: add type for last modified timestamp in responses Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 5/9] s3 client: add helper to parse http date headers Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 6/9] s3 client: implement methods to operate on s3 objects in bucket Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 7/9] s3 client: add example usage for basic operations Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 8/9] pbs-api-types: extend datastore config by backend config enum Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 9/9] pbs-api-types: maintenance: add new maintenance mode S3 refresh Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 01/38] datastore: add helpers for path/digest to s3 object key conversion Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 02/38] config: introduce s3 object store client configuration Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 03/38] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 04/38] api: datastore: check s3 backend bucket access on datastore create Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 05/38] api/cli: add endpoint and command to check s3 client connection Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 06/38] datastore: allow to get the backend for a datastore Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 07/38] api: backup: store datastore backend in runtime environment Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 08/38] api: backup: conditionally upload chunks to s3 object store backend Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 09/38] api: backup: conditionally upload blobs " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 10/38] api: backup: conditionally upload indices " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 11/38] api: backup: conditionally upload manifest " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 12/38] api: datastore: conditionally upload client log to s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 13/38] sync: pull: conditionally upload content " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 14/38] api: reader: fetch chunks based on datastore backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 15/38] datastore: local chunk reader: read chunks based on backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 16/38] verify worker: add datastore backed to verify worker Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 17/38] verify: implement chunk verification for stores with s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 18/38] datastore: create namespace marker in " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 19/38] datastore: create/delete protected marker file on s3 storage backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 20/38] datastore: prune groups/snapshots from s3 object store backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 21/38] datastore: get and set owner for s3 " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 22/38] datastore: implement garbage collection for s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 23/38] ui: add datastore type selector and reorganize component layout Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 24/38] ui: add s3 client edit window for configuration create/edit Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 25/38] ui: add s3 client view for configuration Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 26/38] ui: expose the s3 client view in the navigation tree Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 27/38] ui: add s3 client selector and bucket field for s3 backend setup Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 28/38] tools: lru cache: add removed callback for evicted cache nodes Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 29/38] tools: async lru cache: implement insert, remove and contains methods Christian Ebner
2025-07-10 17:07 ` Christian Ebner [this message]
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 31/38] api: backup: use local datastore cache on s3 backend chunk upload Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 32/38] api: reader: use local datastore cache on s3 backend chunk fetching Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 33/38] datastore: local chunk reader: get cached chunk from local cache store Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 34/38] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 35/38] api/datastore: implement refresh endpoint for stores with s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 36/38] cli: add dedicated subcommand for datastore s3 refresh Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 37/38] ui: render s3 refresh as valid maintenance type and task description Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 38/38] ui: expose s3 refresh button for datastores backed by object store Christian Ebner
2025-07-14 14:33 ` [pbs-devel] [PATCH proxmox{, -backup} v7 00/47] fix #2943: S3 storage backend for datastores Lukas Wagner
2025-07-14 15:40 ` Christian Ebner
2025-07-15 7:21 ` Lukas Wagner
2025-07-15 7:32 ` Christian Ebner
2025-07-15 12:55 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250710170728.102829-40-c.ebner@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.