* [pbs-devel] [PATCH proxmox-backup 0/2] rewarm local datastore chunk cache on datastore instantiation
@ 2025-08-01 14:10 Christian Ebner
2025-08-01 14:10 ` [pbs-devel] [PATCH proxmox-backup 1/2] tools: lru cache: allow to dynamically increase the cache capacity Christian Ebner
2025-08-01 14:10 ` [pbs-devel] [PATCH proxmox-backup 2/2] datastore: reinsert unused chunks into cache during instantiation Christian Ebner
0 siblings, 2 replies; 5+ messages in thread
From: Christian Ebner @ 2025-08-01 14:10 UTC (permalink / raw)
To: pbs-devel
This patches implement the logic to reclaim previously cached chunks which
have been lost from the in-memory chunk digest list since:
- the datastore is removed for the lookup cache when a corresponding
maintenance mode is set.
- the services are restarted.
- the system is rebooted.
The chunks are re-inserted by iterating over the local datastore cache
chunk store, detecting chunks which have an atime older than the start
of the reclaim task and have a size > 0 (have not been previously evicted).
Since they offer now usable storage space, increment the cache capacity
for each of the found chunks.
proxmox-backup:
Christian Ebner (2):
tools: lru cache: allow to dynamically increase the cache capacity
datastore: reinsert unused chunks into cache during instantiation
pbs-datastore/src/chunk_store.rs | 65 +++++++++++++++++++
pbs-datastore/src/datastore.rs | 18 ++++-
.../src/local_datastore_lru_cache.rs | 19 ++++++
pbs-tools/src/async_lru_cache.rs | 8 +++
pbs-tools/src/lru_cache.rs | 6 ++
5 files changed, 114 insertions(+), 2 deletions(-)
Summary over all repositories:
5 files changed, 114 insertions(+), 2 deletions(-)
--
Generated by git-murpp 0.8.1
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 1/2] tools: lru cache: allow to dynamically increase the cache capacity
2025-08-01 14:10 [pbs-devel] [PATCH proxmox-backup 0/2] rewarm local datastore chunk cache on datastore instantiation Christian Ebner
@ 2025-08-01 14:10 ` Christian Ebner
2025-08-01 14:10 ` [pbs-devel] [PATCH proxmox-backup 2/2] datastore: reinsert unused chunks into cache during instantiation Christian Ebner
1 sibling, 0 replies; 5+ messages in thread
From: Christian Ebner @ 2025-08-01 14:10 UTC (permalink / raw)
To: pbs-devel
Currently, the capacity of the LRU cache is set at instantiation, not
allowing to change it afterwards. In some situations, e.g. when more
space/memory is available, dynamically increasing it is required.
Therefore, add the methods which allow to increase the capacity by a
given increment, for the lru cache, async lru cache and the local
datastore lru caches, respectively.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
pbs-datastore/src/local_datastore_lru_cache.rs | 7 +++++++
pbs-tools/src/async_lru_cache.rs | 8 ++++++++
pbs-tools/src/lru_cache.rs | 6 ++++++
3 files changed, 21 insertions(+)
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
index c0edd3619..00cce94d6 100644
--- a/pbs-datastore/src/local_datastore_lru_cache.rs
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -177,4 +177,11 @@ impl LocalDatastoreLruCache {
pub fn contains(&self, digest: &[u8; 32]) -> bool {
self.cache.contains(*digest)
}
+
+ /// Increases the capacity of the cache by given increment.
+ ///
+ /// Returns the new cache capacity.
+ pub fn increase_capacity(&self, increment: usize) -> usize {
+ self.cache.increase_capacity(increment)
+ }
}
diff --git a/pbs-tools/src/async_lru_cache.rs b/pbs-tools/src/async_lru_cache.rs
index 3a975de32..dbfa1e21d 100644
--- a/pbs-tools/src/async_lru_cache.rs
+++ b/pbs-tools/src/async_lru_cache.rs
@@ -39,6 +39,14 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V: Clone + Send + 'static> AsyncL
}
}
+ /// Increment the LRU cache capacity by given increment, saturating at MAX value for usize
+ ///
+ /// Returns the new capacity.
+ pub fn increase_capacity(&self, increment: usize) -> usize {
+ let mut maps = self.maps.lock().unwrap();
+ maps.0.increase_capacity(increment)
+ }
+
/// Access an item either via the cache or by calling cacher.fetch. A return value of Ok(None)
/// means the item requested has no representation, Err(_) means a call to fetch() failed,
/// regardless of whether it was initiated by this call or a previous one.
diff --git a/pbs-tools/src/lru_cache.rs b/pbs-tools/src/lru_cache.rs
index a7aea6528..d12d0675e 100644
--- a/pbs-tools/src/lru_cache.rs
+++ b/pbs-tools/src/lru_cache.rs
@@ -133,6 +133,12 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
}
}
+ /// Increments the cache capacity by given `increment`, saturating at MAX value for usize
+ pub fn increase_capacity(&mut self, increment: usize) -> usize {
+ self.capacity = self.capacity.saturating_add(increment);
+ self.capacity
+ }
+
/// Insert or update an entry identified by `key` with the given `value`.
/// This entry is placed as the most recently used node at the head.
pub fn insert<F>(&mut self, key: K, value: V, removed: F) -> Result<bool, anyhow::Error>
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 2/2] datastore: reinsert unused chunks into cache during instantiation
2025-08-01 14:10 [pbs-devel] [PATCH proxmox-backup 0/2] rewarm local datastore chunk cache on datastore instantiation Christian Ebner
2025-08-01 14:10 ` [pbs-devel] [PATCH proxmox-backup 1/2] tools: lru cache: allow to dynamically increase the cache capacity Christian Ebner
@ 2025-08-01 14:10 ` Christian Ebner
2025-08-04 20:42 ` Thomas Lamprecht
1 sibling, 1 reply; 5+ messages in thread
From: Christian Ebner @ 2025-08-01 14:10 UTC (permalink / raw)
To: pbs-devel
The local datastore chunk cache stores the currently cached chunk
digests in-memory, the chunk's data is stored however on the
filesystem. The in-memory cache might however be lost when:
- the datastore is removed for the lookup cache when a corresponding
maintenance mode is set.
- the services are restarted.
- the system is rebooted.
After above actions, the cache is reistantiated again together with
the datastore on the next datastore lookup, calculating a cache
capacity based on the currently available storage space. This however
leaves the previously cached chunks out.
Therefore, reinsert them in an asynchronos task, by iterating over
them an insert the chunk digest again. For these previously used
chunks, increase also the cache size as this is now usable storage
for the cache as well.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
pbs-datastore/src/chunk_store.rs | 65 +++++++++++++++++++
pbs-datastore/src/datastore.rs | 18 ++++-
.../src/local_datastore_lru_cache.rs | 12 ++++
3 files changed, 93 insertions(+), 2 deletions(-)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 3c59612bb..fcc0db3c6 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -8,6 +8,8 @@ use anyhow::{bail, format_err, Context, Error};
use tracing::{info, warn};
use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
+use pbs_tools::async_lru_cache::AsyncLruCache;
+use proxmox_human_byte::HumanByte;
use proxmox_io::ReadExt;
use proxmox_s3_client::S3Client;
use proxmox_sys::fs::{create_dir, create_path, file_type_from_file_stat, CreateOptions};
@@ -704,6 +706,69 @@ impl ChunkStore {
ChunkStore::check_permissions(lockfile_path, 0o644)?;
Ok(())
}
+
+ /// Reinsert all cache chunks currently present in the chunk store, but not in the in-memory
+ /// LRU cache. Ignores chunks which atime is newer than the start time at the reinsert call.
+ pub fn reinsert_unused_cache_chunks(
+ &self,
+ cache: &AsyncLruCache<[u8; 32], ()>,
+ ) -> Result<(), Error> {
+ let min_atime = proxmox_time::epoch_i64();
+
+ let mut reclaimed = 0;
+ for (entry, _progress, _bad) in self.get_chunk_iterator()? {
+ let entry = entry
+ .with_context(|| format!("chunk iterator on chunk store '{}' failed", self.name))?;
+ let filename = entry.file_name();
+
+ if let Ok(stat) = nix::sys::stat::fstatat(
+ Some(entry.parent_fd()),
+ filename,
+ nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW,
+ ) {
+ let file_type = file_type_from_file_stat(&stat);
+ if file_type != Some(nix::dir::Type::File) {
+ continue;
+ }
+
+ if stat.st_atime < min_atime && stat.st_size > 0 {
+ let filename_bytes = filename.to_bytes();
+ if filename_bytes.len() != 64
+ || !filename_bytes.iter().all(u8::is_ascii_hexdigit)
+ {
+ continue;
+ }
+ let mut digest = [0u8; 32];
+ // safe to unwrap as already checked above
+ hex::decode_to_slice(&filename_bytes[..64], &mut digest).unwrap();
+ let (path, _digest_str) = self.chunk_path(&digest);
+
+ cache.increase_capacity(1);
+ if let Err(err) = cache.insert(digest, (), |_| {
+ if let Err(err) = nix::unistd::truncate(&path, 0) {
+ if err != nix::errno::Errno::ENOENT {
+ return Err(Error::from(err));
+ }
+ }
+ Ok(())
+ }) {
+ tracing::error!(
+ "Failed to rewarm cache with chunk {filename:?} on store '{}' - {err}",
+ self.name,
+ );
+ }
+ reclaimed += stat.st_size as u64;
+ }
+ }
+ }
+ tracing::info!(
+ "Reclaimed {} from chunk cache for store {}",
+ HumanByte::from(reclaimed),
+ self.name,
+ );
+
+ Ok(())
+ }
}
#[test]
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 5a22ffbcc..4a4f8b33a 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -364,10 +364,24 @@ impl DataStore {
update_active_operations(name, operation, 1)?;
}
- Ok(Arc::new(Self {
+ let datastore = Arc::new(Self {
inner: datastore,
operation,
- }))
+ });
+
+ if datastore.cache().is_some() {
+ let datastore2 = datastore.clone();
+ let name = name.to_string();
+ tokio::task::spawn_blocking(move || {
+ tracing::info!("Started cache refresh for datastore {name}");
+ let _ = datastore2
+ .cache()
+ .unwrap()
+ .refresh_cache_and_resize_capacity();
+ });
+ }
+
+ Ok(datastore)
}
/// removes all datastores that are not configured anymore
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
index 00cce94d6..9d585fa7a 100644
--- a/pbs-datastore/src/local_datastore_lru_cache.rs
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -184,4 +184,16 @@ impl LocalDatastoreLruCache {
pub fn increase_capacity(&self, increment: usize) -> usize {
self.cache.increase_capacity(increment)
}
+
+ /// Reinsert non-zero chunks currently found on the local datastore cache filesystem
+ /// into the list of digest stored in-memory, so they are reused. Increases also the
+ /// cache capacity for each inserted chunk, as the previous capacity is calculated base
+ /// on available storage, but the chunk was already present, thereby decreasing the
+ /// available on-disk storage space.
+ ///
+ /// Returns the new cache capacity.
+ pub fn refresh_cache_and_resize_capacity(&self) -> Result<(), Error> {
+ let (store, cache) = (&self.store, &self.cache);
+ store.reinsert_unused_cache_chunks(cache)
+ }
}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 2/2] datastore: reinsert unused chunks into cache during instantiation
2025-08-01 14:10 ` [pbs-devel] [PATCH proxmox-backup 2/2] datastore: reinsert unused chunks into cache during instantiation Christian Ebner
@ 2025-08-04 20:42 ` Thomas Lamprecht
2025-08-05 6:15 ` Christian Ebner
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Lamprecht @ 2025-08-04 20:42 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Am 01.08.25 um 16:10 schrieb Christian Ebner:
> The local datastore chunk cache stores the currently cached chunk
> digests in-memory, the chunk's data is stored however on the
> filesystem. The in-memory cache might however be lost when:
> - the datastore is removed for the lookup cache when a corresponding
> maintenance mode is set.
> - the services are restarted.
> - the system is rebooted.
>
> After above actions, the cache is reistantiated again together with
> the datastore on the next datastore lookup, calculating a cache
> capacity based on the currently available storage space. This however
> leaves the previously cached chunks out.
> Therefore, reinsert them in an asynchronos task, by iterating over
> them an insert the chunk digest again. For these previously used
> chunks, increase also the cache size as this is now usable storage
> for the cache as well.
I really would like some basic numbers for patches doing things with
caches, especially if they iterate over all chunks present on disk, IIUC.
AFAICIT it at least happens in the background, so doesn't delays the
one instantiating the new datastore struct directly, but without some
pacing going through all chunks as fast as possible might introduce
significant IO pressure I think.
Also, what happens if the datastore instance is already dropped again
during this cache re-warming? AFAICT that can only realistically happen
with maintenance mode, as with restarts/reboots it naturally should not
matter. Also, just to be sure, this might also block any shutdown
future from resolving, just like any other spawn_blocking, or am I
mistaken?
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 2/2] datastore: reinsert unused chunks into cache during instantiation
2025-08-04 20:42 ` Thomas Lamprecht
@ 2025-08-05 6:15 ` Christian Ebner
0 siblings, 0 replies; 5+ messages in thread
From: Christian Ebner @ 2025-08-05 6:15 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox Backup Server development discussion
On 8/4/25 10:41 PM, Thomas Lamprecht wrote:
> Am 01.08.25 um 16:10 schrieb Christian Ebner:
>> The local datastore chunk cache stores the currently cached chunk
>> digests in-memory, the chunk's data is stored however on the
>> filesystem. The in-memory cache might however be lost when:
>> - the datastore is removed for the lookup cache when a corresponding
>> maintenance mode is set.
>> - the services are restarted.
>> - the system is rebooted.
>>
>> After above actions, the cache is reistantiated again together with
>> the datastore on the next datastore lookup, calculating a cache
>> capacity based on the currently available storage space. This however
>> leaves the previously cached chunks out.
>> Therefore, reinsert them in an asynchronos task, by iterating over
>> them an insert the chunk digest again. For these previously used
>> chunks, increase also the cache size as this is now usable storage
>> for the cache as well.
>
> I really would like some basic numbers for patches doing things with
> caches, especially if they iterate over all chunks present on disk, IIUC.
> AFAICIT it at least happens in the background, so doesn't delays the
> one instantiating the new datastore struct directly, but without some
> pacing going through all chunks as fast as possible might introduce
> significant IO pressure I think.
Yes, although given that the cache should be limited in most cases, the
actual chunk iteration should not be that bad. It took a couple of
seconds with a cache store containing 8190 chunks, although on a NVME
SSD. But I will do a more in depth runtime and I/O pressure analysis in
the next days.
> Also, what happens if the datastore instance is already dropped again
> during this cache re-warming? AFAICT that can only realistically happen
> with maintenance mode, as with restarts/reboots it naturally should not
> matter. Also, just to be sure, this might also block any shutdown
> future from resolving, just like any other spawn_blocking, or am I
> mistaken?
Right, these are edge case I did not consider, given that these patches
were created rather in a hurry with the primary intention to have a stop
gap. After all the users will run out of storage space sooner or later,
given that nothing will ever reclaim the space occupied by the chunks
not in the cache anymore.
For now a
```
find /<datastore-path>/.chunks/ -type f -exec truncate --size 0 {} \;
```
while having the datastore in maintenance mode offline is a valid
workaround to recover from these cases, although clearing the contents
instead of reclaiming them.
But all in all, given your concerns I will invest more time into these
patches so we can (hopefully) roll them out and fix this issue for the
users.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-08-05 6:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-01 14:10 [pbs-devel] [PATCH proxmox-backup 0/2] rewarm local datastore chunk cache on datastore instantiation Christian Ebner
2025-08-01 14:10 ` [pbs-devel] [PATCH proxmox-backup 1/2] tools: lru cache: allow to dynamically increase the cache capacity Christian Ebner
2025-08-01 14:10 ` [pbs-devel] [PATCH proxmox-backup 2/2] datastore: reinsert unused chunks into cache during instantiation Christian Ebner
2025-08-04 20:42 ` Thomas Lamprecht
2025-08-05 6:15 ` Christian Ebner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.