* [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client
@ 2026-01-09 17:35 Robert Obkircher
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size Robert Obkircher
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: Robert Obkircher @ 2026-01-09 17:35 UTC (permalink / raw)
To: pbs-devel
Add support for commands like:
ssh host cmd | proxmox-backup-client backup data.img:/dev/stdin
proxmox-backup-client backup a.img:<(mysqldump) b.img:<(pgdump)
Changes since v2:
client:
- Renamed ChunkSize to IndexType as suggested.
datastore:
- Introduced explicit `index_capacity` field.
- Previously I (ab)used `index_length` for the capacity and
computed the actual index length from `size`.
- This removes the assert mentioned in the review.
- Renamed INITIAL_CHUNKS_IF_UNKNOWN to INITIAL_CAPACITY and increased
it to 128 blocks which is enough for 512 MiB of content.
- Only remap when capacity increases, not every time size increases.
- Increase capacity to next power of 2 instead of by 1.5x.
- This is less code and file systems probably don't do in place
updates where 1.5x is theoretically better.
- Ensure that add_digest and close fail after remap errors.
- Keep seek+write_all instead of write_all_at to minimize changes.
- Imroved error messages in clone_data_from.
- Refuse creation of empty files because the original also did that.
- Added tests for FixedIndexWriter:
- Is it okay to to write to a directory in the cwd?
- This was inspired by `chunk_store::test_chunk_store1`
- Removed test script.
Changes since v1:
- use mremap+ftruncate instead of write_all_at
- make the size API parameter optional instead of using 0
- use an enum to represent fixed/dynamic chunk size in UploadOptions
- alias "-" to "/dev/stdin"
- split changes into separate commits
Robert Obkircher (5):
fix #3847: datastore: support writing fidx files of unknown size
fix #3847: datastore: test FixedIndexWriter
fix #3847: api: backup: make fixed index file size optional
fix #3847: client: support fifo pipe inputs for images
fix #3847: client: treat minus sign as stdin
pbs-client/src/backup_writer.rs | 38 ++--
pbs-datastore/src/datastore.rs | 2 +-
pbs-datastore/src/fixed_index.rs | 304 +++++++++++++++++++++++++++++-
proxmox-backup-client/src/main.rs | 37 ++--
src/api2/backup/environment.rs | 8 +-
src/api2/backup/mod.rs | 4 +-
src/server/push.rs | 11 +-
7 files changed, 366 insertions(+), 38 deletions(-)
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size
2026-01-09 17:35 [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
@ 2026-01-09 17:35 ` Robert Obkircher
2026-01-14 13:13 ` Fabian Grünbichler
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter Robert Obkircher
` (4 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Robert Obkircher @ 2026-01-09 17:35 UTC (permalink / raw)
To: pbs-devel
Use mremap and ftruncate to support growable FixedIndexWriters. Grow
exponentially from a small initial index size for efficiency. Truncate
excessive capacity after encountering a non-full block or on close.
Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
pbs-datastore/src/datastore.rs | 2 +-
pbs-datastore/src/fixed_index.rs | 120 +++++++++++++++++++++++++++++--
2 files changed, 117 insertions(+), 5 deletions(-)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 9c57aaac..af712726 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -591,7 +591,7 @@ impl DataStore {
pub fn create_fixed_writer<P: AsRef<Path>>(
&self,
filename: P,
- size: usize,
+ size: Option<usize>,
chunk_size: usize,
) -> Result<FixedIndexWriter, Error> {
let index = FixedIndexWriter::create(
diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
index 6c3be2d4..8036a519 100644
--- a/pbs-datastore/src/fixed_index.rs
+++ b/pbs-datastore/src/fixed_index.rs
@@ -219,9 +219,12 @@ pub struct FixedIndexWriter {
chunk_size: usize,
size: usize,
index_length: usize,
+ index_capacity: usize,
index: *mut u8,
pub uuid: [u8; 16],
pub ctime: i64,
+ growable_size: bool,
+ write_size_on_close: bool,
}
// `index` is mmap()ed which cannot be thread-local so should be sendable
@@ -237,12 +240,18 @@ impl Drop for FixedIndexWriter {
}
impl FixedIndexWriter {
+ /// The initial capacity, if the total size is unknown.
+ ///
+ /// This capacity takes up the same amount of space as the header
+ /// and can refer to 128 Blocks * 4 MiB/Block = 512 MiB of content.
+ const INITIAL_CAPACITY: usize = 4096 / 32;
+
#[allow(clippy::cast_ptr_alignment)]
// Requires obtaining a shared chunk store lock beforehand
pub fn create(
store: Arc<ChunkStore>,
path: &Path,
- size: usize,
+ known_size: Option<usize>,
chunk_size: usize,
) -> Result<Self, Error> {
let full_path = store.relative_path(path);
@@ -264,6 +273,7 @@ impl FixedIndexWriter {
}
let ctime = proxmox_time::epoch_i64();
+ let size = known_size.unwrap_or(0);
let uuid = Uuid::generate();
@@ -280,8 +290,12 @@ impl FixedIndexWriter {
file.write_all(&buffer)?;
- let index_length = size.div_ceil(chunk_size);
- let index_size = index_length * 32;
+ let (index_length, index_capacity) = known_size
+ .map(|s| s.div_ceil(chunk_size))
+ .map(|len| (len, len))
+ .unwrap_or((0, Self::INITIAL_CAPACITY));
+
+ let index_size = index_capacity * 32;
nix::unistd::ftruncate(&file, (header_size + index_size) as i64)?;
let data = unsafe {
@@ -305,12 +319,87 @@ impl FixedIndexWriter {
chunk_size,
size,
index_length,
+ index_capacity,
index: data,
ctime,
uuid: *uuid.as_bytes(),
+ growable_size: known_size.is_none(),
+ write_size_on_close: known_size.is_none(),
+ })
+ }
+
+ /// If this returns an error, the sizes may be out of sync,
+ /// which is especially bad if the capacity was reduced.
+ fn set_index_capacity(&mut self, new_capacity: usize) -> Result<(), Error> {
+ if new_capacity == self.index_capacity {
+ return Ok(());
+ }
+ let old_index_size = self.index_capacity * 32;
+ let new_index_size = new_capacity * 32;
+ let new_file_size = (size_of::<FixedIndexHeader>() + new_index_size) as i64;
+
+ let index_addr = NonNull::new(self.index as *mut std::ffi::c_void).ok_or_else(|| {
+ format_err!("Can't resize FixedIndexWriter index because the index pointer is null.")
+ })?;
+
+ nix::unistd::ftruncate(&self.file, new_file_size)?;
+
+ let new_index = unsafe {
+ nix::sys::mman::mremap(
+ index_addr,
+ old_index_size,
+ new_index_size,
+ nix::sys::mman::MRemapFlags::MREMAP_MAYMOVE,
+ None,
+ )
+ }?;
+
+ self.index = new_index.as_ptr().cast::<u8>();
+ self.index_capacity = new_capacity;
+ Ok(())
+ }
+
+ /// Unmapping ensures future add and close operations fail.
+ fn set_index_capacity_or_unmap(&mut self, new_capacity: usize) -> Result<(), Error> {
+ self.set_index_capacity(new_capacity).map_err(|e| {
+ let unmap_result = self.unmap();
+ let message = format!(
+ "failed to resize index capacity from {} to {new_capacity} with backing file: {:?}",
+ self.index_capacity, self.tmp_filename
+ );
+ assert!(self.index.is_null(), "{message} {unmap_result:?}");
+ e.context(message)
})
}
+ /// Increase the content size to be at least `requested_size` and
+ /// ensure there is enough capacity.
+ ///
+ /// Only writers that were created without a known size can grow.
+ /// The size also becomes fixed as soon as it is no longer divisible
+ /// by the block size, to ensure that only the last block can be
+ /// smaller.
+ pub fn grow_to_size(&mut self, requested_size: usize) -> Result<(), Error> {
+ if self.size < requested_size {
+ if !self.growable_size {
+ bail!("refusing to resize from {} to {requested_size}", self.size);
+ }
+ let new_len = requested_size.div_ceil(self.chunk_size);
+ if new_len * self.chunk_size != requested_size {
+ // not a full chunk, so this must be the last one
+ self.growable_size = false;
+ self.set_index_capacity_or_unmap(new_len)?;
+ } else if new_len > self.index_capacity {
+ self.set_index_capacity_or_unmap(new_len.next_power_of_two())?;
+ };
+ assert!(new_len <= self.index_capacity);
+ self.index_length = new_len;
+ self.size = requested_size;
+ }
+ Ok(())
+ }
+
+ /// The current length of the index. This may be increased with [`grow_to_size`].
pub fn index_length(&self) -> usize {
self.index_length
}
@@ -320,7 +409,7 @@ impl FixedIndexWriter {
return Ok(());
};
- let index_size = self.index_length * 32;
+ let index_size = self.index_capacity * 32;
if let Err(err) = unsafe { nix::sys::mman::munmap(index, index_size) } {
bail!("unmap file {:?} failed - {}", self.tmp_filename, err);
@@ -342,9 +431,24 @@ impl FixedIndexWriter {
self.unmap()?;
+ if self.index_length == 0 {
+ bail!("refusing to close empty fidx file {:?}", self.tmp_filename);
+ } else if self.index_length < self.index_capacity {
+ let file_size = size_of::<FixedIndexHeader>() + index_size;
+ nix::unistd::ftruncate(&self.file, file_size as i64)?;
+ self.index_capacity = self.index_length;
+ }
+
let csum_offset = std::mem::offset_of!(FixedIndexHeader, index_csum);
self.file.seek(SeekFrom::Start(csum_offset as u64))?;
self.file.write_all(&index_csum)?;
+
+ if self.write_size_on_close {
+ let size_offset = std::mem::offset_of!(FixedIndexHeader, size);
+ self.file.seek(SeekFrom::Start(size_offset as u64))?;
+ self.file.write_all(&(self.size as u64).to_le_bytes())?;
+ }
+
self.file.flush()?;
if let Err(err) = std::fs::rename(&self.tmp_filename, &self.filename) {
@@ -407,6 +511,14 @@ impl FixedIndexWriter {
}
pub fn clone_data_from(&mut self, reader: &FixedIndexReader) -> Result<(), Error> {
+ if self.growable_size {
+ bail!("reusing the index is only supported with known input size");
+ }
+
+ if self.chunk_size != reader.chunk_size {
+ bail!("can't reuse file with different chunk size");
+ }
+
if self.index_length != reader.index_count() {
bail!("clone_data_from failed - index sizes not equal");
}
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter
2026-01-09 17:35 [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size Robert Obkircher
@ 2026-01-09 17:35 ` Robert Obkircher
2026-01-14 13:13 ` Fabian Grünbichler
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 3/5] fix #3847: api: backup: make fixed index file size optional Robert Obkircher
` (3 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Robert Obkircher @ 2026-01-09 17:35 UTC (permalink / raw)
To: pbs-devel
Create a dummy chunk store and write fidx files with fixed and
dynamically sized writers. Compare the resulting binary files directly
(ignoring uuid and ctime) and also read them back using the reader.
The chunk hashes are made up and don't actually exist in the chunk
store.
Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
pbs-datastore/src/fixed_index.rs | 184 +++++++++++++++++++++++++++++++
1 file changed, 184 insertions(+)
diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
index 8036a519..a20edc94 100644
--- a/pbs-datastore/src/fixed_index.rs
+++ b/pbs-datastore/src/fixed_index.rs
@@ -530,3 +530,187 @@ impl FixedIndexWriter {
Ok(())
}
}
+
+#[cfg(test)]
+mod tests {
+ use super::*;
+ use crate::chunk_store::ChunkStore;
+ use pbs_api_types::DatastoreFSyncLevel;
+ use std::fs;
+ use std::sync::Arc;
+
+ const CS: usize = 4096;
+
+ #[test]
+ fn test_fixed_index_writer() {
+ let mut testdir = fs::canonicalize(".").unwrap();
+ testdir.push(".testdir_fixed_index");
+
+ if let Err(_e) = std::fs::remove_dir_all(&testdir) { /* ignore */ }
+
+ // this is a lot faster than ChunkStore::create, which takes almost 3 seconds
+ let store = {
+ let mut chunks = PathBuf::from(&testdir);
+ chunks.push(".chunks");
+ fs::create_dir_all(chunks).unwrap();
+ Arc::new(ChunkStore::open("test", &testdir, DatastoreFSyncLevel::None).unwrap())
+ };
+
+ test_empty(store.clone());
+ test_smaller_than_initial_capacity(store.clone());
+ test_grow_to_multiples_of_chunk_size(store.clone());
+ test_grow_to_fixed_size(store);
+
+ std::fs::remove_dir_all(&testdir).unwrap();
+ }
+
+ fn test_empty(store: Arc<ChunkStore>) {
+ let path = Path::new("test_empty");
+ let mut w = FixedIndexWriter::create(store, path, None, CS).unwrap();
+
+ assert_eq!(0, w.index_length(), "returns length, not capacity");
+ assert!(w.add_digest(0, &[1u8; 32]).is_err(), "out of bounds");
+ assert!(w.close().is_err(), "should refuse to create empty file");
+ }
+
+ fn test_smaller_than_initial_capacity(store: Arc<ChunkStore>) {
+ let path = Path::new("test_smaller_than_initial_capacity");
+ let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
+
+ let size = CS - 1;
+ let expected = test_data(size);
+ w.grow_to_size(size).unwrap();
+ expected[0].add_to(&mut w);
+
+ w.close().unwrap();
+ drop(w);
+
+ check_with_reader(&store.relative_path(path), size, &expected);
+ compare_to_known_size_writer(store, path, size, &expected);
+ }
+
+ fn test_grow_to_multiples_of_chunk_size(store: Arc<ChunkStore>) {
+ let path = Path::new("test_grow_to_multiples_of_chunk_size");
+ let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
+
+ let initial = FixedIndexWriter::INITIAL_CAPACITY;
+ let steps = [1, 2, initial, initial + 1, 5 * initial, 10 * initial + 1];
+ let expected = test_data(steps.last().unwrap() * CS);
+
+ let mut begin = 0;
+ for chunk_count in steps {
+ let last = &expected[chunk_count - 1];
+ w.grow_to_size(last.end).unwrap();
+ assert_eq!(last.index + 1, w.index_length());
+ assert!(w.add_digest(last.index + 1, &[1u8; 32]).is_err());
+
+ for c in expected[begin..chunk_count].iter().rev() {
+ c.add_to(&mut w);
+ }
+ begin = chunk_count;
+ }
+ w.close().unwrap();
+ drop(w);
+
+ let size = expected.len() * CS;
+ check_with_reader(&store.relative_path(path), size, &expected);
+ compare_to_known_size_writer(store, path, size, &expected);
+ }
+
+ fn test_grow_to_fixed_size(store: Arc<ChunkStore>) {
+ let path = Path::new("test_grow_to_fixed_size");
+ let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
+
+ let size = (FixedIndexWriter::INITIAL_CAPACITY + 42) * CS - 1; // last is not full
+ let expected = test_data(size);
+
+ w.grow_to_size(size).unwrap();
+ assert!(w.grow_to_size(size + 1).is_err(), "size must be fixed now");
+ assert_eq!(expected.len(), w.index_length());
+ assert!(w.add_digest(expected.len(), &[1u8; 32]).is_err());
+
+ for c in expected.iter().rev() {
+ c.add_to(&mut w);
+ }
+
+ w.close().unwrap();
+ drop(w);
+
+ check_with_reader(&store.relative_path(path), size, &expected);
+ compare_to_known_size_writer(store, path, size, &expected);
+ }
+
+ struct TestChunk {
+ digest: [u8; 32],
+ index: usize,
+ size: usize,
+ end: usize,
+ }
+
+ impl TestChunk {
+ fn add_to(&self, w: &mut FixedIndexWriter) {
+ assert_eq!(
+ self.index,
+ w.check_chunk_alignment(self.end, self.size).unwrap()
+ );
+ w.add_digest(self.index, &self.digest).unwrap();
+ }
+ }
+
+ fn test_data(size: usize) -> Vec<TestChunk> {
+ (0..size.div_ceil(CS))
+ .map(|index| {
+ let mut digest = [0u8; 32];
+ let i = &(index as u64).to_le_bytes();
+ for c in digest.chunks_mut(i.len()) {
+ c.copy_from_slice(i);
+ }
+ let size = if ((index + 1) * CS) <= size {
+ CS
+ } else {
+ size % CS
+ };
+ TestChunk {
+ digest,
+ index,
+ size,
+ end: index * CS + size,
+ }
+ })
+ .collect()
+ }
+
+ fn check_with_reader(path: &Path, size: usize, chunks: &[TestChunk]) {
+ let reader = FixedIndexReader::open(path).unwrap();
+ assert_eq!(size as u64, reader.index_bytes());
+ assert_eq!(chunks.len(), reader.index_count());
+ for c in chunks {
+ assert_eq!(&c.digest, reader.index_digest(c.index).unwrap());
+ }
+ }
+
+ fn compare_to_known_size_writer(
+ store: Arc<ChunkStore>,
+ name: &Path,
+ size: usize,
+ chunks: &[TestChunk],
+ ) {
+ let mut path = PathBuf::from(name);
+ path.set_extension("reference");
+ let mut w = FixedIndexWriter::create(store.clone(), &path, Some(size), CS).unwrap();
+ for c in chunks {
+ c.add_to(&mut w);
+ }
+ w.close().unwrap();
+ drop(w);
+
+ let mut reference = fs::read(store.relative_path(&path)).unwrap();
+ let mut tested = fs::read(store.relative_path(name)).unwrap();
+
+ // ignore uuid and ctime
+ reference[8..32].fill(0);
+ tested[8..32].fill(0);
+
+ assert_eq!(reference, tested);
+ }
+}
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pbs-devel] [PATCH v3 proxmox-backup 3/5] fix #3847: api: backup: make fixed index file size optional
2026-01-09 17:35 [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size Robert Obkircher
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter Robert Obkircher
@ 2026-01-09 17:35 ` Robert Obkircher
2026-01-14 13:13 ` Fabian Grünbichler
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 4/5] fix #3847: client: support fifo pipe inputs for images Robert Obkircher
` (2 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Robert Obkircher @ 2026-01-09 17:35 UTC (permalink / raw)
To: pbs-devel
Grow the FixedIndexWriter as necessary and update the duplicate size
in BackupEnvironment.
Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
src/api2/backup/environment.rs | 8 ++++++--
src/api2/backup/mod.rs | 4 ++--
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index bd9c5211..77361724 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -349,7 +349,7 @@ impl BackupEnvironment {
&self,
index: FixedIndexWriter,
name: String,
- size: usize,
+ size: Option<usize>,
chunk_size: u32,
incremental: bool,
) -> Result<usize, Error> {
@@ -365,7 +365,7 @@ impl BackupEnvironment {
index,
name,
chunk_count: 0,
- size,
+ size: size.unwrap_or(0),
chunk_size,
small_chunk_count: 0,
upload_stat: UploadStatistic::new(),
@@ -443,7 +443,11 @@ impl BackupEnvironment {
}
let end = (offset as usize) + (size as usize);
+ data.index.grow_to_size(end)?;
let idx = data.index.check_chunk_alignment(end, size as usize)?;
+ if end > data.size {
+ data.size = end;
+ }
data.chunk_count += 1;
diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
index 3e6b7a95..c2822c18 100644
--- a/src/api2/backup/mod.rs
+++ b/src/api2/backup/mod.rs
@@ -456,7 +456,7 @@ pub const API_METHOD_CREATE_FIXED_INDEX: ApiMethod = ApiMethod::new(
("archive-name", false, &BACKUP_ARCHIVE_NAME_SCHEMA),
(
"size",
- false,
+ true,
&IntegerSchema::new("File size.").minimum(1).schema()
),
(
@@ -480,7 +480,7 @@ fn create_fixed_index(
let env: &BackupEnvironment = rpcenv.as_ref();
let name = required_string_param(¶m, "archive-name")?.to_owned();
- let size = required_integer_param(¶m, "size")? as usize;
+ let size = param["size"].as_u64().map(usize::try_from).transpose()?;
let reuse_csum = param["reuse-csum"].as_str();
let archive_name = name.clone();
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pbs-devel] [PATCH v3 proxmox-backup 4/5] fix #3847: client: support fifo pipe inputs for images
2026-01-09 17:35 [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
` (2 preceding siblings ...)
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 3/5] fix #3847: api: backup: make fixed index file size optional Robert Obkircher
@ 2026-01-09 17:35 ` Robert Obkircher
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 5/5] fix #3847: client: treat minus sign as stdin Robert Obkircher
2026-01-14 13:13 ` [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Fabian Grünbichler
5 siblings, 0 replies; 14+ messages in thread
From: Robert Obkircher @ 2026-01-09 17:35 UTC (permalink / raw)
To: pbs-devel
Accept fifo files as inputs for image backups. The unknown size is
represented in the UploadOptions using a new IndexType enum that
now stores an optional total size for the Fixed variant.
Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
pbs-client/src/backup_writer.rs | 38 ++++++++++++++++++++++---------
proxmox-backup-client/src/main.rs | 30 +++++++++++++-----------
src/server/push.rs | 11 +++++----
3 files changed, 50 insertions(+), 29 deletions(-)
diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index dbd177d8..f33f1063 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -52,7 +52,17 @@ pub struct UploadOptions {
pub previous_manifest: Option<Arc<BackupManifest>>,
pub compress: bool,
pub encrypt: bool,
- pub fixed_size: Option<u64>,
+ pub index_type: IndexType,
+}
+
+/// Index type for upload options.
+#[derive(Default, Clone)]
+pub enum IndexType {
+ /// Dynamic chunking.
+ #[default]
+ Dynamic,
+ /// Fixed size chunking with optional image file size.
+ Fixed(Option<u64>),
}
struct ChunkUploadResponse {
@@ -292,11 +302,14 @@ impl BackupWriter {
options: UploadOptions,
) -> Result<BackupStats, Error> {
let mut param = json!({ "archive-name": archive_name });
- let prefix = if let Some(size) = options.fixed_size {
- param["size"] = size.into();
- "fixed"
- } else {
- "dynamic"
+ let prefix = match options.index_type {
+ IndexType::Fixed(image_file_size) => {
+ if let Some(size) = image_file_size {
+ param["size"] = size.into();
+ }
+ "fixed"
+ }
+ IndexType::Dynamic => "dynamic",
};
if options.encrypt && self.crypt_config.is_none() {
@@ -387,11 +400,14 @@ impl BackupWriter {
let known_chunks = Arc::new(Mutex::new(HashSet::new()));
let mut param = json!({ "archive-name": archive_name });
- let prefix = if let Some(size) = options.fixed_size {
- param["size"] = size.into();
- "fixed"
- } else {
- "dynamic"
+ let prefix = match options.index_type {
+ IndexType::Fixed(image_file_size) => {
+ if let Some(size) = image_file_size {
+ param["size"] = size.into();
+ }
+ "fixed"
+ }
+ IndexType::Dynamic => "dynamic",
};
if options.encrypt && self.crypt_config.is_none() {
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 999e5020..7fc711fd 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -46,7 +46,7 @@ use pbs_client::tools::{
use pbs_client::{
delete_ticket_info, parse_backup_specification, view_task_result, BackupDetectionMode,
BackupReader, BackupRepository, BackupSpecificationType, BackupStats, BackupWriter,
- BackupWriterOptions, ChunkStream, FixedChunkStream, HttpClient, InjectionData,
+ BackupWriterOptions, ChunkStream, FixedChunkStream, HttpClient, IndexType, InjectionData,
PxarBackupStream, RemoteChunkReader, UploadOptions, BACKUP_SOURCE_SCHEMA,
};
use pbs_datastore::catalog::{BackupCatalogWriter, CatalogReader, CatalogWriter};
@@ -205,7 +205,7 @@ async fn backup_directory<P: AsRef<Path>>(
pxar_create_options: pbs_client::pxar::PxarCreateOptions,
upload_options: UploadOptions,
) -> Result<(BackupStats, Option<BackupStats>), Error> {
- if upload_options.fixed_size.is_some() {
+ if let IndexType::Fixed(_) = upload_options.index_type {
bail!("cannot backup directory with fixed chunk size!");
}
@@ -295,7 +295,7 @@ async fn backup_image<P: AsRef<Path>>(
let stream = FixedChunkStream::new(stream, chunk_size.unwrap_or(4 * 1024 * 1024));
- if upload_options.fixed_size.is_none() {
+ if let IndexType::Dynamic = upload_options.index_type {
bail!("cannot backup image with dynamic chunk size!");
}
@@ -859,15 +859,17 @@ async fn create_backup(
upload_list.push((BackupSpecificationType::PXAR, filename, target, "didx", 0));
}
BackupSpecificationType::IMAGE => {
- if !(file_type.is_file() || file_type.is_block_device()) {
- bail!("got unexpected file type (expected file or block device)");
- }
-
- let size = image_size(&PathBuf::from(&filename))?;
-
- if size == 0 {
- bail!("got zero-sized file '{}'", filename);
- }
+ let size = if file_type.is_file() || file_type.is_block_device() {
+ let size = image_size(&PathBuf::from(&filename))?;
+ if size == 0 {
+ bail!("got zero-sized file '{}'", filename);
+ }
+ size
+ } else if file_type.is_fifo() {
+ 0
+ } else {
+ bail!("got unexpected file type (expected file, block device, or fifo");
+ };
upload_list.push((
BackupSpecificationType::IMAGE,
@@ -1191,9 +1193,11 @@ async fn create_backup(
(BackupSpecificationType::IMAGE, false) => {
log_file("image", &filename, target.as_ref());
+ // 0 means fifo pipe with unknown size
+ let image_file_size = (size != 0).then_some(size);
let upload_options = UploadOptions {
previous_manifest: previous_manifest.clone(),
- fixed_size: Some(size),
+ index_type: IndexType::Fixed(image_file_size),
compress: true,
encrypt: crypto.mode == CryptMode::Encrypt,
};
diff --git a/src/server/push.rs b/src/server/push.rs
index d7884fce..b1b41297 100644
--- a/src/server/push.rs
+++ b/src/server/push.rs
@@ -17,7 +17,8 @@ use pbs_api_types::{
PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
};
use pbs_client::{
- BackupRepository, BackupWriter, BackupWriterOptions, HttpClient, MergedChunkInfo, UploadOptions,
+ BackupRepository, BackupWriter, BackupWriterOptions, HttpClient, IndexType, MergedChunkInfo,
+ UploadOptions,
};
use pbs_config::CachedUserInfo;
use pbs_datastore::data_blob::ChunkInfo;
@@ -917,7 +918,7 @@ pub(crate) async fn push_snapshot(
index,
chunk_reader,
&backup_writer,
- None,
+ IndexType::Dynamic,
known_chunks.clone(),
)
.await?;
@@ -944,7 +945,7 @@ pub(crate) async fn push_snapshot(
index,
chunk_reader,
&backup_writer,
- Some(size),
+ IndexType::Fixed(Some(size)),
known_chunks.clone(),
)
.await?;
@@ -1002,7 +1003,7 @@ async fn push_index(
index: impl IndexFile + Send + 'static,
chunk_reader: Arc<dyn AsyncReadChunk>,
backup_writer: &BackupWriter,
- size: Option<u64>,
+ index_type: IndexType,
known_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
) -> Result<SyncStats, Error> {
let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
@@ -1048,7 +1049,7 @@ async fn push_index(
let upload_options = UploadOptions {
compress: true,
encrypt: false,
- fixed_size: size,
+ index_type,
..UploadOptions::default()
};
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pbs-devel] [PATCH v3 proxmox-backup 5/5] fix #3847: client: treat minus sign as stdin
2026-01-09 17:35 [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
` (3 preceding siblings ...)
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 4/5] fix #3847: client: support fifo pipe inputs for images Robert Obkircher
@ 2026-01-09 17:35 ` Robert Obkircher
2026-01-14 13:13 ` [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Fabian Grünbichler
5 siblings, 0 replies; 14+ messages in thread
From: Robert Obkircher @ 2026-01-09 17:35 UTC (permalink / raw)
To: pbs-devel
Treat "-" as an alias for "/dev/stdin". If there is an actual file
with that name it can still be read via "./-".
Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
proxmox-backup-client/src/main.rs | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 7fc711fd..37878b01 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -845,6 +845,13 @@ async fn create_backup(
}
target_set.insert(target.clone());
+ // one can still use ./- to refer to an actual file with that name
+ let filename = if filename == "-" {
+ String::from("/dev/stdin")
+ } else {
+ filename
+ };
+
use std::os::unix::fs::FileTypeExt;
let metadata = std::fs::metadata(&filename)
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client
2026-01-09 17:35 [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
` (4 preceding siblings ...)
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 5/5] fix #3847: client: treat minus sign as stdin Robert Obkircher
@ 2026-01-14 13:13 ` Fabian Grünbichler
5 siblings, 0 replies; 14+ messages in thread
From: Fabian Grünbichler @ 2026-01-14 13:13 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On January 9, 2026 6:35 pm, Robert Obkircher wrote:
> Add support for commands like:
> ssh host cmd | proxmox-backup-client backup data.img:/dev/stdin
> proxmox-backup-client backup a.img:<(mysqldump) b.img:<(pgdump)
this is already shaping up nicely, some comments on individual patches,
and some feedback for the organization of the series:
the actual fix for the referenced bug is the last two patches - the rest
is just preparation work but has no effect yet until it is wired up in
the client (AFAICT?). that means that the "fix .." prefix should only go
into those two patches (or technically even just #4, since #5 is just
syntactic sugar on top).
please ensure each commit builds when applied in series. the first patch
breaks the backup API and should include changes making it compatible
(i.e., wrapping things in Some(..) as needed). the third patch would
then replace those compat changes with making it optional in the API and
handling that correctly.
thanks!
>
> Changes since v2:
>
> client:
> - Renamed ChunkSize to IndexType as suggested.
>
> datastore:
> - Introduced explicit `index_capacity` field.
> - Previously I (ab)used `index_length` for the capacity and
> computed the actual index length from `size`.
> - This removes the assert mentioned in the review.
> - Renamed INITIAL_CHUNKS_IF_UNKNOWN to INITIAL_CAPACITY and increased
> it to 128 blocks which is enough for 512 MiB of content.
> - Only remap when capacity increases, not every time size increases.
> - Increase capacity to next power of 2 instead of by 1.5x.
> - This is less code and file systems probably don't do in place
> updates where 1.5x is theoretically better.
> - Ensure that add_digest and close fail after remap errors.
> - Keep seek+write_all instead of write_all_at to minimize changes.
> - Imroved error messages in clone_data_from.
> - Refuse creation of empty files because the original also did that.
> - Added tests for FixedIndexWriter:
> - Is it okay to to write to a directory in the cwd?
> - This was inspired by `chunk_store::test_chunk_store1`
> - Removed test script.
>
> Changes since v1:
> - use mremap+ftruncate instead of write_all_at
> - make the size API parameter optional instead of using 0
> - use an enum to represent fixed/dynamic chunk size in UploadOptions
> - alias "-" to "/dev/stdin"
> - split changes into separate commits
>
> Robert Obkircher (5):
> fix #3847: datastore: support writing fidx files of unknown size
> fix #3847: datastore: test FixedIndexWriter
> fix #3847: api: backup: make fixed index file size optional
> fix #3847: client: support fifo pipe inputs for images
> fix #3847: client: treat minus sign as stdin
>
> pbs-client/src/backup_writer.rs | 38 ++--
> pbs-datastore/src/datastore.rs | 2 +-
> pbs-datastore/src/fixed_index.rs | 304 +++++++++++++++++++++++++++++-
> proxmox-backup-client/src/main.rs | 37 ++--
> src/api2/backup/environment.rs | 8 +-
> src/api2/backup/mod.rs | 4 +-
> src/server/push.rs | 11 +-
> 7 files changed, 366 insertions(+), 38 deletions(-)
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 3/5] fix #3847: api: backup: make fixed index file size optional
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 3/5] fix #3847: api: backup: make fixed index file size optional Robert Obkircher
@ 2026-01-14 13:13 ` Fabian Grünbichler
0 siblings, 0 replies; 14+ messages in thread
From: Fabian Grünbichler @ 2026-01-14 13:13 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On January 9, 2026 6:35 pm, Robert Obkircher wrote:
> Grow the FixedIndexWriter as necessary and update the duplicate size
> in BackupEnvironment.
>
> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
> ---
> src/api2/backup/environment.rs | 8 ++++++--
> src/api2/backup/mod.rs | 4 ++--
> 2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
> index bd9c5211..77361724 100644
> --- a/src/api2/backup/environment.rs
> +++ b/src/api2/backup/environment.rs
> @@ -349,7 +349,7 @@ impl BackupEnvironment {
> &self,
> index: FixedIndexWriter,
> name: String,
> - size: usize,
> + size: Option<usize>,
> chunk_size: u32,
> incremental: bool,
> ) -> Result<usize, Error> {
> @@ -365,7 +365,7 @@ impl BackupEnvironment {
> index,
> name,
> chunk_count: 0,
> - size,
> + size: size.unwrap_or(0),
would make more sense IMHO to keep this as an option, to signify this is
a known_size/not-growable index
> chunk_size,
> small_chunk_count: 0,
> upload_stat: UploadStatistic::new(),
> @@ -443,7 +443,11 @@ impl BackupEnvironment {
> }
>
> let end = (offset as usize) + (size as usize);
> + data.index.grow_to_size(end)?;
> let idx = data.index.check_chunk_alignment(end, size as usize)?;
> + if end > data.size {
> + data.size = end;
> + }
and drop this, but make index.size accessible via a getter, and then
checking both:
index.size is as submitted by the client (ensures the growing happened
correctly on both sides)
data.size is as submitted by the client, if set (this is the existing
check, just accounting for it now being optional)
when closing the writer here..
>
> data.chunk_count += 1;
>
> diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
> index 3e6b7a95..c2822c18 100644
> --- a/src/api2/backup/mod.rs
> +++ b/src/api2/backup/mod.rs
> @@ -456,7 +456,7 @@ pub const API_METHOD_CREATE_FIXED_INDEX: ApiMethod = ApiMethod::new(
> ("archive-name", false, &BACKUP_ARCHIVE_NAME_SCHEMA),
> (
> "size",
> - false,
> + true,
> &IntegerSchema::new("File size.").minimum(1).schema()
> ),
> (
> @@ -480,7 +480,7 @@ fn create_fixed_index(
> let env: &BackupEnvironment = rpcenv.as_ref();
>
> let name = required_string_param(¶m, "archive-name")?.to_owned();
> - let size = required_integer_param(¶m, "size")? as usize;
> + let size = param["size"].as_u64().map(usize::try_from).transpose()?;
> let reuse_csum = param["reuse-csum"].as_str();
>
> let archive_name = name.clone();
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter Robert Obkircher
@ 2026-01-14 13:13 ` Fabian Grünbichler
2026-01-16 12:48 ` Robert Obkircher
0 siblings, 1 reply; 14+ messages in thread
From: Fabian Grünbichler @ 2026-01-14 13:13 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On January 9, 2026 6:35 pm, Robert Obkircher wrote:
> Create a dummy chunk store and write fidx files with fixed and
> dynamically sized writers. Compare the resulting binary files directly
> (ignoring uuid and ctime) and also read them back using the reader.
>
> The chunk hashes are made up and don't actually exist in the chunk
> store.
>
> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
> ---
> pbs-datastore/src/fixed_index.rs | 184 +++++++++++++++++++++++++++++++
> 1 file changed, 184 insertions(+)
>
> diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
> index 8036a519..a20edc94 100644
> --- a/pbs-datastore/src/fixed_index.rs
> +++ b/pbs-datastore/src/fixed_index.rs
> @@ -530,3 +530,187 @@ impl FixedIndexWriter {
> Ok(())
> }
> }
> +
> +#[cfg(test)]
> +mod tests {
> + use super::*;
> + use crate::chunk_store::ChunkStore;
> + use pbs_api_types::DatastoreFSyncLevel;
> + use std::fs;
> + use std::sync::Arc;
> +
> + const CS: usize = 4096;
> +
> + #[test]
> + fn test_fixed_index_writer() {
this should be a setup method, and not annotated as #[test]
> + let mut testdir = fs::canonicalize(".").unwrap();
NACK, please use a tmpdir instead of writing to the current dir.. cargo
only provides a mechanism for this for integration tests, so for unit
tests you need to take care of it yourself, or move these to an
integration test..
we have discussed dropping the chunk store entirely in the past:
https://lore.proxmox.com/pbs-devel/1761822027.sm41v71zmk.astroid@yuna.none/
maybe we should do that and make unit testing here more
straight-forward? would still require a writable path though..
> + testdir.push(".testdir_fixed_index");
> +
> + if let Err(_e) = std::fs::remove_dir_all(&testdir) { /* ignore */ }
> +
> + // this is a lot faster than ChunkStore::create, which takes almost 3 seconds
> + let store = {
> + let mut chunks = PathBuf::from(&testdir);
> + chunks.push(".chunks");
> + fs::create_dir_all(chunks).unwrap();
> + Arc::new(ChunkStore::open("test", &testdir, DatastoreFSyncLevel::None).unwrap())
> + };
> +
> + test_empty(store.clone());
> + test_smaller_than_initial_capacity(store.clone());
> + test_grow_to_multiples_of_chunk_size(store.clone());
> + test_grow_to_fixed_size(store);
these should not be called here, but rather call the setup method
> +
> + std::fs::remove_dir_all(&testdir).unwrap();
and this should be handled with some kind of auto-cleanup-on-drop
mechanism, unless the whole thing is moved to an integration test, in
which case it will be somewhere under `target/` anyway..
> + }
> +
> + fn test_empty(store: Arc<ChunkStore>) {
> + let path = Path::new("test_empty");
> + let mut w = FixedIndexWriter::create(store, path, None, CS).unwrap();
> +
> + assert_eq!(0, w.index_length(), "returns length, not capacity");
should also check the capacity and size?
> + assert!(w.add_digest(0, &[1u8; 32]).is_err(), "out of bounds");
> + assert!(w.close().is_err(), "should refuse to create empty file");
> + }
> +
> + fn test_smaller_than_initial_capacity(store: Arc<ChunkStore>) {
the test_grow_to_multiples_of_chunk_size also tests sizes smaller than
the initial capacity.. this tests a very special edge case - an index
made up of a single, partial chunk..
> + let path = Path::new("test_smaller_than_initial_capacity");
> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
> +
> + let size = CS - 1;
> + let expected = test_data(size);
> + w.grow_to_size(size).unwrap();
> + expected[0].add_to(&mut w);
> +
> + w.close().unwrap();
> + drop(w);
> +
> + check_with_reader(&store.relative_path(path), size, &expected);
> + compare_to_known_size_writer(store, path, size, &expected);
> + }
> +
> + fn test_grow_to_multiples_of_chunk_size(store: Arc<ChunkStore>) {
> + let path = Path::new("test_grow_to_multiples_of_chunk_size");
> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
> +
> + let initial = FixedIndexWriter::INITIAL_CAPACITY;
> + let steps = [1, 2, initial, initial + 1, 5 * initial, 10 * initial + 1];
> + let expected = test_data(steps.last().unwrap() * CS);
> +
> + let mut begin = 0;
> + for chunk_count in steps {
> + let last = &expected[chunk_count - 1];
> + w.grow_to_size(last.end).unwrap();
> + assert_eq!(last.index + 1, w.index_length());
if index_length is changed to capacity, then this needs to be adapted,
unless index_length() continues to return the length based on size and
chunk size
> + assert!(w.add_digest(last.index + 1, &[1u8; 32]).is_err());
> +
> + for c in expected[begin..chunk_count].iter().rev() {
> + c.add_to(&mut w);
> + }
> + begin = chunk_count;
> + }
> + w.close().unwrap();
> + drop(w);
> +
> + let size = expected.len() * CS;
> + check_with_reader(&store.relative_path(path), size, &expected);
> + compare_to_known_size_writer(store, path, size, &expected);
> + }
> +
> + fn test_grow_to_fixed_size(store: Arc<ChunkStore>) {
s/fixed/misaligned/ ?
> + let path = Path::new("test_grow_to_fixed_size");
> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
> +
> + let size = (FixedIndexWriter::INITIAL_CAPACITY + 42) * CS - 1; // last is not full
> + let expected = test_data(size);
> +
> + w.grow_to_size(size).unwrap();
> + assert!(w.grow_to_size(size + 1).is_err(), "size must be fixed now");
> + assert_eq!(expected.len(), w.index_length());
> + assert!(w.add_digest(expected.len(), &[1u8; 32]).is_err());
> +
> + for c in expected.iter().rev() {
> + c.add_to(&mut w);
> + }
> +
> + w.close().unwrap();
> + drop(w);
> +
> + check_with_reader(&store.relative_path(path), size, &expected);
> + compare_to_known_size_writer(store, path, size, &expected);
> + }
> +
> + struct TestChunk {
> + digest: [u8; 32],
> + index: usize,
> + size: usize,
> + end: usize,
> + }
> +
> + impl TestChunk {
> + fn add_to(&self, w: &mut FixedIndexWriter) {
> + assert_eq!(
> + self.index,
> + w.check_chunk_alignment(self.end, self.size).unwrap()
> + );
> + w.add_digest(self.index, &self.digest).unwrap();
> + }
> + }
> +
> + fn test_data(size: usize) -> Vec<TestChunk> {
> + (0..size.div_ceil(CS))
> + .map(|index| {
> + let mut digest = [0u8; 32];
> + let i = &(index as u64).to_le_bytes();
> + for c in digest.chunks_mut(i.len()) {
> + c.copy_from_slice(i);
> + }
> + let size = if ((index + 1) * CS) <= size {
> + CS
> + } else {
> + size % CS
> + };
> + TestChunk {
> + digest,
> + index,
> + size,
> + end: index * CS + size,
> + }
> + })
> + .collect()
> + }
> +
> + fn check_with_reader(path: &Path, size: usize, chunks: &[TestChunk]) {
> + let reader = FixedIndexReader::open(path).unwrap();
> + assert_eq!(size as u64, reader.index_bytes());
> + assert_eq!(chunks.len(), reader.index_count());
> + for c in chunks {
> + assert_eq!(&c.digest, reader.index_digest(c.index).unwrap());
> + }
> + }
> +
> + fn compare_to_known_size_writer(
> + store: Arc<ChunkStore>,
> + name: &Path,
> + size: usize,
> + chunks: &[TestChunk],
> + ) {
> + let mut path = PathBuf::from(name);
> + path.set_extension("reference");
> + let mut w = FixedIndexWriter::create(store.clone(), &path, Some(size), CS).unwrap();
> + for c in chunks {
> + c.add_to(&mut w);
> + }
> + w.close().unwrap();
> + drop(w);
> +
> + let mut reference = fs::read(store.relative_path(&path)).unwrap();
> + let mut tested = fs::read(store.relative_path(name)).unwrap();
> +
> + // ignore uuid and ctime
> + reference[8..32].fill(0);
> + tested[8..32].fill(0);
> +
> + assert_eq!(reference, tested);
> + }
> +}
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size Robert Obkircher
@ 2026-01-14 13:13 ` Fabian Grünbichler
2026-01-16 12:42 ` Robert Obkircher
0 siblings, 1 reply; 14+ messages in thread
From: Fabian Grünbichler @ 2026-01-14 13:13 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On January 9, 2026 6:35 pm, Robert Obkircher wrote:
> Use mremap and ftruncate to support growable FixedIndexWriters. Grow
> exponentially from a small initial index size for efficiency. Truncate
> excessive capacity after encountering a non-full block or on close.
>
> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
> ---
> pbs-datastore/src/datastore.rs | 2 +-
> pbs-datastore/src/fixed_index.rs | 120 +++++++++++++++++++++++++++++--
> 2 files changed, 117 insertions(+), 5 deletions(-)
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 9c57aaac..af712726 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -591,7 +591,7 @@ impl DataStore {
> pub fn create_fixed_writer<P: AsRef<Path>>(
> &self,
> filename: P,
> - size: usize,
> + size: Option<usize>,
> chunk_size: usize,
> ) -> Result<FixedIndexWriter, Error> {
> let index = FixedIndexWriter::create(
> diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
> index 6c3be2d4..8036a519 100644
> --- a/pbs-datastore/src/fixed_index.rs
> +++ b/pbs-datastore/src/fixed_index.rs
> @@ -219,9 +219,12 @@ pub struct FixedIndexWriter {
> chunk_size: usize,
> size: usize,
> index_length: usize,
> + index_capacity: usize,
chunk_size and size effectively already give you index_length (as in,
how many "slots" are allowed to be used), so index length and index
capacity can be a single field. if you want to rename it to signify that
it might now be bigger than ceil(size/chunk_size), that's fine, but I
don't think we need to duplicate it and complicate the code below..
> index: *mut u8,
> pub uuid: [u8; 16],
> pub ctime: i64,
> + growable_size: bool,
> + write_size_on_close: bool,
> }
>
> // `index` is mmap()ed which cannot be thread-local so should be sendable
> @@ -237,12 +240,18 @@ impl Drop for FixedIndexWriter {
> }
>
> impl FixedIndexWriter {
> + /// The initial capacity, if the total size is unknown.
> + ///
> + /// This capacity takes up the same amount of space as the header
> + /// and can refer to 128 Blocks * 4 MiB/Block = 512 MiB of content.
> + const INITIAL_CAPACITY: usize = 4096 / 32;
might make sense to make this more explicit - we can only map using page
granularity, the header is one page, so the first part needs to be a
page as well, and if then always double it when resizing we stay aligned
to page boundaries.
> +
> #[allow(clippy::cast_ptr_alignment)]
> // Requires obtaining a shared chunk store lock beforehand
> pub fn create(
> store: Arc<ChunkStore>,
> path: &Path,
> - size: usize,
> + known_size: Option<usize>,
> chunk_size: usize,
> ) -> Result<Self, Error> {
> let full_path = store.relative_path(path);
> @@ -264,6 +273,7 @@ impl FixedIndexWriter {
> }
>
> let ctime = proxmox_time::epoch_i64();
> + let size = known_size.unwrap_or(0);
>
> let uuid = Uuid::generate();
>
> @@ -280,8 +290,12 @@ impl FixedIndexWriter {
>
> file.write_all(&buffer)?;
>
> - let index_length = size.div_ceil(chunk_size);
> - let index_size = index_length * 32;
> + let (index_length, index_capacity) = known_size
> + .map(|s| s.div_ceil(chunk_size))
> + .map(|len| (len, len))
> + .unwrap_or((0, Self::INITIAL_CAPACITY));
> +
> + let index_size = index_capacity * 32;
> nix::unistd::ftruncate(&file, (header_size + index_size) as i64)?;
>
> let data = unsafe {
> @@ -305,12 +319,87 @@ impl FixedIndexWriter {
> chunk_size,
> size,
> index_length,
> + index_capacity,
> index: data,
> ctime,
> uuid: *uuid.as_bytes(),
> + growable_size: known_size.is_none(),
> + write_size_on_close: known_size.is_none(),
> + })
> + }
> +
> + /// If this returns an error, the sizes may be out of sync,
> + /// which is especially bad if the capacity was reduced.
> + fn set_index_capacity(&mut self, new_capacity: usize) -> Result<(), Error> {
> + if new_capacity == self.index_capacity {
> + return Ok(());
> + }
> + let old_index_size = self.index_capacity * 32;
> + let new_index_size = new_capacity * 32;
> + let new_file_size = (size_of::<FixedIndexHeader>() + new_index_size) as i64;
> +
> + let index_addr = NonNull::new(self.index as *mut std::ffi::c_void).ok_or_else(|| {
> + format_err!("Can't resize FixedIndexWriter index because the index pointer is null.")
> + })?;
> +
> + nix::unistd::ftruncate(&self.file, new_file_size)?;
> +
> + let new_index = unsafe {
> + nix::sys::mman::mremap(
> + index_addr,
> + old_index_size,
> + new_index_size,
> + nix::sys::mman::MRemapFlags::MREMAP_MAYMOVE,
> + None,
> + )
> + }?;
> +
> + self.index = new_index.as_ptr().cast::<u8>();
> + self.index_capacity = new_capacity;
> + Ok(())
> + }
> +
> + /// Unmapping ensures future add and close operations fail.
> + fn set_index_capacity_or_unmap(&mut self, new_capacity: usize) -> Result<(), Error> {
> + self.set_index_capacity(new_capacity).map_err(|e| {
> + let unmap_result = self.unmap();
> + let message = format!(
> + "failed to resize index capacity from {} to {new_capacity} with backing file: {:?}",
> + self.index_capacity, self.tmp_filename
> + );
> + assert!(self.index.is_null(), "{message} {unmap_result:?}");
> + e.context(message)
> })
> }
>
> + /// Increase the content size to be at least `requested_size` and
> + /// ensure there is enough capacity.
> + ///
> + /// Only writers that were created without a known size can grow.
> + /// The size also becomes fixed as soon as it is no longer divisible
> + /// by the block size, to ensure that only the last block can be
> + /// smaller.
> + pub fn grow_to_size(&mut self, requested_size: usize) -> Result<(), Error> {
> + if self.size < requested_size {
> + if !self.growable_size {
> + bail!("refusing to resize from {} to {requested_size}", self.size);
> + }
> + let new_len = requested_size.div_ceil(self.chunk_size);
> + if new_len * self.chunk_size != requested_size {
> + // not a full chunk, so this must be the last one
> + self.growable_size = false;
> + self.set_index_capacity_or_unmap(new_len)?;
> + } else if new_len > self.index_capacity {
> + self.set_index_capacity_or_unmap(new_len.next_power_of_two())?;
> + };
> + assert!(new_len <= self.index_capacity);
> + self.index_length = new_len;
> + self.size = requested_size;
> + }
should we handle the else part here? i.e., error out if shrinking is
requested?
> + Ok(())
> + }
> +
> + /// The current length of the index. This may be increased with [`grow_to_size`].
> pub fn index_length(&self) -> usize {
> self.index_length
> }
> @@ -320,7 +409,7 @@ impl FixedIndexWriter {
> return Ok(());
> };
>
> - let index_size = self.index_length * 32;
> + let index_size = self.index_capacity * 32;
>
> if let Err(err) = unsafe { nix::sys::mman::munmap(index, index_size) } {
> bail!("unmap file {:?} failed - {}", self.tmp_filename, err);
> @@ -342,9 +431,24 @@ impl FixedIndexWriter {
>
> self.unmap()?;
>
> + if self.index_length == 0 {
> + bail!("refusing to close empty fidx file {:?}", self.tmp_filename);
> + } else if self.index_length < self.index_capacity {
> + let file_size = size_of::<FixedIndexHeader>() + index_size;
> + nix::unistd::ftruncate(&self.file, file_size as i64)?;
> + self.index_capacity = self.index_length;
> + }
> +
> let csum_offset = std::mem::offset_of!(FixedIndexHeader, index_csum);
> self.file.seek(SeekFrom::Start(csum_offset as u64))?;
> self.file.write_all(&index_csum)?;
> +
> + if self.write_size_on_close {
> + let size_offset = std::mem::offset_of!(FixedIndexHeader, size);
> + self.file.seek(SeekFrom::Start(size_offset as u64))?;
> + self.file.write_all(&(self.size as u64).to_le_bytes())?;
> + }
> +
> self.file.flush()?;
>
> if let Err(err) = std::fs::rename(&self.tmp_filename, &self.filename) {
> @@ -407,6 +511,14 @@ impl FixedIndexWriter {
> }
>
> pub fn clone_data_from(&mut self, reader: &FixedIndexReader) -> Result<(), Error> {
> + if self.growable_size {
> + bail!("reusing the index is only supported with known input size");
> + }
> +
> + if self.chunk_size != reader.chunk_size {
> + bail!("can't reuse file with different chunk size");
> + }
> +
> if self.index_length != reader.index_count() {
> bail!("clone_data_from failed - index sizes not equal");
> }
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size
2026-01-14 13:13 ` Fabian Grünbichler
@ 2026-01-16 12:42 ` Robert Obkircher
2026-01-16 14:28 ` Fabian Grünbichler
0 siblings, 1 reply; 14+ messages in thread
From: Robert Obkircher @ 2026-01-16 12:42 UTC (permalink / raw)
To: pbs-devel
On 1/14/26 14:13, Fabian Grünbichler wrote:
> On January 9, 2026 6:35 pm, Robert Obkircher wrote:
>> Use mremap and ftruncate to support growable FixedIndexWriters. Grow
>> exponentially from a small initial index size for efficiency. Truncate
>> excessive capacity after encountering a non-full block or on close.
>>
>> Signed-off-by: Robert Obkircher<r.obkircher@proxmox.com>
>> ---
>> pbs-datastore/src/datastore.rs | 2 +-
>> pbs-datastore/src/fixed_index.rs | 120 +++++++++++++++++++++++++++++--
>> 2 files changed, 117 insertions(+), 5 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 9c57aaac..af712726 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -591,7 +591,7 @@ impl DataStore {
>> pub fn create_fixed_writer<P: AsRef<Path>>(
>> &self,
>> filename: P,
>> - size: usize,
>> + size: Option<usize>,
>> chunk_size: usize,
>> ) -> Result<FixedIndexWriter, Error> {
>> let index = FixedIndexWriter::create(
>> diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
>> index 6c3be2d4..8036a519 100644
>> --- a/pbs-datastore/src/fixed_index.rs
>> +++ b/pbs-datastore/src/fixed_index.rs
>> @@ -219,9 +219,12 @@ pub struct FixedIndexWriter {
>> chunk_size: usize,
>> size: usize,
>> index_length: usize,
>> + index_capacity: usize,
> chunk_size and size effectively already give you index_length (as in,
> how many "slots" are allowed to be used), so index length and index
> capacity can be a single field. if you want to rename it to signify that
> it might now be bigger than ceil(size/chunk_size), that's fine, but I
> don't think we need to duplicate it and complicate the code below..
By that logic (chunk_size, size, growable_size) also give me the capacity,
so both fields could be removed.
I kept them as separate fields because that is what the original code and
the Reader do, supposedly to avoid the division for the bounds check.
This also makes it much more explicit what is going on and since the
index_length has to be computed anyway while resizing, I don't see how
storing that value for later is more complicated than recomputing it.
>> index: *mut u8,
>> pub uuid: [u8; 16],
>> pub ctime: i64,
>> + growable_size: bool,
>> + write_size_on_close: bool,
>> }
>>
>> // `index` is mmap()ed which cannot be thread-local so should be sendable
>> @@ -237,12 +240,18 @@ impl Drop for FixedIndexWriter {
>> }
>>
>> impl FixedIndexWriter {
>> + /// The initial capacity, if the total size is unknown.
>> + ///
>> + /// This capacity takes up the same amount of space as the header
>> + /// and can refer to 128 Blocks * 4 MiB/Block = 512 MiB of content.
>> + const INITIAL_CAPACITY: usize = 4096 / 32;
> might make sense to make this more explicit - we can only map using page
> granularity, the header is one page, so the first part needs to be a
> page as well, and if then always double it when resizing we stay aligned
> to page boundaries.
There is actually no strict requirement for the mapped length to be
a multiple of the page size, so I didn't want to guarantee too much.
However, the offset into the file must be aligned, so a page size
larger than the header would break anyway.
I think it would make sense to just map the file from the start, because
that would simplify my other fix [1] and the size/checksum update in
close().
[1]
https://lore.proxmox.com/pbs-devel/20260109175945.309913-3-r.obkircher@proxmox.com/
>> +
>> #[allow(clippy::cast_ptr_alignment)]
>> // Requires obtaining a shared chunk store lock beforehand
>> pub fn create(
>> store: Arc<ChunkStore>,
>> path: &Path,
>> - size: usize,
>> + known_size: Option<usize>,
>> chunk_size: usize,
>> ) -> Result<Self, Error> {
>> let full_path = store.relative_path(path);
>> @@ -264,6 +273,7 @@ impl FixedIndexWriter {
>> }
>>
>> let ctime = proxmox_time::epoch_i64();
>> + let size = known_size.unwrap_or(0);
>>
>> let uuid = Uuid::generate();
>>
>> @@ -280,8 +290,12 @@ impl FixedIndexWriter {
>>
>> file.write_all(&buffer)?;
>>
>> - let index_length = size.div_ceil(chunk_size);
>> - let index_size = index_length * 32;
>> + let (index_length, index_capacity) = known_size
>> + .map(|s| s.div_ceil(chunk_size))
>> + .map(|len| (len, len))
>> + .unwrap_or((0, Self::INITIAL_CAPACITY));
>> +
>> + let index_size = index_capacity * 32;
>> nix::unistd::ftruncate(&file, (header_size + index_size) as i64)?;
>>
>> let data = unsafe {
>> @@ -305,12 +319,87 @@ impl FixedIndexWriter {
>> chunk_size,
>> size,
>> index_length,
>> + index_capacity,
>> index: data,
>> ctime,
>> uuid: *uuid.as_bytes(),
>> + growable_size: known_size.is_none(),
>> + write_size_on_close: known_size.is_none(),
>> + })
>> + }
>> +
>> + /// If this returns an error, the sizes may be out of sync,
>> + /// which is especially bad if the capacity was reduced.
>> + fn set_index_capacity(&mut self, new_capacity: usize) -> Result<(), Error> {
>> + if new_capacity == self.index_capacity {
>> + return Ok(());
>> + }
>> + let old_index_size = self.index_capacity * 32;
>> + let new_index_size = new_capacity * 32;
>> + let new_file_size = (size_of::<FixedIndexHeader>() + new_index_size) as i64;
>> +
>> + let index_addr = NonNull::new(self.index as *mut std::ffi::c_void).ok_or_else(|| {
>> + format_err!("Can't resize FixedIndexWriter index because the index pointer is null.")
>> + })?;
>> +
>> + nix::unistd::ftruncate(&self.file, new_file_size)?;
>> +
>> + let new_index = unsafe {
>> + nix::sys::mman::mremap(
>> + index_addr,
>> + old_index_size,
>> + new_index_size,
>> + nix::sys::mman::MRemapFlags::MREMAP_MAYMOVE,
>> + None,
>> + )
>> + }?;
>> +
>> + self.index = new_index.as_ptr().cast::<u8>();
>> + self.index_capacity = new_capacity;
>> + Ok(())
>> + }
>> +
>> + /// Unmapping ensures future add and close operations fail.
>> + fn set_index_capacity_or_unmap(&mut self, new_capacity: usize) -> Result<(), Error> {
>> + self.set_index_capacity(new_capacity).map_err(|e| {
>> + let unmap_result = self.unmap();
>> + let message = format!(
>> + "failed to resize index capacity from {} to {new_capacity} with backing file: {:?}",
>> + self.index_capacity, self.tmp_filename
>> + );
>> + assert!(self.index.is_null(), "{message} {unmap_result:?}");
>> + e.context(message)
>> })
>> }
>>
>> + /// Increase the content size to be at least `requested_size` and
>> + /// ensure there is enough capacity.
>> + ///
>> + /// Only writers that were created without a known size can grow.
>> + /// The size also becomes fixed as soon as it is no longer divisible
>> + /// by the block size, to ensure that only the last block can be
>> + /// smaller.
>> + pub fn grow_to_size(&mut self, requested_size: usize) -> Result<(), Error> {
>> + if self.size < requested_size {
>> + if !self.growable_size {
>> + bail!("refusing to resize from {} to {requested_size}", self.size);
>> + }
>> + let new_len = requested_size.div_ceil(self.chunk_size);
>> + if new_len * self.chunk_size != requested_size {
>> + // not a full chunk, so this must be the last one
>> + self.growable_size = false;
>> + self.set_index_capacity_or_unmap(new_len)?;
>> + } else if new_len > self.index_capacity {
>> + self.set_index_capacity_or_unmap(new_len.next_power_of_two())?;
>> + };
>> + assert!(new_len <= self.index_capacity);
>> + self.index_length = new_len;
>> + self.size = requested_size;
>> + }
> should we handle the else part here? i.e., error out if shrinking is
> requested?
The else part is expected to succeed. I considered renaming the method to
something like grow_if_smaller or ensure_capacity to make this clear, but
I'm not quite happy with those names either.
>> + Ok(())
>> + }
>> +
>> + /// The current length of the index. This may be increased with [`grow_to_size`].
>> pub fn index_length(&self) -> usize {
>> self.index_length
>> }
>> @@ -320,7 +409,7 @@ impl FixedIndexWriter {
>> return Ok(());
>> };
>>
>> - let index_size = self.index_length * 32;
>> + let index_size = self.index_capacity * 32;
>>
>> if let Err(err) = unsafe { nix::sys::mman::munmap(index, index_size) } {
>> bail!("unmap file {:?} failed - {}", self.tmp_filename, err);
>> @@ -342,9 +431,24 @@ impl FixedIndexWriter {
>>
>> self.unmap()?;
>>
>> + if self.index_length == 0 {
>> + bail!("refusing to close empty fidx file {:?}", self.tmp_filename);
>> + } else if self.index_length < self.index_capacity {
>> + let file_size = size_of::<FixedIndexHeader>() + index_size;
>> + nix::unistd::ftruncate(&self.file, file_size as i64)?;
>> + self.index_capacity = self.index_length;
>> + }
>> +
>> let csum_offset = std::mem::offset_of!(FixedIndexHeader, index_csum);
>> self.file.seek(SeekFrom::Start(csum_offset as u64))?;
>> self.file.write_all(&index_csum)?;
>> +
>> + if self.write_size_on_close {
>> + let size_offset = std::mem::offset_of!(FixedIndexHeader, size);
>> + self.file.seek(SeekFrom::Start(size_offset as u64))?;
>> + self.file.write_all(&(self.size as u64).to_le_bytes())?;
>> + }
>> +
>> self.file.flush()?;
>>
>> if let Err(err) = std::fs::rename(&self.tmp_filename, &self.filename) {
>> @@ -407,6 +511,14 @@ impl FixedIndexWriter {
>> }
>>
>> pub fn clone_data_from(&mut self, reader: &FixedIndexReader) -> Result<(), Error> {
>> + if self.growable_size {
>> + bail!("reusing the index is only supported with known input size");
>> + }
>> +
>> + if self.chunk_size != reader.chunk_size {
>> + bail!("can't reuse file with different chunk size");
>> + }
>> +
>> if self.index_length != reader.index_count() {
>> bail!("clone_data_from failed - index sizes not equal");
>> }
>> --
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter
2026-01-14 13:13 ` Fabian Grünbichler
@ 2026-01-16 12:48 ` Robert Obkircher
2026-01-16 14:30 ` Fabian Grünbichler
0 siblings, 1 reply; 14+ messages in thread
From: Robert Obkircher @ 2026-01-16 12:48 UTC (permalink / raw)
To: pbs-devel
[-- Attachment #1.1: Type: text/plain, Size: 9988 bytes --]
On 1/14/26 14:13, Fabian Grünbichler wrote:
> On January 9, 2026 6:35 pm, Robert Obkircher wrote:
>> Create a dummy chunk store and write fidx files with fixed and
>> dynamically sized writers. Compare the resulting binary files directly
>> (ignoring uuid and ctime) and also read them back using the reader.
>>
>> The chunk hashes are made up and don't actually exist in the chunk
>> store.
>>
>> Signed-off-by: Robert Obkircher<r.obkircher@proxmox.com>
>> ---
>> pbs-datastore/src/fixed_index.rs | 184 +++++++++++++++++++++++++++++++
>> 1 file changed, 184 insertions(+)
>>
>> diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
>> index 8036a519..a20edc94 100644
>> --- a/pbs-datastore/src/fixed_index.rs
>> +++ b/pbs-datastore/src/fixed_index.rs
>> @@ -530,3 +530,187 @@ impl FixedIndexWriter {
>> Ok(())
>> }
>> }
>> +
>> +#[cfg(test)]
>> +mod tests {
>> + use super::*;
>> + use crate::chunk_store::ChunkStore;
>> + use pbs_api_types::DatastoreFSyncLevel;
>> + use std::fs;
>> + use std::sync::Arc;
>> +
>> + const CS: usize = 4096;
>> +
>> + #[test]
>> + fn test_fixed_index_writer() {
> this should be a setup method, and not annotated as #[test]
>
>> + let mut testdir = fs::canonicalize(".").unwrap();
> NACK, please use a tmpdir instead of writing to the current dir.. cargo
> only provides a mechanism for this for integration tests, so for unit
> tests you need to take care of it yourself, or move these to an
> integration test..
>
> we have discussed dropping the chunk store entirely in the past:
>
> https://lore.proxmox.com/pbs-devel/1761822027.sm41v71zmk.astroid@yuna.none/
>
> maybe we should do that and make unit testing here more
> straight-forward? would still require a writable path though..
Ok, I'll also send a separate patch for `chunk_store::test_chunk_store1`.
>> + testdir.push(".testdir_fixed_index");
>> +
>> + if let Err(_e) = std::fs::remove_dir_all(&testdir) { /* ignore */ }
>> +
>> + // this is a lot faster than ChunkStore::create, which takes almost 3 seconds
>> + let store = {
>> + let mut chunks = PathBuf::from(&testdir);
>> + chunks.push(".chunks");
>> + fs::create_dir_all(chunks).unwrap();
>> + Arc::new(ChunkStore::open("test", &testdir, DatastoreFSyncLevel::None).unwrap())
>> + };
>> +
>> + test_empty(store.clone());
>> + test_smaller_than_initial_capacity(store.clone());
>> + test_grow_to_multiples_of_chunk_size(store.clone());
>> + test_grow_to_fixed_size(store);
> these should not be called here, but rather call the setup method
The only reason I didn't do that was because burning 3 seconds per test to
create 65k directories for each chunk store seemed a bit excessive. But if
my workaround is ok, or if we remove it entirely, then separate tests are
definitely better.
>> +
>> + std::fs::remove_dir_all(&testdir).unwrap();
> and this should be handled with some kind of auto-cleanup-on-drop
> mechanism, unless the whole thing is moved to an integration test, in
> which case it will be somewhere under `target/` anyway..
The idea was to preserve files for debugging in case of failure, but I guess
in that case one can just patch the code and re-run the test.
>> + }
>> +
>> + fn test_empty(store: Arc<ChunkStore>) {
>> + let path = Path::new("test_empty");
>> + let mut w = FixedIndexWriter::create(store, path, None, CS).unwrap();
>> +
>> + assert_eq!(0, w.index_length(), "returns length, not capacity");
> should also check the capacity and size?
>
>> + assert!(w.add_digest(0, &[1u8; 32]).is_err(), "out of bounds");
>> + assert!(w.close().is_err(), "should refuse to create empty file");
>> + }
>> +
>> + fn test_smaller_than_initial_capacity(store: Arc<ChunkStore>) {
> the test_grow_to_multiples_of_chunk_size also tests sizes smaller than
> the initial capacity.. this tests a very special edge case - an index
> made up of a single, partial chunk..
>
>> + let path = Path::new("test_smaller_than_initial_capacity");
>> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
>> +
>> + let size = CS - 1;
>> + let expected = test_data(size);
>> + w.grow_to_size(size).unwrap();
>> + expected[0].add_to(&mut w);
>> +
>> + w.close().unwrap();
>> + drop(w);
>> +
>> + check_with_reader(&store.relative_path(path), size, &expected);
>> + compare_to_known_size_writer(store, path, size, &expected);
>> + }
>> +
>> + fn test_grow_to_multiples_of_chunk_size(store: Arc<ChunkStore>) {
>> + let path = Path::new("test_grow_to_multiples_of_chunk_size");
>> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
>> +
>> + let initial = FixedIndexWriter::INITIAL_CAPACITY;
>> + let steps = [1, 2, initial, initial + 1, 5 * initial, 10 * initial + 1];
>> + let expected = test_data(steps.last().unwrap() * CS);
>> +
>> + let mut begin = 0;
>> + for chunk_count in steps {
>> + let last = &expected[chunk_count - 1];
>> + w.grow_to_size(last.end).unwrap();
>> + assert_eq!(last.index + 1, w.index_length());
> if index_length is changed to capacity, then this needs to be adapted,
> unless index_length() continues to return the length based on size and
> chunk size
The function index_length is public and is also used by the backup
environment.
It must obviously return the length and not the internal capacity.
>
>> + assert!(w.add_digest(last.index + 1, &[1u8; 32]).is_err());
>> +
>> + for c in expected[begin..chunk_count].iter().rev() {
>> + c.add_to(&mut w);
>> + }
>> + begin = chunk_count;
>> + }
>> + w.close().unwrap();
>> + drop(w);
>> +
>> + let size = expected.len() * CS;
>> + check_with_reader(&store.relative_path(path), size, &expected);
>> + compare_to_known_size_writer(store, path, size, &expected);
>> + }
>> +
>> + fn test_grow_to_fixed_size(store: Arc<ChunkStore>) {
> s/fixed/misaligned/ ?
>
>> + let path = Path::new("test_grow_to_fixed_size");
>> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
>> +
>> + let size = (FixedIndexWriter::INITIAL_CAPACITY + 42) * CS - 1; // last is not full
>> + let expected = test_data(size);
>> +
>> + w.grow_to_size(size).unwrap();
>> + assert!(w.grow_to_size(size + 1).is_err(), "size must be fixed now");
>> + assert_eq!(expected.len(), w.index_length());
>> + assert!(w.add_digest(expected.len(), &[1u8; 32]).is_err());
>> +
>> + for c in expected.iter().rev() {
>> + c.add_to(&mut w);
>> + }
>> +
>> + w.close().unwrap();
>> + drop(w);
>> +
>> + check_with_reader(&store.relative_path(path), size, &expected);
>> + compare_to_known_size_writer(store, path, size, &expected);
>> + }
>> +
>> + struct TestChunk {
>> + digest: [u8; 32],
>> + index: usize,
>> + size: usize,
>> + end: usize,
>> + }
>> +
>> + impl TestChunk {
>> + fn add_to(&self, w: &mut FixedIndexWriter) {
>> + assert_eq!(
>> + self.index,
>> + w.check_chunk_alignment(self.end, self.size).unwrap()
>> + );
>> + w.add_digest(self.index, &self.digest).unwrap();
>> + }
>> + }
>> +
>> + fn test_data(size: usize) -> Vec<TestChunk> {
>> + (0..size.div_ceil(CS))
>> + .map(|index| {
>> + let mut digest = [0u8; 32];
>> + let i = &(index as u64).to_le_bytes();
>> + for c in digest.chunks_mut(i.len()) {
>> + c.copy_from_slice(i);
>> + }
>> + let size = if ((index + 1) * CS) <= size {
>> + CS
>> + } else {
>> + size % CS
>> + };
>> + TestChunk {
>> + digest,
>> + index,
>> + size,
>> + end: index * CS + size,
>> + }
>> + })
>> + .collect()
>> + }
>> +
>> + fn check_with_reader(path: &Path, size: usize, chunks: &[TestChunk]) {
>> + let reader = FixedIndexReader::open(path).unwrap();
>> + assert_eq!(size as u64, reader.index_bytes());
>> + assert_eq!(chunks.len(), reader.index_count());
>> + for c in chunks {
>> + assert_eq!(&c.digest, reader.index_digest(c.index).unwrap());
>> + }
>> + }
>> +
>> + fn compare_to_known_size_writer(
>> + store: Arc<ChunkStore>,
>> + name: &Path,
>> + size: usize,
>> + chunks: &[TestChunk],
>> + ) {
>> + let mut path = PathBuf::from(name);
>> + path.set_extension("reference");
>> + let mut w = FixedIndexWriter::create(store.clone(), &path, Some(size), CS).unwrap();
>> + for c in chunks {
>> + c.add_to(&mut w);
>> + }
>> + w.close().unwrap();
>> + drop(w);
>> +
>> + let mut reference = fs::read(store.relative_path(&path)).unwrap();
>> + let mut tested = fs::read(store.relative_path(name)).unwrap();
>> +
>> + // ignore uuid and ctime
>> + reference[8..32].fill(0);
>> + tested[8..32].fill(0);
>> +
>> + assert_eq!(reference, tested);
>> + }
>> +}
>> --
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
[-- Attachment #1.2: Type: text/html, Size: 12404 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size
2026-01-16 12:42 ` Robert Obkircher
@ 2026-01-16 14:28 ` Fabian Grünbichler
0 siblings, 0 replies; 14+ messages in thread
From: Fabian Grünbichler @ 2026-01-16 14:28 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
Quoting Robert Obkircher (2026-01-16 13:42:36)
>
> On 1/14/26 14:13, Fabian Grünbichler wrote:
> > On January 9, 2026 6:35 pm, Robert Obkircher wrote:
> >> Use mremap and ftruncate to support growable FixedIndexWriters. Grow
> >> exponentially from a small initial index size for efficiency. Truncate
> >> excessive capacity after encountering a non-full block or on close.
> >>
> >> Signed-off-by: Robert Obkircher<r.obkircher@proxmox.com>
> >> ---
> >> pbs-datastore/src/datastore.rs | 2 +-
> >> pbs-datastore/src/fixed_index.rs | 120 +++++++++++++++++++++++++++++--
> >> 2 files changed, 117 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> >> index 9c57aaac..af712726 100644
> >> --- a/pbs-datastore/src/datastore.rs
> >> +++ b/pbs-datastore/src/datastore.rs
> >> @@ -591,7 +591,7 @@ impl DataStore {
> >> pub fn create_fixed_writer<P: AsRef<Path>>(
> >> &self,
> >> filename: P,
> >> - size: usize,
> >> + size: Option<usize>,
> >> chunk_size: usize,
> >> ) -> Result<FixedIndexWriter, Error> {
> >> let index = FixedIndexWriter::create(
> >> diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
> >> index 6c3be2d4..8036a519 100644
> >> --- a/pbs-datastore/src/fixed_index.rs
> >> +++ b/pbs-datastore/src/fixed_index.rs
> >> @@ -219,9 +219,12 @@ pub struct FixedIndexWriter {
> >> chunk_size: usize,
> >> size: usize,
> >> index_length: usize,
> >> + index_capacity: usize,
> > chunk_size and size effectively already give you index_length (as in,
> > how many "slots" are allowed to be used), so index length and index
> > capacity can be a single field. if you want to rename it to signify that
> > it might now be bigger than ceil(size/chunk_size), that's fine, but I
> > don't think we need to duplicate it and complicate the code below..
> By that logic (chunk_size, size, growable_size) also give me the capacity,
> so both fields could be removed.
that's true :)
> I kept them as separate fields because that is what the original code and
> the Reader do, supposedly to avoid the division for the bounds check.
we effectively do that bounds check twice though - once for size (in
check_chunk_alignment) and again for index_length (which is derived 1:1 from
size and chunk_size) in add_digest.. and the backup env calls
check_chunk_alignment right before calling add_digest. when copying from a
reader directly to the writer that does not happen, but in that case we already
checked that the index and chunk sizes match before we start adding the first
digest, so the check in add_digest is redundant there as well..
I now wonder we could combine check_chunk_alignment and grow_to_size, since
they handle quite similar aspects and both only have a single call site outside
of tests, which is right next to eachother.. or (see below)
>
> This also makes it much more explicit what is going on and since the
> index_length has to be computed anyway while resizing, I don't see how
> storing that value for later is more complicated than recomputing it.
>
> >> index: *mut u8,
> >> pub uuid: [u8; 16],
> >> pub ctime: i64,
> >> + growable_size: bool,
> >> + write_size_on_close: bool,
> >> }
> >>
> >> // `index` is mmap()ed which cannot be thread-local so should be sendable
> >> @@ -237,12 +240,18 @@ impl Drop for FixedIndexWriter {
> >> }
> >>
> >> impl FixedIndexWriter {
> >> + /// The initial capacity, if the total size is unknown.
> >> + ///
> >> + /// This capacity takes up the same amount of space as the header
> >> + /// and can refer to 128 Blocks * 4 MiB/Block = 512 MiB of content.
> >> + const INITIAL_CAPACITY: usize = 4096 / 32;
> > might make sense to make this more explicit - we can only map using page
> > granularity, the header is one page, so the first part needs to be a
> > page as well, and if then always double it when resizing we stay aligned
> > to page boundaries.
>
> There is actually no strict requirement for the mapped length to be
> a multiple of the page size, so I didn't want to guarantee too much.
> However, the offset into the file must be aligned, so a page size
> larger than the header would break anyway.
you are right, sorry!
> I think it would make sense to just map the file from the start, because
> that would simplify my other fix [1] and the size/checksum update in
> close().
given that we already had a report about PBS not working on 16k page kernels,
and there are other platforms with 64k pages, it might be nice to fix this up
if it doesn't cause too much annoyance.
IIRC besides the index code, only our shared memory helper (for config caching
and rate limiting) use mmap. in those cases the mapped data is padded to a
single 4k page atm, and when mapping we round up to a page as well, although
the helper for that has a FIXME calling for it to actually query the page size
instead of assuming it's 4k ;)
so yeah, those all sound like they could be fixed up without too much effort
and without much additional cost (for the index access, it boils down to one
extra pointer addition for the header-to-payload-offset?).
>
> [1]
> https://lore.proxmox.com/pbs-devel/20260109175945.309913-3-r.obkircher@proxmox.com/
>
> >> +
> >> #[allow(clippy::cast_ptr_alignment)]
> >> // Requires obtaining a shared chunk store lock beforehand
> >> pub fn create(
> >> store: Arc<ChunkStore>,
> >> path: &Path,
> >> - size: usize,
> >> + known_size: Option<usize>,
> >> chunk_size: usize,
> >> ) -> Result<Self, Error> {
> >> let full_path = store.relative_path(path);
> >> @@ -264,6 +273,7 @@ impl FixedIndexWriter {
> >> }
> >>
> >> let ctime = proxmox_time::epoch_i64();
> >> + let size = known_size.unwrap_or(0);
> >>
> >> let uuid = Uuid::generate();
> >>
> >> @@ -280,8 +290,12 @@ impl FixedIndexWriter {
> >>
> >> file.write_all(&buffer)?;
> >>
> >> - let index_length = size.div_ceil(chunk_size);
> >> - let index_size = index_length * 32;
> >> + let (index_length, index_capacity) = known_size
> >> + .map(|s| s.div_ceil(chunk_size))
> >> + .map(|len| (len, len))
> >> + .unwrap_or((0, Self::INITIAL_CAPACITY));
> >> +
> >> + let index_size = index_capacity * 32;
> >> nix::unistd::ftruncate(&file, (header_size + index_size) as i64)?;
> >>
> >> let data = unsafe {
> >> @@ -305,12 +319,87 @@ impl FixedIndexWriter {
> >> chunk_size,
> >> size,
> >> index_length,
> >> + index_capacity,
> >> index: data,
> >> ctime,
> >> uuid: *uuid.as_bytes(),
> >> + growable_size: known_size.is_none(),
> >> + write_size_on_close: known_size.is_none(),
> >> + })
> >> + }
> >> +
> >> + /// If this returns an error, the sizes may be out of sync,
> >> + /// which is especially bad if the capacity was reduced.
> >> + fn set_index_capacity(&mut self, new_capacity: usize) -> Result<(), Error> {
> >> + if new_capacity == self.index_capacity {
> >> + return Ok(());
> >> + }
> >> + let old_index_size = self.index_capacity * 32;
> >> + let new_index_size = new_capacity * 32;
> >> + let new_file_size = (size_of::<FixedIndexHeader>() + new_index_size) as i64;
> >> +
> >> + let index_addr = NonNull::new(self.index as *mut std::ffi::c_void).ok_or_else(|| {
> >> + format_err!("Can't resize FixedIndexWriter index because the index pointer is null.")
> >> + })?;
> >> +
> >> + nix::unistd::ftruncate(&self.file, new_file_size)?;
> >> +
> >> + let new_index = unsafe {
> >> + nix::sys::mman::mremap(
> >> + index_addr,
> >> + old_index_size,
> >> + new_index_size,
> >> + nix::sys::mman::MRemapFlags::MREMAP_MAYMOVE,
> >> + None,
> >> + )
> >> + }?;
> >> +
> >> + self.index = new_index.as_ptr().cast::<u8>();
> >> + self.index_capacity = new_capacity;
> >> + Ok(())
> >> + }
> >> +
> >> + /// Unmapping ensures future add and close operations fail.
> >> + fn set_index_capacity_or_unmap(&mut self, new_capacity: usize) -> Result<(), Error> {
> >> + self.set_index_capacity(new_capacity).map_err(|e| {
> >> + let unmap_result = self.unmap();
> >> + let message = format!(
> >> + "failed to resize index capacity from {} to {new_capacity} with backing file: {:?}",
> >> + self.index_capacity, self.tmp_filename
> >> + );
> >> + assert!(self.index.is_null(), "{message} {unmap_result:?}");
> >> + e.context(message)
> >> })
> >> }
> >>
> >> + /// Increase the content size to be at least `requested_size` and
> >> + /// ensure there is enough capacity.
> >> + ///
> >> + /// Only writers that were created without a known size can grow.
> >> + /// The size also becomes fixed as soon as it is no longer divisible
> >> + /// by the block size, to ensure that only the last block can be
> >> + /// smaller.
> >> + pub fn grow_to_size(&mut self, requested_size: usize) -> Result<(), Error> {
> >> + if self.size < requested_size {
> >> + if !self.growable_size {
> >> + bail!("refusing to resize from {} to {requested_size}", self.size);
> >> + }
> >> + let new_len = requested_size.div_ceil(self.chunk_size);
> >> + if new_len * self.chunk_size != requested_size {
> >> + // not a full chunk, so this must be the last one
> >> + self.growable_size = false;
> >> + self.set_index_capacity_or_unmap(new_len)?;
> >> + } else if new_len > self.index_capacity {
> >> + self.set_index_capacity_or_unmap(new_len.next_power_of_two())?;
> >> + };
> >> + assert!(new_len <= self.index_capacity);
> >> + self.index_length = new_len;
> >> + self.size = requested_size;
> >> + }
> > should we handle the else part here? i.e., error out if shrinking is
> > requested?
>
> The else part is expected to succeed. I considered renaming the method to
> something like grow_if_smaller or ensure_capacity to make this clear, but
> I'm not quite happy with those names either.
we could
- call it resize, move the size check to the (only) call site, and allow
shrinking
- call it reserve, make it only handle capacity, and move the size update to a
(renamed) check_chunk_alignment
- combine grow_to_size and check_chunk_alignment and find a fitting name for
what it does :-P
- keep it as is, but find a better name
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter
2026-01-16 12:48 ` Robert Obkircher
@ 2026-01-16 14:30 ` Fabian Grünbichler
0 siblings, 0 replies; 14+ messages in thread
From: Fabian Grünbichler @ 2026-01-16 14:30 UTC (permalink / raw)
To: Robert Obkircher, pbs-devel
Quoting Robert Obkircher (2026-01-16 13:48:04)
>
> On 1/14/26 14:13, Fabian Grünbichler wrote:
>
> On January 9, 2026 6:35 pm, Robert Obkircher wrote:
>
> Create a dummy chunk store and write fidx files with fixed and
> dynamically sized writers. Compare the resulting binary files directly
> (ignoring uuid and ctime) and also read them back using the reader.
>
> The chunk hashes are made up and don't actually exist in the chunk
> store.
>
> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
> ---
> pbs-datastore/src/fixed_index.rs | 184 +++++++++++++++++++++++++++++++
> 1 file changed, 184 insertions(+)
>
> diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
> index 8036a519..a20edc94 100644
> --- a/pbs-datastore/src/fixed_index.rs
> +++ b/pbs-datastore/src/fixed_index.rs
> @@ -530,3 +530,187 @@ impl FixedIndexWriter {
> Ok(())
> }
> }
> +
> +#[cfg(test)]
> +mod tests {
> + use super::*;
> + use crate::chunk_store::ChunkStore;
> + use pbs_api_types::DatastoreFSyncLevel;
> + use std::fs;
> + use std::sync::Arc;
> +
> + const CS: usize = 4096;
> +
> + #[test]
> + fn test_fixed_index_writer() {
>
> this should be a setup method, and not annotated as #[test]
>
>
> + let mut testdir = fs::canonicalize(".").unwrap();
>
> NACK, please use a tmpdir instead of writing to the current dir.. cargo
> only provides a mechanism for this for integration tests, so for unit
> tests you need to take care of it yourself, or move these to an
> integration test..
>
> we have discussed dropping the chunk store entirely in the past:
>
> https://lore.proxmox.com/pbs-devel/1761822027.sm41v71zmk.astroid@yuna.none/
>
> maybe we should do that and make unit testing here more
> straight-forward? would still require a writable path though..
>
> Ok, I'll also send a separate patch for `chunk_store::test_chunk_store1`.
>
>
> + testdir.push(".testdir_fixed_index");
> +
> + if let Err(_e) = std::fs::remove_dir_all(&testdir) { /* ignore */ }
> +
> + // this is a lot faster than ChunkStore::create, which takes almost 3 seconds
> + let store = {
> + let mut chunks = PathBuf::from(&testdir);
> + chunks.push(".chunks");
> + fs::create_dir_all(chunks).unwrap();
> + Arc::new(ChunkStore::open("test", &testdir, DatastoreFSyncLevel::None).unwrap())
> + };
> +
> + test_empty(store.clone());
> + test_smaller_than_initial_capacity(store.clone());
> + test_grow_to_multiples_of_chunk_size(store.clone());
> + test_grow_to_fixed_size(store);
>
> these should not be called here, but rather call the setup method
>
> The only reason I didn't do that was because burning 3 seconds per test to
> create 65k directories for each chunk store seemed a bit excessive. But if
> my workaround is ok, or if we remove it entirely, then separate tests are
> definitely better.
yes, I think the solution for that is getting rid of the ChunkStore reference
there, and just pass in the index path (base).
>
> +
> + std::fs::remove_dir_all(&testdir).unwrap();
>
> and this should be handled with some kind of auto-cleanup-on-drop
> mechanism, unless the whole thing is moved to an integration test, in
> which case it will be somewhere under `target/` anyway..
>
> The idea was to preserve files for debugging in case of failure, but I guess
> in that case one can just patch the code and re-run the test.
leaving the files around in case of failure is less of an issue if they are in
some tmpdir somewhere, but it's bad to pollute the source dir/CWD (and in some
build environments, that dir might not even be writable).
>
> + }
> +
> + fn test_empty(store: Arc<ChunkStore>) {
> + let path = Path::new("test_empty");
> + let mut w = FixedIndexWriter::create(store, path, None, CS).unwrap();
> +
> + assert_eq!(0, w.index_length(), "returns length, not capacity");
>
> should also check the capacity and size?
>
>
> + assert!(w.add_digest(0, &[1u8; 32]).is_err(), "out of bounds");
> + assert!(w.close().is_err(), "should refuse to create empty file");
> + }
> +
> + fn test_smaller_than_initial_capacity(store: Arc<ChunkStore>) {
>
> the test_grow_to_multiples_of_chunk_size also tests sizes smaller than
> the initial capacity.. this tests a very special edge case - an index
> made up of a single, partial chunk..
>
>
> + let path = Path::new("test_smaller_than_initial_capacity");
> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
> +
> + let size = CS - 1;
> + let expected = test_data(size);
> + w.grow_to_size(size).unwrap();
> + expected[0].add_to(&mut w);
> +
> + w.close().unwrap();
> + drop(w);
> +
> + check_with_reader(&store.relative_path(path), size, &expected);
> + compare_to_known_size_writer(store, path, size, &expected);
> + }
> +
> + fn test_grow_to_multiples_of_chunk_size(store: Arc<ChunkStore>) {
> + let path = Path::new("test_grow_to_multiples_of_chunk_size");
> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
> +
> + let initial = FixedIndexWriter::INITIAL_CAPACITY;
> + let steps = [1, 2, initial, initial + 1, 5 * initial, 10 * initial + 1];
> + let expected = test_data(steps.last().unwrap() * CS);
> +
> + let mut begin = 0;
> + for chunk_count in steps {
> + let last = &expected[chunk_count - 1];
> + w.grow_to_size(last.end).unwrap();
> + assert_eq!(last.index + 1, w.index_length());
>
> if index_length is changed to capacity, then this needs to be adapted,
> unless index_length() continues to return the length based on size and
> chunk size
>
> The function index_length is public and is also used by the backup environment.
> It must obviously return the length and not the internal capacity.
see comments on the first patch :)
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-01-16 14:31 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-09 17:35 [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 1/5] fix #3847: datastore: support writing fidx files of unknown size Robert Obkircher
2026-01-14 13:13 ` Fabian Grünbichler
2026-01-16 12:42 ` Robert Obkircher
2026-01-16 14:28 ` Fabian Grünbichler
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter Robert Obkircher
2026-01-14 13:13 ` Fabian Grünbichler
2026-01-16 12:48 ` Robert Obkircher
2026-01-16 14:30 ` Fabian Grünbichler
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 3/5] fix #3847: api: backup: make fixed index file size optional Robert Obkircher
2026-01-14 13:13 ` Fabian Grünbichler
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 4/5] fix #3847: client: support fifo pipe inputs for images Robert Obkircher
2026-01-09 17:35 ` [pbs-devel] [PATCH v3 proxmox-backup 5/5] fix #3847: client: treat minus sign as stdin Robert Obkircher
2026-01-14 13:13 ` [pbs-devel] [PATCH v3 proxmox-backup 0/5] fix: #3847 pipe from STDIN to proxmox-backup-client Fabian Grünbichler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox