From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 65C3A1FF141 for ; Fri, 16 Jan 2026 13:48:05 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 93F6218D27; Fri, 16 Jan 2026 13:48:10 +0100 (CET) Message-ID: Date: Fri, 16 Jan 2026 13:48:04 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Robert Obkircher To: pbs-devel@lists.proxmox.com References: <20260109173548.301653-1-r.obkircher@proxmox.com> <20260109173548.301653-3-r.obkircher@proxmox.com> <1768390946.z2jyt2cjwb.astroid@yuna.none> Content-Language: en-US, de-AT In-Reply-To: <1768390946.z2jyt2cjwb.astroid@yuna.none> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1768567635756 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.047 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy HTML_MESSAGE 0.001 HTML included in message KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Subject: Re: [pbs-devel] [PATCH v3 proxmox-backup 2/5] fix #3847: datastore: test FixedIndexWriter X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox Backup Server development discussion Content-Type: multipart/mixed; boundary="===============8713059236895442823==" Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" This is a multi-part message in MIME format. --===============8713059236895442823== Content-Type: multipart/alternative; boundary="------------tyGd6XoIIv0MxxdFR0HiEn0E" Content-Language: en-US, de-AT This is a multi-part message in MIME format. --------------tyGd6XoIIv0MxxdFR0HiEn0E Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 1/14/26 14:13, Fabian Grünbichler wrote: > On January 9, 2026 6:35 pm, Robert Obkircher wrote: >> Create a dummy chunk store and write fidx files with fixed and >> dynamically sized writers. Compare the resulting binary files directly >> (ignoring uuid and ctime) and also read them back using the reader. >> >> The chunk hashes are made up and don't actually exist in the chunk >> store. >> >> Signed-off-by: Robert Obkircher >> --- >> pbs-datastore/src/fixed_index.rs | 184 +++++++++++++++++++++++++++++++ >> 1 file changed, 184 insertions(+) >> >> diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs >> index 8036a519..a20edc94 100644 >> --- a/pbs-datastore/src/fixed_index.rs >> +++ b/pbs-datastore/src/fixed_index.rs >> @@ -530,3 +530,187 @@ impl FixedIndexWriter { >> Ok(()) >> } >> } >> + >> +#[cfg(test)] >> +mod tests { >> + use super::*; >> + use crate::chunk_store::ChunkStore; >> + use pbs_api_types::DatastoreFSyncLevel; >> + use std::fs; >> + use std::sync::Arc; >> + >> + const CS: usize = 4096; >> + >> + #[test] >> + fn test_fixed_index_writer() { > this should be a setup method, and not annotated as #[test] > >> + let mut testdir = fs::canonicalize(".").unwrap(); > NACK, please use a tmpdir instead of writing to the current dir.. cargo > only provides a mechanism for this for integration tests, so for unit > tests you need to take care of it yourself, or move these to an > integration test.. > > we have discussed dropping the chunk store entirely in the past: > > https://lore.proxmox.com/pbs-devel/1761822027.sm41v71zmk.astroid@yuna.none/ > > maybe we should do that and make unit testing here more > straight-forward? would still require a writable path though.. Ok, I'll also send a separate patch for `chunk_store::test_chunk_store1`. >> + testdir.push(".testdir_fixed_index"); >> + >> + if let Err(_e) = std::fs::remove_dir_all(&testdir) { /* ignore */ } >> + >> + // this is a lot faster than ChunkStore::create, which takes almost 3 seconds >> + let store = { >> + let mut chunks = PathBuf::from(&testdir); >> + chunks.push(".chunks"); >> + fs::create_dir_all(chunks).unwrap(); >> + Arc::new(ChunkStore::open("test", &testdir, DatastoreFSyncLevel::None).unwrap()) >> + }; >> + >> + test_empty(store.clone()); >> + test_smaller_than_initial_capacity(store.clone()); >> + test_grow_to_multiples_of_chunk_size(store.clone()); >> + test_grow_to_fixed_size(store); > these should not be called here, but rather call the setup method The only reason I didn't do that was because burning 3 seconds per test to create 65k directories for each chunk store seemed a bit excessive. But if my workaround is ok, or if we remove it entirely, then separate tests are definitely better. >> + >> + std::fs::remove_dir_all(&testdir).unwrap(); > and this should be handled with some kind of auto-cleanup-on-drop > mechanism, unless the whole thing is moved to an integration test, in > which case it will be somewhere under `target/` anyway.. The idea was to preserve files for debugging in case of failure, but I guess in that case one can just patch the code and re-run the test. >> + } >> + >> + fn test_empty(store: Arc) { >> + let path = Path::new("test_empty"); >> + let mut w = FixedIndexWriter::create(store, path, None, CS).unwrap(); >> + >> + assert_eq!(0, w.index_length(), "returns length, not capacity"); > should also check the capacity and size? > >> + assert!(w.add_digest(0, &[1u8; 32]).is_err(), "out of bounds"); >> + assert!(w.close().is_err(), "should refuse to create empty file"); >> + } >> + >> + fn test_smaller_than_initial_capacity(store: Arc) { > the test_grow_to_multiples_of_chunk_size also tests sizes smaller than > the initial capacity.. this tests a very special edge case - an index > made up of a single, partial chunk.. > >> + let path = Path::new("test_smaller_than_initial_capacity"); >> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap(); >> + >> + let size = CS - 1; >> + let expected = test_data(size); >> + w.grow_to_size(size).unwrap(); >> + expected[0].add_to(&mut w); >> + >> + w.close().unwrap(); >> + drop(w); >> + >> + check_with_reader(&store.relative_path(path), size, &expected); >> + compare_to_known_size_writer(store, path, size, &expected); >> + } >> + >> + fn test_grow_to_multiples_of_chunk_size(store: Arc) { >> + let path = Path::new("test_grow_to_multiples_of_chunk_size"); >> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap(); >> + >> + let initial = FixedIndexWriter::INITIAL_CAPACITY; >> + let steps = [1, 2, initial, initial + 1, 5 * initial, 10 * initial + 1]; >> + let expected = test_data(steps.last().unwrap() * CS); >> + >> + let mut begin = 0; >> + for chunk_count in steps { >> + let last = &expected[chunk_count - 1]; >> + w.grow_to_size(last.end).unwrap(); >> + assert_eq!(last.index + 1, w.index_length()); > if index_length is changed to capacity, then this needs to be adapted, > unless index_length() continues to return the length based on size and > chunk size The function index_length is public and is also used by the backup environment. It must obviously return the length and not the internal capacity. > >> + assert!(w.add_digest(last.index + 1, &[1u8; 32]).is_err()); >> + >> + for c in expected[begin..chunk_count].iter().rev() { >> + c.add_to(&mut w); >> + } >> + begin = chunk_count; >> + } >> + w.close().unwrap(); >> + drop(w); >> + >> + let size = expected.len() * CS; >> + check_with_reader(&store.relative_path(path), size, &expected); >> + compare_to_known_size_writer(store, path, size, &expected); >> + } >> + >> + fn test_grow_to_fixed_size(store: Arc) { > s/fixed/misaligned/ ? > >> + let path = Path::new("test_grow_to_fixed_size"); >> + let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap(); >> + >> + let size = (FixedIndexWriter::INITIAL_CAPACITY + 42) * CS - 1; // last is not full >> + let expected = test_data(size); >> + >> + w.grow_to_size(size).unwrap(); >> + assert!(w.grow_to_size(size + 1).is_err(), "size must be fixed now"); >> + assert_eq!(expected.len(), w.index_length()); >> + assert!(w.add_digest(expected.len(), &[1u8; 32]).is_err()); >> + >> + for c in expected.iter().rev() { >> + c.add_to(&mut w); >> + } >> + >> + w.close().unwrap(); >> + drop(w); >> + >> + check_with_reader(&store.relative_path(path), size, &expected); >> + compare_to_known_size_writer(store, path, size, &expected); >> + } >> + >> + struct TestChunk { >> + digest: [u8; 32], >> + index: usize, >> + size: usize, >> + end: usize, >> + } >> + >> + impl TestChunk { >> + fn add_to(&self, w: &mut FixedIndexWriter) { >> + assert_eq!( >> + self.index, >> + w.check_chunk_alignment(self.end, self.size).unwrap() >> + ); >> + w.add_digest(self.index, &self.digest).unwrap(); >> + } >> + } >> + >> + fn test_data(size: usize) -> Vec { >> + (0..size.div_ceil(CS)) >> + .map(|index| { >> + let mut digest = [0u8; 32]; >> + let i = &(index as u64).to_le_bytes(); >> + for c in digest.chunks_mut(i.len()) { >> + c.copy_from_slice(i); >> + } >> + let size = if ((index + 1) * CS) <= size { >> + CS >> + } else { >> + size % CS >> + }; >> + TestChunk { >> + digest, >> + index, >> + size, >> + end: index * CS + size, >> + } >> + }) >> + .collect() >> + } >> + >> + fn check_with_reader(path: &Path, size: usize, chunks: &[TestChunk]) { >> + let reader = FixedIndexReader::open(path).unwrap(); >> + assert_eq!(size as u64, reader.index_bytes()); >> + assert_eq!(chunks.len(), reader.index_count()); >> + for c in chunks { >> + assert_eq!(&c.digest, reader.index_digest(c.index).unwrap()); >> + } >> + } >> + >> + fn compare_to_known_size_writer( >> + store: Arc, >> + name: &Path, >> + size: usize, >> + chunks: &[TestChunk], >> + ) { >> + let mut path = PathBuf::from(name); >> + path.set_extension("reference"); >> + let mut w = FixedIndexWriter::create(store.clone(), &path, Some(size), CS).unwrap(); >> + for c in chunks { >> + c.add_to(&mut w); >> + } >> + w.close().unwrap(); >> + drop(w); >> + >> + let mut reference = fs::read(store.relative_path(&path)).unwrap(); >> + let mut tested = fs::read(store.relative_path(name)).unwrap(); >> + >> + // ignore uuid and ctime >> + reference[8..32].fill(0); >> + tested[8..32].fill(0); >> + >> + assert_eq!(reference, tested); >> + } >> +} >> -- >> 2.47.3 >> >> >> >> _______________________________________________ >> pbs-devel mailing list >> pbs-devel@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel >> >> >> > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel > > --------------tyGd6XoIIv0MxxdFR0HiEn0E Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit


On 1/14/26 14:13, Fabian Grünbichler wrote:
On January 9, 2026 6:35 pm, Robert Obkircher wrote:
Create a dummy chunk store and write fidx files with fixed and
dynamically sized writers. Compare the resulting binary files directly
(ignoring uuid and ctime) and also read them back using the reader.

The chunk hashes are made up and don't actually exist in the chunk
store.

Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
 pbs-datastore/src/fixed_index.rs | 184 +++++++++++++++++++++++++++++++
 1 file changed, 184 insertions(+)

diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
index 8036a519..a20edc94 100644
--- a/pbs-datastore/src/fixed_index.rs
+++ b/pbs-datastore/src/fixed_index.rs
@@ -530,3 +530,187 @@ impl FixedIndexWriter {
         Ok(())
     }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::chunk_store::ChunkStore;
+    use pbs_api_types::DatastoreFSyncLevel;
+    use std::fs;
+    use std::sync::Arc;
+
+    const CS: usize = 4096;
+
+    #[test]
+    fn test_fixed_index_writer() {
this should be a setup method, and not annotated as #[test]

+        let mut testdir = fs::canonicalize(".").unwrap();
NACK, please use a tmpdir instead of writing to the current dir.. cargo
only provides a mechanism for this for integration tests, so for unit
tests you need to take care of it yourself, or move these to an
integration test..

we have discussed dropping the chunk store entirely in the past:

https://lore.proxmox.com/pbs-devel/1761822027.sm41v71zmk.astroid@yuna.none/

maybe we should do that and make unit testing here more
straight-forward? would still require a writable path though..
Ok, I'll also send a separate patch for `chunk_store::test_chunk_store1`.

+        testdir.push(".testdir_fixed_index");
+
+        if let Err(_e) = std::fs::remove_dir_all(&testdir) { /* ignore */ }
+
+        // this is a lot faster than ChunkStore::create, which takes almost 3 seconds
+        let store = {
+            let mut chunks = PathBuf::from(&testdir);
+            chunks.push(".chunks");
+            fs::create_dir_all(chunks).unwrap();
+            Arc::new(ChunkStore::open("test", &testdir, DatastoreFSyncLevel::None).unwrap())
+        };
+
+        test_empty(store.clone());
+        test_smaller_than_initial_capacity(store.clone());
+        test_grow_to_multiples_of_chunk_size(store.clone());
+        test_grow_to_fixed_size(store);
these should not be called here, but rather call the setup method
The only reason I didn't do that was because burning 3 seconds per test to 
create 65k directories for each chunk store seemed a bit excessive. But if
my workaround is ok, or if we remove it entirely, then separate tests are
definitely better.
+
+        std::fs::remove_dir_all(&testdir).unwrap();
and this should be handled with some kind of auto-cleanup-on-drop
mechanism, unless the whole thing is moved to an integration test, in
which case it will be somewhere under `target/` anyway..
The idea was to preserve files for debugging in case of failure, but I guess
in that case one can just patch the code and re-run the test.
+    }
+
+    fn test_empty(store: Arc<ChunkStore>) {
+        let path = Path::new("test_empty");
+        let mut w = FixedIndexWriter::create(store, path, None, CS).unwrap();
+
+        assert_eq!(0, w.index_length(), "returns length, not capacity");
should also check the capacity and size?

+        assert!(w.add_digest(0, &[1u8; 32]).is_err(), "out of bounds");
+        assert!(w.close().is_err(), "should refuse to create empty file");
+    }
+
+    fn test_smaller_than_initial_capacity(store: Arc<ChunkStore>) {
the test_grow_to_multiples_of_chunk_size also tests sizes smaller than
the initial capacity.. this tests a very special edge case - an index
made up of a single, partial chunk..

+        let path = Path::new("test_smaller_than_initial_capacity");
+        let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
+
+        let size = CS - 1;
+        let expected = test_data(size);
+        w.grow_to_size(size).unwrap();
+        expected[0].add_to(&mut w);
+
+        w.close().unwrap();
+        drop(w);
+
+        check_with_reader(&store.relative_path(path), size, &expected);
+        compare_to_known_size_writer(store, path, size, &expected);
+    }
+
+    fn test_grow_to_multiples_of_chunk_size(store: Arc<ChunkStore>) {
+        let path = Path::new("test_grow_to_multiples_of_chunk_size");
+        let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
+
+        let initial = FixedIndexWriter::INITIAL_CAPACITY;
+        let steps = [1, 2, initial, initial + 1, 5 * initial, 10 * initial + 1];
+        let expected = test_data(steps.last().unwrap() * CS);
+
+        let mut begin = 0;
+        for chunk_count in steps {
+            let last = &expected[chunk_count - 1];
+            w.grow_to_size(last.end).unwrap();
+            assert_eq!(last.index + 1, w.index_length());
if index_length is changed to capacity, then this needs to be adapted,
unless index_length() continues to return the length based on size and
chunk size
The function index_length is public and is also used by the backup environment.
It must obviously return the length and not the internal capacity.

+            assert!(w.add_digest(last.index + 1, &[1u8; 32]).is_err());
+
+            for c in expected[begin..chunk_count].iter().rev() {
+                c.add_to(&mut w);
+            }
+            begin = chunk_count;
+        }
+        w.close().unwrap();
+        drop(w);
+
+        let size = expected.len() * CS;
+        check_with_reader(&store.relative_path(path), size, &expected);
+        compare_to_known_size_writer(store, path, size, &expected);
+    }
+
+    fn test_grow_to_fixed_size(store: Arc<ChunkStore>) {
s/fixed/misaligned/ ?

+        let path = Path::new("test_grow_to_fixed_size");
+        let mut w = FixedIndexWriter::create(store.clone(), path, None, CS).unwrap();
+
+        let size = (FixedIndexWriter::INITIAL_CAPACITY + 42) * CS - 1; // last is not full
+        let expected = test_data(size);
+
+        w.grow_to_size(size).unwrap();
+        assert!(w.grow_to_size(size + 1).is_err(), "size must be fixed now");
+        assert_eq!(expected.len(), w.index_length());
+        assert!(w.add_digest(expected.len(), &[1u8; 32]).is_err());
+
+        for c in expected.iter().rev() {
+            c.add_to(&mut w);
+        }
+
+        w.close().unwrap();
+        drop(w);
+
+        check_with_reader(&store.relative_path(path), size, &expected);
+        compare_to_known_size_writer(store, path, size, &expected);
+    }
+
+    struct TestChunk {
+        digest: [u8; 32],
+        index: usize,
+        size: usize,
+        end: usize,
+    }
+
+    impl TestChunk {
+        fn add_to(&self, w: &mut FixedIndexWriter) {
+            assert_eq!(
+                self.index,
+                w.check_chunk_alignment(self.end, self.size).unwrap()
+            );
+            w.add_digest(self.index, &self.digest).unwrap();
+        }
+    }
+
+    fn test_data(size: usize) -> Vec<TestChunk> {
+        (0..size.div_ceil(CS))
+            .map(|index| {
+                let mut digest = [0u8; 32];
+                let i = &(index as u64).to_le_bytes();
+                for c in digest.chunks_mut(i.len()) {
+                    c.copy_from_slice(i);
+                }
+                let size = if ((index + 1) * CS) <= size {
+                    CS
+                } else {
+                    size % CS
+                };
+                TestChunk {
+                    digest,
+                    index,
+                    size,
+                    end: index * CS + size,
+                }
+            })
+            .collect()
+    }
+
+    fn check_with_reader(path: &Path, size: usize, chunks: &[TestChunk]) {
+        let reader = FixedIndexReader::open(path).unwrap();
+        assert_eq!(size as u64, reader.index_bytes());
+        assert_eq!(chunks.len(), reader.index_count());
+        for c in chunks {
+            assert_eq!(&c.digest, reader.index_digest(c.index).unwrap());
+        }
+    }
+
+    fn compare_to_known_size_writer(
+        store: Arc<ChunkStore>,
+        name: &Path,
+        size: usize,
+        chunks: &[TestChunk],
+    ) {
+        let mut path = PathBuf::from(name);
+        path.set_extension("reference");
+        let mut w = FixedIndexWriter::create(store.clone(), &path, Some(size), CS).unwrap();
+        for c in chunks {
+            c.add_to(&mut w);
+        }
+        w.close().unwrap();
+        drop(w);
+
+        let mut reference = fs::read(store.relative_path(&path)).unwrap();
+        let mut tested = fs::read(store.relative_path(name)).unwrap();
+
+        // ignore uuid and ctime
+        reference[8..32].fill(0);
+        tested[8..32].fill(0);
+
+        assert_eq!(reference, tested);
+    }
+}
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


--------------tyGd6XoIIv0MxxdFR0HiEn0E-- --===============8713059236895442823== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel --===============8713059236895442823==--