public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Robert Obkircher <r.obkircher@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>,
	Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v4 proxmox-backup 09/11] datastore: use u64 instead of usize for fidx writer content size
Date: Mon, 26 Jan 2026 14:20:30 +0100	[thread overview]
Message-ID: <5aea0de2-2cc1-4c30-a609-2cdb1a2497a6@proxmox.com> (raw)
In-Reply-To: <767de04e-3f23-4cf3-8ad8-45d4c2d4013f@proxmox.com>


On 1/26/26 12:38, Christian Ebner wrote:
> Not sure about these changes, maybe other devs have a stronger
> opinion on this one.
>
> If we do want to adapt this, then IMHO this should however be done
> throughout the whole codebase, for the dynamic index as well.
The dynamic reader/writer and the FixedIndexReader already use u64 for
content
size and offsets. The API only supports 4 MiB chunks anyway, so it
shouldn't matter
if we keep using u32 there.

FixedIndexReader still casts the chunk_size from u64 to usize and
back. I planned
to change this while adding 16kb page size support to the Readers
(#7244) and I
also  wanted to add an is_power_of_two check, because the modulo
computation
in chunk_from_offset relies on that.
>  
>
>
> On 1/23/26 4:43 PM, Robert Obkircher wrote:
>> This is closer to what the file format supports.
>>
>> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
>> ---
>>   pbs-datastore/src/datastore.rs   |  6 +--
>>   pbs-datastore/src/fixed_index.rs | 69
>> ++++++++++++++++----------------
>>   src/api2/backup/environment.rs   |  6 +--
>>   src/api2/backup/mod.rs           |  2 +-
>>   4 files changed, 42 insertions(+), 41 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs
>> b/pbs-datastore/src/datastore.rs
>> index 56dfce6e..8770d942 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -695,11 +695,11 @@ impl DataStore {
>>       pub fn create_fixed_writer<P: AsRef<Path>>(
>>           &self,
>>           filename: P,
>> -        size: Option<usize>,
>> -        chunk_size: usize,
>> +        size: Option<u64>,
>> +        chunk_size: u32,
>
> question: is this intentionally set to u32 instead of u64 like for
> other chunk sizes in this patch? Should be consistent ... 
I'll change it to u64.
>
>>       ) -> Result<FixedIndexWriter, Error> {
>>           let full_path =
>> self.inner.chunk_store.relative_path(filename.as_ref());
>> -        FixedIndexWriter::create(full_path, size, chunk_size)
>> +        FixedIndexWriter::create(full_path, size, chunk_size.into())
>>       }
>>         pub fn open_fixed_reader<P: AsRef<Path>>(
>> diff --git a/pbs-datastore/src/fixed_index.rs
>> b/pbs-datastore/src/fixed_index.rs
>> index 056ae07b..c2888372 100644
>> --- a/pbs-datastore/src/fixed_index.rs
>> +++ b/pbs-datastore/src/fixed_index.rs
>> @@ -214,8 +214,8 @@ pub struct FixedIndexWriter {
>>       file: File,
>>       filename: PathBuf,
>>       tmp_filename: PathBuf,
>> -    chunk_size: usize,
>> -    size: usize,
>> +    chunk_size: u64,
>> +    size: u64,
>>       index_length: usize,
>>       index_capacity: usize,
>>       index: *mut u8,
>> @@ -248,8 +248,8 @@ impl FixedIndexWriter {
>>       // Requires obtaining a shared chunk store lock beforehand
>>       pub fn create(
>>           full_path: impl Into<PathBuf>,
>> -        known_size: Option<usize>,
>> -        chunk_size: usize,
>> +        known_size: Option<u64>,
>> +        chunk_size: u64,
>>       ) -> Result<Self, Error> {
>>           let full_path = full_path.into();
>>           let mut tmp_path = full_path.clone();
>> @@ -287,10 +287,13 @@ impl FixedIndexWriter {
>>             file.write_all(&buffer)?;
>>   -        let (index_length, index_capacity) = known_size
>> -            .map(|s| s.div_ceil(chunk_size))
>> -            .map(|len| (len, len))
>> -            .unwrap_or((0, Self::INITIAL_CAPACITY));
>> +        let (index_length, index_capacity) = match known_size {
>> +            Some(s) => {
>> +                let len = s.div_ceil(chunk_size).try_into()?;
>> +                (len, len)
>> +            }
>> +            None => (0, Self::INITIAL_CAPACITY),
>> +        };
>>             let index_size = index_capacity * 32;
>>           nix::unistd::ftruncate(&file, (header_size + index_size)
>> as i64)?;
>> @@ -376,13 +379,13 @@ impl FixedIndexWriter {
>>       /// The size also becomes fixed as soon as it is no longer
>> divisible
>>       /// by the block size, to ensure that only the last block can be
>>       /// smaller.
>> -    fn grow_to_size(&mut self, requested_size: usize) ->
>> Result<(), Error> {
>> +    fn grow_to_size(&mut self, requested_size: u64) -> Result<(),
>> Error> {
>>           if self.size < requested_size {
>>               if !self.growable_size {
>>                   bail!("refusing to resize from {} to
>> {requested_size}", self.size);
>>               }
>> -            let new_len = requested_size.div_ceil(self.chunk_size);
>> -            if new_len * self.chunk_size != requested_size {
>> +            let new_len =
>> requested_size.div_ceil(self.chunk_size).try_into()?;
>> +            if new_len as u64 * self.chunk_size != requested_size {
>>                   // not a full chunk, so this must be the last one
>>                   self.growable_size = false;
>>                   self.set_index_capacity_or_unmap(new_len)?;
>> @@ -463,12 +466,10 @@ impl FixedIndexWriter {
>>           Ok(index_csum)
>>       }
>>   -    fn check_chunk_alignment(&self, offset: usize, chunk_len:
>> usize) -> Result<usize, Error> {
>> -        if offset < chunk_len {
>> +    fn check_chunk_alignment(&self, offset: u64, chunk_len: u64)
>> -> Result<usize, Error> {
>> +        let Some(pos) = offset.checked_sub(chunk_len) else {
>>               bail!("got chunk with small offset ({} < {}", offset,
>> chunk_len);
>> -        }
>> -
>> -        let pos = offset - chunk_len;
>> +        };
>>             if offset > self.size {
>>               bail!("chunk data exceeds size ({} >= {})", offset,
>> self.size);
>> @@ -490,7 +491,7 @@ impl FixedIndexWriter {
>>               bail!("got unaligned chunk (pos = {})", pos);
>>           }
>>   -        Ok(pos / self.chunk_size)
>> +        Ok((pos / self.chunk_size) as usize)
>>       }
>>         fn add_digest(&mut self, index: usize, digest: &[u8; 32])
>> -> Result<(), Error> {
>> @@ -524,12 +525,12 @@ impl FixedIndexWriter {
>>       /// If this writer has been created without a fixed size, the
>>       /// index capacity and content size are increased automatically
>>       /// until an incomplete chunk is encountered.
>> -    pub fn add_chunk(&mut self, start: u64, size: u32, digest:
>> &[u8; 32]) -> Result<(), Error> {
>> -        let Some(end) = start.checked_add(size.into()) else {
>> +    pub fn add_chunk(&mut self, start: u64, size: u64, digest:
>> &[u8; 32]) -> Result<(), Error> {
>> +        let Some(end) = start.checked_add(size) else {
>>               bail!("add_chunk: start and size are too large:
>> {start}+{size}");
>>           };
>> -        self.grow_to_size(end as usize)?;
>> -        let idx = self.check_chunk_alignment(end as usize, size as
>> usize)?;
>> +        self.grow_to_size(end)?;
>> +        let idx = self.check_chunk_alignment(end, size)?;
>>           self.add_digest(idx, digest)
>>       }
>>   @@ -538,7 +539,7 @@ impl FixedIndexWriter {
>>               bail!("reusing the index is only supported with known
>> input size");
>>           }
>>   -        if self.chunk_size != reader.chunk_size {
>> +        if Ok(self.chunk_size) != reader.chunk_size.try_into() {
>>               bail!("can't reuse file with different chunk size");
>>           }
>>   @@ -560,7 +561,7 @@ mod tests {
>>       use std::env;
>>       use std::fs;
>>   -    const CS: usize = 4096;
>> +    const CS: u64 = 4096;
>>         #[test]
>>       fn test_empty() {
>> @@ -606,7 +607,7 @@ mod tests {
>>             let initial = FixedIndexWriter::INITIAL_CAPACITY;
>>           let steps = [1, 2, initial, initial + 1, 5 * initial, 10
>> * initial + 1];
>> -        let expected = test_data(steps.last().unwrap() * CS);
>> +        let expected = test_data(*steps.last().unwrap() as u64 * CS);
>>             let mut begin = 0;
>>           for chunk_count in steps {
>> @@ -623,7 +624,7 @@ mod tests {
>>           w.close().unwrap();
>>           drop(w);
>>   -        let size = expected.len() * CS;
>> +        let size = expected.len() as u64 * CS;
>>           check_with_reader(&path, size, &expected);
>>           compare_to_known_size_writer(&path, size, &expected);
>>       }
>> @@ -634,7 +635,7 @@ mod tests {
>>           let path = dir.join("test_grow_to_misaligned_size");
>>           let mut w = FixedIndexWriter::create(&path, None,
>> CS).unwrap();
>>   -        let size = (FixedIndexWriter::INITIAL_CAPACITY + 42) *
>> CS - 1; // last is not full
>> +        let size = (FixedIndexWriter::INITIAL_CAPACITY as u64 +
>> 42) * CS - 1; // last is not full
>>           let expected = test_data(size);
>>             w.grow_to_size(size).unwrap();
>> @@ -677,8 +678,8 @@ mod tests {
>>       struct TestChunk {
>>           digest: [u8; 32],
>>           index: usize,
>> -        size: usize,
>> -        end: usize,
>> +        size: u64,
>> +        end: u64,
>>       }
>>         impl TestChunk {
>> @@ -691,7 +692,7 @@ mod tests {
>>           }
>>       }
>>   -    fn test_data(size: usize) -> Vec<TestChunk> {
>> +    fn test_data(size: u64) -> Vec<TestChunk> {
>>           (0..size.div_ceil(CS))
>>               .map(|index| {
>>                   let mut digest = [0u8; 32];
>> @@ -706,24 +707,24 @@ mod tests {
>>                   };
>>                   TestChunk {
>>                       digest,
>> -                    index,
>> +                    index: index as usize,
>>                       size,
>> -                    end: index * CS + size,
>> +                    end: index as u64 * CS + size,
>>                   }
>>               })
>>               .collect()
>>       }
>>   -    fn check_with_reader(path: &Path, size: usize, chunks:
>> &[TestChunk]) {
>> +    fn check_with_reader(path: &Path, size: u64, chunks:
>> &[TestChunk]) {
>>           let reader = FixedIndexReader::open(path).unwrap();
>> -        assert_eq!(size as u64, reader.index_bytes());
>> +        assert_eq!(size, reader.index_bytes());
>>           assert_eq!(chunks.len(), reader.index_count());
>>           for c in chunks {
>>               assert_eq!(&c.digest,
>> reader.index_digest(c.index).unwrap());
>>           }
>>       }
>>   -    fn compare_to_known_size_writer(file: &Path, size: usize,
>> chunks: &[TestChunk]) {
>> +    fn compare_to_known_size_writer(file: &Path, size: u64,
>> chunks: &[TestChunk]) {
>>           let mut path = file.to_path_buf();
>>           path.set_extension("reference");
>>           let mut w = FixedIndexWriter::create(&path, Some(size),
>> CS).unwrap();
>> diff --git a/src/api2/backup/environment.rs
>> b/src/api2/backup/environment.rs
>> index 04c5bf84..7d49d47c 100644
>> --- a/src/api2/backup/environment.rs
>> +++ b/src/api2/backup/environment.rs
>> @@ -67,7 +67,7 @@ struct DynamicWriterState {
>>   struct FixedWriterState {
>>       name: String,
>>       index: FixedIndexWriter,
>> -    size: Option<usize>,
>> +    size: Option<u64>,
>>       chunk_size: u32,
>>       chunk_count: u64,
>>       small_chunk_count: usize, // allow 0..1 small chunks (last
>> chunk may be smaller)
>> @@ -349,7 +349,7 @@ impl BackupEnvironment {
>>           &self,
>>           index: FixedIndexWriter,
>>           name: String,
>> -        size: Option<usize>,
>> +        size: Option<u64>,
>>           chunk_size: u32,
>>           incremental: bool,
>>       ) -> Result<usize, Error> {
>> @@ -442,7 +442,7 @@ impl BackupEnvironment {
>>               );
>>           }
>>   -        data.index.add_chunk(offset, size, digest)?;
>> +        data.index.add_chunk(offset, size.into(), digest)?;
>>             data.chunk_count += 1;
>>   diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
>> index c2822c18..54445efa 100644
>> --- a/src/api2/backup/mod.rs
>> +++ b/src/api2/backup/mod.rs
>> @@ -480,7 +480,7 @@ fn create_fixed_index(
>>       let env: &BackupEnvironment = rpcenv.as_ref();
>>         let name = required_string_param(&param,
>> "archive-name")?.to_owned();
>> -    let size =
>> param["size"].as_u64().map(usize::try_from).transpose()?;
>> +    let size = param["size"].as_u64();
>>       let reuse_csum = param["reuse-csum"].as_str();
>>         let archive_name = name.clone();
>


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2026-01-26 13:20 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23 15:37 [pbs-devel] [PATCH v4 proxmox-backup 00/11] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 01/11] datastore: support writing fidx files of unknown size Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 02/11] datastore: remove Arc<ChunkStore> from FixedIndexWriter Robert Obkircher
2026-01-26 10:47   ` Christian Ebner
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 03/11] datastore: test FixedIndexWriter Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 04/11] api: backup: make fixed index file size optional Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 05/11] api: verify fixed index writer size on close Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 06/11] fix #3847: client: support fifo pipe inputs for images Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 07/11] client: treat minus sign as stdin Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 08/11] datastore: combine public FixedIndexWriter methods into add_chunk Robert Obkircher
2026-01-26 11:22   ` Christian Ebner
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 09/11] datastore: use u64 instead of usize for fidx writer content size Robert Obkircher
2026-01-26 11:39   ` Christian Ebner
2026-01-26 13:20     ` Robert Obkircher [this message]
2026-01-26 14:02       ` Christian Ebner
2026-01-26 11:57   ` Christian Ebner
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 10/11] datastore: compute fidx file size with overflow checks Robert Obkircher
2026-01-23 15:37 ` [pbs-devel] [PATCH v4 proxmox-backup 11/11] datastore: support writing fidx files on systems with larger page size Robert Obkircher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5aea0de2-2cc1-4c30-a609-2cdb1a2497a6@proxmox.com \
    --to=r.obkircher@proxmox.com \
    --cc=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal