From: Christian Ebner <c.ebner@proxmox.com>
To: "Proxmox Backup Server development discussion"
<pbs-devel@lists.proxmox.com>,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in chunk store iterator
Date: Wed, 14 Jan 2026 09:37:59 +0100 [thread overview]
Message-ID: <11bf6e13-7658-409d-9105-2d05b1e31c96@proxmox.com> (raw)
In-Reply-To: <1768297033.j3cp7p57tw.astroid@yuna.none>
On 1/13/26 11:23 AM, Fabian Grünbichler wrote:
> On December 11, 2025 4:38 pm, Christian Ebner wrote:
>> Optimizes the chunk filename check towards regular chunk files by
>> explicitley checking for the correct length.
>>
>> While the check for ascii hexdigits needs to be stated twice, this
>> avoids to check for the `.bad` extension if the chunk filename did
>> already match the expected length.
>
> I don't get this part, we could still check first and only once that the
> first 64 bytes are valid hex?
>
> if bytes.len() < 64 {
> continue;
> }
>
> if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
> continue;
> }
But with the code below I'm done after 2 checks in the regular chunk
digest case:
`bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)`
which is the one which is most likely and should be optimized for?
What I tried to tell with the commit message is that the
bytes.iter().take(64).all(u8::is_ascii_hexdigit) is now written out
twice, but only one of the 2 case will ever be checked.
>
> // now start looking at the length + potential extension
>
>>
>> This will also help to better distinguish bad chunks and chunks
>> used markers for s3 datastores in subsequent changes.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> pbs-datastore/src/chunk_store.rs | 17 +++++++++++------
>> 1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index a5e5f6261..7980938ad 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -315,15 +315,20 @@ impl ChunkStore {
>> Some(Ok(entry)) => {
>> // skip files if they're not a hash
>> let bytes = entry.file_name().to_bytes();
>> - if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
>> - continue;
>> +
>> + if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>> + {
>> + return Some((Ok(entry), percentage, false));
>> }
>> - if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
>> - continue;
>> +
>> + if bytes.len() == 64 + ".0.bad".len()
>> + && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>> + {
>> + let bad = bytes.ends_with(b".bad");
>> + return Some((Ok(entry), percentage, bad));
>
> while this mimics the old code, it is still broken (a chunk digest +
> .fooba or any other 6-byte suffix that is not "??.bad" is returned as
> non-bad chunk, since the length matches a bad chunk, but the extension
> does not).
That was the intention here, to keep this close to the previous
behavior. But since we do this check only in the less likely case, I
agree that adding the check for exact extension might be the better
option here.
Will adapt this accordingly, thanks!
>
>> }
>>
>> - let bad = bytes.ends_with(b".bad");
>> - return Some((Ok(entry), percentage, bad));
>> + continue;
>> }
>> Some(Err(err)) => {
>> // stop after first error
>> --
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2026-01-14 8:38 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct Christian Ebner
2026-01-13 10:23 ` Fabian Grünbichler
2026-01-14 8:22 ` Christian Ebner
2026-01-14 9:18 ` Fabian Grünbichler
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 2/8] chunk store: rename and limit scope for chunk store iterator Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in " Christian Ebner
2026-01-13 10:23 ` Fabian Grünbichler
2026-01-14 8:37 ` Christian Ebner [this message]
2026-01-14 9:41 ` Fabian Grünbichler
2026-01-14 9:53 ` Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 4/8] chunk store: return chunk extension and check for used marker Christian Ebner
2026-01-13 10:24 ` Fabian Grünbichler
2026-01-14 8:41 ` Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 5/8] chunk store: refactor chunk extension parsing into dedicated helper Christian Ebner
2026-01-13 10:24 ` Fabian Grünbichler
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 6/8] datastore: move bad chunk touching logic to chunk store Christian Ebner
2026-01-13 10:24 ` Fabian Grünbichler
2026-01-14 8:58 ` Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 7/8] chunk store: move next bad chunk path generator into dedicated helper Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 8/8] chunk store: move bad chunk filename generation " Christian Ebner
2026-01-13 10:24 ` [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Fabian Grünbichler
2026-01-14 12:33 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=11bf6e13-7658-409d-9105-2d05b1e31c96@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox