* [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup
@ 2025-11-25 14:00 Christian Ebner
2025-11-25 14:18 ` Fabian Grünbichler
2025-11-25 15:02 ` [pbs-devel] applied: " Thomas Lamprecht
0 siblings, 2 replies; 6+ messages in thread
From: Christian Ebner @ 2025-11-25 14:00 UTC (permalink / raw)
To: pbs-devel
Since commit 9510ef1a ("GC: assure chunk exists on s3 store when
creating missing chunk marker") chunks which are referenced by
an index file but do not have a local marker file are marked by a
file with the `using` extension, so they are not cleaned up during
phase 2 if the chunk is still present on the backend.
If the chunk is however not encountered, phase 3 will see the marker
and tries to clean it up, which currently however fails because
it is first tried to be cleaned up from the LRU cache, the filename
being converted to the chunk digest.
Therefore, clean up any using marker file encountered during phase 3
before any regular or bad chunk, independent from the atime.
Fixes: https://forum.proxmox.com/threads/176567/post-819437
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
Changes since version 1 (thanks a lot for offlist discussion Thomas):
- Cleanup using marker chunks independent from atime cutoff
pbs-datastore/src/chunk_store.rs | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index f53460664..7fe09b914 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -25,6 +25,8 @@ use crate::file_formats::{
};
use crate::{DataBlob, LocalDatastoreLruCache};
+const USING_MARKER_FILENAME_EXT: &str = "using";
+
/// File system based chunk store
pub struct ChunkStore {
name: String, // used for error reporting
@@ -426,6 +428,16 @@ impl ChunkStore {
drop(lock);
continue;
}
+ if filename
+ .to_bytes()
+ .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
+ {
+ unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
+ format_err!("unlinking chunk using marker {filename:?} failed - {err}")
+ })?;
+ drop(lock);
+ continue;
+ }
chunk_count += 1;
@@ -776,7 +788,7 @@ impl ChunkStore {
/// Helper to generate marker file path for expected chunks
fn chunk_expected_marker_path(&self, digest: &[u8; 32]) -> PathBuf {
let (mut path, _digest_str) = self.chunk_path(digest);
- path.set_extension("using");
+ path.set_extension(USING_MARKER_FILENAME_EXT);
path
}
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup
2025-11-25 14:00 [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup Christian Ebner
@ 2025-11-25 14:18 ` Fabian Grünbichler
2025-11-25 14:23 ` Thomas Lamprecht
2025-11-25 14:27 ` Christian Ebner
2025-11-25 15:02 ` [pbs-devel] applied: " Thomas Lamprecht
1 sibling, 2 replies; 6+ messages in thread
From: Fabian Grünbichler @ 2025-11-25 14:18 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On November 25, 2025 3:00 pm, Christian Ebner wrote:
> Since commit 9510ef1a ("GC: assure chunk exists on s3 store when
> creating missing chunk marker") chunks which are referenced by
> an index file but do not have a local marker file are marked by a
> file with the `using` extension, so they are not cleaned up during
> phase 2 if the chunk is still present on the backend.
>
> If the chunk is however not encountered, phase 3 will see the marker
> and tries to clean it up, which currently however fails because
> it is first tried to be cleaned up from the LRU cache, the filename
> being converted to the chunk digest.
>
> Therefore, clean up any using marker file encountered during phase 3
> before any regular or bad chunk, independent from the atime.
>
> Fixes: https://forum.proxmox.com/threads/176567/post-819437
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> Changes since version 1 (thanks a lot for offlist discussion Thomas):
> - Cleanup using marker chunks independent from atime cutoff
>
> pbs-datastore/src/chunk_store.rs | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index f53460664..7fe09b914 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -25,6 +25,8 @@ use crate::file_formats::{
> };
> use crate::{DataBlob, LocalDatastoreLruCache};
>
> +const USING_MARKER_FILENAME_EXT: &str = "using";
> +
> /// File system based chunk store
> pub struct ChunkStore {
> name: String, // used for error reporting
> @@ -426,6 +428,16 @@ impl ChunkStore {
> drop(lock);
> continue;
> }
> + if filename
> + .to_bytes()
> + .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
> + {
> + unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
> + format_err!("unlinking chunk using marker {filename:?} failed - {err}")
> + })?;
> + drop(lock);
> + continue;
> + }
this looks okay as a stop-gap, but isn't the actual problem that
.using
and
.0.bad
have the same length, so we end up taking a codepath using a weird "bad
but not bad" filename instead of skipping those markers in phase3?
in get_chunk_iterator, we skip all files that are not 64 bytes or
64+len(.0.bad) bytes long, but then set the "bad" flag based on the
extension..
and then in cond_sweep_chunk in sweep_unused_chunk, we convert non-bad
chunk filenames to digests which then fails for the "using" filenames,
because they are too long (per the error from the forum thread).
>
> chunk_count += 1;
>
> @@ -776,7 +788,7 @@ impl ChunkStore {
> /// Helper to generate marker file path for expected chunks
> fn chunk_expected_marker_path(&self, digest: &[u8; 32]) -> PathBuf {
> let (mut path, _digest_str) = self.chunk_path(digest);
> - path.set_extension("using");
> + path.set_extension(USING_MARKER_FILENAME_EXT);
> path
> }
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup
2025-11-25 14:18 ` Fabian Grünbichler
@ 2025-11-25 14:23 ` Thomas Lamprecht
2025-11-25 14:27 ` Christian Ebner
1 sibling, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2025-11-25 14:23 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Fabian Grünbichler
Am 25.11.25 um 15:18 schrieb Fabian Grünbichler:
>> + if filename
>> + .to_bytes()
>> + .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
>> + {
>> + unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
>> + format_err!("unlinking chunk using marker {filename:?} failed - {err}")
>> + })?;
>> + drop(lock);
>> + continue;
>> + }
> this looks okay as a stop-gap, but isn't the actual problem that
>
> .using
>
> and
>
> .0.bad
>
> have the same length, so we end up taking a codepath using a weird "bad
> but not bad" filename instead of skipping those markers in phase3?
I mean, yes, but these .using markers would need to be cleaned (or updated to
ensure they match reality) sooner or later anyway, or does that already happen
elsewhere?
>
> in get_chunk_iterator, we skip all files that are not 64 bytes or
> 64+len(.0.bad) bytes long, but then set the "bad" flag based on the
> extension..
>
> and then in cond_sweep_chunk in sweep_unused_chunk, we convert non-bad
> chunk filenames to digests which then fails for the "using" filenames,
> because they are too long (per the error from the forum thread).
I mean, "using" is also not that of a telling suffix, "referenced" or
"remote" or the like might be better, even just "used" is slightly easier
to grasp for me.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup
2025-11-25 14:18 ` Fabian Grünbichler
2025-11-25 14:23 ` Thomas Lamprecht
@ 2025-11-25 14:27 ` Christian Ebner
2025-11-26 8:23 ` Fabian Grünbichler
1 sibling, 1 reply; 6+ messages in thread
From: Christian Ebner @ 2025-11-25 14:27 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Fabian Grünbichler
On 11/25/25 3:19 PM, Fabian Grünbichler wrote:
> On November 25, 2025 3:00 pm, Christian Ebner wrote:
>> Since commit 9510ef1a ("GC: assure chunk exists on s3 store when
>> creating missing chunk marker") chunks which are referenced by
>> an index file but do not have a local marker file are marked by a
>> file with the `using` extension, so they are not cleaned up during
>> phase 2 if the chunk is still present on the backend.
>>
>> If the chunk is however not encountered, phase 3 will see the marker
>> and tries to clean it up, which currently however fails because
>> it is first tried to be cleaned up from the LRU cache, the filename
>> being converted to the chunk digest.
>>
>> Therefore, clean up any using marker file encountered during phase 3
>> before any regular or bad chunk, independent from the atime.
>>
>> Fixes: https://forum.proxmox.com/threads/176567/post-819437
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> Changes since version 1 (thanks a lot for offlist discussion Thomas):
>> - Cleanup using marker chunks independent from atime cutoff
>>
>> pbs-datastore/src/chunk_store.rs | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index f53460664..7fe09b914 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -25,6 +25,8 @@ use crate::file_formats::{
>> };
>> use crate::{DataBlob, LocalDatastoreLruCache};
>>
>> +const USING_MARKER_FILENAME_EXT: &str = "using";
>> +
>> /// File system based chunk store
>> pub struct ChunkStore {
>> name: String, // used for error reporting
>> @@ -426,6 +428,16 @@ impl ChunkStore {
>> drop(lock);
>> continue;
>> }
>> + if filename
>> + .to_bytes()
>> + .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
>> + {
>> + unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
>> + format_err!("unlinking chunk using marker {filename:?} failed - {err}")
>> + })?;
>> + drop(lock);
>> + continue;
>> + }
>
> this looks okay as a stop-gap, but isn't the actual problem that
>
> .using
>
> and
>
> .0.bad
>
> have the same length, so we end up taking a codepath using a weird "bad
> but not bad" filename instead of skipping those markers in phase3?
but we need to clean them up at some point, otherwise the following
might happen:
- chunk is in use by index file, phase 1 sets marker
- chunk is not present on s3 object store (bad chunk), therefore not
seen in phase 2 and not replaced by regular marker file
- chunk is uploaded
- both index files are pruned
- chunk is never cleaned up because using marker file persists.
> in get_chunk_iterator, we skip all files that are not 64 bytes or
> 64+len(.0.bad) bytes long, but then set the "bad" flag based on the
> extension..
this might return the information if this was a using marker by some
enum variant instead of the bad boolean flag, so that can be used to
clearly distinguish these.
>
> and then in cond_sweep_chunk in sweep_unused_chunk, we convert non-bad
> chunk filenames to digests which then fails for the "using" filenames,
> because they are too long (per the error from the forum thread).
>
>>
>> chunk_count += 1;
>>
>> @@ -776,7 +788,7 @@ impl ChunkStore {
>> /// Helper to generate marker file path for expected chunks
>> fn chunk_expected_marker_path(&self, digest: &[u8; 32]) -> PathBuf {
>> let (mut path, _digest_str) = self.chunk_path(digest);
>> - path.set_extension("using");
>> + path.set_extension(USING_MARKER_FILENAME_EXT);
>> path
>> }
>>
>> --
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* [pbs-devel] applied: [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup
2025-11-25 14:00 [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup Christian Ebner
2025-11-25 14:18 ` Fabian Grünbichler
@ 2025-11-25 15:02 ` Thomas Lamprecht
1 sibling, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2025-11-25 15:02 UTC (permalink / raw)
To: pbs-devel, Christian Ebner
On Tue, 25 Nov 2025 15:00:13 +0100, Christian Ebner wrote:
> Since commit 9510ef1a ("GC: assure chunk exists on s3 store when
> creating missing chunk marker") chunks which are referenced by
> an index file but do not have a local marker file are marked by a
> file with the `using` extension, so they are not cleaned up during
> phase 2 if the chunk is still present on the backend.
>
> If the chunk is however not encountered, phase 3 will see the marker
> and tries to clean it up, which currently however fails because
> it is first tried to be cleaned up from the LRU cache, the filename
> being converted to the chunk digest.
>
> [...]
Applied, thanks!
[1/1] GC: chunk store: fix chunk using markers cleanup
commit: 57f9d167e9d274d2c3bb56fa5930dbdcbc060c9e
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup
2025-11-25 14:27 ` Christian Ebner
@ 2025-11-26 8:23 ` Fabian Grünbichler
0 siblings, 0 replies; 6+ messages in thread
From: Fabian Grünbichler @ 2025-11-26 8:23 UTC (permalink / raw)
To: Christian Ebner, Proxmox Backup Server development discussion
On November 25, 2025 3:27 pm, Christian Ebner wrote:
> On 11/25/25 3:19 PM, Fabian Grünbichler wrote:
>> On November 25, 2025 3:00 pm, Christian Ebner wrote:
>>> Since commit 9510ef1a ("GC: assure chunk exists on s3 store when
>>> creating missing chunk marker") chunks which are referenced by
>>> an index file but do not have a local marker file are marked by a
>>> file with the `using` extension, so they are not cleaned up during
>>> phase 2 if the chunk is still present on the backend.
>>>
>>> If the chunk is however not encountered, phase 3 will see the marker
>>> and tries to clean it up, which currently however fails because
>>> it is first tried to be cleaned up from the LRU cache, the filename
>>> being converted to the chunk digest.
>>>
>>> Therefore, clean up any using marker file encountered during phase 3
>>> before any regular or bad chunk, independent from the atime.
>>>
>>> Fixes: https://forum.proxmox.com/threads/176567/post-819437
>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>>> ---
>>> Changes since version 1 (thanks a lot for offlist discussion Thomas):
>>> - Cleanup using marker chunks independent from atime cutoff
>>>
>>> pbs-datastore/src/chunk_store.rs | 14 +++++++++++++-
>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>>> index f53460664..7fe09b914 100644
>>> --- a/pbs-datastore/src/chunk_store.rs
>>> +++ b/pbs-datastore/src/chunk_store.rs
>>> @@ -25,6 +25,8 @@ use crate::file_formats::{
>>> };
>>> use crate::{DataBlob, LocalDatastoreLruCache};
>>>
>>> +const USING_MARKER_FILENAME_EXT: &str = "using";
>>> +
>>> /// File system based chunk store
>>> pub struct ChunkStore {
>>> name: String, // used for error reporting
>>> @@ -426,6 +428,16 @@ impl ChunkStore {
>>> drop(lock);
>>> continue;
>>> }
>>> + if filename
>>> + .to_bytes()
>>> + .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
>>> + {
>>> + unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
>>> + format_err!("unlinking chunk using marker {filename:?} failed - {err}")
>>> + })?;
>>> + drop(lock);
>>> + continue;
>>> + }
>>
>> this looks okay as a stop-gap, but isn't the actual problem that
>>
>> .using
>>
>> and
>>
>> .0.bad
>>
>> have the same length, so we end up taking a codepath using a weird "bad
>> but not bad" filename instead of skipping those markers in phase3?
>
> but we need to clean them up at some point, otherwise the following
> might happen:
> - chunk is in use by index file, phase 1 sets marker
> - chunk is not present on s3 object store (bad chunk), therefore not
> seen in phase 2 and not replaced by regular marker file
> - chunk is uploaded
> - both index files are pruned
> - chunk is never cleaned up because using marker file persists.
yes, that's true, since the only purpose is to protect against cleaning
up in phase 2, they don't need to live longer than during GC.
>> in get_chunk_iterator, we skip all files that are not 64 bytes or
>> 64+len(.0.bad) bytes long, but then set the "bad" flag based on the
>> extension..
>
> this might return the information if this was a using marker by some
> enum variant instead of the bad boolean flag, so that can be used to
> clearly distinguish these.
that seems cleaner, yes.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-11-26 8:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-25 14:00 [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup Christian Ebner
2025-11-25 14:18 ` Fabian Grünbichler
2025-11-25 14:23 ` Thomas Lamprecht
2025-11-25 14:27 ` Christian Ebner
2025-11-26 8:23 ` Fabian Grünbichler
2025-11-25 15:02 ` [pbs-devel] applied: " Thomas Lamprecht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox