public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v5 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored
Date: Wed, 19 Mar 2025 10:22:58 +0100	[thread overview]
Message-ID: <9ab3a041-6abb-405f-b232-f687802e81ce@proxmox.com> (raw)
In-Reply-To: <07f3fa30-7892-419b-b5ae-47e9de57ab9b@proxmox.com>

On 3/19/25 09:45, Thomas Lamprecht wrote:
> Am 06.03.25 um 15:52 schrieb Christian Ebner:
>> Check if the filesystem backing the chunk store actually updates the
>> atime to avoid potential data loss in phase 2 of garbage collection,
>> in case the atime update is not honored.
>>
>> Perform the check before phase 1 of garbage collection, as well as
>> on datastore creation. The latter to early detect and disallow
>> datastore creation on filesystem configurations which otherwise most
>> likely would lead to data losses.
>>
>> Enable the atime update check by default, but allow to opt-out by
>> setting a datastore tuning parameter flag for backwards compatibility.
>> This is honored by both, garbage collection and datastore creation.
>>
>> The check uses a 4 MiB fixed sized, unencypted and compressed chunk
>> as test marker, inserted if not present. This all zero-chunk is very
>> likely anyways for unencrypted backup contents with large all-zero
>> regions using fixed size chunking (e.g. VMs).
>>
>> To avoid cases were the timestamp will not be updated because of the
>> Linux kernels timestamp granularity, sleep in-between stating and
>> utimensat for 1 second.
>>
>> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5982
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 4:
>> - Improve logging of values if atime update fails.
>> - fix incorrect comment
>>
>>   pbs-datastore/src/chunk_store.rs | 101 ++++++++++++++++++++++++++++---
>>   pbs-datastore/src/datastore.rs   |  13 ++++
>>   src/api2/config/datastore.rs     |   1 +
>>   3 files changed, 108 insertions(+), 7 deletions(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index 5e02909a1..a8c826353 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -1,9 +1,11 @@
>> +use std::os::unix::fs::MetadataExt;
>>   use std::os::unix::io::AsRawFd;
>>   use std::path::{Path, PathBuf};
>>   use std::sync::{Arc, Mutex};
>> +use std::time::{Duration, UNIX_EPOCH};
>>   
>> -use anyhow::{bail, format_err, Error};
>> -use tracing::info;
>> +use anyhow::{bail, format_err, Context, Error};
>> +use tracing::{info, warn};
>>   
>>   use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
>>   use proxmox_io::ReadExt;
>> @@ -13,6 +15,7 @@ use proxmox_sys::process_locker::{
>>   };
>>   use proxmox_worker_task::WorkerTaskContext;
>>   
>> +use crate::data_blob::DataChunkBuilder;
>>   use crate::file_formats::{
>>       COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
>>   };
>> @@ -93,6 +96,7 @@ impl ChunkStore {
>>           uid: nix::unistd::Uid,
>>           gid: nix::unistd::Gid,
>>           sync_level: DatastoreFSyncLevel,
>> +        atime_safety_check: bool,
> 
> Not a huge fan of another flat parameter here, albeit it's a bit borderline.
> I do not want to add scope creep, but if we really want this here, then moving
> those options into a struct might be nicer, could be done afterward too though.
> 
> But stepping a bit back, why is this a parameter for the fn creating a new
> chunkstore in the first place? Why cannot the call sites that are interested
> in doing this call check_fs_atime_updates themselves on the chunkstore instance
> after getting it instead of having to loop through a parameter?
> 
> We do not have many places that create datastores, so I do not think there's a
> high likelihood that we forget to call it.

Yes, can move this to the call side instead. The intention of having 
this here is so that it will always be considered when creating a chunk 
store. But doing this in the datastore creation only might even be 
better. As currently this is only checked on chunk store creation, not 
when reusing a pre-existing datastore. It might however make sense to 
run the check there too.

> 
>>       ) -> Result<Self, Error>
>>       where
>>           P: Into<PathBuf>,
>> @@ -147,7 +151,17 @@ impl ChunkStore {
>>               }
>>           }
>>   
>> -        Self::open(name, base, sync_level)
>> +        let chunk_store = Self::open(name, base, sync_level)?;
>> +        if atime_safety_check {
>> +            chunk_store
>> +                .check_fs_atime_updates(true)
>> +                .map_err(|err| format_err!("access time safety check failed - {err:#}"))?;
>> +            info!("Access time safety check successful.");
>> +        } else {
>> +            info!("Access time safety check skipped.");
>> +        }
>> +
>> +        Ok(chunk_store)
>>       }
>>   
>>       fn lockfile_path<P: Into<PathBuf>>(base: P) -> PathBuf {
>> @@ -442,6 +456,66 @@ impl ChunkStore {
>>           Ok(())
>>       }
>>   
>> +    /// Check if atime updates are honored by the filesystem backing the chunk store.
>> +    ///
>> +    /// Checks if the atime is always updated by utimensat taking into consideration the Linux
>> +    /// kernel timestamp granularity.
>> +    /// If `retry_on_file_changed` is set to true, the check is performed again on the changed file
>> +    /// if a file change while testing is detected by differences in bith time or inode number.
>> +    /// Uses a 4 MiB fixed size, compressed but unencrypted chunk to test. The chunk is inserted in
>> +    /// the chunk store if not yet present.
>> +    /// Returns with error if the check could not be performed.
>> +    pub fn check_fs_atime_updates(&self, retry_on_file_changed: bool) -> Result<(), Error> {
>> +        let (zero_chunk, digest) = DataChunkBuilder::build_zero_chunk(None, 4096 * 1024, true)?;
>> +        let (pre_existing, _) = self.insert_chunk(&zero_chunk, &digest)?;
>> +        let (path, _digest) = self.chunk_path(&digest);
>> +
>> +        // Take into account timestamp update granularity in the kernel
>> +        std::thread::sleep(Duration::from_secs(1));
> 
> If there's the need for a v6 I'd maybe add to the comment that this normally
> runs in a worker, so sleep blocking the whole thread is fine here.

Acked, will be included.

> Also, you write "sleep in-between stating and utimensat for 1 second", but
> this is sleeping before the stat + touch + stat calls, or am I overlooking
> something?

Yes, the wording is not correct anymore after iterating the patch 
series, as the order of execution changed a bit.
The first atime update happens in the insert chunk, then we read the 
metadata after sleeping for 1 second. This is to reduce the chance of 
another atime update happening from somewhere else in the mean time 
(suggested by Fabian in his previous review of the series).

> 
>> +        let metadata_before =
>> +            std::fs::metadata(&path).context(format!("failed to get metadata for {path:?}"))?;
>> +
>> +        // Second atime update if chunk pre-existed, insert_chunk already updates pre-existing ones
>> +        self.cond_touch_path(&path, true)?;
>> +
>> +        let metadata_now =
>> +            std::fs::metadata(&path).context(format!("failed to get metadata for {path:?}"))?;
> 
> Another tiny nit: maybe encode now/after in the context to be easier able
> to distinguish which stat syscall failed – while it's very unlikely that
> one succeeds and the other doesn't, it would not hurt to be able to tell
> that case apart.

Acked, will incorporate that.

> 
>> +
>> +        // Check for the unlikely case that the file changed in-between the
>> +        // two metadata calls, try to check once again on changed file
>> +        if metadata_before.ino() != metadata_now.ino() {
>> +            if retry_on_file_changed {
>> +                return self.check_fs_atime_updates(false);
>> +            }
>> +            bail!("chunk {path:?} changed twice during access time safety check, cannot proceed.");
>> +        }
>> +
>> +        if metadata_before.accessed()? >= metadata_now.accessed()? {
>> +            let chunk_info_str = if pre_existing {
>> +                "pre-existing"
>> +            } else {
>> +                "newly inserted"
>> +            };
>> +            warn!("Chunk metadata was not correctly updated during access time safety check:");
>> +            info!(
>> +                "Timestamps before update: accessed {:?}, modified {:?}, created {:?}",
>> +                metadata_before.accessed().unwrap_or(UNIX_EPOCH),
>> +                metadata_before.modified().unwrap_or(UNIX_EPOCH),
>> +                metadata_before.created().unwrap_or(UNIX_EPOCH),
>> +            );
>> +            info!(
>> +                "Timestamps after update: accessed {:?}, modified {:?}, created {:?}",
>> +                metadata_now.accessed().unwrap_or(UNIX_EPOCH),
>> +                metadata_now.modified().unwrap_or(UNIX_EPOCH),
>> +                metadata_now.created().unwrap_or(UNIX_EPOCH),
>> +            );
>> +            bail!("access time safety check using {chunk_info_str} chunk failed, aborting GC!");
>> +        }
>> +
>> +        Ok(())
>> +    }
>> +
>>       pub fn insert_chunk(&self, chunk: &DataBlob, digest: &[u8; 32]) -> Result<(bool, u64), Error> {
>>           // unwrap: only `None` in unit tests
>>           assert!(self.locker.is_some());
> 
> 
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 75c0c16ab..5558bb1ac 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -1170,6 +1170,19 @@ impl DataStore {
>>                   upid: Some(upid.to_string()),
>>                   ..Default::default()
>>               };
>> +            let tuning: DatastoreTuning = serde_json::from_value(
>> +                DatastoreTuning::API_SCHEMA
>> +                    .parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
>> +            )?;
>> +            if tuning.gc_atime_safety_check.unwrap_or(true) {
>> +                self.inner
>> +                    .chunk_store
>> +                    .check_fs_atime_updates(true)
>> +                    .map_err(|err| format_err!("atime safety check failed - {err:#}"))?;
>> +                info!("Access time safety check successful, proceeding with GC.");
>> +            } else {
>> +                info!("Filesystem atime safety check disabled by datastore tuning options.");
> 
> Sounds IMO scarier than what this is, i.e. like the whole GC would not
> be safe anymore, or would not use atime safety checks anymore.
> 
> I'd rather either drop that or reword it to something else, maybe just doing
> s/safety/update/ would be enough.

Agreed, will replace as suggested.

>> +            }
>>   
>>               info!("Start GC phase1 (mark used chunks)");
>>   
>> diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
>> index fe3260f6d..35847fc45 100644
>> --- a/src/api2/config/datastore.rs
>> +++ b/src/api2/config/datastore.rs
>> @@ -119,6 +119,7 @@ pub(crate) fn do_create_datastore(
>>                   backup_user.uid,
>>                   backup_user.gid,
>>                   tuning.sync_level.unwrap_or_default(),
>> +                tuning.gc_atime_safety_check.unwrap_or(true),
>>               )
>>               .map(|_| ())
>>           } else {



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2025-03-19  9:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-06 14:52 [pbs-devel] [PATCH v5 proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox 1/8] pbs api types: add garbage collection atime safety check flag Christian Ebner
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox 2/8] pbs api types: add option to set GC chunk cleanup atime cutoff Christian Ebner
2025-03-19  8:10   ` Thomas Lamprecht
2025-03-19  8:48     ` Christian Ebner
2025-03-19  9:01       ` Thomas Lamprecht
2025-03-19  9:08         ` Christian Ebner
2025-03-19  9:13           ` Thomas Lamprecht
2025-03-19  9:25             ` Christian Ebner
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored Christian Ebner
2025-03-19  8:45   ` Thomas Lamprecht
2025-03-19  9:22     ` Christian Ebner [this message]
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox-backup 4/8] ui: expose GC atime safety check flag in datastore tuning options Christian Ebner
2025-03-19  8:45   ` Thomas Lamprecht
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox-backup 5/8] docs: mention GC atime update check for " Christian Ebner
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox-backup 6/8] datastore: use custom GC atime cutoff if set Christian Ebner
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox-backup 7/8] ui: expose GC atime cutoff in datastore tuning option Christian Ebner
2025-03-19  8:49   ` Thomas Lamprecht
2025-03-06 14:52 ` [pbs-devel] [PATCH v5 proxmox-backup 8/8] docs: mention gc-atime-cutoff as " Christian Ebner
2025-03-19 17:26 ` [pbs-devel] [PATCH v5 proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ab3a041-6abb-405f-b232-f687802e81ce@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal