public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
Date: Mon, 17 Feb 2025 14:12:08 +0100	[thread overview]
Message-ID: <20250217131208.265219-3-c.ebner@proxmox.com> (raw)
In-Reply-To: <20250217131208.265219-1-c.ebner@proxmox.com>

Check if the filesystem the chunk store is located on actually
updates the atime when performing the marking of the chunks in
phase 1 of the garbage collection. Since it is not enough to check if
a single/first chunks atime is updated, since the filesystem can be
mounted via the `relatime` option, find the first chunk which is'
outside the relatime's 24 hour cutoff window and check the update on
that chunk only.

Allow to disable the atime update checks via the optional datastores
tuning parameter, but enable them by default.

Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5982
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 50 +++++++++++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 75c0c16ab..9aa509e27 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -4,6 +4,7 @@ use std::os::unix::ffi::OsStrExt;
 use std::os::unix::io::AsRawFd;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, LazyLock, Mutex};
+use std::time::UNIX_EPOCH;
 
 use anyhow::{bail, format_err, Error};
 use nix::unistd::{unlinkat, UnlinkatFlags};
@@ -1029,13 +1030,15 @@ impl DataStore {
         Ok(list)
     }
 
-    // mark chunks  used by ``index`` as used
+    // mark chunks  used by ``index`` as used,
+    // fail if the chunks atime is not updated as expected by the filesystem.
     fn index_mark_used_chunks<I: IndexFile>(
         &self,
         index: I,
         file_name: &Path, // only used for error reporting
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
+        min_atime: &mut Option<i64>,
     ) -> Result<(), Error> {
         status.index_file_count += 1;
         status.index_data_bytes += index.index_bytes();
@@ -1044,6 +1047,20 @@ impl DataStore {
             worker.check_abort()?;
             worker.fail_on_shutdown()?;
             let digest = index.index_digest(pos).unwrap();
+
+            // Check if the chunk store actually honors `atime` updates by checking the first chunk
+            // outside of the cutoff range. If atime is not honored garbage collection must fail,
+            // as otherwise chunks still in use can be lost.
+            let mut old_atime = None;
+            if let Some(min_atime) = min_atime {
+                let metadata = self.stat_chunk(digest)?;
+                let atime = metadata.accessed()?;
+                let atime_epoch: i64 = atime.duration_since(UNIX_EPOCH)?.as_secs().try_into()?;
+                if atime_epoch < *min_atime {
+                    old_atime = Some(atime)
+                }
+            }
+
             if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
                 let hex = hex::encode(digest);
                 warn!(
@@ -1061,6 +1078,19 @@ impl DataStore {
                     self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
                 }
             }
+
+            // `atime` was outside range for this chunk, check that it has been updated.
+            if let Some(old_atime) = old_atime {
+                let metadata = self.stat_chunk(digest)?;
+                let atime = metadata.accessed()?;
+                if old_atime < atime {
+                    // atime was updated, do not check further chunks
+                    *min_atime = None;
+                } else {
+                    let hex = hex::encode(digest);
+                    bail!("atime not updated for chunk {hex}, is atime support enabled?");
+                }
+            }
         }
         Ok(())
     }
@@ -1069,6 +1099,7 @@ impl DataStore {
         &self,
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
+        min_atime: &mut Option<i64>,
     ) -> Result<(), Error> {
         let image_list = self.list_images()?;
         let image_count = image_list.len();
@@ -1097,12 +1128,12 @@ impl DataStore {
                             let index = FixedIndexReader::new(file).map_err(|e| {
                                 format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
                             })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
+                            self.index_mark_used_chunks(index, &img, status, worker, min_atime)?;
                         } else if archive_type == ArchiveType::DynamicIndex {
                             let index = DynamicIndexReader::new(file).map_err(|e| {
                                 format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
                             })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
+                            self.index_mark_used_chunks(index, &img, status, worker, min_atime)?;
                         }
                     }
                 }
@@ -1170,10 +1201,21 @@ impl DataStore {
                 upid: Some(upid.to_string()),
                 ..Default::default()
             };
+            let tuning: DatastoreTuning = serde_json::from_value(
+                DatastoreTuning::API_SCHEMA
+                    .parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
+            )?;
+            let mut atime_check_ref = if tuning.check_gc_atime.unwrap_or(true) {
+                // Cutoff atime including a 24h grace period for filesystems mounted with
+                // `relatime` option.
+                Some(phase1_start_time - 3600 * 24)
+            } else {
+                None
+            };
 
             info!("Start GC phase1 (mark used chunks)");
 
-            self.mark_used_chunks(&mut gc_status, worker)?;
+            self.mark_used_chunks(&mut gc_status, worker, &mut atime_check_ref)?;
 
             info!("Start GC phase2 (sweep unused chunks)");
             self.inner.chunk_store.sweep_unused_chunks(
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


  parent reply	other threads:[~2025-02-17 13:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-17 13:12 [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner
2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox 1/2] pbs api types: Add check garbage collection atime updates flag Christian Ebner
2025-02-17 13:12 ` Christian Ebner [this message]
2025-02-17 15:36   ` [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored Fabian Grünbichler
2025-02-17 15:57     ` Christian Ebner
2025-02-18 11:53     ` Thomas Lamprecht
2025-02-18 12:39       ` Christian Ebner
2025-02-18 13:17         ` Christian Ebner
2025-02-18 13:31         ` Thomas Lamprecht
2025-02-18 13:38           ` Christian Ebner
2025-02-19 16:54 ` [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250217131208.265219-3-c.ebner@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal