public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored
@ 2025-02-17 13:12 Christian Ebner
  2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox 1/2] pbs api types: Add check garbage collection atime updates flag Christian Ebner
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Christian Ebner @ 2025-02-17 13:12 UTC (permalink / raw)
  To: pbs-devel

These patches add a check to phase 1 of garbage collection in order
to detect when the filesystem backing the chunk store does not honor
atime updates. This avoids possible data loss for situations where
garbage collection could otherwise delete chunks still referenced by
a backup snaphost's index file.

In order to reduce overhead and since filesystem mounted with
relatime are not guaranteed to update the atime each time, only check
the first chunk for which the atime is outside of the grace period
range and skip for all latter. Since finding this first chunk also
has an overhead (stating all chunks pre-atime update), allow to
disable this checks altogether by setting a datastore tuning
parameter flag.

Link to the issue in the bugtracker:
https://bugzilla.proxmox.com/show_bug.cgi?id=5982

proxmox:

Christian Ebner (1):
  pbs api types: Add check garbage collection atime updates flag

 pbs-api-types/src/datastore.rs | 8 ++++++++
 1 file changed, 8 insertions(+)

proxmox-backup:

Christian Ebner (1):
  fix #5982: garbage collection: check atime updates are honored

 pbs-datastore/src/datastore.rs | 50 +++++++++++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 4 deletions(-)

-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox 1/2] pbs api types: Add check garbage collection atime updates flag
  2025-02-17 13:12 [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner
@ 2025-02-17 13:12 ` Christian Ebner
  2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored Christian Ebner
  2025-02-19 16:54 ` [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner
  2 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-02-17 13:12 UTC (permalink / raw)
  To: pbs-devel

Add the `check-gc-atime` flag to the datastore tuning parameters.
This flag allows to enable/disable a check to detect if the atime
update is honored by the filesystem backing the chunk store during
phase 1 of garbage collection. The default is to perform the check.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-api-types/src/datastore.rs | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index ddd8d3c6..7b2a59cc 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -229,6 +229,12 @@ pub enum DatastoreFSyncLevel {
             type: ChunkOrder,
             optional: true,
         },
+        "check-gc-atime": {
+            description: "If enabled, check the filesystem updates atime during phase 1 of garbage collection",
+            optional: true,
+            default: true,
+            type: bool,
+        },
     },
 )]
 #[derive(Serialize, Deserialize, Default)]
@@ -240,6 +246,8 @@ pub struct DatastoreTuning {
     pub chunk_order: Option<ChunkOrder>,
     #[serde(skip_serializing_if = "Option::is_none")]
     pub sync_level: Option<DatastoreFSyncLevel>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub check_gc_atime: Option<bool>,
 }
 
 pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-17 13:12 [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner
  2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox 1/2] pbs api types: Add check garbage collection atime updates flag Christian Ebner
@ 2025-02-17 13:12 ` Christian Ebner
  2025-02-17 15:36   ` Fabian Grünbichler
  2025-02-19 16:54 ` [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner
  2 siblings, 1 reply; 11+ messages in thread
From: Christian Ebner @ 2025-02-17 13:12 UTC (permalink / raw)
  To: pbs-devel

Check if the filesystem the chunk store is located on actually
updates the atime when performing the marking of the chunks in
phase 1 of the garbage collection. Since it is not enough to check if
a single/first chunks atime is updated, since the filesystem can be
mounted via the `relatime` option, find the first chunk which is'
outside the relatime's 24 hour cutoff window and check the update on
that chunk only.

Allow to disable the atime update checks via the optional datastores
tuning parameter, but enable them by default.

Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5982
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 50 +++++++++++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 75c0c16ab..9aa509e27 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -4,6 +4,7 @@ use std::os::unix::ffi::OsStrExt;
 use std::os::unix::io::AsRawFd;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, LazyLock, Mutex};
+use std::time::UNIX_EPOCH;
 
 use anyhow::{bail, format_err, Error};
 use nix::unistd::{unlinkat, UnlinkatFlags};
@@ -1029,13 +1030,15 @@ impl DataStore {
         Ok(list)
     }
 
-    // mark chunks  used by ``index`` as used
+    // mark chunks  used by ``index`` as used,
+    // fail if the chunks atime is not updated as expected by the filesystem.
     fn index_mark_used_chunks<I: IndexFile>(
         &self,
         index: I,
         file_name: &Path, // only used for error reporting
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
+        min_atime: &mut Option<i64>,
     ) -> Result<(), Error> {
         status.index_file_count += 1;
         status.index_data_bytes += index.index_bytes();
@@ -1044,6 +1047,20 @@ impl DataStore {
             worker.check_abort()?;
             worker.fail_on_shutdown()?;
             let digest = index.index_digest(pos).unwrap();
+
+            // Check if the chunk store actually honors `atime` updates by checking the first chunk
+            // outside of the cutoff range. If atime is not honored garbage collection must fail,
+            // as otherwise chunks still in use can be lost.
+            let mut old_atime = None;
+            if let Some(min_atime) = min_atime {
+                let metadata = self.stat_chunk(digest)?;
+                let atime = metadata.accessed()?;
+                let atime_epoch: i64 = atime.duration_since(UNIX_EPOCH)?.as_secs().try_into()?;
+                if atime_epoch < *min_atime {
+                    old_atime = Some(atime)
+                }
+            }
+
             if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
                 let hex = hex::encode(digest);
                 warn!(
@@ -1061,6 +1078,19 @@ impl DataStore {
                     self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
                 }
             }
+
+            // `atime` was outside range for this chunk, check that it has been updated.
+            if let Some(old_atime) = old_atime {
+                let metadata = self.stat_chunk(digest)?;
+                let atime = metadata.accessed()?;
+                if old_atime < atime {
+                    // atime was updated, do not check further chunks
+                    *min_atime = None;
+                } else {
+                    let hex = hex::encode(digest);
+                    bail!("atime not updated for chunk {hex}, is atime support enabled?");
+                }
+            }
         }
         Ok(())
     }
@@ -1069,6 +1099,7 @@ impl DataStore {
         &self,
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
+        min_atime: &mut Option<i64>,
     ) -> Result<(), Error> {
         let image_list = self.list_images()?;
         let image_count = image_list.len();
@@ -1097,12 +1128,12 @@ impl DataStore {
                             let index = FixedIndexReader::new(file).map_err(|e| {
                                 format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
                             })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
+                            self.index_mark_used_chunks(index, &img, status, worker, min_atime)?;
                         } else if archive_type == ArchiveType::DynamicIndex {
                             let index = DynamicIndexReader::new(file).map_err(|e| {
                                 format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
                             })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
+                            self.index_mark_used_chunks(index, &img, status, worker, min_atime)?;
                         }
                     }
                 }
@@ -1170,10 +1201,21 @@ impl DataStore {
                 upid: Some(upid.to_string()),
                 ..Default::default()
             };
+            let tuning: DatastoreTuning = serde_json::from_value(
+                DatastoreTuning::API_SCHEMA
+                    .parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
+            )?;
+            let mut atime_check_ref = if tuning.check_gc_atime.unwrap_or(true) {
+                // Cutoff atime including a 24h grace period for filesystems mounted with
+                // `relatime` option.
+                Some(phase1_start_time - 3600 * 24)
+            } else {
+                None
+            };
 
             info!("Start GC phase1 (mark used chunks)");
 
-            self.mark_used_chunks(&mut gc_status, worker)?;
+            self.mark_used_chunks(&mut gc_status, worker, &mut atime_check_ref)?;
 
             info!("Start GC phase2 (sweep unused chunks)");
             self.inner.chunk_store.sweep_unused_chunks(
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored Christian Ebner
@ 2025-02-17 15:36   ` Fabian Grünbichler
  2025-02-17 15:57     ` Christian Ebner
  2025-02-18 11:53     ` Thomas Lamprecht
  0 siblings, 2 replies; 11+ messages in thread
From: Fabian Grünbichler @ 2025-02-17 15:36 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On February 17, 2025 2:12 pm, Christian Ebner wrote:
> Check if the filesystem the chunk store is located on actually
> updates the atime when performing the marking of the chunks in
> phase 1 of the garbage collection. Since it is not enough to check if
> a single/first chunks atime is updated, since the filesystem can be
> mounted via the `relatime` option, find the first chunk which is'
> outside the relatime's 24 hour cutoff window and check the update on
> that chunk only.

given that our touching should punch through relatime (and does so on
all filesystems we tested so far), couldn't we just

- stat the first chunk
- touch the first chunk
- check if timestamps have been updated
- print a warning about the filesystem being potentially broken, and
- if the option is enabled, suggest the user report the details to us
- only continue if the option is explicitly disabled

that way we should get a real world survey of broken file systems that
could inform our decision to drop the 24h window (or keep it).. if we
introduce an option (defaulting to yes for now) conditionalizing the 24h
window, we could even tell users with semi-broken storages (see below)
to explicitly set that option in case we later switch the default,
although I am not sure whether such storages exist for real.

the only downside would be a potential slew of reports if we missed some
prominent filesystem that applies relatime to explicit timestamp updates
(any prominent storage ignoring explicit timestamp updates altogether
would have already cropped up in our support channels after causing
fatal data loss, and we only had a handful such reports so far, usually
involving proprietary storage appliances).

> 
> Allow to disable the atime update checks via the optional datastores
> tuning parameter, but enable them by default.
> 
> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5982
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  pbs-datastore/src/datastore.rs | 50 +++++++++++++++++++++++++++++++---
>  1 file changed, 46 insertions(+), 4 deletions(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 75c0c16ab..9aa509e27 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -4,6 +4,7 @@ use std::os::unix::ffi::OsStrExt;
>  use std::os::unix::io::AsRawFd;
>  use std::path::{Path, PathBuf};
>  use std::sync::{Arc, LazyLock, Mutex};
> +use std::time::UNIX_EPOCH;
>  
>  use anyhow::{bail, format_err, Error};
>  use nix::unistd::{unlinkat, UnlinkatFlags};
> @@ -1029,13 +1030,15 @@ impl DataStore {
>          Ok(list)
>      }
>  
> -    // mark chunks  used by ``index`` as used
> +    // mark chunks  used by ``index`` as used,
> +    // fail if the chunks atime is not updated as expected by the filesystem.
>      fn index_mark_used_chunks<I: IndexFile>(
>          &self,
>          index: I,
>          file_name: &Path, // only used for error reporting
>          status: &mut GarbageCollectionStatus,
>          worker: &dyn WorkerTaskContext,
> +        min_atime: &mut Option<i64>,
>      ) -> Result<(), Error> {
>          status.index_file_count += 1;
>          status.index_data_bytes += index.index_bytes();
> @@ -1044,6 +1047,20 @@ impl DataStore {
>              worker.check_abort()?;
>              worker.fail_on_shutdown()?;
>              let digest = index.index_digest(pos).unwrap();
> +
> +            // Check if the chunk store actually honors `atime` updates by checking the first chunk
> +            // outside of the cutoff range. If atime is not honored garbage collection must fail,
> +            // as otherwise chunks still in use can be lost.
> +            let mut old_atime = None;
> +            if let Some(min_atime) = min_atime {
> +                let metadata = self.stat_chunk(digest)?;
> +                let atime = metadata.accessed()?;
> +                let atime_epoch: i64 = atime.duration_since(UNIX_EPOCH)?.as_secs().try_into()?;
> +                if atime_epoch < *min_atime {
> +                    old_atime = Some(atime)
> +                }
> +            }
> +
>              if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
>                  let hex = hex::encode(digest);
>                  warn!(
> @@ -1061,6 +1078,19 @@ impl DataStore {
>                      self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
>                  }
>              }
> +
> +            // `atime` was outside range for this chunk, check that it has been updated.
> +            if let Some(old_atime) = old_atime {
> +                let metadata = self.stat_chunk(digest)?;
> +                let atime = metadata.accessed()?;
> +                if old_atime < atime {
> +                    // atime was updated, do not check further chunks
> +                    *min_atime = None;
> +                } else {
> +                    let hex = hex::encode(digest);
> +                    bail!("atime not updated for chunk {hex}, is atime support enabled?");
> +                }
> +            }
>          }
>          Ok(())
>      }
> @@ -1069,6 +1099,7 @@ impl DataStore {
>          &self,
>          status: &mut GarbageCollectionStatus,
>          worker: &dyn WorkerTaskContext,
> +        min_atime: &mut Option<i64>,
>      ) -> Result<(), Error> {
>          let image_list = self.list_images()?;
>          let image_count = image_list.len();
> @@ -1097,12 +1128,12 @@ impl DataStore {
>                              let index = FixedIndexReader::new(file).map_err(|e| {
>                                  format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
>                              })?;
> -                            self.index_mark_used_chunks(index, &img, status, worker)?;
> +                            self.index_mark_used_chunks(index, &img, status, worker, min_atime)?;
>                          } else if archive_type == ArchiveType::DynamicIndex {
>                              let index = DynamicIndexReader::new(file).map_err(|e| {
>                                  format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
>                              })?;
> -                            self.index_mark_used_chunks(index, &img, status, worker)?;
> +                            self.index_mark_used_chunks(index, &img, status, worker, min_atime)?;
>                          }
>                      }
>                  }
> @@ -1170,10 +1201,21 @@ impl DataStore {
>                  upid: Some(upid.to_string()),
>                  ..Default::default()
>              };
> +            let tuning: DatastoreTuning = serde_json::from_value(
> +                DatastoreTuning::API_SCHEMA
> +                    .parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
> +            )?;
> +            let mut atime_check_ref = if tuning.check_gc_atime.unwrap_or(true) {
> +                // Cutoff atime including a 24h grace period for filesystems mounted with
> +                // `relatime` option.
> +                Some(phase1_start_time - 3600 * 24)
> +            } else {
> +                None
> +            };
>  
>              info!("Start GC phase1 (mark used chunks)");
>  
> -            self.mark_used_chunks(&mut gc_status, worker)?;
> +            self.mark_used_chunks(&mut gc_status, worker, &mut atime_check_ref)?;
>  
>              info!("Start GC phase2 (sweep unused chunks)");
>              self.inner.chunk_store.sweep_unused_chunks(
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-17 15:36   ` Fabian Grünbichler
@ 2025-02-17 15:57     ` Christian Ebner
  2025-02-18 11:53     ` Thomas Lamprecht
  1 sibling, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-02-17 15:57 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

On 2/17/25 16:36, Fabian Grünbichler wrote:
> On February 17, 2025 2:12 pm, Christian Ebner wrote:
>> Check if the filesystem the chunk store is located on actually
>> updates the atime when performing the marking of the chunks in
>> phase 1 of the garbage collection. Since it is not enough to check if
>> a single/first chunks atime is updated, since the filesystem can be
>> mounted via the `relatime` option, find the first chunk which is'
>> outside the relatime's 24 hour cutoff window and check the update on
>> that chunk only.
> 
> given that our touching should punch through relatime (and does so on
> all filesystems we tested so far), couldn't we just
> 
> - stat the first chunk
> - touch the first chunk
> - check if timestamps have been updated
> - print a warning about the filesystem being potentially broken, and
> - if the option is enabled, suggest the user report the details to us
> - only continue if the option is explicitly disabled
> 
> that way we should get a real world survey of broken file systems that
> could inform our decision to drop the 24h window (or keep it).. if we
> introduce an option (defaulting to yes for now) conditionalizing the 24h
> window, we could even tell users with semi-broken storages (see below)
> to explicitly set that option in case we later switch the default,
> although I am not sure whether such storages exist for real.

Hmm, that is a good idea!

> the only downside would be a potential slew of reports if we missed some
> prominent filesystem that applies relatime to explicit timestamp updates
> (any prominent storage ignoring explicit timestamp updates altogether
> would have already cropped up in our support channels after causing
> fatal data loss, and we only had a handful such reports so far, usually
> involving proprietary storage appliances).

Of course not ideal, but to big of an issue if this check is implemented 
to be opt-out. Giving the user a way to disable such a check and 
generating data to potentially drop the 24h grace period altogether in 
the future sound good to me!

Will adapt the patches accordingly in version 2, thanks!


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-17 15:36   ` Fabian Grünbichler
  2025-02-17 15:57     ` Christian Ebner
@ 2025-02-18 11:53     ` Thomas Lamprecht
  2025-02-18 12:39       ` Christian Ebner
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Lamprecht @ 2025-02-18 11:53 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

Am 17.02.25 um 16:36 schrieb Fabian Grünbichler:
> On February 17, 2025 2:12 pm, Christian Ebner wrote:
>> Check if the filesystem the chunk store is located on actually
>> updates the atime when performing the marking of the chunks in
>> phase 1 of the garbage collection. Since it is not enough to check if
>> a single/first chunks atime is updated, since the filesystem can be
>> mounted via the `relatime` option, find the first chunk which is'
>> outside the relatime's 24 hour cutoff window and check the update on
>> that chunk only.
> 
> given that our touching should punch through relatime (and does so on
> all filesystems we tested so far), couldn't we just
> 
> - stat the first chunk
> - touch the first chunk
> - check if timestamps have been updated
> - print a warning about the filesystem being potentially broken, and
> - if the option is enabled, suggest the user report the details to us
> - only continue if the option is explicitly disabled
> 
> that way we should get a real world survey of broken file systems that
> could inform our decision to drop the 24h window (or keep it).. if we
> introduce an option (defaulting to yes for now) conditionalizing the 24h
> window, we could even tell users with semi-broken storages (see below)
> to explicitly set that option in case we later switch the default,
> although I am not sure whether such storages exist for real.

+1; one (additional) option _might_  be to trigger suck a check on
datastore creation, e.g. create the all-zero chunk and then do that
test. As of now that probably would not win us much, but if we make
the 24h-wait opt-in then users would be warned early enough, or we
could even auto-set that option in such a case.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-18 11:53     ` Thomas Lamprecht
@ 2025-02-18 12:39       ` Christian Ebner
  2025-02-18 13:17         ` Christian Ebner
  2025-02-18 13:31         ` Thomas Lamprecht
  0 siblings, 2 replies; 11+ messages in thread
From: Christian Ebner @ 2025-02-18 12:39 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Thomas Lamprecht,
	Fabian Grünbichler

On 2/18/25 12:53, Thomas Lamprecht wrote:
> Am 17.02.25 um 16:36 schrieb Fabian Grünbichler:
>> On February 17, 2025 2:12 pm, Christian Ebner wrote:
>>> Check if the filesystem the chunk store is located on actually
>>> updates the atime when performing the marking of the chunks in
>>> phase 1 of the garbage collection. Since it is not enough to check if
>>> a single/first chunks atime is updated, since the filesystem can be
>>> mounted via the `relatime` option, find the first chunk which is'
>>> outside the relatime's 24 hour cutoff window and check the update on
>>> that chunk only.
>>
>> given that our touching should punch through relatime (and does so on
>> all filesystems we tested so far), couldn't we just
>>
>> - stat the first chunk
>> - touch the first chunk
>> - check if timestamps have been updated
>> - print a warning about the filesystem being potentially broken, and
>> - if the option is enabled, suggest the user report the details to us
>> - only continue if the option is explicitly disabled
>>
>> that way we should get a real world survey of broken file systems that
>> could inform our decision to drop the 24h window (or keep it).. if we
>> introduce an option (defaulting to yes for now) conditionalizing the 24h
>> window, we could even tell users with semi-broken storages (see below)
>> to explicitly set that option in case we later switch the default,
>> although I am not sure whether such storages exist for real.
> 
> +1; one (additional) option _might_  be to trigger suck a check on
> datastore creation, e.g. create the all-zero chunk and then do that
> test. As of now that probably would not win us much, but if we make
> the 24h-wait opt-in then users would be warned early enough, or we
> could even auto-set that option in such a case.

Only checking the atime update check on datastore creation is not enough 
I think, as the backing filesystem might get remounted with changed 
mount parameters? Or do you mean to *also* check on datastore creation 
already to early on detect issues? Although, in my testing even with 
`noatime` the atime update seems to be honored by the way the garbage 
collection performs the time updates (further details see below).

Anyways, creating the all-zero chunk and use that for the check sounds 
like a good optimization to me, as that allows to avoid conditional 
checking in the phase 1 of garbage collection. However, at the cost of 
having to make sure that it is never cleaned up by phase 2...

Regarding the 24 hour waiting period, as mentioned above I noted that 
atime updates are honored even if I set the `noatime` for an ext4 or 
`atime=off` on zfs.
Seems like the utimensat() bypasses this directly, as it calls into 
vfs_utimes() [0], which sets this to be an explicit time update, 
followed by the notify_change() [1], which then calls the setattr() of 
the corresponding filesystem [2] via the given inode.
This seems to bypass the atime_needs_update() [3], only called by 
touch_atime(). The atime_needs_update() also checks the 
relatime_needs_update() [4].

Although not conclusive (yet).

[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/utimes.c#n20
[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/attr.c#n426
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/attr.c#n552
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/inode.c#n2139
[4] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/inode.c#n2008


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-18 12:39       ` Christian Ebner
@ 2025-02-18 13:17         ` Christian Ebner
  2025-02-18 13:31         ` Thomas Lamprecht
  1 sibling, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-02-18 13:17 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Thomas Lamprecht,
	Fabian Grünbichler

On 2/18/25 13:39, Christian Ebner wrote:

> Anyways, creating the all-zero chunk and use that for the check sounds 
> like a good optimization to me, as that allows to avoid conditional 
> checking in the phase 1 of garbage collection. However, at the cost of 
> having to make sure that it is never cleaned up by phase 2...

Well, which it obviously will not, since it was touched at the start of 
garbage collection. So no extra check needed for that :)


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-18 12:39       ` Christian Ebner
  2025-02-18 13:17         ` Christian Ebner
@ 2025-02-18 13:31         ` Thomas Lamprecht
  2025-02-18 13:38           ` Christian Ebner
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Lamprecht @ 2025-02-18 13:31 UTC (permalink / raw)
  To: Christian Ebner, Proxmox Backup Server development discussion,
	Fabian Grünbichler

Am 18.02.25 um 13:39 schrieb Christian Ebner:
> On 2/18/25 12:53, Thomas Lamprecht wrote:
>> +1; one (additional) option _might_  be to trigger suck a check on
>> datastore creation, e.g. create the all-zero chunk and then do that
>> test. As of now that probably would not win us much, but if we make
>> the 24h-wait opt-in then users would be warned early enough, or we
>> could even auto-set that option in such a case.
> 
> Only checking the atime update check on datastore creation is not enough 
> I think, as the backing filesystem might get remounted with changed 
> mount parameters? Or do you mean to *also* check on datastore creation 
> already to early on detect issues? Although, in my testing even with 
> `noatime` the atime update seems to be honored by the way the garbage 
> collection performs the time updates (further details see below).

yes, I meant doing that additionally to checking on GC.

> Anyways, creating the all-zero chunk and use that for the check sounds 
> like a good optimization to me, as that allows to avoid conditional 
> checking in the phase 1 of garbage collection. However, at the cost of 
> having to make sure that it is never cleaned up by phase 2...

I saw your second reply already but even without that in mind it would
be IMO fine to only use the all-zero chunk for the on-create check, as
I would not see it as a big problem if it then gets pruned during
GC, if the latter would use an actually existing chunk. But no hard
feelings here at all either way.

> Regarding the 24 hour waiting period, as mentioned above I noted that 
> atime updates are honored even if I set the `noatime` for an ext4 or 
> `atime=off` on zfs.
> Seems like the utimensat() bypasses this directly, as it calls into 
> vfs_utimes() [0], which sets this to be an explicit time update, 
> followed by the notify_change() [1], which then calls the setattr() of 
> the corresponding filesystem [2] via the given inode.
> This seems to bypass the atime_needs_update() [3], only called by 
> touch_atime(). The atime_needs_update() also checks the 
> relatime_needs_update() [4].
> 
> Although not conclusive (yet).
> 

Yeah, that would support making this opt-in. FWIW, we could maybe "sell"
this as sort of feature by not just transforming it into a boolean
"24h-wait period <true|false>" option but rather a more generic
"wait-period <X hours>" option that defaults to 0 hours (or maybe a
few minutes if we want to support minute granularity). Not sure if there
are enough (real world) use cases to warrant this, so mostly mentioned
for the sake of completeness.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored
  2025-02-18 13:31         ` Thomas Lamprecht
@ 2025-02-18 13:38           ` Christian Ebner
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-02-18 13:38 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox Backup Server development discussion,
	Fabian Grünbichler

On 2/18/25 14:31, Thomas Lamprecht wrote:
> Am 18.02.25 um 13:39 schrieb Christian Ebner:
>> On 2/18/25 12:53, Thomas Lamprecht wrote:
>>> +1; one (additional) option _might_  be to trigger suck a check on
>>> datastore creation, e.g. create the all-zero chunk and then do that
>>> test. As of now that probably would not win us much, but if we make
>>> the 24h-wait opt-in then users would be warned early enough, or we
>>> could even auto-set that option in such a case.
>>
>> Only checking the atime update check on datastore creation is not enough
>> I think, as the backing filesystem might get remounted with changed
>> mount parameters? Or do you mean to *also* check on datastore creation
>> already to early on detect issues? Although, in my testing even with
>> `noatime` the atime update seems to be honored by the way the garbage
>> collection performs the time updates (further details see below).
> 
> yes, I meant doing that additionally to checking on GC.
> 
>> Anyways, creating the all-zero chunk and use that for the check sounds
>> like a good optimization to me, as that allows to avoid conditional
>> checking in the phase 1 of garbage collection. However, at the cost of
>> having to make sure that it is never cleaned up by phase 2...
> 
> I saw your second reply already but even without that in mind it would
> be IMO fine to only use the all-zero chunk for the on-create check, as
> I would not see it as a big problem if it then gets pruned during
> GC, if the latter would use an actually existing chunk. But no hard
> feelings here at all either way.

I think using a 4M fixed size chunk for both cases makes it even more 
elegant, as one can then use the same logic for both cases, the check on 
datastore create and the check on garbage collection. And to be 
backwards compatible, this simply creates the zero chunk for both cases 
if it does not exist, covering existing datastores already.

>> Regarding the 24 hour waiting period, as mentioned above I noted that
>> atime updates are honored even if I set the `noatime` for an ext4 or
>> `atime=off` on zfs.
>> Seems like the utimensat() bypasses this directly, as it calls into
>> vfs_utimes() [0], which sets this to be an explicit time update,
>> followed by the notify_change() [1], which then calls the setattr() of
>> the corresponding filesystem [2] via the given inode.
>> This seems to bypass the atime_needs_update() [3], only called by
>> touch_atime(). The atime_needs_update() also checks the
>> relatime_needs_update() [4].
>>
>> Although not conclusive (yet).
>>
> 
> Yeah, that would support making this opt-in. FWIW, we could maybe "sell"
> this as sort of feature by not just transforming it into a boolean
> "24h-wait period <true|false>" option but rather a more generic
> "wait-period <X hours>" option that defaults to 0 hours (or maybe a
> few minutes if we want to support minute granularity). Not sure if there
> are enough (real world) use cases to warrant this, so mostly mentioned
> for the sake of completeness.

Yes, that sounds good to me!



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored
  2025-02-17 13:12 [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner
  2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox 1/2] pbs api types: Add check garbage collection atime updates flag Christian Ebner
  2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored Christian Ebner
@ 2025-02-19 16:54 ` Christian Ebner
  2 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-02-19 16:54 UTC (permalink / raw)
  To: pbs-devel

Superseded-by version 2:
https://lore.proxmox.com/pbs-devel/20250219164847.757184-1-c.ebner@proxmox.com/T/

Just noted that I did forget to add the v2 prefix and there was a typo 
in the issue number of the cover-letter title which I also copy/pasted 
ignorantly. This should be issue 5982.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-02-19 16:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-17 13:12 [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner
2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox 1/2] pbs api types: Add check garbage collection atime updates flag Christian Ebner
2025-02-17 13:12 ` [pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored Christian Ebner
2025-02-17 15:36   ` Fabian Grünbichler
2025-02-17 15:57     ` Christian Ebner
2025-02-18 11:53     ` Thomas Lamprecht
2025-02-18 12:39       ` Christian Ebner
2025-02-18 13:17         ` Christian Ebner
2025-02-18 13:31         ` Thomas Lamprecht
2025-02-18 13:38           ` Christian Ebner
2025-02-19 16:54 ` [pbs-devel] [PATCH proxmox proxmox-backup 0/2] fix #5892: check atime update is honored Christian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal