* [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored
@ 2025-03-19 17:24 Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox 1/2] pbs api types: add garbage collection atime safety check flag Christian Ebner
` (8 more replies)
0 siblings, 9 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
These patches add a check to phase 1 of garbage collection and datastore
creation in order to detect when the filesystem backing the chunk store
does not honor atime updates. This avoids possible data loss for situations
where garbage collection could otherwise delete chunks still referenced by
a backup snaphost's index file.
The check is performed on a fixed size 4 MiB unencrypted and compressed
chunk of all-zeros, inserted if not present yet.
The Linux kernel timestamp granularity is taken into account by sleeping
for 1 second to avoid discarded atime update attempts by utimensat calls.
The test is enabled by default, but an opt-out option can be set via the
datastore tuning parameters for backwards compatibility.
Further, add a datastore tuning parameter to reduce the wait period for
chunk removal in phase 2 of garbage collection.
Most notable changes sice version 5 (thanks Thomas for feedback):
- Move atime update check on datastore creation from chunk store to data
store creation.
- Check atime updates also when reusing an existing datastore
- Shorten field labels and add more info in qtip.
- Add additional context to errors.
- Improve readability of constants by writing them as their factors.
Note: I the end adapting the cutoff to HumanTime and render it
accordingly in the UI turned out to be to much of a hussle as expected,
therefore sticked to the integer schema.
Most notable changes sice version 4 (thanks Fabian for feedback):
- Decouple gc-atime-safety-check from gc-atime-cutoff
- Improve logging and error handling
- fixed typo in docs (thanks Maximilliano for noticing).
Most notable changes sice version 3 (thanks Fabian for feedback):
- Drop check for relatime like behaviour, as this is not supported and
does not show up in any of the tests performed on btrfs, cephfs, ext4,
NFS3, NFS4, ntfs, SMB3_11, xfs or ZFS.
- Additionally check chunk inode to detect possible but very unlikely
file changes, perform check once again in that case.
- Move atime cutoff selection and min_atime calculation to the same
location, as they are logically related.
Most notable changes sice version 2 (thanks Fabian and Thomas for
comments and suggestions):
- Take into account Linux timestamp granularity, do not set timestamp
to the past, as that introduces other error paths such as lack of
permissions or fs limitations.
- Check relatime behavior, if atime behaviour is not honored. Fallback
to original cutoff in that case.
- Adapt tuning parameter names.
Most notable changes sice version 1 (thanks Fabian and Thomas for
comments and suggestions):
- Optimize check by using the all zero chunk
- Enable the check by default and fail GC job if not honored, but allow
to opt-out
- Add GC wait period tuning option
Link to the issue in the bugtracker:
https://bugzilla.proxmox.com/show_bug.cgi?id=5982
proxmox:
Christian Ebner (2):
pbs api types: add garbage collection atime safety check flag
pbs api types: add option to set GC chunk cleanup atime cutoff
pbs-api-types/src/datastore.rs | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
proxmox-backup:
Christian Ebner (6):
fix #5982: garbage collection: check atime updates are honored
ui: expose GC atime safety check flag in datastore tuning options
docs: mention GC atime update check for tuning options
datastore: use custom GC atime cutoff if set
ui: expose GC atime cutoff in datastore tuning option
docs: mention gc-atime-cutoff as datastore tuning option
docs/storage.rst | 16 ++++++-
pbs-datastore/src/chunk_store.rs | 82 +++++++++++++++++++++++++++-----
pbs-datastore/src/datastore.rs | 39 ++++++++++++++-
src/api2/config/datastore.rs | 35 ++++++++++----
www/Utils.js | 9 ++++
www/datastore/OptionView.js | 23 +++++++++
6 files changed, 181 insertions(+), 23 deletions(-)
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox 1/2] pbs api types: add garbage collection atime safety check flag
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox 2/2] pbs api types: add option to set GC chunk cleanup atime cutoff Christian Ebner
` (7 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Add the `gc-atime-safety-check` flag to the datastore tuning
parameters. This flag allows to enable/disable a check to detect if
the atime update is honored by the filesystem backing the chunk store
during phase 1 of garbage collection and during datastore creation.
The default is to perform the check.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- no changes
pbs-api-types/src/datastore.rs | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index ddd8d3c6..1a4d9e2d 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -229,6 +229,14 @@ pub enum DatastoreFSyncLevel {
type: ChunkOrder,
optional: true,
},
+ "gc-atime-safety-check": {
+ description:
+ "Check filesystem atime updates are honored during store creation and garbage \
+ collection",
+ optional: true,
+ default: true,
+ type: bool,
+ },
},
)]
#[derive(Serialize, Deserialize, Default)]
@@ -240,6 +248,8 @@ pub struct DatastoreTuning {
pub chunk_order: Option<ChunkOrder>,
#[serde(skip_serializing_if = "Option::is_none")]
pub sync_level: Option<DatastoreFSyncLevel>,
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub gc_atime_safety_check: Option<bool>,
}
pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox 2/2] pbs api types: add option to set GC chunk cleanup atime cutoff
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox 1/2] pbs api types: add garbage collection atime safety check flag Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored Christian Ebner
` (6 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Add the `gc-atime-cutoff` option to the datastore tuning parameters.
This allows to specify the time after which the chunks are not
considered in use anymore if their atime has not been updated since
then.
The default is to keep chunks within the 24h 5m timespan (given no
active writers).
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- Show maximum and default values as factors to convey their meaning
instead of adding a comment.
pbs-api-types/src/datastore.rs | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index 1a4d9e2d..b174a047 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -223,6 +223,15 @@ pub enum DatastoreFSyncLevel {
Filesystem,
}
+pub const GC_ATIME_CUTOFF_SCHEMA: Schema = IntegerSchema::new(
+ "Cutoff (in minutes) for chunk cleanup atime check in garbage collection phase 2 \
+ (default 24h 5m)",
+)
+.minimum(1) // safety margin for kernel timestamp granularity, but stay within minute range
+.maximum(2 * 24 * 60)
+.default(24 * 60 + 5)
+.schema();
+
#[api(
properties: {
"chunk-order": {
@@ -237,6 +246,10 @@ pub enum DatastoreFSyncLevel {
default: true,
type: bool,
},
+ "gc-atime-cutoff": {
+ schema: GC_ATIME_CUTOFF_SCHEMA,
+ optional: true,
+ },
},
)]
#[derive(Serialize, Deserialize, Default)]
@@ -250,6 +263,8 @@ pub struct DatastoreTuning {
pub sync_level: Option<DatastoreFSyncLevel>,
#[serde(skip_serializing_if = "Option::is_none")]
pub gc_atime_safety_check: Option<bool>,
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub gc_atime_cutoff: Option<usize>,
}
pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox 1/2] pbs api types: add garbage collection atime safety check flag Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox 2/2] pbs api types: add option to set GC chunk cleanup atime cutoff Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-20 8:36 ` Wolfgang Bumiller
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 4/8] ui: expose GC atime safety check flag in datastore tuning options Christian Ebner
` (5 subsequent siblings)
8 siblings, 1 reply; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Check if the filesystem backing the chunk store actually updates the
atime to avoid potential data loss in phase 2 of garbage collection,
in case the atime update is not honored.
Perform the check before phase 1 of garbage collection, as well as
on datastore creation. The latter to early detect and disallow
datastore creation on filesystem configurations which otherwise most
likely would lead to data losses. To perform the check also when
reusing an existing datastore, open the chunks store also on reuse.
Enable the atime update check by default, but allow to opt-out by
setting a datastore tuning parameter flag for backwards compatibility.
This is honored by both, garbage collection and datastore creation.
The check uses a 4 MiB fixed sized, unencypted and compressed chunk
as test marker, inserted if not present. This all zero-chunk is very
likely anyways for unencrypted backup contents with large all-zero
regions using fixed size chunking (e.g. VMs).
To avoid cases were the timestamp will not be updated because of the
Linux kernels timestamp granularity, sleep in-between chunk insert
(including an atime update if pre-existing) and the subsequent
stating + utimensat for 1 second.
Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5982
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- Also perform check when reusing datastores, not just new one.
- Expand on comment for sleep, fix incorrect wording in commit message.
- Add additional error context to before/after stat calls.
- Soften wording of error messages if check is skipped because disabled.
pbs-datastore/src/chunk_store.rs | 72 ++++++++++++++++++++++++++++++--
pbs-datastore/src/datastore.rs | 13 ++++++
src/api2/config/datastore.rs | 35 ++++++++++++----
3 files changed, 109 insertions(+), 11 deletions(-)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 5e02909a1..dcb499426 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -1,9 +1,11 @@
+use std::os::unix::fs::MetadataExt;
use std::os::unix::io::AsRawFd;
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
+use std::time::{Duration, UNIX_EPOCH};
-use anyhow::{bail, format_err, Error};
-use tracing::info;
+use anyhow::{bail, format_err, Context, Error};
+use tracing::{info, warn};
use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
use proxmox_io::ReadExt;
@@ -13,6 +15,7 @@ use proxmox_sys::process_locker::{
};
use proxmox_worker_task::WorkerTaskContext;
+use crate::data_blob::DataChunkBuilder;
use crate::file_formats::{
COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
};
@@ -177,7 +180,7 @@ impl ChunkStore {
/// Note that this must be used with care, as it's dangerous to create two instances on the
/// same base path, as closing the underlying ProcessLocker drops all locks from this process
/// on the lockfile (even if separate FDs)
- pub(crate) fn open<P: Into<PathBuf>>(
+ pub fn open<P: Into<PathBuf>>(
name: &str,
base: P,
sync_level: DatastoreFSyncLevel,
@@ -442,6 +445,69 @@ impl ChunkStore {
Ok(())
}
+ /// Check if atime updates are honored by the filesystem backing the chunk store.
+ ///
+ /// Checks if the atime is always updated by utimensat taking into consideration the Linux
+ /// kernel timestamp granularity.
+ /// If `retry_on_file_changed` is set to true, the check is performed again on the changed file
+ /// if a file change while testing is detected by differences in bith time or inode number.
+ /// Uses a 4 MiB fixed size, compressed but unencrypted chunk to test. The chunk is inserted in
+ /// the chunk store if not yet present.
+ /// Returns with error if the check could not be performed.
+ pub fn check_fs_atime_updates(&self, retry_on_file_changed: bool) -> Result<(), Error> {
+ let (zero_chunk, digest) = DataChunkBuilder::build_zero_chunk(None, 4096 * 1024, true)?;
+ let (pre_existing, _) = self.insert_chunk(&zero_chunk, &digest)?;
+ let (path, _digest) = self.chunk_path(&digest);
+
+ // Take into account timestamp update granularity in the kernel
+ // Blocking the thread is fine here since this runs in a worker.
+ std::thread::sleep(Duration::from_secs(1));
+
+ let metadata_before = std::fs::metadata(&path).context(format!(
+ "failed to get metadata for {path:?} before atime update"
+ ))?;
+
+ // Second atime update if chunk pre-existed, insert_chunk already updates pre-existing ones
+ self.cond_touch_path(&path, true)?;
+
+ let metadata_now = std::fs::metadata(&path).context(format!(
+ "failed to get metadata for {path:?} after atime update"
+ ))?;
+
+ // Check for the unlikely case that the file changed in-between the
+ // two metadata calls, try to check once again on changed file
+ if metadata_before.ino() != metadata_now.ino() {
+ if retry_on_file_changed {
+ return self.check_fs_atime_updates(false);
+ }
+ bail!("chunk {path:?} changed twice during access time safety check, cannot proceed.");
+ }
+
+ if metadata_before.accessed()? >= metadata_now.accessed()? {
+ let chunk_info_str = if pre_existing {
+ "pre-existing"
+ } else {
+ "newly inserted"
+ };
+ warn!("Chunk metadata was not correctly updated during access time safety check:");
+ info!(
+ "Timestamps before update: accessed {:?}, modified {:?}, created {:?}",
+ metadata_before.accessed().unwrap_or(UNIX_EPOCH),
+ metadata_before.modified().unwrap_or(UNIX_EPOCH),
+ metadata_before.created().unwrap_or(UNIX_EPOCH),
+ );
+ info!(
+ "Timestamps after update: accessed {:?}, modified {:?}, created {:?}",
+ metadata_now.accessed().unwrap_or(UNIX_EPOCH),
+ metadata_now.modified().unwrap_or(UNIX_EPOCH),
+ metadata_now.created().unwrap_or(UNIX_EPOCH),
+ );
+ bail!("access time safety check using {chunk_info_str} chunk failed, aborting GC!");
+ }
+
+ Ok(())
+ }
+
pub fn insert_chunk(&self, chunk: &DataBlob, digest: &[u8; 32]) -> Result<(bool, u64), Error> {
// unwrap: only `None` in unit tests
assert!(self.locker.is_some());
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index a6a91ca79..09b5808e0 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1170,6 +1170,19 @@ impl DataStore {
upid: Some(upid.to_string()),
..Default::default()
};
+ let tuning: DatastoreTuning = serde_json::from_value(
+ DatastoreTuning::API_SCHEMA
+ .parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
+ )?;
+ if tuning.gc_atime_safety_check.unwrap_or(true) {
+ self.inner
+ .chunk_store
+ .check_fs_atime_updates(true)
+ .map_err(|err| format_err!("atime safety check failed - {err:#}"))?;
+ info!("Access time update check successful, proceeding with GC.");
+ } else {
+ info!("Access time update check disabled by datastore tuning options.");
+ }
info!("Start GC phase1 (mark used chunks)");
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index fe3260f6d..7e97a7de3 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -4,7 +4,7 @@ use ::serde::{Deserialize, Serialize};
use anyhow::{bail, format_err, Error};
use hex::FromHex;
use serde_json::Value;
-use tracing::warn;
+use tracing::{info, warn};
use proxmox_router::{http_bail, Permission, Router, RpcEnvironment, RpcEnvironmentType};
use proxmox_schema::{api, param_bail, ApiType};
@@ -98,7 +98,15 @@ pub(crate) fn do_create_datastore(
)?;
let res = if reuse_datastore {
- ChunkStore::verify_chunkstore(&path)
+ ChunkStore::verify_chunkstore(&path).and_then(|_| {
+ // Must be the only instance accessing and locking the chunk store,
+ // dropping will close all other locks from this process on the lockfile as well.
+ ChunkStore::open(
+ &datastore.name,
+ &path,
+ tuning.sync_level.unwrap_or_default(),
+ )
+ })
} else {
let mut is_empty = true;
if let Ok(dir) = std::fs::read_dir(&path) {
@@ -120,19 +128,30 @@ pub(crate) fn do_create_datastore(
backup_user.gid,
tuning.sync_level.unwrap_or_default(),
)
- .map(|_| ())
} else {
Err(format_err!("datastore path not empty"))
}
};
- if res.is_err() {
- if need_unmount {
- if let Err(e) = unmount_by_mountpoint(&path) {
- warn!("could not unmount device: {e}");
+ match res {
+ Ok(chunk_store) => {
+ if tuning.gc_atime_safety_check.unwrap_or(true) {
+ chunk_store
+ .check_fs_atime_updates(true)
+ .map_err(|err| format_err!("access time safety check failed - {err:#}"))?;
+ info!("Access time update check successful.");
+ } else {
+ info!("Access time update check skipped.");
+ }
+ }
+ Err(err) => {
+ if need_unmount {
+ if let Err(e) = unmount_by_mountpoint(&path) {
+ warn!("could not unmount device: {e}");
+ }
}
+ return Err(err);
}
- return res;
}
config.set_data(&datastore.name, "datastore", &datastore)?;
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox-backup 4/8] ui: expose GC atime safety check flag in datastore tuning options
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
` (2 preceding siblings ...)
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 5/8] docs: mention GC atime update check for " Christian Ebner
` (4 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Allow to edit the atime safety check flag via the datastore tuning
options edit window.
Do not expose the flag for datastore creation as it is strongly
discouraged to create datastores on filesystems not correctly handling
atime updates as the garbage collection expects. It is nevertheless
still possible to create a datastore via the cli and pass in the
`--tuning gc-atime-safety-check=false` option.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- Shorten field label, add qtip with longer info instead.
www/Utils.js | 4 ++++
www/datastore/OptionView.js | 12 ++++++++++++
2 files changed, 16 insertions(+)
diff --git a/www/Utils.js b/www/Utils.js
index 2746ef0b5..9bd7e1615 100644
--- a/www/Utils.js
+++ b/www/Utils.js
@@ -846,6 +846,10 @@ Ext.define('PBS.Utils', {
sync = PBS.Utils.tuningOptions['sync-level'][sync ?? '__default__'];
options.push(`${gettext('Sync Level')}: ${sync}`);
+ let gc_atime_safety_check = tuning['gc-atime-safety-check'];
+ delete tuning['gc-atime-safety-check'];
+ options.push(`${gettext('GC Access Time Safety Check')}: ${gc_atime_safety_check ?? true}`);
+
for (const [k, v] of Object.entries(tuning)) {
options.push(`${k}: ${v}`);
}
diff --git a/www/datastore/OptionView.js b/www/datastore/OptionView.js
index e1f38af6f..eb846ab76 100644
--- a/www/datastore/OptionView.js
+++ b/www/datastore/OptionView.js
@@ -271,6 +271,18 @@ Ext.define('PBS.Datastore.Options', {
deleteEmpty: true,
value: '__default__',
},
+ {
+ xtype: 'proxmoxcheckbox',
+ name: 'gc-atime-safety-check',
+ fieldLabel: gettext('GC atime Check'),
+ autoEl: {
+ tag: 'div',
+ 'data-qtip': gettext('Ensure underlying storage honors access time updates'),
+ },
+ defaultValue: 1,
+ uncheckedValue: 0,
+ deleteDefaultValue: true,
+ },
],
},
},
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox-backup 5/8] docs: mention GC atime update check for tuning options
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
` (3 preceding siblings ...)
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 4/8] ui: expose GC atime safety check flag in datastore tuning options Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 6/8] datastore: use custom GC atime cutoff if set Christian Ebner
` (3 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Document the gc-atime-check flag and describe the behavior it
controls, by adding it as additional bullet point to the documented
datastore tuning options.
This also fixes the intendation for the cli example how to set the
sync level, to make it clear that still belongs to the previous
bullet point.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- no changes
docs/storage.rst | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/docs/storage.rst b/docs/storage.rst
index 490302955..db61ca58f 100644
--- a/docs/storage.rst
+++ b/docs/storage.rst
@@ -435,9 +435,16 @@ There are some tuning related options for the datastore that are more advanced:
This can be set with:
-.. code-block:: console
+ .. code-block:: console
+
+ # proxmox-backup-manager datastore update <storename> --tuning 'sync-level=filesystem'
- # proxmox-backup-manager datastore update <storename> --tuning 'sync-level=filesystem'
+* ``gc-atime-safety-check``: Datastore GC atime update safety check:
+ You can explicitly `enable` or `disable` the atime update safety check
+ performed on datastore creation and garbage collection. This checks if atime
+ updates are handled as expected by garbage collection and therefore avoids the
+ risk of data loss by unexpected filesystem behavior. It is recommended to set
+ this to enabled, which is also the default value.
If you want to set multiple tuning options simultaneously, you can separate them
with a comma, like this:
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox-backup 6/8] datastore: use custom GC atime cutoff if set
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
` (4 preceding siblings ...)
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 5/8] docs: mention GC atime update check for " Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 7/8] ui: expose GC atime cutoff in datastore tuning option Christian Ebner
` (2 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Use the user configured atime cutoff over the default 24h 5m
margin if explicitly set, otherwise fallback to the default.
Move the minimum atime calculation based on the atime cutoff to the
sweep_unused_chunks() callside and pass in the calculated values, as
to have the logic in the same place.
Add log outputs shownig which cutoff and minimum access time is used
by the garbage collection.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- no changes
pbs-datastore/src/chunk_store.rs | 10 +---------
pbs-datastore/src/datastore.rs | 26 +++++++++++++++++++++++++-
2 files changed, 26 insertions(+), 10 deletions(-)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index dcb499426..1cc26104f 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -356,7 +356,7 @@ impl ChunkStore {
pub fn sweep_unused_chunks(
&self,
oldest_writer: i64,
- phase1_start_time: i64,
+ min_atime: i64,
status: &mut GarbageCollectionStatus,
worker: &dyn WorkerTaskContext,
) -> Result<(), Error> {
@@ -366,14 +366,6 @@ impl ChunkStore {
use nix::sys::stat::fstatat;
use nix::unistd::{unlinkat, UnlinkatFlags};
- let mut min_atime = phase1_start_time - 3600 * 24; // at least 24h (see mount option relatime)
-
- if oldest_writer < min_atime {
- min_atime = oldest_writer;
- }
-
- min_atime -= 300; // add 5 mins gap for safety
-
let mut last_percentage = 0;
let mut chunk_count = 0;
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 09b5808e0..aa61c5077 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -4,6 +4,7 @@ use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::AsRawFd;
use std::path::{Path, PathBuf};
use std::sync::{Arc, LazyLock, Mutex};
+use std::time::Duration;
use anyhow::{bail, format_err, Error};
use nix::unistd::{unlinkat, UnlinkatFlags};
@@ -17,6 +18,7 @@ use proxmox_sys::fs::{file_read_optional_string, replace_file, CreateOptions};
use proxmox_sys::fs::{lock_dir_noblock, DirLockGuard};
use proxmox_sys::linux::procfs::MountInfo;
use proxmox_sys::process_locker::ProcessLockSharedGuard;
+use proxmox_time::TimeSpan;
use proxmox_worker_task::WorkerTaskContext;
use pbs_api_types::{
@@ -1174,6 +1176,7 @@ impl DataStore {
DatastoreTuning::API_SCHEMA
.parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
)?;
+
if tuning.gc_atime_safety_check.unwrap_or(true) {
self.inner
.chunk_store
@@ -1182,6 +1185,27 @@ impl DataStore {
info!("Access time update check successful, proceeding with GC.");
} else {
info!("Access time update check disabled by datastore tuning options.");
+ };
+
+ // Fallback to default 24h 5m if not set
+ let cutoff = tuning
+ .gc_atime_cutoff
+ .map(|cutoff| cutoff * 60)
+ .unwrap_or(3600 * 24 + 300);
+
+ let mut min_atime = phase1_start_time - cutoff as i64;
+ info!(
+ "Using access time cutoff {}, minimum access time is {}",
+ TimeSpan::from(Duration::from_secs(cutoff as u64)),
+ proxmox_time::epoch_to_rfc3339_utc(min_atime)?,
+ );
+ if oldest_writer < min_atime {
+ min_atime = oldest_writer - 300; // account for 5 min safety gap
+ info!(
+ "Oldest backup writer started at {}, extending minimum access time to {}",
+ TimeSpan::from(Duration::from_secs(oldest_writer as u64)),
+ proxmox_time::epoch_to_rfc3339_utc(min_atime)?,
+ );
}
info!("Start GC phase1 (mark used chunks)");
@@ -1191,7 +1215,7 @@ impl DataStore {
info!("Start GC phase2 (sweep unused chunks)");
self.inner.chunk_store.sweep_unused_chunks(
oldest_writer,
- phase1_start_time,
+ min_atime,
&mut gc_status,
worker,
)?;
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox-backup 7/8] ui: expose GC atime cutoff in datastore tuning option
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
` (5 preceding siblings ...)
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 6/8] datastore: use custom GC atime cutoff if set Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 8/8] docs: mention gc-atime-cutoff as " Christian Ebner
2025-03-20 13:18 ` [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Allows to set the atime cutoff for phase 2 of garbage collection in
the datastores tuning parameters. This value changes the time after
which a chunk is not considered in use anymore if it falls outside of
the cutoff window.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- shorten field label, add additional info in qtip. This was chosen
over the suggestion to add a boxLabel, as that did not render nicely.
- adapt empty text to better show actual default
www/Utils.js | 5 +++++
www/datastore/OptionView.js | 11 +++++++++++
2 files changed, 16 insertions(+)
diff --git a/www/Utils.js b/www/Utils.js
index 9bd7e1615..13b5eceda 100644
--- a/www/Utils.js
+++ b/www/Utils.js
@@ -850,6 +850,11 @@ Ext.define('PBS.Utils', {
delete tuning['gc-atime-safety-check'];
options.push(`${gettext('GC Access Time Safety Check')}: ${gc_atime_safety_check ?? true}`);
+ let gc_atime_cutoff = tuning['gc-atime-cutoff'];
+ delete tuning['gc-atime-cutoff'];
+ gc_atime_cutoff = gc_atime_cutoff ?? '1445';
+ options.push(`${gettext('GC Access Time Cutoff')}: ${gc_atime_cutoff}m`);
+
for (const [k, v] of Object.entries(tuning)) {
options.push(`${k}: ${v}`);
}
diff --git a/www/datastore/OptionView.js b/www/datastore/OptionView.js
index eb846ab76..f29574169 100644
--- a/www/datastore/OptionView.js
+++ b/www/datastore/OptionView.js
@@ -283,6 +283,17 @@ Ext.define('PBS.Datastore.Options', {
uncheckedValue: 0,
deleteDefaultValue: true,
},
+ {
+ xtype: 'proxmoxintegerfield',
+ name: 'gc-atime-cutoff',
+ emptyText: gettext('1445 (24 hours 5 minutes)'),
+ fieldLabel: gettext('GC atime Cutoff'),
+ autoEl: {
+ tag: 'div',
+ 'data-qtip': gettext('Cutoff for atime in minutes'),
+ },
+ deleteEmpty: true,
+ },
],
},
},
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [pbs-devel] [PATCH v6 proxmox-backup 8/8] docs: mention gc-atime-cutoff as datastore tuning option
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
` (6 preceding siblings ...)
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 7/8] ui: expose GC atime cutoff in datastore tuning option Christian Ebner
@ 2025-03-19 17:24 ` Christian Ebner
2025-03-20 13:18 ` [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-19 17:24 UTC (permalink / raw)
To: pbs-devel
Document the gc-atime-cutoff option and describe the behavior it
controls, by adding it as additional bullet point to the
documented datastore tuning options.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 5:
- no changes
docs/storage.rst | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/docs/storage.rst b/docs/storage.rst
index db61ca58f..da748c682 100644
--- a/docs/storage.rst
+++ b/docs/storage.rst
@@ -446,6 +446,11 @@ There are some tuning related options for the datastore that are more advanced:
risk of data loss by unexpected filesystem behavior. It is recommended to set
this to enabled, which is also the default value.
+* ``gc-atime-cutoff``: Datastore GC atime cutoff for chunk cleanup:
+ This allows to set the cutoff for which a chunk is still considered in-use
+ during phase 2 of garbage collection (given no older writers). If the
+ ``atime`` of the chunk is outside the range, it will be removed.
+
If you want to set multiple tuning options simultaneously, you can separate them
with a comma, like this:
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [pbs-devel] [PATCH v6 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored Christian Ebner
@ 2025-03-20 8:36 ` Wolfgang Bumiller
2025-03-20 8:44 ` Christian Ebner
0 siblings, 1 reply; 12+ messages in thread
From: Wolfgang Bumiller @ 2025-03-20 8:36 UTC (permalink / raw)
To: Christian Ebner; +Cc: pbs-devel
On Wed, Mar 19, 2025 at 06:24:27PM +0100, Christian Ebner wrote:
> Check if the filesystem backing the chunk store actually updates the
> atime to avoid potential data loss in phase 2 of garbage collection,
> in case the atime update is not honored.
>
> Perform the check before phase 1 of garbage collection, as well as
> on datastore creation. The latter to early detect and disallow
> datastore creation on filesystem configurations which otherwise most
> likely would lead to data losses. To perform the check also when
> reusing an existing datastore, open the chunks store also on reuse.
>
> Enable the atime update check by default, but allow to opt-out by
> setting a datastore tuning parameter flag for backwards compatibility.
> This is honored by both, garbage collection and datastore creation.
>
> The check uses a 4 MiB fixed sized, unencypted and compressed chunk
> as test marker, inserted if not present. This all zero-chunk is very
> likely anyways for unencrypted backup contents with large all-zero
> regions using fixed size chunking (e.g. VMs).
>
> To avoid cases were the timestamp will not be updated because of the
> Linux kernels timestamp granularity, sleep in-between chunk insert
> (including an atime update if pre-existing) and the subsequent
> stating + utimensat for 1 second.
>
> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5982
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 5:
> - Also perform check when reusing datastores, not just new one.
> - Expand on comment for sleep, fix incorrect wording in commit message.
> - Add additional error context to before/after stat calls.
> - Soften wording of error messages if check is skipped because disabled.
>
> pbs-datastore/src/chunk_store.rs | 72 ++++++++++++++++++++++++++++++--
> pbs-datastore/src/datastore.rs | 13 ++++++
> src/api2/config/datastore.rs | 35 ++++++++++++----
> 3 files changed, 109 insertions(+), 11 deletions(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 5e02909a1..dcb499426 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -1,9 +1,11 @@
> +use std::os::unix::fs::MetadataExt;
> use std::os::unix::io::AsRawFd;
> use std::path::{Path, PathBuf};
> use std::sync::{Arc, Mutex};
> +use std::time::{Duration, UNIX_EPOCH};
>
> -use anyhow::{bail, format_err, Error};
> -use tracing::info;
> +use anyhow::{bail, format_err, Context, Error};
> +use tracing::{info, warn};
>
> use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
> use proxmox_io::ReadExt;
> @@ -13,6 +15,7 @@ use proxmox_sys::process_locker::{
> };
> use proxmox_worker_task::WorkerTaskContext;
>
> +use crate::data_blob::DataChunkBuilder;
> use crate::file_formats::{
> COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
> };
> @@ -177,7 +180,7 @@ impl ChunkStore {
> /// Note that this must be used with care, as it's dangerous to create two instances on the
> /// same base path, as closing the underlying ProcessLocker drops all locks from this process
> /// on the lockfile (even if separate FDs)
> - pub(crate) fn open<P: Into<PathBuf>>(
> + pub fn open<P: Into<PathBuf>>(
> name: &str,
> base: P,
> sync_level: DatastoreFSyncLevel,
> @@ -442,6 +445,69 @@ impl ChunkStore {
> Ok(())
> }
>
> + /// Check if atime updates are honored by the filesystem backing the chunk store.
> + ///
> + /// Checks if the atime is always updated by utimensat taking into consideration the Linux
> + /// kernel timestamp granularity.
> + /// If `retry_on_file_changed` is set to true, the check is performed again on the changed file
> + /// if a file change while testing is detected by differences in bith time or inode number.
> + /// Uses a 4 MiB fixed size, compressed but unencrypted chunk to test. The chunk is inserted in
> + /// the chunk store if not yet present.
> + /// Returns with error if the check could not be performed.
> + pub fn check_fs_atime_updates(&self, retry_on_file_changed: bool) -> Result<(), Error> {
> + let (zero_chunk, digest) = DataChunkBuilder::build_zero_chunk(None, 4096 * 1024, true)?;
> + let (pre_existing, _) = self.insert_chunk(&zero_chunk, &digest)?;
> + let (path, _digest) = self.chunk_path(&digest);
> +
> + // Take into account timestamp update granularity in the kernel
> + // Blocking the thread is fine here since this runs in a worker.
> + std::thread::sleep(Duration::from_secs(1));
> +
> + let metadata_before = std::fs::metadata(&path).context(format!(
> + "failed to get metadata for {path:?} before atime update"
> + ))?;
> +
> + // Second atime update if chunk pre-existed, insert_chunk already updates pre-existing ones
> + self.cond_touch_path(&path, true)?;
> +
> + let metadata_now = std::fs::metadata(&path).context(format!(
> + "failed to get metadata for {path:?} after atime update"
> + ))?;
> +
> + // Check for the unlikely case that the file changed in-between the
> + // two metadata calls, try to check once again on changed file
> + if metadata_before.ino() != metadata_now.ino() {
> + if retry_on_file_changed {
> + return self.check_fs_atime_updates(false);
> + }
> + bail!("chunk {path:?} changed twice during access time safety check, cannot proceed.");
> + }
> +
> + if metadata_before.accessed()? >= metadata_now.accessed()? {
> + let chunk_info_str = if pre_existing {
> + "pre-existing"
> + } else {
> + "newly inserted"
> + };
> + warn!("Chunk metadata was not correctly updated during access time safety check:");
> + info!(
> + "Timestamps before update: accessed {:?}, modified {:?}, created {:?}",
> + metadata_before.accessed().unwrap_or(UNIX_EPOCH),
> + metadata_before.modified().unwrap_or(UNIX_EPOCH),
> + metadata_before.created().unwrap_or(UNIX_EPOCH),
I wonder if rendering `UNIX_EPOCH` makes sense for failed values, we
could just use `.ok()` and have it debug-render the `Option`?
> + );
> + info!(
> + "Timestamps after update: accessed {:?}, modified {:?}, created {:?}",
> + metadata_now.accessed().unwrap_or(UNIX_EPOCH),
> + metadata_now.modified().unwrap_or(UNIX_EPOCH),
> + metadata_now.created().unwrap_or(UNIX_EPOCH),
> + );
> + bail!("access time safety check using {chunk_info_str} chunk failed, aborting GC!");
> + }
> +
> + Ok(())
> + }
> +
> pub fn insert_chunk(&self, chunk: &DataBlob, digest: &[u8; 32]) -> Result<(bool, u64), Error> {
> // unwrap: only `None` in unit tests
> assert!(self.locker.is_some());
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index a6a91ca79..09b5808e0 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1170,6 +1170,19 @@ impl DataStore {
> upid: Some(upid.to_string()),
> ..Default::default()
> };
> + let tuning: DatastoreTuning = serde_json::from_value(
> + DatastoreTuning::API_SCHEMA
> + .parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
> + )?;
> + if tuning.gc_atime_safety_check.unwrap_or(true) {
> + self.inner
> + .chunk_store
> + .check_fs_atime_updates(true)
> + .map_err(|err| format_err!("atime safety check failed - {err:#}"))?;
^ I see we have `{:#}` only exactly once in our code base from one of
your patches. Do we really want to use this form?
From what I can tell, this chains contexts with colons.
I think we mostly use `.context`/`.with_context` nowadays instead?
That way the choice of formatting is left to the caller and they can
choose between `{}`, `{:#}` and `{:?}` and `{:#?}`...
> + info!("Access time update check successful, proceeding with GC.");
> + } else {
> + info!("Access time update check disabled by datastore tuning options.");
> + }
>
> info!("Start GC phase1 (mark used chunks)");
>
> diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
> index fe3260f6d..7e97a7de3 100644
> --- a/src/api2/config/datastore.rs
> +++ b/src/api2/config/datastore.rs
> @@ -4,7 +4,7 @@ use ::serde::{Deserialize, Serialize};
> use anyhow::{bail, format_err, Error};
> use hex::FromHex;
> use serde_json::Value;
> -use tracing::warn;
> +use tracing::{info, warn};
>
> use proxmox_router::{http_bail, Permission, Router, RpcEnvironment, RpcEnvironmentType};
> use proxmox_schema::{api, param_bail, ApiType};
> @@ -98,7 +98,15 @@ pub(crate) fn do_create_datastore(
> )?;
>
> let res = if reuse_datastore {
> - ChunkStore::verify_chunkstore(&path)
> + ChunkStore::verify_chunkstore(&path).and_then(|_| {
> + // Must be the only instance accessing and locking the chunk store,
> + // dropping will close all other locks from this process on the lockfile as well.
> + ChunkStore::open(
> + &datastore.name,
> + &path,
> + tuning.sync_level.unwrap_or_default(),
> + )
> + })
> } else {
> let mut is_empty = true;
> if let Ok(dir) = std::fs::read_dir(&path) {
> @@ -120,19 +128,30 @@ pub(crate) fn do_create_datastore(
> backup_user.gid,
> tuning.sync_level.unwrap_or_default(),
> )
> - .map(|_| ())
> } else {
> Err(format_err!("datastore path not empty"))
> }
> };
>
> - if res.is_err() {
> - if need_unmount {
> - if let Err(e) = unmount_by_mountpoint(&path) {
> - warn!("could not unmount device: {e}");
> + match res {
> + Ok(chunk_store) => {
> + if tuning.gc_atime_safety_check.unwrap_or(true) {
> + chunk_store
> + .check_fs_atime_updates(true)
> + .map_err(|err| format_err!("access time safety check failed - {err:#}"))?;
^ same here
> + info!("Access time update check successful.");
> + } else {
> + info!("Access time update check skipped.");
> + }
> + }
> + Err(err) => {
> + if need_unmount {
> + if let Err(e) = unmount_by_mountpoint(&path) {
> + warn!("could not unmount device: {e}");
> + }
> }
> + return Err(err);
> }
> - return res;
> }
>
> config.set_data(&datastore.name, "datastore", &datastore)?;
> --
> 2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [pbs-devel] [PATCH v6 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored
2025-03-20 8:36 ` Wolfgang Bumiller
@ 2025-03-20 8:44 ` Christian Ebner
0 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-20 8:44 UTC (permalink / raw)
To: Wolfgang Bumiller; +Cc: pbs-devel
On 3/20/25 09:36, Wolfgang Bumiller wrote:
> On Wed, Mar 19, 2025 at 06:24:27PM +0100, Christian Ebner wrote:
>> Check if the filesystem backing the chunk store actually updates the
>> atime to avoid potential data loss in phase 2 of garbage collection,
>> in case the atime update is not honored.
>>
>> Perform the check before phase 1 of garbage collection, as well as
>> on datastore creation. The latter to early detect and disallow
>> datastore creation on filesystem configurations which otherwise most
>> likely would lead to data losses. To perform the check also when
>> reusing an existing datastore, open the chunks store also on reuse.
>>
>> Enable the atime update check by default, but allow to opt-out by
>> setting a datastore tuning parameter flag for backwards compatibility.
>> This is honored by both, garbage collection and datastore creation.
>>
>> The check uses a 4 MiB fixed sized, unencypted and compressed chunk
>> as test marker, inserted if not present. This all zero-chunk is very
>> likely anyways for unencrypted backup contents with large all-zero
>> regions using fixed size chunking (e.g. VMs).
>>
>> To avoid cases were the timestamp will not be updated because of the
>> Linux kernels timestamp granularity, sleep in-between chunk insert
>> (including an atime update if pre-existing) and the subsequent
>> stating + utimensat for 1 second.
>>
>> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5982
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 5:
>> - Also perform check when reusing datastores, not just new one.
>> - Expand on comment for sleep, fix incorrect wording in commit message.
>> - Add additional error context to before/after stat calls.
>> - Soften wording of error messages if check is skipped because disabled.
>>
>> pbs-datastore/src/chunk_store.rs | 72 ++++++++++++++++++++++++++++++--
>> pbs-datastore/src/datastore.rs | 13 ++++++
>> src/api2/config/datastore.rs | 35 ++++++++++++----
>> 3 files changed, 109 insertions(+), 11 deletions(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index 5e02909a1..dcb499426 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -1,9 +1,11 @@
>> +use std::os::unix::fs::MetadataExt;
>> use std::os::unix::io::AsRawFd;
>> use std::path::{Path, PathBuf};
>> use std::sync::{Arc, Mutex};
>> +use std::time::{Duration, UNIX_EPOCH};
>>
>> -use anyhow::{bail, format_err, Error};
>> -use tracing::info;
>> +use anyhow::{bail, format_err, Context, Error};
>> +use tracing::{info, warn};
>>
>> use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
>> use proxmox_io::ReadExt;
>> @@ -13,6 +15,7 @@ use proxmox_sys::process_locker::{
>> };
>> use proxmox_worker_task::WorkerTaskContext;
>>
>> +use crate::data_blob::DataChunkBuilder;
>> use crate::file_formats::{
>> COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
>> };
>> @@ -177,7 +180,7 @@ impl ChunkStore {
>> /// Note that this must be used with care, as it's dangerous to create two instances on the
>> /// same base path, as closing the underlying ProcessLocker drops all locks from this process
>> /// on the lockfile (even if separate FDs)
>> - pub(crate) fn open<P: Into<PathBuf>>(
>> + pub fn open<P: Into<PathBuf>>(
>> name: &str,
>> base: P,
>> sync_level: DatastoreFSyncLevel,
>> @@ -442,6 +445,69 @@ impl ChunkStore {
>> Ok(())
>> }
>>
>> + /// Check if atime updates are honored by the filesystem backing the chunk store.
>> + ///
>> + /// Checks if the atime is always updated by utimensat taking into consideration the Linux
>> + /// kernel timestamp granularity.
>> + /// If `retry_on_file_changed` is set to true, the check is performed again on the changed file
>> + /// if a file change while testing is detected by differences in bith time or inode number.
>> + /// Uses a 4 MiB fixed size, compressed but unencrypted chunk to test. The chunk is inserted in
>> + /// the chunk store if not yet present.
>> + /// Returns with error if the check could not be performed.
>> + pub fn check_fs_atime_updates(&self, retry_on_file_changed: bool) -> Result<(), Error> {
>> + let (zero_chunk, digest) = DataChunkBuilder::build_zero_chunk(None, 4096 * 1024, true)?;
>> + let (pre_existing, _) = self.insert_chunk(&zero_chunk, &digest)?;
>> + let (path, _digest) = self.chunk_path(&digest);
>> +
>> + // Take into account timestamp update granularity in the kernel
>> + // Blocking the thread is fine here since this runs in a worker.
>> + std::thread::sleep(Duration::from_secs(1));
>> +
>> + let metadata_before = std::fs::metadata(&path).context(format!(
>> + "failed to get metadata for {path:?} before atime update"
>> + ))?;
>> +
>> + // Second atime update if chunk pre-existed, insert_chunk already updates pre-existing ones
>> + self.cond_touch_path(&path, true)?;
>> +
>> + let metadata_now = std::fs::metadata(&path).context(format!(
>> + "failed to get metadata for {path:?} after atime update"
>> + ))?;
>> +
>> + // Check for the unlikely case that the file changed in-between the
>> + // two metadata calls, try to check once again on changed file
>> + if metadata_before.ino() != metadata_now.ino() {
>> + if retry_on_file_changed {
>> + return self.check_fs_atime_updates(false);
>> + }
>> + bail!("chunk {path:?} changed twice during access time safety check, cannot proceed.");
>> + }
>> +
>> + if metadata_before.accessed()? >= metadata_now.accessed()? {
>> + let chunk_info_str = if pre_existing {
>> + "pre-existing"
>> + } else {
>> + "newly inserted"
>> + };
>> + warn!("Chunk metadata was not correctly updated during access time safety check:");
>> + info!(
>> + "Timestamps before update: accessed {:?}, modified {:?}, created {:?}",
>> + metadata_before.accessed().unwrap_or(UNIX_EPOCH),
>> + metadata_before.modified().unwrap_or(UNIX_EPOCH),
>> + metadata_before.created().unwrap_or(UNIX_EPOCH),
>
> I wonder if rendering `UNIX_EPOCH` makes sense for failed values, we
> could just use `.ok()` and have it debug-render the `Option`?
Okay, will adapt that
>
>> + );
>> + info!(
>> + "Timestamps after update: accessed {:?}, modified {:?}, created {:?}",
>> + metadata_now.accessed().unwrap_or(UNIX_EPOCH),
>> + metadata_now.modified().unwrap_or(UNIX_EPOCH),
>> + metadata_now.created().unwrap_or(UNIX_EPOCH),
>> + );
>> + bail!("access time safety check using {chunk_info_str} chunk failed, aborting GC!");
>> + }
>> +
>> + Ok(())
>> + }
>> +
>> pub fn insert_chunk(&self, chunk: &DataBlob, digest: &[u8; 32]) -> Result<(bool, u64), Error> {
>> // unwrap: only `None` in unit tests
>> assert!(self.locker.is_some());
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index a6a91ca79..09b5808e0 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -1170,6 +1170,19 @@ impl DataStore {
>> upid: Some(upid.to_string()),
>> ..Default::default()
>> };
>> + let tuning: DatastoreTuning = serde_json::from_value(
>> + DatastoreTuning::API_SCHEMA
>> + .parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
>> + )?;
>> + if tuning.gc_atime_safety_check.unwrap_or(true) {
>> + self.inner
>> + .chunk_store
>> + .check_fs_atime_updates(true)
>> + .map_err(|err| format_err!("atime safety check failed - {err:#}"))?;
>
> ^ I see we have `{:#}` only exactly once in our code base from one of
> your patches. Do we really want to use this form?
>
> From what I can tell, this chains contexts with colons.
>
> I think we mostly use `.context`/`.with_context` nowadays instead?
> That way the choice of formatting is left to the caller and they can
> choose between `{}`, `{:#}` and `{:?}` and `{:#?}`...
But exactly that was the intention here? The check_fs_atime_updates()
returns some errors by adding context instead of reformatting the error
itself.
I didn't adapt this for the whole datastore logic though, therefore
limited it to this call side instead. I could of course bubble this up
further, but who should be the caller defining the error format then?
>
>> + info!("Access time update check successful, proceeding with GC.");
>> + } else {
>> + info!("Access time update check disabled by datastore tuning options.");
>> + }
>>
>> info!("Start GC phase1 (mark used chunks)");
>>
>> diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
>> index fe3260f6d..7e97a7de3 100644
>> --- a/src/api2/config/datastore.rs
>> +++ b/src/api2/config/datastore.rs
>> @@ -4,7 +4,7 @@ use ::serde::{Deserialize, Serialize};
>> use anyhow::{bail, format_err, Error};
>> use hex::FromHex;
>> use serde_json::Value;
>> -use tracing::warn;
>> +use tracing::{info, warn};
>>
>> use proxmox_router::{http_bail, Permission, Router, RpcEnvironment, RpcEnvironmentType};
>> use proxmox_schema::{api, param_bail, ApiType};
>> @@ -98,7 +98,15 @@ pub(crate) fn do_create_datastore(
>> )?;
>>
>> let res = if reuse_datastore {
>> - ChunkStore::verify_chunkstore(&path)
>> + ChunkStore::verify_chunkstore(&path).and_then(|_| {
>> + // Must be the only instance accessing and locking the chunk store,
>> + // dropping will close all other locks from this process on the lockfile as well.
>> + ChunkStore::open(
>> + &datastore.name,
>> + &path,
>> + tuning.sync_level.unwrap_or_default(),
>> + )
>> + })
>> } else {
>> let mut is_empty = true;
>> if let Ok(dir) = std::fs::read_dir(&path) {
>> @@ -120,19 +128,30 @@ pub(crate) fn do_create_datastore(
>> backup_user.gid,
>> tuning.sync_level.unwrap_or_default(),
>> )
>> - .map(|_| ())
>> } else {
>> Err(format_err!("datastore path not empty"))
>> }
>> };
>>
>> - if res.is_err() {
>> - if need_unmount {
>> - if let Err(e) = unmount_by_mountpoint(&path) {
>> - warn!("could not unmount device: {e}");
>> + match res {
>> + Ok(chunk_store) => {
>> + if tuning.gc_atime_safety_check.unwrap_or(true) {
>> + chunk_store
>> + .check_fs_atime_updates(true)
>> + .map_err(|err| format_err!("access time safety check failed - {err:#}"))?;
>
> ^ same here
>
>> + info!("Access time update check successful.");
>> + } else {
>> + info!("Access time update check skipped.");
>> + }
>> + }
>> + Err(err) => {
>> + if need_unmount {
>> + if let Err(e) = unmount_by_mountpoint(&path) {
>> + warn!("could not unmount device: {e}");
>> + }
>> }
>> + return Err(err);
>> }
>> - return res;
>> }
>>
>> config.set_data(&datastore.name, "datastore", &datastore)?;
>> --
>> 2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
` (7 preceding siblings ...)
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 8/8] docs: mention gc-atime-cutoff as " Christian Ebner
@ 2025-03-20 13:18 ` Christian Ebner
8 siblings, 0 replies; 12+ messages in thread
From: Christian Ebner @ 2025-03-20 13:18 UTC (permalink / raw)
To: pbs-devel
superseded-by version 7:
https://lore.proxmox.com/pbs-devel/20250320131711.312894-1-c.ebner@proxmox.com/T/
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-03-20 13:19 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-19 17:24 [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox 1/2] pbs api types: add garbage collection atime safety check flag Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox 2/2] pbs api types: add option to set GC chunk cleanup atime cutoff Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 3/8] fix #5982: garbage collection: check atime updates are honored Christian Ebner
2025-03-20 8:36 ` Wolfgang Bumiller
2025-03-20 8:44 ` Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 4/8] ui: expose GC atime safety check flag in datastore tuning options Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 5/8] docs: mention GC atime update check for " Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 6/8] datastore: use custom GC atime cutoff if set Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 7/8] ui: expose GC atime cutoff in datastore tuning option Christian Ebner
2025-03-19 17:24 ` [pbs-devel] [PATCH v6 proxmox-backup 8/8] docs: mention gc-atime-cutoff as " Christian Ebner
2025-03-20 13:18 ` [pbs-devel] [PATCH v6 proxmox proxmox-backup 0/8] fix #5982: check atime update is honored Christian Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal