public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup v1 0/3] refactor datastore locking to use tmpfs
@ 2022-04-06  9:39 Stefan Sterz
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run' Stefan Sterz
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Stefan Sterz @ 2022-04-06  9:39 UTC (permalink / raw)
  To: pbs-devel

This patch series refactors locking within the datastore trait to use
the tmpfs-backed /run directory. Instead of the actual directory
hierarchy, each backup group and snapshot gets assigned a lock file
based on a hash of the relative path within the given datastore. A
second commit also refactors the maniefst locks in the same manor.

To avoid a race condition when deleting a lock (e.g. upon group
deletion), each lock file will be stat'ed before and after acquiring
a lock. If the inodes match, we did not encounter a race condition
and are allowed to continue using the lock. Locks will be cleaned up
when their respective group/snapshot/manifest is removed.

The last commit in this series fixes a race condition that occurs when
changing the owner of a datastore.

Stefan Sterz (3):
  fix #3935: datastore/api2/backup: move datastore locking to '/run'
  fix: datastore: move manifest locking to new locking method
  fix: api: avoid race condition in set_backup_owner

 pbs-datastore/src/datastore.rs       | 171 ++++++++++++++++++---------
 pbs-datastore/src/snapshot_reader.rs |  30 +++--
 src/api2/admin/datastore.rs          |   9 +-
 src/api2/backup/environment.rs       |   4 +-
 src/api2/backup/mod.rs               |   5 +-
 src/api2/reader/mod.rs               |   7 +-
 src/backup/verify.rs                 |  12 +-
 7 files changed, 155 insertions(+), 83 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run'
  2022-04-06  9:39 [pbs-devel] [PATCH proxmox-backup v1 0/3] refactor datastore locking to use tmpfs Stefan Sterz
@ 2022-04-06  9:39 ` Stefan Sterz
  2022-04-06 14:51   ` Thomas Lamprecht
  2022-04-07  8:45   ` Wolfgang Bumiller
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 2/3] fix: datastore: move manifest locking to new locking method Stefan Sterz
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 3/3] fix: api: avoid race condition in set_backup_owner Stefan Sterz
  2 siblings, 2 replies; 9+ messages in thread
From: Stefan Sterz @ 2022-04-06  9:39 UTC (permalink / raw)
  To: pbs-devel

to avoid issues when removing a group or snapshot directory where two
threads hold a lock to the same directory, move locking to the tmpfs
backed '/run' directory. also adds double stat'ing to make it
possible to remove locks without certain race condition issues.

Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
---
i chose to have the lock be the sha256 hash of the relative path of a
given group/snapshot/manifest in a datastore, because this way we
never run into issues where file names might be too long. this has
been discussed in previously[1]. i could also encode them using
proxmox_sys::systemd::escape_unit, but that would use two characters
for most ascii characters and, thus, we would run into the 255 byte
filename limit even faster in most cases. however, i have no set
opinion here, if the consesus is, that this is a non-issue ill gladly
send a v2.

[1]: https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001669.html

 pbs-datastore/src/datastore.rs       | 129 +++++++++++++++++++++------
 pbs-datastore/src/snapshot_reader.rs |  30 ++++---
 src/api2/backup/environment.rs       |   4 +-
 src/api2/backup/mod.rs               |   5 +-
 src/api2/reader/mod.rs               |   7 +-
 src/backup/verify.rs                 |  12 +--
 6 files changed, 131 insertions(+), 56 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index d416c8d8..cb2a8a4e 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -15,7 +15,7 @@ use proxmox_sys::fs::{replace_file, file_read_optional_string, CreateOptions};
 use proxmox_sys::process_locker::ProcessLockSharedGuard;
 use proxmox_sys::WorkerTaskContext;
 use proxmox_sys::{task_log, task_warn};
-use proxmox_sys::fs::{lock_dir_noblock, DirLockGuard};
+use proxmox_sys::error::SysError;
 
 use pbs_api_types::{
     UPID, DataStoreConfig, Authid, GarbageCollectionStatus, HumanByte,
@@ -39,6 +39,8 @@ lazy_static! {
     static ref DATASTORE_MAP: Mutex<HashMap<String, Arc<DataStore>>> = Mutex::new(HashMap::new());
 }
 
+pub const DATASTORE_LOCKS_DIR: &str = "/run/proxmox-backup/locks";
+
 /// checks if auth_id is owner, or, if owner is a token, if
 /// auth_id is the user of the token
 pub fn check_backup_owner(
@@ -288,11 +290,9 @@ impl DataStore {
     /// Remove a complete backup group including all snapshots, returns true
     /// if all snapshots were removed, and false if some were protected
     pub fn remove_backup_group(&self, backup_group: &BackupGroup) ->  Result<bool, Error> {
-
+        let _guard = self.lock_group(backup_group)?;
         let full_path = self.group_path(backup_group);
 
-        let _guard = proxmox_sys::fs::lock_dir_noblock(&full_path, "backup group", "possible running backup")?;
-
         log::info!("removing backup group {:?}", full_path);
 
         let mut removed_all = true;
@@ -308,27 +308,28 @@ impl DataStore {
 
         if removed_all {
             // no snapshots left, we can now safely remove the empty folder
-            std::fs::remove_dir_all(&full_path)
-                .map_err(|err| {
-                    format_err!(
-                        "removing backup group directory {:?} failed - {}",
-                        full_path,
-                        err,
-                    )
-                })?;
+            std::fs::remove_dir_all(&full_path).map_err(|err| {
+                format_err!(
+                    "removing backup group directory {:?} failed - {}",
+                    full_path,
+                    err,
+                )
+            })?;
+
+            if let Ok(path) = self.group_lock_path(backup_group) {
+                let _ = std::fs::remove_file(path);
+            }
         }
 
         Ok(removed_all)
     }
 
     /// Remove a backup directory including all content
-    pub fn remove_backup_dir(&self, backup_dir: &BackupDir, force: bool) ->  Result<(), Error> {
-
-        let full_path = self.snapshot_path(backup_dir);
-
+    pub fn remove_backup_dir(&self, backup_dir: &BackupDir, force: bool) -> Result<(), Error> {
         let (_guard, _manifest_guard);
+
         if !force {
-            _guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or in use")?;
+            _guard = self.lock_snapshot(backup_dir)?;
             _manifest_guard = self.lock_manifest(backup_dir)?;
         }
 
@@ -336,6 +337,8 @@ impl DataStore {
             bail!("cannot remove protected snapshot");
         }
 
+        let full_path = self.snapshot_path(backup_dir);
+
         log::info!("removing backup snapshot {:?}", full_path);
         std::fs::remove_dir_all(&full_path)
             .map_err(|err| {
@@ -352,6 +355,10 @@ impl DataStore {
             let _ = std::fs::remove_file(path);
         }
 
+        if let Ok(path) = self.snapshot_lock_path(backup_dir) {
+            let _ = std::fs::remove_file(path);
+        }
+
         Ok(())
     }
 
@@ -427,7 +434,7 @@ impl DataStore {
         &self,
         backup_group: &BackupGroup,
         auth_id: &Authid,
-    ) -> Result<(Authid, DirLockGuard), Error> {
+    ) -> Result<(Authid, BackupLockGuard), Error> {
         // create intermediate path first:
         let mut full_path = self.base_path();
         full_path.push(backup_group.backup_type());
@@ -438,13 +445,13 @@ impl DataStore {
         // create the last component now
         match std::fs::create_dir(&full_path) {
             Ok(_) => {
-                let guard = lock_dir_noblock(&full_path, "backup group", "another backup is already running")?;
+                let guard = self.lock_group(backup_group)?;
                 self.set_owner(backup_group, auth_id, false)?;
                 let owner = self.get_owner(backup_group)?; // just to be sure
                 Ok((owner, guard))
             }
             Err(ref err) if err.kind() == io::ErrorKind::AlreadyExists => {
-                let guard = lock_dir_noblock(&full_path, "backup group", "another backup is already running")?;
+                let guard = self.lock_group(backup_group)?;
                 let owner = self.get_owner(backup_group)?; // just to be sure
                 Ok((owner, guard))
             }
@@ -456,14 +463,13 @@ impl DataStore {
     ///
     /// The BackupGroup directory needs to exist.
     pub fn create_locked_backup_dir(&self, backup_dir: &BackupDir)
-        -> Result<(PathBuf, bool, DirLockGuard), Error>
+        -> Result<(PathBuf, bool, BackupLockGuard), Error>
     {
         let relative_path = backup_dir.relative_path();
         let mut full_path = self.base_path();
         full_path.push(&relative_path);
 
-        let lock = ||
-            lock_dir_noblock(&full_path, "snapshot", "internal error - tried creating snapshot that's already in use");
+        let lock = || self.lock_snapshot(backup_dir);
 
         match std::fs::create_dir(&full_path) {
             Ok(_) => Ok((relative_path, true, lock()?)),
@@ -810,7 +816,8 @@ impl DataStore {
         backup_dir: &BackupDir,
     ) -> Result<String, Error> {
         let mut path = format!(
-            "/run/proxmox-backup/locks/{}/{}/{}",
+            "{}/{}/{}/{}",
+            DATASTORE_LOCKS_DIR,
             self.name(),
             backup_dir.group().backup_type(),
             backup_dir.group().backup_id(),
@@ -883,9 +890,7 @@ impl DataStore {
         backup_dir: &BackupDir,
         protection: bool
     ) -> Result<(), Error> {
-        let full_path = self.snapshot_path(backup_dir);
-
-        let _guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or in use")?;
+        let _guard = self.lock_snapshot(backup_dir)?;
 
         let protected_path = backup_dir.protected_file(self.base_path());
         if protection {
@@ -952,4 +957,72 @@ impl DataStore {
 
         Ok(chunk_list)
     }
-}
+
+    fn snapshot_lock_path(&self, backup_dir: &BackupDir) -> Result<PathBuf, Error> {
+        let path = Path::new(DATASTORE_LOCKS_DIR).join(self.name());
+        std::fs::create_dir_all(&path)?;
+
+        // hash relative path to get a fixed length unique file name
+        let rpath_hash =
+            openssl::sha::sha256(backup_dir.relative_path().to_str().unwrap().as_bytes());
+        Ok(path.join(hex::encode(rpath_hash)))
+    }
+
+    fn group_lock_path(&self, group: &BackupGroup) -> Result<PathBuf, Error> {
+        let path = Path::new(DATASTORE_LOCKS_DIR).join(self.name());
+        std::fs::create_dir_all(&path)?;
+
+        // hash relative path to get a fixed length unique file name
+        let rpath_hash = openssl::sha::sha256(group.group_path().to_str().unwrap().as_bytes());
+        Ok(path.join(hex::encode(rpath_hash)))
+    }
+
+    // this function helps implement the double stat'ing procedure. it
+    // avoids certain race conditions upon lock deletion.
+    fn lock_helper<F>(&self, path: &std::path::Path, lock_fn: F) -> Result<BackupLockGuard, Error>
+    where
+        F: Fn(&std::path::Path) -> Result<BackupLockGuard, Error>,
+    {
+        let stat = match nix::sys::stat::stat(path) {
+            Ok(result) => result.st_ino,
+            // lock hasn't been created yet, ignore check
+            Err(e) if e.not_found() => return Ok(lock_fn(path)?),
+            Err(e) => bail!("could not stat lock file before locking! - {}", e),
+        };
+
+        let lock = lock_fn(path)?;
+
+        if stat != nix::sys::stat::stat(path)?.st_ino {
+            bail!("could not acquire lock, another thread modified the lock file");
+        }
+
+        Ok(lock)
+    }
+
+    /// Lock a snapshot exclusively
+    pub fn lock_snapshot(&self, backup_dir: &BackupDir) -> Result<BackupLockGuard, Error> {
+        self.lock_helper(&self.snapshot_lock_path(backup_dir)?, |p| {
+            open_backup_lockfile(p, Some(Duration::from_secs(0)), true).map_err(|err| {
+                format_err!("unable to acquire snapshot lock {:?} - {}", &p, err)
+            })
+        })
+    }
+
+    /// Acquire a shared lock on a snapshot
+    pub fn lock_snapshot_shared(&self, backup_dir: &BackupDir) -> Result<BackupLockGuard, Error> {
+        self.lock_helper(&self.snapshot_lock_path(backup_dir)?, |p| {
+            open_backup_lockfile(p, Some(Duration::from_secs(0)), false).map_err(|err| {
+                format_err!("unable to acquire shared snapshot lock {:?} - {}", &p, err)
+            })
+        })
+    }
+
+    /// Lock a group exclusively
+    pub fn lock_group(&self, group: &BackupGroup) -> Result<BackupLockGuard, Error> {
+        self.lock_helper(&self.group_lock_path(group)?, |p| {
+            open_backup_lockfile(p, Some(Duration::from_secs(0)), true).map_err(|err| {
+                format_err!("unable to acquire backup group lock {:?} - {}", &p, err)
+            })
+        })
+    }
+}
\ No newline at end of file
diff --git a/pbs-datastore/src/snapshot_reader.rs b/pbs-datastore/src/snapshot_reader.rs
index 1bbf57e7..3773804d 100644
--- a/pbs-datastore/src/snapshot_reader.rs
+++ b/pbs-datastore/src/snapshot_reader.rs
@@ -3,10 +3,11 @@ use std::sync::Arc;
 use std::os::unix::io::{AsRawFd, FromRawFd};
 use std::fs::File;
 
-use anyhow::{bail, Error};
+use anyhow::{bail, format_err, Error};
 use nix::dir::Dir;
-
-use proxmox_sys::fs::lock_dir_noblock_shared;
+use nix::sys::stat::Mode;
+use nix::fcntl::OFlag;
+use pbs_config::BackupLockGuard;
 
 use crate::backup_info::BackupDir;
 use crate::index::IndexFile;
@@ -23,19 +24,22 @@ pub struct SnapshotReader {
     datastore_name: String,
     file_list: Vec<String>,
     locked_dir: Dir,
+
+    // while this is never read, the lock needs to be kept until the
+    // reader is dropped to ensure valid locking semantics
+    _lock: BackupLockGuard,
 }
 
 impl SnapshotReader {
 
     /// Lock snapshot, reads the manifest and returns a new instance
     pub fn new(datastore: Arc<DataStore>, snapshot: BackupDir) -> Result<Self, Error> {
+        let lock = datastore.lock_snapshot_shared(&snapshot)?;
 
-        let snapshot_path = datastore.snapshot_path(&snapshot);
+        let path = datastore.snapshot_path(&snapshot);
 
-        let locked_dir = lock_dir_noblock_shared(
-            &snapshot_path,
-            "snapshot",
-            "locked by another operation")?;
+        let locked_dir = Dir::open(&path, OFlag::O_RDONLY, Mode::empty())
+            .map_err(|err| format_err!("unable to open snapshot directory {:?} - {}", path, err))?;
 
         let datastore_name = datastore.name().to_string();
 
@@ -47,7 +51,7 @@ impl SnapshotReader {
             }
         };
 
-        let mut client_log_path = snapshot_path;
+        let mut client_log_path = datastore.snapshot_path(&snapshot);
         client_log_path.push(CLIENT_LOG_BLOB_NAME);
 
         let mut file_list = Vec::new();
@@ -57,7 +61,13 @@ impl SnapshotReader {
             file_list.push(CLIENT_LOG_BLOB_NAME.to_string());
         }
 
-        Ok(Self { snapshot, datastore_name, file_list, locked_dir })
+        Ok(Self {
+            snapshot,
+            datastore_name,
+            file_list,
+            locked_dir,
+            _lock: lock,
+        })
     }
 
     /// Return the snapshot directory
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 3e23840f..b4557d42 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -1,7 +1,7 @@
 use anyhow::{bail, format_err, Error};
+use pbs_config::BackupLockGuard;
 use std::sync::{Arc, Mutex};
 use std::collections::HashMap;
-use nix::dir::Dir;
 
 use ::serde::{Serialize};
 use serde_json::{json, Value};
@@ -501,7 +501,7 @@ impl BackupEnvironment {
     /// If verify-new is set on the datastore, this will run a new verify task
     /// for the backup. If not, this will return and also drop the passed lock
     /// immediately.
-    pub fn verify_after_complete(&self, snap_lock: Dir) -> Result<(), Error> {
+    pub fn verify_after_complete(&self, snap_lock: BackupLockGuard) -> Result<(), Error> {
         self.ensure_finished()?;
 
         if !self.datastore.verify_new() {
diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
index 395edd3d..6e38f3da 100644
--- a/src/api2/backup/mod.rs
+++ b/src/api2/backup/mod.rs
@@ -20,7 +20,7 @@ use pbs_api_types::{
     BACKUP_ID_SCHEMA, BACKUP_TIME_SCHEMA, BACKUP_TYPE_SCHEMA, DATASTORE_SCHEMA,
     CHUNK_DIGEST_SCHEMA, PRIV_DATASTORE_BACKUP, BACKUP_ARCHIVE_NAME_SCHEMA,
 };
-use proxmox_sys::fs::lock_dir_noblock_shared;
+
 use pbs_tools::json::{required_array_param, required_integer_param, required_string_param};
 use pbs_config::CachedUserInfo;
 use pbs_datastore::{DataStore, PROXMOX_BACKUP_PROTOCOL_ID_V1};
@@ -157,8 +157,7 @@ async move {
         }
 
         // lock last snapshot to prevent forgetting/pruning it during backup
-        let full_path = datastore.snapshot_path(&last.backup_dir);
-        Some(lock_dir_noblock_shared(&full_path, "snapshot", "base snapshot is already locked by another operation")?)
+        Some(datastore.lock_snapshot_shared(&last.backup_dir)?)
     } else {
         None
     };
diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs
index 2b11d1b1..f472686a 100644
--- a/src/api2/reader/mod.rs
+++ b/src/api2/reader/mod.rs
@@ -20,7 +20,7 @@ use pbs_api_types::{
     CHUNK_DIGEST_SCHEMA, PRIV_DATASTORE_READ, PRIV_DATASTORE_BACKUP,
     BACKUP_ARCHIVE_NAME_SCHEMA,
 };
-use proxmox_sys::fs::lock_dir_noblock_shared;
+
 use pbs_tools::json::{required_integer_param, required_string_param};
 use pbs_datastore::{DataStore, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
 use pbs_datastore::backup_info::BackupDir;
@@ -114,10 +114,7 @@ fn upgrade_to_backup_reader_protocol(
             }
         }
 
-        let _guard = lock_dir_noblock_shared(
-            &datastore.snapshot_path(&backup_dir),
-            "snapshot",
-            "locked by another operation")?;
+        let _guard = datastore.lock_snapshot_shared(&backup_dir)?;
 
         let path = datastore.base_path();
 
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index 307d366c..f4375daa 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -1,4 +1,4 @@
-use nix::dir::Dir;
+use pbs_config::BackupLockGuard;
 use std::collections::HashSet;
 use std::sync::atomic::{AtomicUsize, Ordering};
 use std::sync::{Arc, Mutex};
@@ -13,7 +13,6 @@ use pbs_datastore::{DataStore, DataBlob, StoreProgress};
 use pbs_datastore::backup_info::{BackupGroup, BackupDir, BackupInfo};
 use pbs_datastore::index::IndexFile;
 use pbs_datastore::manifest::{archive_type, ArchiveType, BackupManifest, FileInfo};
-use proxmox_sys::fs::lock_dir_noblock_shared;
 
 use crate::tools::parallel_handler::ParallelHandler;
 
@@ -300,11 +299,8 @@ pub fn verify_backup_dir(
     upid: UPID,
     filter: Option<&dyn Fn(&BackupManifest) -> bool>,
 ) -> Result<bool, Error> {
-    let snap_lock = lock_dir_noblock_shared(
-        &verify_worker.datastore.snapshot_path(backup_dir),
-        "snapshot",
-        "locked by another operation",
-    );
+    let snap_lock = verify_worker.datastore.lock_snapshot_shared(&backup_dir);
+
     match snap_lock {
         Ok(snap_lock) => {
             verify_backup_dir_with_lock(verify_worker, backup_dir, upid, filter, snap_lock)
@@ -328,7 +324,7 @@ pub fn verify_backup_dir_with_lock(
     backup_dir: &BackupDir,
     upid: UPID,
     filter: Option<&dyn Fn(&BackupManifest) -> bool>,
-    _snap_lock: Dir,
+    _snap_lock: BackupLockGuard,
 ) -> Result<bool, Error> {
     let manifest = match verify_worker.datastore.load_manifest(backup_dir) {
         Ok((manifest, _)) => manifest,
-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v1 2/3] fix: datastore: move manifest locking to new locking method
  2022-04-06  9:39 [pbs-devel] [PATCH proxmox-backup v1 0/3] refactor datastore locking to use tmpfs Stefan Sterz
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run' Stefan Sterz
@ 2022-04-06  9:39 ` Stefan Sterz
  2022-04-07  8:52   ` Wolfgang Bumiller
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 3/3] fix: api: avoid race condition in set_backup_owner Stefan Sterz
  2 siblings, 1 reply; 9+ messages in thread
From: Stefan Sterz @ 2022-04-06  9:39 UTC (permalink / raw)
  To: pbs-devel

adds double stat'ing and removes directory hierarchy to bring manifest
locking in-line with other locks used by the datastore trait.

Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 46 +++++++++++++---------------------
 1 file changed, 18 insertions(+), 28 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index cb2a8a4e..619e2710 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -30,9 +30,7 @@ use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
 use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
 use crate::index::IndexFile;
 use crate::manifest::{
-    MANIFEST_BLOB_NAME, MANIFEST_LOCK_NAME, CLIENT_LOG_BLOB_NAME,
-    ArchiveType, BackupManifest,
-    archive_type,
+    archive_type, ArchiveType, BackupManifest, CLIENT_LOG_BLOB_NAME, MANIFEST_BLOB_NAME,
 };
 
 lazy_static! {
@@ -811,38 +809,30 @@ impl DataStore {
     ///
     /// Also creates the basedir. The lockfile is located in
     /// '/run/proxmox-backup/locks/{datastore}/{type}/{id}/{timestamp}.index.json.lck'
-    fn manifest_lock_path(
-        &self,
-        backup_dir: &BackupDir,
-    ) -> Result<String, Error> {
-        let mut path = format!(
-            "{}/{}/{}/{}",
-            DATASTORE_LOCKS_DIR,
-            self.name(),
+    fn manifest_lock_path(&self, backup_dir: &BackupDir) -> Result<PathBuf, Error> {
+        let path = Path::new(DATASTORE_LOCKS_DIR).join(self.name());
+
+        let rpath = format!(
+            "{}/{}/{}-manifest",
             backup_dir.group().backup_type(),
             backup_dir.group().backup_id(),
+            backup_dir.backup_time_string(),
         );
+
         std::fs::create_dir_all(&path)?;
-        use std::fmt::Write;
-        write!(path, "/{}{}", backup_dir.backup_time_string(), &MANIFEST_LOCK_NAME)?;
+        let rpath_hash = openssl::sha::sha256(rpath.as_bytes());
 
-        Ok(path)
+        Ok(path.join(hex::encode(rpath_hash)))
     }
 
-    fn lock_manifest(
-        &self,
-        backup_dir: &BackupDir,
-    ) -> Result<BackupLockGuard, Error> {
-        let path = self.manifest_lock_path(backup_dir)?;
-
-        // update_manifest should never take a long time, so if someone else has
-        // the lock we can simply block a bit and should get it soon
-        open_backup_lockfile(&path, Some(Duration::from_secs(5)), true)
-            .map_err(|err| {
-                format_err!(
-                    "unable to acquire manifest lock {:?} - {}", &path, err
-                )
-            })
+    fn lock_manifest(&self, backup_dir: &BackupDir) -> Result<BackupLockGuard, Error> {
+        self.lock_helper(&self.manifest_lock_path(backup_dir)?, |p| {
+            // update_manifest should never take a long time, so if
+            // someone else has the lock we can simply block a bit
+            // and should get it soon
+            open_backup_lockfile(&p, Some(Duration::from_secs(5)), true)
+                .map_err(|err| format_err!("unable to acquire manifest lock {:?} - {}", &p, err))
+        })
     }
 
     /// Load the manifest without a lock. Must not be written back.
-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v1 3/3] fix: api: avoid race condition in set_backup_owner
  2022-04-06  9:39 [pbs-devel] [PATCH proxmox-backup v1 0/3] refactor datastore locking to use tmpfs Stefan Sterz
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run' Stefan Sterz
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 2/3] fix: datastore: move manifest locking to new locking method Stefan Sterz
@ 2022-04-06  9:39 ` Stefan Sterz
  2 siblings, 0 replies; 9+ messages in thread
From: Stefan Sterz @ 2022-04-06  9:39 UTC (permalink / raw)
  To: pbs-devel

when two clients change the owner of a backup store, a race condition
arose. add locking to avoid this.

Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
---
 src/api2/admin/datastore.rs | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index ef82b426..0c297937 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1889,11 +1889,12 @@ pub fn set_backup_owner(
 
     let privs = user_info.lookup_privs(&auth_id, &["datastore", &store]);
 
+    let owner = datastore.get_owner(&backup_group)?;
+
     let allowed = if (privs & PRIV_DATASTORE_MODIFY) != 0 {
         // High-privilege user/token
         true
     } else if (privs & PRIV_DATASTORE_BACKUP) != 0 {
-        let owner = datastore.get_owner(&backup_group)?;
 
         match (owner.is_token(), new_owner.is_token()) {
             (true, true) => {
@@ -1940,6 +1941,12 @@ pub fn set_backup_owner(
               new_owner);
     }
 
+    let _guard = datastore.lock_group(&backup_group)?;
+
+    if owner != datastore.get_owner(&backup_group)? {
+        bail!("{} does not own this group anymore", owner);
+    }
+
     datastore.set_owner(&backup_group, &new_owner, true)?;
 
     Ok(())
-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run'
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run' Stefan Sterz
@ 2022-04-06 14:51   ` Thomas Lamprecht
  2022-04-07  8:10     ` Wolfgang Bumiller
  2022-04-07  8:45   ` Wolfgang Bumiller
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Lamprecht @ 2022-04-06 14:51 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Stefan Sterz

Thanks for this, no in-depth review/comments, but some higher-level/meta ones,
sorry for that ;-)

On 06.04.22 11:39, Stefan Sterz wrote:
> to avoid issues when removing a group or snapshot directory where two
> threads hold a lock to the same directory, move locking to the tmpfs
> backed '/run' directory. also adds double stat'ing to make it
> possible to remove locks without certain race condition issues.

maybe it would be slightly easier to review (or "conserver" in the git
history) if adding the lock methods and changing to /run + double-stat
would be in two separate patches?

> 
> Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
> ---
> i chose to have the lock be the sha256 hash of the relative path of a

while really not user friendly it /could/ be ok to do and from a technical
POV it would make it a bit simpler.

> given group/snapshot/manifest in a datastore, because this way we
> never run into issues where file names might be too long. this has
> been discussed in previously[1]. i could also encode them using

don't see the hash stuff discussed anywhere in [1]? Or did you mean the
"it can get long" point?

> proxmox_sys::systemd::escape_unit, but that would use two characters
> for most ascii characters and, thus, we would run into the 255 byte

how would that use two chars for every ascii?  As mostly / and - would be
affected?

> filename limit even faster in most cases. however, i have no set
> opinion here, if the consesus is, that this is a non-issue ill gladly
> send a v2.
> 
> [1]: https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001669.html
> 
>  pbs-datastore/src/datastore.rs       | 129 +++++++++++++++++++++------
>  pbs-datastore/src/snapshot_reader.rs |  30 ++++---
>  src/api2/backup/environment.rs       |   4 +-
>  src/api2/backup/mod.rs               |   5 +-
>  src/api2/reader/mod.rs               |   7 +-
>  src/backup/verify.rs                 |  12 +--
>  6 files changed, 131 insertions(+), 56 deletions(-)





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run'
  2022-04-06 14:51   ` Thomas Lamprecht
@ 2022-04-07  8:10     ` Wolfgang Bumiller
  2022-04-07  8:35       ` Stefan Sterz
  0 siblings, 1 reply; 9+ messages in thread
From: Wolfgang Bumiller @ 2022-04-07  8:10 UTC (permalink / raw)
  To: Thomas Lamprecht
  Cc: Proxmox Backup Server development discussion, Stefan Sterz

On Wed, Apr 06, 2022 at 04:51:50PM +0200, Thomas Lamprecht wrote:
> Thanks for this, no in-depth review/comments, but some higher-level/meta ones,
> sorry for that ;-)
> 
> On 06.04.22 11:39, Stefan Sterz wrote:
> > to avoid issues when removing a group or snapshot directory where two
> > threads hold a lock to the same directory, move locking to the tmpfs
> > backed '/run' directory. also adds double stat'ing to make it
> > possible to remove locks without certain race condition issues.
> 
> maybe it would be slightly easier to review (or "conserver" in the git
> history) if adding the lock methods and changing to /run + double-stat
> would be in two separate patches?
> 
> > 
> > Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
> > ---
> > i chose to have the lock be the sha256 hash of the relative path of a
> 
> while really not user friendly it /could/ be ok to do and from a technical
> POV it would make it a bit simpler.

I really dislike hashes here very much 😕.

> 
> > given group/snapshot/manifest in a datastore, because this way we
> > never run into issues where file names might be too long. this has
> > been discussed in previously[1]. i could also encode them using
> 
> don't see the hash stuff discussed anywhere in [1]? Or did you mean the
> "it can get long" point?
> 
> > proxmox_sys::systemd::escape_unit, but that would use two characters
> > for most ascii characters and, thus, we would run into the 255 byte
> 
> how would that use two chars for every ascii?  As mostly / and - would be
> affected?

Given our schemas it wouldn't be all that bad I think?  But it really
does escape a *lot* and we may as well just use percent encoding with a
less inconvenient escape list (eg. '/', '\', '%' & control chars).




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run'
  2022-04-07  8:10     ` Wolfgang Bumiller
@ 2022-04-07  8:35       ` Stefan Sterz
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Sterz @ 2022-04-07  8:35 UTC (permalink / raw)
  To: Wolfgang Bumiller, Thomas Lamprecht
  Cc: Proxmox Backup Server development discussion

On 07.04.22 10:10, Wolfgang Bumiller wrote:
> On Wed, Apr 06, 2022 at 04:51:50PM +0200, Thomas Lamprecht wrote:
>> Thanks for this, no in-depth review/comments, but some higher-level/meta ones,
>> sorry for that ;-)
>>
>> On 06.04.22 11:39, Stefan Sterz wrote:
>>> to avoid issues when removing a group or snapshot directory where two
>>> threads hold a lock to the same directory, move locking to the tmpfs
>>> backed '/run' directory. also adds double stat'ing to make it
>>> possible to remove locks without certain race condition issues.
>>
>> maybe it would be slightly easier to review (or "conserver" in the git
>> history) if adding the lock methods and changing to /run + double-stat
>> would be in two separate patches?
>>

sure, ill add that to a v2

>>>
>>> Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
>>> ---
>>> i chose to have the lock be the sha256 hash of the relative path of a
>>
>> while really not user friendly it /could/ be ok to do and from a technical
>> POV it would make it a bit simpler.
> 
> I really dislike hashes here very much 😕.
> 
>>
>>> given group/snapshot/manifest in a datastore, because this way we
>>> never run into issues where file names might be too long. this has
>>> been discussed in previously[1]. i could also encode them using
>>
>> don't see the hash stuff discussed anywhere in [1]? Or did you mean the
>> "it can get long" point?
>>
>>> proxmox_sys::systemd::escape_unit, but that would use two characters
>>> for most ascii characters and, thus, we would run into the 255 byte
>>
>> how would that use two chars for every ascii?  As mostly / and - would be
>> affected?
> 
> Given our schemas it wouldn't be all that bad I think?  But it really
> does escape a *lot* and we may as well just use percent encoding with a
> less inconvenient escape list (eg. '/', '\', '%' & control chars).

ok so i personally, i don't have a preference, although i don't
understand the strong dislike for hashes. anyway, what i meant by "has
been previously discussed" is this [1]:

>> So if I see this right, the file will then be
>>
/run/proxmox-backup/.locks/$store/${type}/${id}/${timestamp}.index.json.lck
>>
>> seems reasonable apart from the dot in `.locks` ;-)
>>
>> However, do we really need the directory structure here?
>> Shouldn't a flat `.../locks/${type}.${id}.${timestamp}.index.json` be
>> fine as well? (I don't really mind, it would just be less code ;-) )
>
> for now, ids do not really have a length limit besides the fs filename
> limit of 255 bytes
> and since i had to factor that out, i did for datastore/type as well
> (would look even weirder to use something like:
> .../locks/${datastore}.${type}/${id}/${timestamp}.index.json.lck
> )
>
> though we probably should limit the id length anyway...

so if i understand this correctly, the reasoning for why manifest
locks are deeply nested was that the backup id could by itself reach
the file name limit. i might have missed something, but afaict this is
still the case? if not please correct me. so any encoding or hashing
would need to deal with this in one way or another. hashing has the
advantage of fixed length file names, any encoding will likely make
this problem worse as some characters will always need more than one
character to encode. at least that's what it looks like to me, but id
be happily corrected :)

[1]:
https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001669.html




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run'
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run' Stefan Sterz
  2022-04-06 14:51   ` Thomas Lamprecht
@ 2022-04-07  8:45   ` Wolfgang Bumiller
  1 sibling, 0 replies; 9+ messages in thread
From: Wolfgang Bumiller @ 2022-04-07  8:45 UTC (permalink / raw)
  To: Stefan Sterz; +Cc: pbs-devel

On Wed, Apr 06, 2022 at 11:39:53AM +0200, Stefan Sterz wrote:
(...)
> @@ -308,27 +308,28 @@ impl DataStore {
>  
>          if removed_all {
>              // no snapshots left, we can now safely remove the empty folder
> -            std::fs::remove_dir_all(&full_path)
> -                .map_err(|err| {
> -                    format_err!(
> -                        "removing backup group directory {:?} failed - {}",
> -                        full_path,
> -                        err,
> -                    )
> -                })?;
> +            std::fs::remove_dir_all(&full_path).map_err(|err| {
> +                format_err!(
> +                    "removing backup group directory {:?} failed - {}",
> +                    full_path,
> +                    err,
> +                )
> +            })?;

The above looks like a pure indentation change, unless I'm missing
something. If I'm right, please try to avoid this in an otherwise
functionality-changing patch.

> +
> +            if let Ok(path) = self.group_lock_path(backup_group) {

The only possible failure here is if *creating* the intermediate paths
fails. I'm not a friend of this, see my comment at the `_path()`
functions below...

> +                let _ = std::fs::remove_file(path);
> +            }
>          }
>  
>          Ok(removed_all)
>      }
>  
>      /// Remove a backup directory including all content
> -    pub fn remove_backup_dir(&self, backup_dir: &BackupDir, force: bool) ->  Result<(), Error> {
> -
> -        let full_path = self.snapshot_path(backup_dir);
> -
> +    pub fn remove_backup_dir(&self, backup_dir: &BackupDir, force: bool) -> Result<(), Error> {
>          let (_guard, _manifest_guard);
> +
>          if !force {
> -            _guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or in use")?;
> +            _guard = self.lock_snapshot(backup_dir)?;
>              _manifest_guard = self.lock_manifest(backup_dir)?;
>          }
>  
> @@ -336,6 +337,8 @@ impl DataStore {
>              bail!("cannot remove protected snapshot");
>          }
>  
> +        let full_path = self.snapshot_path(backup_dir);
> +
>          log::info!("removing backup snapshot {:?}", full_path);
>          std::fs::remove_dir_all(&full_path)
>              .map_err(|err| {
> @@ -352,6 +355,10 @@ impl DataStore {
>              let _ = std::fs::remove_file(path);
>          }
>  
> +        if let Ok(path) = self.snapshot_lock_path(backup_dir) {

(same)

> +            let _ = std::fs::remove_file(path);
> +        }
> +
>          Ok(())
>      }
>  
(...)
> @@ -952,4 +957,72 @@ impl DataStore {
>  
>          Ok(chunk_list)
>      }
> -}
> +
> +    fn snapshot_lock_path(&self, backup_dir: &BackupDir) -> Result<PathBuf, Error> {
> +        let path = Path::new(DATASTORE_LOCKS_DIR).join(self.name());
> +        std::fs::create_dir_all(&path)?;

Given the name, this function is to query a path. Apart from the helpers
which also create the lock files, the only other current user doesn't
actually *need* the path created, since it only wants to remove it.

I think we should leave the mkdir up to the `fn lock_helper` and drop
the `Result` here.

> +
> +        // hash relative path to get a fixed length unique file name
> +        let rpath_hash =
> +            openssl::sha::sha256(backup_dir.relative_path().to_str().unwrap().as_bytes());

Please use `.as_os_str().as_bytes()` via the `OsStrExt` trait.
Never use `.to_str().unwrap()` on paths.

> +        Ok(path.join(hex::encode(rpath_hash)))
> +    }
> +
> +    fn group_lock_path(&self, group: &BackupGroup) -> Result<PathBuf, Error> {

Same for this function as for `snapshot_lock_path()` above.

> +        let path = Path::new(DATASTORE_LOCKS_DIR).join(self.name());
> +        std::fs::create_dir_all(&path)?;
> +
> +        // hash relative path to get a fixed length unique file name
> +        let rpath_hash = openssl::sha::sha256(group.group_path().to_str().unwrap().as_bytes());
> +        Ok(path.join(hex::encode(rpath_hash)))
> +    }
> +
> +    // this function helps implement the double stat'ing procedure. it
> +    // avoids certain race conditions upon lock deletion.
> +    fn lock_helper<F>(&self, path: &std::path::Path, lock_fn: F) -> Result<BackupLockGuard, Error>
> +    where
> +        F: Fn(&std::path::Path) -> Result<BackupLockGuard, Error>,
> +    {

This is where I'd put the `create_dir_all` part instead.

Also, instead of a path based stat here:

> +        let stat = match nix::sys::stat::stat(path) {
> +            Ok(result) => result.st_ino,
> +            // lock hasn't been created yet, ignore check
> +            Err(e) if e.not_found() => return Ok(lock_fn(path)?),
> +            Err(e) => bail!("could not stat lock file before locking! - {}", e),
> +        };
> +
> +        let lock = lock_fn(path)?;

I'd prefer an `fstat` on the file handle here so the path access happens
only twice instead of 3 times.

For that, you can `impl AsRawFd for BackupLockGuard` (which can just
return -1 for when its inner option is `None`. (Which it shouldn't even
have actually and I wonder if the `cfg(test)` in the tape inventory code
(which is the reason for the Option apparently) could be changed further
so that it doesn't actually use this particular type at all in tests,
but that's a different issue.)

> +
> +        if stat != nix::sys::stat::stat(path)?.st_ino {

Please don't use '?' here because if the file is currently removed we
also want the same nice error message. Something like
`stat().map(|st| st.st_ino != stat).unwrap_or(true)` might work.

> +            bail!("could not acquire lock, another thread modified the lock file");
> +        }
> +
> +        Ok(lock)
> +    }
> +
> +    /// Lock a snapshot exclusively
> +    pub fn lock_snapshot(&self, backup_dir: &BackupDir) -> Result<BackupLockGuard, Error> {
> +        self.lock_helper(&self.snapshot_lock_path(backup_dir)?, |p| {
> +            open_backup_lockfile(p, Some(Duration::from_secs(0)), true).map_err(|err| {
> +                format_err!("unable to acquire snapshot lock {:?} - {}", &p, err)
> +            })
> +        })
> +    }
> +
> +    /// Acquire a shared lock on a snapshot
> +    pub fn lock_snapshot_shared(&self, backup_dir: &BackupDir) -> Result<BackupLockGuard, Error> {
> +        self.lock_helper(&self.snapshot_lock_path(backup_dir)?, |p| {
> +            open_backup_lockfile(p, Some(Duration::from_secs(0)), false).map_err(|err| {
> +                format_err!("unable to acquire shared snapshot lock {:?} - {}", &p, err)
> +            })
> +        })
> +    }
> +
> +    /// Lock a group exclusively
> +    pub fn lock_group(&self, group: &BackupGroup) -> Result<BackupLockGuard, Error> {
> +        self.lock_helper(&self.group_lock_path(group)?, |p| {
> +            open_backup_lockfile(p, Some(Duration::from_secs(0)), true).map_err(|err| {
> +                format_err!("unable to acquire backup group lock {:?} - {}", &p, err)
> +            })
> +        })
> +    }
> +}
> \ No newline at end of file




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v1 2/3] fix: datastore: move manifest locking to new locking method
  2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 2/3] fix: datastore: move manifest locking to new locking method Stefan Sterz
@ 2022-04-07  8:52   ` Wolfgang Bumiller
  0 siblings, 0 replies; 9+ messages in thread
From: Wolfgang Bumiller @ 2022-04-07  8:52 UTC (permalink / raw)
  To: Stefan Sterz; +Cc: pbs-devel

On Wed, Apr 06, 2022 at 11:39:54AM +0200, Stefan Sterz wrote:
> adds double stat'ing and removes directory hierarchy to bring manifest
> locking in-line with other locks used by the datastore trait.
> 
> Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
> ---
>  pbs-datastore/src/datastore.rs | 46 +++++++++++++---------------------
>  1 file changed, 18 insertions(+), 28 deletions(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index cb2a8a4e..619e2710 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -30,9 +30,7 @@ use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
>  use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
>  use crate::index::IndexFile;
>  use crate::manifest::{
> -    MANIFEST_BLOB_NAME, MANIFEST_LOCK_NAME, CLIENT_LOG_BLOB_NAME,
> -    ArchiveType, BackupManifest,
> -    archive_type,
> +    archive_type, ArchiveType, BackupManifest, CLIENT_LOG_BLOB_NAME, MANIFEST_BLOB_NAME,
>  };
>  
>  lazy_static! {
> @@ -811,38 +809,30 @@ impl DataStore {
>      ///
>      /// Also creates the basedir. The lockfile is located in
>      /// '/run/proxmox-backup/locks/{datastore}/{type}/{id}/{timestamp}.index.json.lck'

^ The comment is now wrong ;-)




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-04-07  8:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-06  9:39 [pbs-devel] [PATCH proxmox-backup v1 0/3] refactor datastore locking to use tmpfs Stefan Sterz
2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 1/3] fix #3935: datastore/api2/backup: move datastore locking to '/run' Stefan Sterz
2022-04-06 14:51   ` Thomas Lamprecht
2022-04-07  8:10     ` Wolfgang Bumiller
2022-04-07  8:35       ` Stefan Sterz
2022-04-07  8:45   ` Wolfgang Bumiller
2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 2/3] fix: datastore: move manifest locking to new locking method Stefan Sterz
2022-04-07  8:52   ` Wolfgang Bumiller
2022-04-06  9:39 ` [pbs-devel] [PATCH proxmox-backup v1 3/3] fix: api: avoid race condition in set_backup_owner Stefan Sterz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal