* [pbs-devel] [PATCH proxmox-backup 01/11] prune: respect snapshot flock
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:11 ` [pbs-devel] applied: " Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 02/11] prune: never fail, just warn about failed removals Stefan Reiter
` (9 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
A snapshot that's currently being read can still appear in the prune
list, but should not be removed.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/api2/admin/datastore.rs | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 405a3c24..79add173 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -727,7 +727,7 @@ fn prune(
}));
if !(dry_run || keep) {
- datastore.remove_backup_dir(&info.backup_dir, true)?;
+ datastore.remove_backup_dir(&info.backup_dir, false)?;
}
}
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 02/11] prune: never fail, just warn about failed removals
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 01/11] prune: respect snapshot flock Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:12 ` [pbs-devel] applied: " Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 03/11] backup: use shared flock for base snapshot Stefan Reiter
` (8 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
A removal can fail if the snapshot is already gone (this is fine, our
job is done either way) or we couldn't get a lock (also fine, it can't
be removed then, just warn the user so he knows what happened and why it
wasn't removed) - keep going either way.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/api2/admin/datastore.rs | 74 ++++++++++++++++++-------------------
1 file changed, 36 insertions(+), 38 deletions(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 79add173..3bfd807c 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -16,7 +16,6 @@ use proxmox::api::{
use proxmox::api::router::SubdirMap;
use proxmox::api::schema::*;
use proxmox::tools::fs::{replace_file, CreateOptions};
-use proxmox::try_block;
use proxmox::{http_err, identity, list_subdirs_api_method, sortable};
use pxar::accessor::aio::Accessor;
@@ -692,53 +691,52 @@ fn prune(
// We use a WorkerTask just to have a task log, but run synchrounously
let worker = WorkerTask::new("prune", Some(worker_id), Userid::root_userid().clone(), true)?;
- let result = try_block! {
- if keep_all {
- worker.log("No prune selection - keeping all files.");
- } else {
- worker.log(format!("retention options: {}", prune_options.cli_options_string()));
- worker.log(format!("Starting prune on store \"{}\" group \"{}/{}\"",
- store, backup_type, backup_id));
- }
+ if keep_all {
+ worker.log("No prune selection - keeping all files.");
+ } else {
+ worker.log(format!("retention options: {}", prune_options.cli_options_string()));
+ worker.log(format!("Starting prune on store \"{}\" group \"{}/{}\"",
+ store, backup_type, backup_id));
+ }
- for (info, mut keep) in prune_info {
- if keep_all { keep = true; }
+ for (info, mut keep) in prune_info {
+ if keep_all { keep = true; }
- let backup_time = info.backup_dir.backup_time();
- let timestamp = info.backup_dir.backup_time_string();
- let group = info.backup_dir.group();
+ let backup_time = info.backup_dir.backup_time();
+ let timestamp = info.backup_dir.backup_time_string();
+ let group = info.backup_dir.group();
- let msg = format!(
- "{}/{}/{} {}",
- group.backup_type(),
- group.backup_id(),
- timestamp,
- if keep { "keep" } else { "remove" },
- );
+ let msg = format!(
+ "{}/{}/{} {}",
+ group.backup_type(),
+ group.backup_id(),
+ timestamp,
+ if keep { "keep" } else { "remove" },
+ );
- worker.log(msg);
+ worker.log(msg);
- prune_result.push(json!({
- "backup-type": group.backup_type(),
- "backup-id": group.backup_id(),
- "backup-time": backup_time,
- "keep": keep,
- }));
+ prune_result.push(json!({
+ "backup-type": group.backup_type(),
+ "backup-id": group.backup_id(),
+ "backup-time": backup_time,
+ "keep": keep,
+ }));
- if !(dry_run || keep) {
- datastore.remove_backup_dir(&info.backup_dir, false)?;
+ if !(dry_run || keep) {
+ if let Err(err) = datastore.remove_backup_dir(&info.backup_dir, false) {
+ worker.warn(
+ format!(
+ "failed to remove dir {:?}: {}",
+ info.backup_dir.relative_path(), err
+ )
+ );
}
}
+ }
- Ok(())
- };
-
- worker.log_result(&result);
-
- if let Err(err) = result {
- bail!("prune failed - {}", err);
- };
+ worker.log_result(&Ok(()));
Ok(json!(prune_result))
}
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 03/11] backup: use shared flock for base snapshot
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 01/11] prune: respect snapshot flock Stefan Reiter
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 02/11] prune: never fail, just warn about failed removals Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:12 ` [pbs-devel] applied: " Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 04/11] reader: acquire shared flock on open snapshot Stefan Reiter
` (7 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
To allow other reading operations on the base snapshot as well. No
semantic changes with this patch alone, as all other locks on snapshots
are exclusive.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/api2/backup.rs | 4 ++--
src/tools/fs.rs | 22 +++++++++++++++++++++-
2 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/src/api2/backup.rs b/src/api2/backup.rs
index 8a2a2409..8b2b0e1a 100644
--- a/src/api2/backup.rs
+++ b/src/api2/backup.rs
@@ -16,7 +16,7 @@ use crate::backup::*;
use crate::api2::types::*;
use crate::config::acl::PRIV_DATASTORE_BACKUP;
use crate::config::cached_user_info::CachedUserInfo;
-use crate::tools::fs::lock_dir_noblock;
+use crate::tools::fs::lock_dir_noblock_shared;
mod environment;
use environment::*;
@@ -144,7 +144,7 @@ async move {
// lock last snapshot to prevent forgetting/pruning it during backup
let full_path = datastore.snapshot_path(&last.backup_dir);
- Some(lock_dir_noblock(&full_path, "snapshot", "base snapshot is already locked by another operation")?)
+ Some(lock_dir_noblock_shared(&full_path, "snapshot", "base snapshot is already locked by another operation")?)
} else {
None
};
diff --git a/src/tools/fs.rs b/src/tools/fs.rs
index 5aaaecda..72a530d8 100644
--- a/src/tools/fs.rs
+++ b/src/tools/fs.rs
@@ -265,11 +265,31 @@ impl Default for FSXAttr {
}
}
+/// Attempt to acquire a shared flock on the given path, 'what' and
+/// 'would_block_message' are used for error formatting.
+pub fn lock_dir_noblock_shared(
+ path: &std::path::Path,
+ what: &str,
+ would_block_msg: &str,
+) -> Result<DirLockGuard, Error> {
+ do_lock_dir_noblock(path, what, would_block_msg, false)
+}
+/// Attempt to acquire an exclusive flock on the given path, 'what' and
+/// 'would_block_message' are used for error formatting.
pub fn lock_dir_noblock(
path: &std::path::Path,
what: &str,
would_block_msg: &str,
+) -> Result<DirLockGuard, Error> {
+ do_lock_dir_noblock(path, what, would_block_msg, true)
+}
+
+fn do_lock_dir_noblock(
+ path: &std::path::Path,
+ what: &str,
+ would_block_msg: &str,
+ exclusive: bool,
) -> Result<DirLockGuard, Error> {
let mut handle = Dir::open(path, OFlag::O_RDONLY, Mode::empty())
.map_err(|err| {
@@ -278,7 +298,7 @@ pub fn lock_dir_noblock(
// acquire in non-blocking mode, no point in waiting here since other
// backups could still take a very long time
- proxmox::tools::fs::lock_file(&mut handle, true, Some(std::time::Duration::from_nanos(0)))
+ proxmox::tools::fs::lock_file(&mut handle, exclusive, Some(std::time::Duration::from_nanos(0)))
.map_err(|err| {
format_err!(
"unable to acquire lock on {} directory {:?} - {}", what, path,
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 04/11] reader: acquire shared flock on open snapshot
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (2 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 03/11] backup: use shared flock for base snapshot Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:13 ` [pbs-devel] applied: " Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 05/11] verify: acquire shared snapshot flock and skip on error Stefan Reiter
` (6 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
...to avoid it being forgotten or pruned while in use.
Update lock error message for deletions to be consistent.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/api2/reader.rs | 19 ++++++++++++++-----
src/backup/datastore.rs | 2 +-
2 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/src/api2/reader.rs b/src/api2/reader.rs
index 4c870eda..8f47047e 100644
--- a/src/api2/reader.rs
+++ b/src/api2/reader.rs
@@ -17,6 +17,7 @@ use crate::tools;
use crate::config::acl::{PRIV_DATASTORE_READ, PRIV_DATASTORE_BACKUP};
use crate::config::cached_user_info::CachedUserInfo;
use crate::api2::helpers;
+use crate::tools::fs::lock_dir_noblock_shared;
mod environment;
use environment::*;
@@ -98,6 +99,11 @@ fn upgrade_to_backup_reader_protocol(
}
}
+ let _guard = lock_dir_noblock_shared(
+ &datastore.snapshot_path(&backup_dir),
+ "snapshot",
+ "locked by another operation")?;
+
let path = datastore.base_path();
//let files = BackupInfo::list_files(&path, &backup_dir)?;
@@ -146,11 +152,14 @@ fn upgrade_to_backup_reader_protocol(
use futures::future::Either;
futures::future::select(req_fut, abort_future)
- .map(|res| match res {
- Either::Left((Ok(res), _)) => Ok(res),
- Either::Left((Err(err), _)) => Err(err),
- Either::Right((Ok(res), _)) => Ok(res),
- Either::Right((Err(err), _)) => Err(err),
+ .map(move |res| {
+ let _guard = _guard;
+ match res {
+ Either::Left((Ok(res), _)) => Ok(res),
+ Either::Left((Err(err), _)) => Err(err),
+ Either::Right((Ok(res), _)) => Ok(res),
+ Either::Right((Err(err), _)) => Err(err),
+ }
})
.map_ok(move |_| env.log("reader finished successfully"))
})?;
diff --git a/src/backup/datastore.rs b/src/backup/datastore.rs
index 1f708293..a77c5f1d 100644
--- a/src/backup/datastore.rs
+++ b/src/backup/datastore.rs
@@ -227,7 +227,7 @@ impl DataStore {
let _guard;
if !force {
- _guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or used as base")?;
+ _guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or in use")?;
}
log::info!("removing backup snapshot {:?}", full_path);
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 05/11] verify: acquire shared snapshot flock and skip on error
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (3 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 04/11] reader: acquire shared flock on open snapshot Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:13 ` [pbs-devel] applied: " Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 06/11] gc: avoid race between phase1 and forget/prune Stefan Reiter
` (5 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
If we can't acquire a lock (either because the snapshot disappeared, it
is about to be forgotten/pruned, or it is currently still running) skip
the snapshot. Hold the lock during verification, so that it cannot be
deleted while we are still verifying.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/backup/verify.rs | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index f2c38eae..5e1391d9 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -23,6 +23,7 @@ use crate::{
task::TaskState,
task_log,
tools::ParallelHandler,
+ tools::fs::lock_dir_noblock_shared,
};
fn verify_blob(datastore: Arc<DataStore>, backup_dir: &BackupDir, info: &FileInfo) -> Result<(), Error> {
@@ -284,6 +285,21 @@ pub fn verify_backup_dir(
upid: UPID,
) -> Result<bool, Error> {
+ let _guard_res = lock_dir_noblock_shared(
+ &datastore.snapshot_path(&backup_dir),
+ "snapshot",
+ "locked by another operation");
+ if let Err(err) = _guard_res {
+ task_log!(
+ worker,
+ "SKIPPED: verify {}:{} - could not acquire snapshot lock: {}",
+ datastore.name(),
+ backup_dir,
+ err,
+ );
+ return Ok(true);
+ }
+
let mut manifest = match datastore.load_manifest(&backup_dir) {
Ok((manifest, _)) => manifest,
Err(err) => {
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 06/11] gc: avoid race between phase1 and forget/prune
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (4 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 05/11] verify: acquire shared snapshot flock and skip on error Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:17 ` Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 07/11] datastore: remove load_manifest_json Stefan Reiter
` (4 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
...by saving all forgotten chunks into a HashSet when we detect a GC
phase1 currently running. If GC then hits a NotFound, it can check if
the snapshot was simply forgotten behind its back, or if an actual error
occurred and it needs to abort (since it might delete still referenced
chunks, if the error is transient and the index file is still there).
We have to attach the error message in {fixed,dynamic}_index via the
.context() method, otherwise the original std::io::Error gets lost and
we can't check for NotFound.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
@Dietmar: I went with the anyhow .context() method again - I know we talked
about this, but using a std::io::Error directly also doesn't solve the problem,
since we do a (io_)format_err(...) which removes the context again. And if we
return the IO error directly, we'd have to add that text formatting to all call
sites, or drop it entirely, which I both rather dislike.
src/backup/datastore.rs | 56 +++++++++++++++++++++++++++++++++----
src/backup/dynamic_index.rs | 5 +++-
src/backup/fixed_index.rs | 7 +++--
3 files changed, 60 insertions(+), 8 deletions(-)
diff --git a/src/backup/datastore.rs b/src/backup/datastore.rs
index a77c5f1d..d1758408 100644
--- a/src/backup/datastore.rs
+++ b/src/backup/datastore.rs
@@ -37,6 +37,9 @@ pub struct DataStore {
chunk_store: Arc<ChunkStore>,
gc_mutex: Mutex<bool>,
last_gc_status: Mutex<GarbageCollectionStatus>,
+
+ // bool indicates if phase1 is currently active
+ removed_during_gc: Mutex<(bool, HashSet<PathBuf>)>,
}
impl DataStore {
@@ -83,6 +86,7 @@ impl DataStore {
chunk_store: Arc::new(chunk_store),
gc_mutex: Mutex::new(false),
last_gc_status: Mutex::new(gc_status),
+ removed_during_gc: Mutex::new((false, HashSet::new())),
})
}
@@ -230,6 +234,14 @@ impl DataStore {
_guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or in use")?;
}
+ // Acquire lock and keep it during remove operation, so there's no
+ // chance for a race between adding to the hash and actually removing
+ // the dir (phase1 might otherwise start in-between)
+ let mut removed_guard = self.removed_during_gc.lock().unwrap();
+ if removed_guard.0 {
+ removed_guard.1.insert(self.snapshot_path(&backup_dir));
+ }
+
log::info!("removing backup snapshot {:?}", full_path);
std::fs::remove_dir_all(&full_path)
.map_err(|err| {
@@ -449,6 +461,21 @@ impl DataStore {
let mut last_percentage: usize = 0;
+ let handle_notfound = |err: Error, path: &Path| {
+ if let Some(ioerr) = err.downcast_ref::<std::io::Error>() {
+ if ioerr.kind() == std::io::ErrorKind::NotFound {
+ let (_, removed_hash) = &*self.removed_during_gc.lock().unwrap();
+ let backup_dir = path.parent();
+ if backup_dir.is_some() && removed_hash.contains(backup_dir.unwrap()) {
+ // index file not found, but we know that it was deleted by
+ // a concurrent 'forget', so we can safely ignore
+ return Ok(())
+ }
+ }
+ }
+ Err(err)
+ };
+
for path in image_list {
worker.check_abort()?;
@@ -456,11 +483,17 @@ impl DataStore {
if let Ok(archive_type) = archive_type(&path) {
if archive_type == ArchiveType::FixedIndex {
- let index = self.open_fixed_reader(&path)?;
- self.index_mark_used_chunks(index, &path, status, worker)?;
+ let index = self.open_fixed_reader(&path);
+ match index {
+ Ok(index) => self.index_mark_used_chunks(index, &path, status, worker)?,
+ Err(err) => handle_notfound(err, &path)?
+ }
} else if archive_type == ArchiveType::DynamicIndex {
- let index = self.open_dynamic_reader(&path)?;
- self.index_mark_used_chunks(index, &path, status, worker)?;
+ let index = self.open_dynamic_reader(&path);
+ match index {
+ Ok(index) => self.index_mark_used_chunks(index, &path, status, worker)?,
+ Err(err) => handle_notfound(err, &path)?
+ }
}
}
done += 1;
@@ -506,7 +539,20 @@ impl DataStore {
crate::task_log!(worker, "Start GC phase1 (mark used chunks)");
- self.mark_used_chunks(&mut gc_status, worker)?;
+ {
+ let mut guard = self.removed_during_gc.lock().unwrap();
+ guard.0 = true;
+ }
+ let mark_res = self.mark_used_chunks(&mut gc_status, worker);
+ {
+ let mut guard = self.removed_during_gc.lock().unwrap();
+ guard.0 = false;
+ guard.1.clear();
+ }
+
+ if let Err(err) = mark_res {
+ bail!(err);
+ }
crate::task_log!(worker, "Start GC phase2 (sweep unused chunks)");
self.chunk_store.sweep_unused_chunks(
diff --git a/src/backup/dynamic_index.rs b/src/backup/dynamic_index.rs
index 8731a418..0776209c 100644
--- a/src/backup/dynamic_index.rs
+++ b/src/backup/dynamic_index.rs
@@ -86,7 +86,10 @@ impl DynamicIndexReader {
File::open(path)
.map_err(Error::from)
.and_then(Self::new)
- .map_err(|err| format_err!("Unable to open dynamic index {:?} - {}", path, err))
+ .map_err(|err| {
+ let msg = format!("Unable to open dynamic index {:?} - {}", path, &err);
+ err.context(msg)
+ })
}
pub fn new(mut file: std::fs::File) -> Result<Self, Error> {
diff --git a/src/backup/fixed_index.rs b/src/backup/fixed_index.rs
index eff50055..6a1e66f5 100644
--- a/src/backup/fixed_index.rs
+++ b/src/backup/fixed_index.rs
@@ -1,4 +1,4 @@
-use anyhow::{bail, format_err, Error};
+use anyhow::{bail, Error};
use std::io::{Seek, SeekFrom};
use super::chunk_stat::*;
@@ -61,7 +61,10 @@ impl FixedIndexReader {
File::open(path)
.map_err(Error::from)
.and_then(|file| Self::new(file))
- .map_err(|err| format_err!("Unable to open fixed index {:?} - {}", path, err))
+ .map_err(|err| {
+ let msg = format!("Unable to open fixed index {:?} - {}", path, &err);
+ err.context(msg)
+ })
}
pub fn new(mut file: std::fs::File) -> Result<Self, Error> {
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 06/11] gc: avoid race between phase1 and forget/prune
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 06/11] gc: avoid race between phase1 and forget/prune Stefan Reiter
@ 2020-10-15 5:17 ` Dietmar Maurer
0 siblings, 0 replies; 26+ messages in thread
From: Dietmar Maurer @ 2020-10-15 5:17 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Stefan Reiter
> @Dietmar: I went with the anyhow .context() method again - I know we talked
> about this, but using a std::io::Error directly also doesn't solve the problem,
> since we do a (io_)format_err(...) which removes the context again. And if we
> return the IO error directly, we'd have to add that text formatting to all call
> sites, or drop it entirely, which I both rather dislike.
Simply use File::open ??
let file = match std::fs::File::open(...) {
// check errors
};
let index = FixedIndexReader::new(file)?;
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 07/11] datastore: remove load_manifest_json
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (5 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 06/11] gc: avoid race between phase1 and forget/prune Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:28 ` [pbs-devel] applied: " Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking Stefan Reiter
` (3 subsequent siblings)
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
There's no point in having that as a seperate method, just parse the
thing into a struct and write it back out correctly.
Also makes further changes to the method simpler.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/api2/admin/datastore.rs | 8 ++++----
src/api2/backup/environment.rs | 4 ++--
src/backup/datastore.rs | 15 ++-------------
src/backup/verify.rs | 2 +-
4 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 3bfd807c..5824611b 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1428,9 +1428,9 @@ fn get_notes(
let allowed = (user_privs & PRIV_DATASTORE_READ) != 0;
if !allowed { check_backup_owner(&datastore, backup_dir.group(), &userid)?; }
- let manifest = datastore.load_manifest_json(&backup_dir)?;
+ let (manifest, _) = datastore.load_manifest(&backup_dir)?;
- let notes = manifest["unprotected"]["notes"]
+ let notes = manifest.unprotected["notes"]
.as_str()
.unwrap_or("");
@@ -1481,9 +1481,9 @@ fn set_notes(
let allowed = (user_privs & PRIV_DATASTORE_READ) != 0;
if !allowed { check_backup_owner(&datastore, backup_dir.group(), &userid)?; }
- let mut manifest = datastore.load_manifest_json(&backup_dir)?;
+ let (mut manifest, _) = datastore.load_manifest(&backup_dir)?;
- manifest["unprotected"]["notes"] = notes.into();
+ manifest.unprotected["notes"] = notes.into();
datastore.store_manifest(&backup_dir, manifest)?;
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index b035f1e2..f00c2cd3 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -473,12 +473,12 @@ impl BackupEnvironment {
}
// check manifest
- let mut manifest = self.datastore.load_manifest_json(&self.backup_dir)
+ let (mut manifest, _) = self.datastore.load_manifest(&self.backup_dir)
.map_err(|err| format_err!("unable to load manifest blob - {}", err))?;
let stats = serde_json::to_value(state.backup_stat)?;
- manifest["unprotected"]["chunk_upload_stats"] = stats;
+ manifest.unprotected["chunk_upload_stats"] = stats;
self.datastore.store_manifest(&self.backup_dir, manifest)
.map_err(|err| format_err!("unable to store manifest blob - {}", err))?;
diff --git a/src/backup/datastore.rs b/src/backup/datastore.rs
index d1758408..8ea9311a 100644
--- a/src/backup/datastore.rs
+++ b/src/backup/datastore.rs
@@ -6,7 +6,6 @@ use std::convert::TryFrom;
use anyhow::{bail, format_err, Error};
use lazy_static::lazy_static;
-use serde_json::Value;
use proxmox::tools::fs::{replace_file, CreateOptions};
@@ -669,22 +668,12 @@ impl DataStore {
Ok((manifest, raw_size))
}
- pub fn load_manifest_json(
- &self,
- backup_dir: &BackupDir,
- ) -> Result<Value, Error> {
- let blob = self.load_blob(backup_dir, MANIFEST_BLOB_NAME)?;
- // no expected digest available
- let manifest_data = blob.decode(None, None)?;
- let manifest: Value = serde_json::from_slice(&manifest_data[..])?;
- Ok(manifest)
- }
-
pub fn store_manifest(
&self,
backup_dir: &BackupDir,
- manifest: Value,
+ manifest: BackupManifest,
) -> Result<(), Error> {
+ let manifest = serde_json::to_value(manifest)?;
let manifest = serde_json::to_string_pretty(&manifest)?;
let blob = DataBlob::encode(manifest.as_bytes(), None, true)?;
let raw_data = blob.raw_data();
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index 5e1391d9..05b6ba86 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -368,7 +368,7 @@ pub fn verify_backup_dir(
upid,
};
manifest.unprotected["verify_state"] = serde_json::to_value(verify_state)?;
- datastore.store_manifest(&backup_dir, serde_json::to_value(manifest)?)
+ datastore.store_manifest(&backup_dir, manifest)
.map_err(|err| format_err!("unable to store manifest blob - {}", err))?;
Ok(error_count == 0)
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (6 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 07/11] datastore: remove load_manifest_json Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:25 ` Dietmar Maurer
` (2 more replies)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 09/11] datastore: remove individual snapshots before group Stefan Reiter
` (2 subsequent siblings)
10 siblings, 3 replies; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
Avoid races when updating manifest data by flocking the manifest file
itself. store_manifest is made to require such a lock and will
automatically drop it to ensure safety using Rust's compiler.
Snapshot deletion also acquires the lock, so it cannot interfere with an
outstanding manifest write.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/api2/admin/datastore.rs | 4 +--
src/api2/backup/environment.rs | 4 +--
src/backup/datastore.rs | 50 ++++++++++++++++++++++++++++++++--
src/backup/verify.rs | 6 ++--
4 files changed, 55 insertions(+), 9 deletions(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 5824611b..11223e6a 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1481,11 +1481,11 @@ fn set_notes(
let allowed = (user_privs & PRIV_DATASTORE_READ) != 0;
if !allowed { check_backup_owner(&datastore, backup_dir.group(), &userid)?; }
- let (mut manifest, _) = datastore.load_manifest(&backup_dir)?;
+ let (mut manifest, manifest_guard) = datastore.load_manifest_locked(&backup_dir)?;
manifest.unprotected["notes"] = notes.into();
- datastore.store_manifest(&backup_dir, manifest)?;
+ datastore.store_manifest(&backup_dir, manifest, manifest_guard)?;
Ok(())
}
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index f00c2cd3..0e672d8e 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -473,14 +473,14 @@ impl BackupEnvironment {
}
// check manifest
- let (mut manifest, _) = self.datastore.load_manifest(&self.backup_dir)
+ let (mut manifest, manifest_guard) = self.datastore.load_manifest_locked(&self.backup_dir)
.map_err(|err| format_err!("unable to load manifest blob - {}", err))?;
let stats = serde_json::to_value(state.backup_stat)?;
manifest.unprotected["chunk_upload_stats"] = stats;
- self.datastore.store_manifest(&self.backup_dir, manifest)
+ self.datastore.store_manifest(&self.backup_dir, manifest, manifest_guard)
.map_err(|err| format_err!("unable to store manifest blob - {}", err))?;
if let Some(base) = &self.last_backup {
diff --git a/src/backup/datastore.rs b/src/backup/datastore.rs
index 8ea9311a..f8c228fc 100644
--- a/src/backup/datastore.rs
+++ b/src/backup/datastore.rs
@@ -3,6 +3,8 @@ use std::io::{self, Write};
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
use std::convert::TryFrom;
+use std::time::Duration;
+use std::fs::File;
use anyhow::{bail, format_err, Error};
use lazy_static::lazy_static;
@@ -24,6 +26,8 @@ use crate::tools::fs::{lock_dir_noblock, DirLockGuard};
use crate::api2::types::{GarbageCollectionStatus, Userid};
use crate::server::UPID;
+pub type ManifestLock = File;
+
lazy_static! {
static ref DATASTORE_MAP: Mutex<HashMap<String, Arc<DataStore>>> = Mutex::new(HashMap::new());
}
@@ -228,9 +232,10 @@ impl DataStore {
let full_path = self.snapshot_path(backup_dir);
- let _guard;
+ let (_guard, _manifest_guard);
if !force {
_guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or in use")?;
+ _manifest_guard = self.lock_manifest(backup_dir);
}
// Acquire lock and keep it during remove operation, so there's no
@@ -656,8 +661,47 @@ impl DataStore {
digest_str,
err,
))
- }
+ }
+ fn lock_manifest(
+ &self,
+ backup_dir: &BackupDir,
+ ) -> Result<ManifestLock, Error> {
+ let mut path = self.base_path();
+ path.push(backup_dir.relative_path());
+ path.push(MANIFEST_BLOB_NAME);
+
+ let mut handle = File::open(&path)
+ .map_err(|err| {
+ format_err!("unable to open manifest {:?} for locking - {}", &path, err)
+ })?;
+
+ proxmox::tools::fs::lock_file(&mut handle, true, Some(Duration::from_secs(5)))
+ .map_err(|err| {
+ format_err!(
+ "unable to acquire lock on manifest {:?} - {}", &path, err
+ )
+ })?;
+
+ Ok(handle)
+ }
+
+ /// Load the manifest with a lock, so it can be safely written back again.
+ /// Most operations consist of "load -> edit unprotected -> write back" so the lock is not held
+ /// for long - thus we wait a few seconds for the lock to become available before giving up. In
+ /// case of verify it might take longer, so all callers must either be able to cope with a
+ /// failure or ensure that they are exclusive with verify.
+ pub fn load_manifest_locked(
+ &self,
+ backup_dir: &BackupDir,
+ ) -> Result<(BackupManifest, ManifestLock), Error> {
+ let guard = self.lock_manifest(backup_dir)?;
+ let blob = self.load_blob(backup_dir, MANIFEST_BLOB_NAME)?;
+ let manifest = BackupManifest::try_from(blob)?;
+ Ok((manifest, guard))
+ }
+
+ /// Load the manifest without a lock. Cannot be edited and written back.
pub fn load_manifest(
&self,
backup_dir: &BackupDir,
@@ -668,10 +712,12 @@ impl DataStore {
Ok((manifest, raw_size))
}
+ /// Store a given manifest. Requires a lock acquired with load_manifest_locked for safety.
pub fn store_manifest(
&self,
backup_dir: &BackupDir,
manifest: BackupManifest,
+ _manifest_lock: ManifestLock,
) -> Result<(), Error> {
let manifest = serde_json::to_value(manifest)?;
let manifest = serde_json::to_string_pretty(&manifest)?;
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index 05b6ba86..839987e1 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -300,8 +300,8 @@ pub fn verify_backup_dir(
return Ok(true);
}
- let mut manifest = match datastore.load_manifest(&backup_dir) {
- Ok((manifest, _)) => manifest,
+ let (mut manifest, manifest_guard) = match datastore.load_manifest_locked(&backup_dir) {
+ Ok((manifest, guard)) => (manifest, guard),
Err(err) => {
task_log!(
worker,
@@ -368,7 +368,7 @@ pub fn verify_backup_dir(
upid,
};
manifest.unprotected["verify_state"] = serde_json::to_value(verify_state)?;
- datastore.store_manifest(&backup_dir, manifest)
+ datastore.store_manifest(&backup_dir, manifest, manifest_guard)
.map_err(|err| format_err!("unable to store manifest blob - {}", err))?;
Ok(error_count == 0)
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking Stefan Reiter
@ 2020-10-15 5:25 ` Dietmar Maurer
2020-10-15 7:04 ` Fabian Grünbichler
2020-10-15 5:39 ` Dietmar Maurer
2020-10-15 5:43 ` Dietmar Maurer
2 siblings, 1 reply; 26+ messages in thread
From: Dietmar Maurer @ 2020-10-15 5:25 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Stefan Reiter
AFAIR flock with atomically replaced files has problems/races.
Unfortunately, I do not remember why. Somebody remember why? Or is it really safe?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking
2020-10-15 5:25 ` Dietmar Maurer
@ 2020-10-15 7:04 ` Fabian Grünbichler
0 siblings, 0 replies; 26+ messages in thread
From: Fabian Grünbichler @ 2020-10-15 7:04 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Stefan Reiter
On October 15, 2020 7:25 am, Dietmar Maurer wrote:
> AFAIR flock with atomically replaced files has problems/races.
>
> Unfortunately, I do not remember why. Somebody remember why? Or is it really safe?
you can obtain the flock a second time after replacing the original
(still locked) file. using a separate lock file that is not touched
other than for flocking solves this problem.
$ flock --verbose testfile -c 'sleep 60'
flock: getting lock took 0.000003 seconds
flock: executing /bin/zsh
while that is running and holding the lock:
$ flock --verbose --wait 5 testfile -c 'sleep 60'
flock: timeout while waiting to get lock
$ mv testfile2 testfile
$ flock --verbose --wait 5 testfile -c 'sleep 60'
flock: getting lock took 0.000003 seconds
flock: executing /bin/zsh
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking Stefan Reiter
2020-10-15 5:25 ` Dietmar Maurer
@ 2020-10-15 5:39 ` Dietmar Maurer
2020-10-15 7:53 ` Stefan Reiter
2020-10-15 5:43 ` Dietmar Maurer
2 siblings, 1 reply; 26+ messages in thread
From: Dietmar Maurer @ 2020-10-15 5:39 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Stefan Reiter
Holding the lock for such long time seems wrong to me.
Instead, can we simple reload the manifest before we update
the verify_state?
> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
> index 05b6ba86..839987e1 100644
> --- a/src/backup/verify.rs
> +++ b/src/backup/verify.rs
> @@ -300,8 +300,8 @@ pub fn verify_backup_dir(
> return Ok(true);
> }
>
> - let mut manifest = match datastore.load_manifest(&backup_dir) {
> - Ok((manifest, _)) => manifest,
> + let (mut manifest, manifest_guard) = match datastore.load_manifest_locked(&backup_dir) {
> + Ok((manifest, guard)) => (manifest, guard),
> Err(err) => {
> task_log!(
> worker,
> @@ -368,7 +368,7 @@ pub fn verify_backup_dir(
> upid,
> };
> manifest.unprotected["verify_state"] = serde_json::to_value(verify_state)?;
> - datastore.store_manifest(&backup_dir, manifest)
> + datastore.store_manifest(&backup_dir, manifest, manifest_guard)
> .map_err(|err| format_err!("unable to store manifest blob - {}", err))?;
>
> Ok(error_count == 0)
> --
> 2.20.1
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking
2020-10-15 5:39 ` Dietmar Maurer
@ 2020-10-15 7:53 ` Stefan Reiter
0 siblings, 0 replies; 26+ messages in thread
From: Stefan Reiter @ 2020-10-15 7:53 UTC (permalink / raw)
To: Dietmar Maurer, Proxmox Backup Server development discussion
On 10/15/20 7:39 AM, Dietmar Maurer wrote:
> Holding the lock for such long time seems wrong to me.
>
> Instead, can we simple reload the manifest before we update
> the verify_state?
>
I did so to avoid the manifest changing while were using the snapshot,
but I suppose we already hold the flock on the snapshot dir anyway. So
yes, can do.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking Stefan Reiter
2020-10-15 5:25 ` Dietmar Maurer
2020-10-15 5:39 ` Dietmar Maurer
@ 2020-10-15 5:43 ` Dietmar Maurer
2020-10-15 7:53 ` Stefan Reiter
2 siblings, 1 reply; 26+ messages in thread
From: Dietmar Maurer @ 2020-10-15 5:43 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Stefan Reiter
Note: This api does not work if you want to create a new manifest (because you
do not have a lock)
> + /// Store a given manifest. Requires a lock acquired with load_manifest_locked for safety.
> pub fn store_manifest(
> &self,
> backup_dir: &BackupDir,
> manifest: BackupManifest,
> + _manifest_lock: ManifestLock,
> ) -> Result<(), Error> {
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking
2020-10-15 5:43 ` Dietmar Maurer
@ 2020-10-15 7:53 ` Stefan Reiter
0 siblings, 0 replies; 26+ messages in thread
From: Stefan Reiter @ 2020-10-15 7:53 UTC (permalink / raw)
To: Dietmar Maurer, Proxmox Backup Server development discussion
On 10/15/20 7:43 AM, Dietmar Maurer wrote:
> Note: This api does not work if you want to create a new manifest (because you
> do not have a lock)
>
>> + /// Store a given manifest. Requires a lock acquired with load_manifest_locked for safety.
>> pub fn store_manifest(
>> &self,
>> backup_dir: &BackupDir,
>> manifest: BackupManifest,
>> + _manifest_lock: ManifestLock,
>> ) -> Result<(), Error> {
Correct me if I'm wrong, but I believe we never create a manifest,
except when we upload it from the client the first time (or on pull,
which is the same thing).
That's also why I consider the flock directly on the manifest safe: The
only path we write a new manifest is on creation, where we couldn't have
possibly had a lock on it before, and whenever we change it afterward we
use this API, meaning the flock is required to be held, and thus nothing
can concurrently replace the file underneath us.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 09/11] datastore: remove individual snapshots before group
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (7 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 08/11] datastore: add manifest locking Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-15 5:51 ` [pbs-devel] applied: " Dietmar Maurer
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 10/11] rustdoc: add crate level doc Stefan Reiter
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 11/11] rustdoc: overhaul backup rustdoc and add locking table Stefan Reiter
10 siblings, 1 reply; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
Removing a snapshot has some more safety checks which we don't want to
ignore when removing an entire group (i.e. locking the manifest and
notifying GC).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/backup/datastore.rs | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/src/backup/datastore.rs b/src/backup/datastore.rs
index f8c228fc..30e69e6f 100644
--- a/src/backup/datastore.rs
+++ b/src/backup/datastore.rs
@@ -215,10 +215,17 @@ impl DataStore {
let _guard = tools::fs::lock_dir_noblock(&full_path, "backup group", "possible running backup")?;
log::info!("removing backup group {:?}", full_path);
+
+ // remove all individual backup dirs first to ensure nothing is using them
+ for snap in backup_group.list_backups(&self.base_path())? {
+ self.remove_backup_dir(&snap.backup_dir, false)?;
+ }
+
+ // no snapshots left, we can now safely remove the empty folder
std::fs::remove_dir_all(&full_path)
.map_err(|err| {
format_err!(
- "removing backup group {:?} failed - {}",
+ "removing backup group directory {:?} failed - {}",
full_path,
err,
)
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 10/11] rustdoc: add crate level doc
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (8 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 09/11] datastore: remove individual snapshots before group Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 11/11] rustdoc: overhaul backup rustdoc and add locking table Stefan Reiter
10 siblings, 0 replies; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
Contains a link to the 'backup' module's doc, as that explains a lot
about the inner workings of PBS and probably marks a good entry point
for new readers.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
src/lib.rs | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/lib.rs b/src/lib.rs
index c44c9ae1..cc51b2e0 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,3 +1,8 @@
+//! See the different modules for documentation on their usage.
+//!
+//! The [backup](backup/index.html) module contains some detailed information
+//! on the inner workings of the backup server regarding data storage.
+
pub mod task;
#[macro_use]
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [pbs-devel] [PATCH proxmox-backup 11/11] rustdoc: overhaul backup rustdoc and add locking table
2020-10-14 12:16 [pbs-devel] [PATCH 00/11] Locking and rustdoc improvements Stefan Reiter
` (9 preceding siblings ...)
2020-10-14 12:16 ` [pbs-devel] [PATCH proxmox-backup 10/11] rustdoc: add crate level doc Stefan Reiter
@ 2020-10-14 12:16 ` Stefan Reiter
10 siblings, 0 replies; 26+ messages in thread
From: Stefan Reiter @ 2020-10-14 12:16 UTC (permalink / raw)
To: pbs-devel
Rewrite most of the documentation to be more readable and correct
(according to the current implementations).
Add a table visualizing all different locks used to synchronize
concurrent operations.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
FYI: I used https://www.tablesgenerator.com/markdown_tables for the table
src/backup.rs | 201 ++++++++++++++++++++++++++++++--------------------
1 file changed, 121 insertions(+), 80 deletions(-)
diff --git a/src/backup.rs b/src/backup.rs
index 1b2180bc..95e45d10 100644
--- a/src/backup.rs
+++ b/src/backup.rs
@@ -1,107 +1,148 @@
-//! This module implements the proxmox backup data storage
+//! This module implements the data storage and access layer.
//!
-//! Proxmox backup splits large files into chunks, and stores them
-//! deduplicated using a content addressable storage format.
+//! # Data formats
//!
-//! A chunk is simply defined as binary blob, which is stored inside a
-//! `ChunkStore`, addressed by the SHA256 digest of the binary blob.
+//! PBS splits large files into chunks, and stores them deduplicated using
+//! a content addressable storage format.
//!
-//! Index files are used to reconstruct the original file. They
-//! basically contain a list of SHA256 checksums. The `DynamicIndex*`
-//! format is able to deal with dynamic chunk sizes, whereas the
-//! `FixedIndex*` format is an optimization to store a list of equal
-//! sized chunks.
+//! Backup snapshots are stored as folders containing a manifest file and
+//! potentially one or more index or blob files.
//!
-//! # ChunkStore Locking
+//! The manifest contains hashes of all other files and can be signed by
+//! the client.
//!
-//! We need to be able to restart the proxmox-backup service daemons,
-//! so that we can update the software without rebooting the host. But
-//! such restarts must not abort running backup jobs, so we need to
-//! keep the old service running until those jobs are finished. This
-//! implies that we need some kind of locking for the
-//! ChunkStore. Please note that it is perfectly valid to have
-//! multiple parallel ChunkStore writers, even when they write the
-//! same chunk (because the chunk would have the same name and the
-//! same data). The only real problem is garbage collection, because
-//! we need to avoid deleting chunks which are still referenced.
+//! Blob files contain data directly. They are used for config files and
+//! the like.
//!
-//! * Read Index Files:
+//! Index files are used to reconstruct an original file. They contain a
+//! list of SHA256 checksums. The `DynamicIndex*` format is able to deal
+//! with dynamic chunk sizes (CT and host backups), whereas the
+//! `FixedIndex*` format is an optimization to store a list of equal sized
+//! chunks (VMs, whole block devices).
//!
-//! Acquire shared lock for .idx files.
-//!
-//!
-//! * Delete Index Files:
-//!
-//! Acquire exclusive lock for .idx files. This makes sure that we do
-//! not delete index files while they are still in use.
-//!
-//!
-//! * Create Index Files:
-//!
-//! Acquire shared lock for ChunkStore (process wide).
-//!
-//! Note: When creating .idx files, we create temporary a (.tmp) file,
-//! then do an atomic rename ...
-//!
-//!
-//! * Garbage Collect:
-//!
-//! Acquire exclusive lock for ChunkStore (process wide). If we have
-//! already a shared lock for the ChunkStore, try to upgrade that
-//! lock.
-//!
-//!
-//! * Server Restart
-//!
-//! Try to abort the running garbage collection to release exclusive
-//! ChunkStore locks ASAP. Start the new service with the existing listening
-//! socket.
+//! A chunk is defined as a binary blob, which is stored inside a
+//! [ChunkStore](struct.ChunkStore.html) instead of the backup directory
+//! directly, and can be addressed by its SHA256 digest.
//!
//!
//! # Garbage Collection (GC)
//!
-//! Deleting backups is as easy as deleting the corresponding .idx
-//! files. Unfortunately, this does not free up any storage, because
-//! those files just contain references to chunks.
+//! Deleting backups is as easy as deleting the corresponding .idx files.
+//! However, this does not free up any storage, because those files just
+//! contain references to chunks.
//!
//! To free up some storage, we run a garbage collection process at
-//! regular intervals. The collector uses a mark and sweep
-//! approach. In the first phase, it scans all .idx files to mark used
-//! chunks. The second phase then removes all unmarked chunks from the
-//! store.
+//! regular intervals. The collector uses a mark and sweep approach. In
+//! the first phase, it scans all .idx files to mark used chunks. The
+//! second phase then removes all unmarked chunks from the store.
//!
-//! The above locking mechanism makes sure that we are the only
-//! process running GC. But we still want to be able to create backups
-//! during GC, so there may be multiple backup threads/tasks
-//! running. Either started before GC started, or started while GC is
-//! running.
+//! The locking mechanisms mentioned below make sure that we are the only
+//! process running GC. We still want to be able to create backups during
+//! GC, so there may be multiple backup threads/tasks running, either
+//! started before GC, or while GC is running.
//!
//! ## `atime` based GC
//!
//! The idea here is to mark chunks by updating the `atime` (access
-//! timestamp) on the chunk file. This is quite simple and does not
-//! need additional RAM.
+//! timestamp) on the chunk file. This is quite simple and does not need
+//! additional RAM.
//!
//! One minor problem is that recent Linux versions use the `relatime`
-//! mount flag by default for performance reasons (yes, we want
-//! that). When enabled, `atime` data is written to the disk only if
-//! the file has been modified since the `atime` data was last updated
-//! (`mtime`), or if the file was last accessed more than a certain
-//! amount of time ago (by default 24h). So we may only delete chunks
-//! with `atime` older than 24 hours.
-//!
-//! Another problem arises from running backups. The mark phase does
-//! not find any chunks from those backups, because there is no .idx
-//! file for them (created after the backup). Chunks created or
-//! touched by those backups may have an `atime` as old as the start
-//! time of those backups. Please note that the backup start time may
-//! predate the GC start time. So we may only delete chunks older than
-//! the start time of those running backup jobs.
+//! mount flag by default for performance reasons (and we want that). When
+//! enabled, `atime` data is written to the disk only if the file has been
+//! modified since the `atime` data was last updated (`mtime`), or if the
+//! file was last accessed more than a certain amount of time ago (by
+//! default 24h). So we may only delete chunks with `atime` older than 24
+//! hours.
//!
+//! Another problem arises from running backups. The mark phase does not
+//! find any chunks from those backups, because there is no .idx file for
+//! them (created after the backup). Chunks created or touched by those
+//! backups may have an `atime` as old as the start time of those backups.
+//! Please note that the backup start time may predate the GC start time.
+//! So we may only delete chunks older than the start time of those
+//! running backup jobs, which might be more than 24h back (this is the
+//! reason why ProcessLocker exclusive locks only have to be exclusive
+//! between processes, since within one we can determine the age of the
+//! oldest shared lock).
//!
//! ## Store `marks` in RAM using a HASH
//!
-//! Not sure if this is better. TODO
+//! Might be better. Under investigation.
+//!
+//!
+//! # Locking
+//!
+//! Since PBS allows multiple potentially interfering operations at the
+//! same time (e.g. garbage collect, prune, multiple backup creations
+//! (only in seperate groups), forget, ...), these need to lock against
+//! each other in certain scenarios. There is no overarching global lock
+//! though, instead always the finest grained lock possible is used,
+//! because running these operations concurrently is treated as a feature
+//! on its own.
+//!
+//! ## Inter-process Locking
+//!
+//! We need to be able to restart the proxmox-backup service daemons, so
+//! that we can update the software without rebooting the host. But such
+//! restarts must not abort running backup jobs, so we need to keep the
+//! old service running until those jobs are finished. This implies that
+//! we need some kind of locking for modifying chunks and indices in the
+//! ChunkStore.
+//!
+//! Please note that it is perfectly valid to have multiple
+//! parallel ChunkStore writers, even when they write the same chunk
+//! (because the chunk would have the same name and the same data, and
+//! writes are completed atomically via a rename). The only problem is
+//! garbage collection, because we need to avoid deleting chunks which are
+//! still referenced.
+//!
+//! To do this we use the
+//! [ProcessLocker](../tools/struct.ProcessLocker.html).
+//!
+//! ### ChunkStore-wide
+//!
+//! * Create Index Files:
+//!
+//! Acquire shared lock for ChunkStore.
+//!
+//! Note: When creating .idx files, we create a temporary .tmp file,
+//! then do an atomic rename.
+//!
+//! * Garbage Collect:
+//!
+//! Acquire exclusive lock for ChunkStore. If we have
+//! already a shared lock for the ChunkStore, try to upgrade that
+//! lock.
+//!
+//! Exclusive locks only work _between processes_. It is valid to have an
+//! exclusive and one or more shared locks held within one process. Writing
+//! chunks within one process is synchronized using the gc_mutex.
+//!
+//! On server restart, we stop any running GC in the old process to avoid
+//! having the exclusive lock held for too long.
+//!
+//! ## Locking table
+//!
+//! Below table shows all operations that play a role in locking, and which
+//! mechanisms are used to make their concurrent usage safe.
+//!
+//! | starting ><br>v during | read index file | create index file | GC mark | GC sweep | read manifest | write manifest | forget | prune | create backup | verify | reader api |
+//! |-|-|-|-|-|-|-|-|-|-|-|-|
+//! | **read index file** | / | / | / | / | / | / | mmap stays valid, oldest_shared_lock prevents GC | see forget column | / | / | / |
+//! | **create index file** | / | / | / | / | / | / | / | / | /, happens at the end, after all chunks are touched | /, only happens without a manifest | / |
+//! | **GC mark** | / | Datastore process-lock shared | gc_mutex, exclusive ProcessLocker | gc_mutex | / | /, GC only cares about index files, not manifests | tells GC about removed chunks | see forget column | /, index files don’t exist yet | / | / |
+//! | **GC sweep** | / | Datastore process-lock shared | gc_mutex, exclusive ProcessLocker | gc_mutex | / | / | /, chunks already marked | see forget column | chunks get touched; chunk_store.mutex; oldest PL lock | / | / |
+//! | **read manifest** | / | / | / | / | update_manifest lock | update_manifest lock | update_manifest lock, remove dir under lock | see forget column | / | / | / |
+//! | **write manifest** | / | / | / | / | update_manifest lock | update_manifest lock | update_manifest lock | see forget column | /, “write manifest” happens at the end | /, can call “write manifest”, see that column | / |
+//! | **forget** | / | / | removed_during_gc mutex is held during unlink | marking done, doesn’t matter if forgotten now | / | update_manifest lock | /, unlink is atomic | causes forget to fail, but that’s OK | running backup has snapshot flock | /, potentially detects missing folder | shared snap flock |
+//! | **prune** | / | / | see forget row | see forget row | / | see forget row | causes warn in prune, but no error | see forget column | running and last non-running can’t be pruned | see forget row | shared snap flock |
+//! | **create backup** | / | only time this happens, thus has snapshot flock | / | chunks get touched; chunk_store.mutex; oldest PL lock | / | see “write manifest” row | snapshot flock, can’t be forgotten | running and last non-running can’t be pruned | snapshot group flock, only one running per group | /, won’t be verified since manifest missing | / |
+//! | **verify** | / | / | / | / | / | see “write manifest” row | /, potentially detects missing folder | see forget column | / | /, but useless (“write manifest” protects itself) | / |
+//! | **reader api** | / | / | / | /, open snap can’t be forgotten, so ref must exist | / | / | prevented by shared snap flock | prevented by shared snap flock | / | / | /, lock is shared |
+//!
+//! * / = no interaction
+//! * shared/exclusive from POV of 'starting' process
use anyhow::{bail, Error};
--
2.20.1
^ permalink raw reply [flat|nested] 26+ messages in thread