public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target
@ 2024-09-12 14:32 Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 01/33] api: datastore: add missing whitespace in description Christian Ebner
                   ` (35 more replies)
  0 siblings, 36 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

This patch series implements the functionality to extend the current
sync jobs in pull direction by an additional push direction, allowing
to push contents of a local source datastore to a remote target.

The series implements this by using the REST API of the remote target
for fetching, creating and/or deleting namespaces, groups and backups,
and reuses the clients backup writer functionality to create snapshots
by writing a manifeset on the remote target and sync the fixed index,
dynamic index or blobs contained in the source manifest to the remote,
preserving also encryption information.

Thanks to Fabian for further feedback to the previous version of the
patches, especially regarding users and ACLs.

Most notable changes since version 2 of the patch series include:
- Add checks and extend roles and privs to allow for restricting a local
  users access to remote datastore operations. In order to perform a
  full sync in push direction, including permissions for namespace
  creation and deleting contents with remove vansished, a acl.cfg looks
  like below:
  ```
  acl:1:/datastore/datastore:syncoperator@pbs:DatastoreAudit
  acl:1:/remote:syncoperator@pbs:RemoteSyncOperator
  acl:1:/remote/local/pushme:syncoperator@pbs:RemoteDatastoreModify,RemoteDatastorePrune,RemoteSyncPushOperator
  ```
  Based on further feedback, privs might get further grouped or an
  additional role containing most of these can be created.
- Drop patch introducing `no-timestamp-check` flag for backup client, as pointed
  out by Fabian this is not needed, as only backups newer than the currently
  last available will be pushed.
- Fix read snapshots from source by using the correct namespace.
- Rename PullParameters `owner` to more fitting `local_user`.
- Fix typos in remote sync push operator comment.
- Fix comments not matching the functionality for the cli implementations.

The patch series is structured as follows in this version:
- patch 1 is a cleanup patch fixing typos in api documentation.
- patches 2 to 7 are patches restructuring the current code so that
  functionality of the current pull implementation can be reused for
  the push implementation as well.
- patch 8 extens the backup writers functionality to be able to push
  snapshots to the target.
- patches 9 to 11 are once again preparatory patches for shared
  implementation of sync jobs in pull and push direction.
- patches 12 to 14 define the required permission acls and roles.
- patch 15 implements almost all of the logic required for the push,
  including pushing of the datastore, namespace, groups and snapshots,
  taking into account also filters and additional sync flags.
- patch 16 extends the current sync job configuration by a new config
  type `sync-push` allowing to configure sync jobs in push direction
  while limiting possible misconfiguration errors.
- patches 17 to 28 expose the new sync job direction via the API, CLI
  and WebUI.
- patches 29 to 33 finally are followup patches, changing the return
  type for the backup group and namespace delete REST API endpoints
  to return statistics on the deleted snapshots, groups and namespaces,
  which are then used to include this information in the task log.
  As this is an API breaking change, the patches are kept independent
  from the other patches.

Link to issue on bugtracker:
https://bugzilla.proxmox.com/show_bug.cgi?id=3044

Christian Ebner (33):
  api: datastore: add missing whitespace in description
  server: sync: move sync related stats to common module
  server: sync: move reader trait to common sync module
  server: sync: move source to common sync module
  client: backup writer: bundle upload stats counters
  client: backup writer: factor out merged chunk stream upload
  client: backup writer: add chunk count and duration stats
  client: backup writer: allow push uploading index and chunks
  server: sync: move skip info/reason to common sync module
  server: sync: make skip reason message more genenric
  server: sync: factor out namespace depth check into sync module
  config: acl: mention optional namespace acl path component
  config: acl: allow namespace components for remote datastores
  api types: define remote permissions and roles for push sync
  fix #3044: server: implement push support for sync operations
  config: jobs: add `sync-push` config type for push sync jobs
  api: push: implement endpoint for sync in push direction
  api: sync: move sync job invocation to server sync module
  api: sync jobs: expose optional `sync-direction` parameter
  api: sync: add permission checks for push sync jobs
  bin: manager: add datastore push cli command
  ui: group filter: allow to set namespace for local datastore
  ui: sync edit: source group filters based on sync direction
  ui: add view with separate grids for pull and push sync jobs
  ui: sync job: adapt edit window to be used for pull and push
  ui: sync: pass sync-direction to allow removing push jobs
  ui: sync view: do not use data model proxy for store
  ui: sync view: set sync direction when invoking run task via api
  datastore: move `BackupGroupDeleteStats` to api types
  api types: implement api type for `BackupGroupDeleteStats`
  datastore: increment deleted group counter when removing group
  api: datastore/namespace: return backup groups delete stats on remove
  server: sync job: use delete stats provided by the api

 pbs-api-types/src/acl.rs             |  32 +
 pbs-api-types/src/datastore.rs       |  64 ++
 pbs-api-types/src/jobs.rs            |  52 ++
 pbs-client/src/backup_writer.rs      | 228 +++++--
 pbs-config/src/acl.rs                |   7 +-
 pbs-config/src/sync.rs               |  11 +-
 pbs-datastore/src/backup_info.rs     |  34 +-
 pbs-datastore/src/datastore.rs       |  27 +-
 src/api2/admin/datastore.rs          |  24 +-
 src/api2/admin/namespace.rs          |  20 +-
 src/api2/admin/sync.rs               |  45 +-
 src/api2/config/datastore.rs         |  22 +-
 src/api2/config/notifications/mod.rs |  15 +-
 src/api2/config/sync.rs              |  84 ++-
 src/api2/mod.rs                      |   2 +
 src/api2/pull.rs                     | 108 ----
 src/api2/push.rs                     | 182 ++++++
 src/bin/proxmox-backup-manager.rs    | 216 +++++--
 src/bin/proxmox-backup-proxy.rs      |  25 +-
 src/server/mod.rs                    |   3 +
 src/server/pull.rs                   | 658 ++------------------
 src/server/push.rs                   | 883 +++++++++++++++++++++++++++
 src/server/sync.rs                   | 700 +++++++++++++++++++++
 www/Makefile                         |   1 +
 www/config/SyncPullPushView.js       |  60 ++
 www/config/SyncView.js               |  47 +-
 www/datastore/DataStoreList.js       |   2 +-
 www/datastore/Panel.js               |   2 +-
 www/form/GroupFilter.js              |  18 +-
 www/window/SyncJobEdit.js            |  45 +-
 30 files changed, 2706 insertions(+), 911 deletions(-)
 create mode 100644 src/api2/push.rs
 create mode 100644 src/server/push.rs
 create mode 100644 src/server/sync.rs
 create mode 100644 www/config/SyncPullPushView.js

-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 01/33] api: datastore: add missing whitespace in description
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 02/33] server: sync: move sync related stats to common module Christian Ebner
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/api2/admin/datastore.rs | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index b66449bde..0a5af1e76 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -271,7 +271,7 @@ pub fn list_groups(
     },
     access: {
         permission: &Permission::Anybody,
-        description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any\
+        description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any \
             or DATASTORE_PRUNE and being the owner of the group",
     },
 )]
@@ -378,7 +378,7 @@ pub async fn list_snapshot_files(
     },
     access: {
         permission: &Permission::Anybody,
-        description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any\
+        description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any \
             or DATASTORE_PRUNE and being the owner of the group",
     },
 )]
@@ -958,7 +958,7 @@ pub fn verify(
     returns: pbs_api_types::ADMIN_DATASTORE_PRUNE_RETURN_TYPE,
     access: {
         permission: &Permission::Anybody,
-        description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any\
+        description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any \
             or DATASTORE_PRUNE and being the owner of the group",
     },
 )]
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 02/33] server: sync: move sync related stats to common module
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 01/33] api: datastore: add missing whitespace in description Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 03/33] server: sync: move reader trait to common sync module Christian Ebner
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

Move and rename the `PullStats` to `SyncStats` as well as moving the
`RemovedVanishedStats` to make them reusable for sync operations in
push direction as well as pull direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/server/mod.rs  |   1 +
 src/server/pull.rs | 121 ++++++++++++++-------------------------------
 src/server/sync.rs |  51 +++++++++++++++++++
 3 files changed, 89 insertions(+), 84 deletions(-)
 create mode 100644 src/server/sync.rs

diff --git a/src/server/mod.rs b/src/server/mod.rs
index 7f845e5b8..468847c2e 100644
--- a/src/server/mod.rs
+++ b/src/server/mod.rs
@@ -34,6 +34,7 @@ pub use report::*;
 pub mod auth;
 
 pub(crate) mod pull;
+pub(crate) mod sync;
 
 pub(crate) async fn reload_proxy_certificate() -> Result<(), Error> {
     let proxy_pid = proxmox_rest_server::read_pid(pbs_buildcfg::PROXMOX_BACKUP_PROXY_PID_FN)?;
diff --git a/src/server/pull.rs b/src/server/pull.rs
index de1bb5d5f..4a97bfaa3 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -5,7 +5,7 @@ use std::io::{Seek, Write};
 use std::path::{Path, PathBuf};
 use std::sync::atomic::{AtomicUsize, Ordering};
 use std::sync::{Arc, Mutex};
-use std::time::{Duration, SystemTime};
+use std::time::SystemTime;
 
 use anyhow::{bail, format_err, Error};
 use http::StatusCode;
@@ -34,6 +34,7 @@ use pbs_datastore::{
 };
 use pbs_tools::sha::sha256;
 
+use super::sync::{RemovedVanishedStats, SyncStats};
 use crate::backup::{check_ns_modification_privs, check_ns_privs, ListAccessibleBackupGroups};
 use crate::tools::parallel_handler::ParallelHandler;
 
@@ -64,54 +65,6 @@ pub(crate) struct LocalSource {
     ns: BackupNamespace,
 }
 
-#[derive(Default)]
-pub(crate) struct RemovedVanishedStats {
-    pub(crate) groups: usize,
-    pub(crate) snapshots: usize,
-    pub(crate) namespaces: usize,
-}
-
-impl RemovedVanishedStats {
-    fn add(&mut self, rhs: RemovedVanishedStats) {
-        self.groups += rhs.groups;
-        self.snapshots += rhs.snapshots;
-        self.namespaces += rhs.namespaces;
-    }
-}
-
-#[derive(Default)]
-pub(crate) struct PullStats {
-    pub(crate) chunk_count: usize,
-    pub(crate) bytes: usize,
-    pub(crate) elapsed: Duration,
-    pub(crate) removed: Option<RemovedVanishedStats>,
-}
-
-impl From<RemovedVanishedStats> for PullStats {
-    fn from(removed: RemovedVanishedStats) -> Self {
-        Self {
-            removed: Some(removed),
-            ..Default::default()
-        }
-    }
-}
-
-impl PullStats {
-    fn add(&mut self, rhs: PullStats) {
-        self.chunk_count += rhs.chunk_count;
-        self.bytes += rhs.bytes;
-        self.elapsed += rhs.elapsed;
-
-        if let Some(rhs_removed) = rhs.removed {
-            if let Some(ref mut removed) = self.removed {
-                removed.add(rhs_removed);
-            } else {
-                self.removed = Some(rhs_removed);
-            }
-        }
-    }
-}
-
 #[async_trait::async_trait]
 /// `PullSource` is a trait that provides an interface for pulling data/information from a source.
 /// The trait includes methods for listing namespaces, groups, and backup directories,
@@ -576,7 +529,7 @@ async fn pull_index_chunks<I: IndexFile>(
     target: Arc<DataStore>,
     index: I,
     downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
-) -> Result<PullStats, Error> {
+) -> Result<SyncStats, Error> {
     use futures::stream::{self, StreamExt, TryStreamExt};
 
     let start_time = SystemTime::now();
@@ -663,7 +616,7 @@ async fn pull_index_chunks<I: IndexFile>(
         HumanByte::new_binary(bytes as f64 / elapsed.as_secs_f64()),
     );
 
-    Ok(PullStats {
+    Ok(SyncStats {
         chunk_count,
         bytes,
         elapsed,
@@ -701,7 +654,7 @@ async fn pull_single_archive<'a>(
     snapshot: &'a pbs_datastore::BackupDir,
     archive_info: &'a FileInfo,
     downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
-) -> Result<PullStats, Error> {
+) -> Result<SyncStats, Error> {
     let archive_name = &archive_info.filename;
     let mut path = snapshot.full_path();
     path.push(archive_name);
@@ -709,7 +662,7 @@ async fn pull_single_archive<'a>(
     let mut tmp_path = path.clone();
     tmp_path.set_extension("tmp");
 
-    let mut pull_stats = PullStats::default();
+    let mut sync_stats = SyncStats::default();
 
     info!("sync archive {archive_name}");
 
@@ -735,7 +688,7 @@ async fn pull_single_archive<'a>(
                     downloaded_chunks,
                 )
                 .await?;
-                pull_stats.add(stats);
+                sync_stats.add(stats);
             }
         }
         ArchiveType::FixedIndex => {
@@ -755,7 +708,7 @@ async fn pull_single_archive<'a>(
                     downloaded_chunks,
                 )
                 .await?;
-                pull_stats.add(stats);
+                sync_stats.add(stats);
             }
         }
         ArchiveType::Blob => {
@@ -767,7 +720,7 @@ async fn pull_single_archive<'a>(
     if let Err(err) = std::fs::rename(&tmp_path, &path) {
         bail!("Atomic rename file {:?} failed - {}", path, err);
     }
-    Ok(pull_stats)
+    Ok(sync_stats)
 }
 
 /// Actual implementation of pulling a snapshot.
@@ -783,8 +736,8 @@ async fn pull_snapshot<'a>(
     reader: Arc<dyn PullReader + 'a>,
     snapshot: &'a pbs_datastore::BackupDir,
     downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
-) -> Result<PullStats, Error> {
-    let mut pull_stats = PullStats::default();
+) -> Result<SyncStats, Error> {
+    let mut sync_stats = SyncStats::default();
     let mut manifest_name = snapshot.full_path();
     manifest_name.push(MANIFEST_BLOB_NAME);
 
@@ -800,7 +753,7 @@ async fn pull_snapshot<'a>(
     {
         tmp_manifest_blob = data;
     } else {
-        return Ok(pull_stats);
+        return Ok(sync_stats);
     }
 
     if manifest_name.exists() {
@@ -822,7 +775,7 @@ async fn pull_snapshot<'a>(
             };
             info!("no data changes");
             let _ = std::fs::remove_file(&tmp_manifest_name);
-            return Ok(pull_stats); // nothing changed
+            return Ok(sync_stats); // nothing changed
         }
     }
 
@@ -869,7 +822,7 @@ async fn pull_snapshot<'a>(
 
         let stats =
             pull_single_archive(reader.clone(), snapshot, item, downloaded_chunks.clone()).await?;
-        pull_stats.add(stats);
+        sync_stats.add(stats);
     }
 
     if let Err(err) = std::fs::rename(&tmp_manifest_name, &manifest_name) {
@@ -883,7 +836,7 @@ async fn pull_snapshot<'a>(
         .cleanup_unreferenced_files(&manifest)
         .map_err(|err| format_err!("failed to cleanup unreferenced files - {err}"))?;
 
-    Ok(pull_stats)
+    Ok(sync_stats)
 }
 
 /// Pulls a `snapshot`, removing newly created ones on error, but keeping existing ones in any case.
@@ -894,12 +847,12 @@ async fn pull_snapshot_from<'a>(
     reader: Arc<dyn PullReader + 'a>,
     snapshot: &'a pbs_datastore::BackupDir,
     downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
-) -> Result<PullStats, Error> {
+) -> Result<SyncStats, Error> {
     let (_path, is_new, _snap_lock) = snapshot
         .datastore()
         .create_locked_backup_dir(snapshot.backup_ns(), snapshot.as_ref())?;
 
-    let pull_stats = if is_new {
+    let sync_stats = if is_new {
         info!("sync snapshot {}", snapshot.dir());
 
         match pull_snapshot(reader, snapshot, downloaded_chunks).await {
@@ -913,9 +866,9 @@ async fn pull_snapshot_from<'a>(
                 }
                 return Err(err);
             }
-            Ok(pull_stats) => {
+            Ok(sync_stats) => {
                 info!("sync snapshot {} done", snapshot.dir());
-                pull_stats
+                sync_stats
             }
         }
     } else {
@@ -923,7 +876,7 @@ async fn pull_snapshot_from<'a>(
         pull_snapshot(reader, snapshot, downloaded_chunks).await?
     };
 
-    Ok(pull_stats)
+    Ok(sync_stats)
 }
 
 #[derive(PartialEq, Eq)]
@@ -1027,7 +980,7 @@ async fn pull_group(
     source_namespace: &BackupNamespace,
     group: &BackupGroup,
     progress: &mut StoreProgress,
-) -> Result<PullStats, Error> {
+) -> Result<SyncStats, Error> {
     let mut already_synced_skip_info = SkipInfo::new(SkipReason::AlreadySynced);
     let mut transfer_last_skip_info = SkipInfo::new(SkipReason::TransferLast);
 
@@ -1084,7 +1037,7 @@ async fn pull_group(
 
     progress.group_snapshots = list.len() as u64;
 
-    let mut pull_stats = PullStats::default();
+    let mut sync_stats = SyncStats::default();
 
     for (pos, from_snapshot) in list.into_iter().enumerate() {
         let to_snapshot = params
@@ -1102,7 +1055,7 @@ async fn pull_group(
         info!("percentage done: {progress}");
 
         let stats = result?; // stop on error
-        pull_stats.add(stats);
+        sync_stats.add(stats);
     }
 
     if params.remove_vanished {
@@ -1128,7 +1081,7 @@ async fn pull_group(
                 .target
                 .store
                 .remove_backup_dir(&target_ns, snapshot.as_ref(), false)?;
-            pull_stats.add(PullStats::from(RemovedVanishedStats {
+            sync_stats.add(SyncStats::from(RemovedVanishedStats {
                 snapshots: 1,
                 groups: 0,
                 namespaces: 0,
@@ -1136,7 +1089,7 @@ async fn pull_group(
         }
     }
 
-    Ok(pull_stats)
+    Ok(sync_stats)
 }
 
 fn check_and_create_ns(params: &PullParameters, ns: &BackupNamespace) -> Result<bool, Error> {
@@ -1253,7 +1206,7 @@ fn check_and_remove_vanished_ns(
 /// - remote namespaces are filtered by remote
 /// - creation and removal of sub-NS checked here
 /// - access to sub-NS checked here
-pub(crate) async fn pull_store(mut params: PullParameters) -> Result<PullStats, Error> {
+pub(crate) async fn pull_store(mut params: PullParameters) -> Result<SyncStats, Error> {
     // explicit create shared lock to prevent GC on newly created chunks
     let _shared_store_lock = params.target.store.try_shared_chunk_store_lock()?;
     let mut errors = false;
@@ -1286,7 +1239,7 @@ pub(crate) async fn pull_store(mut params: PullParameters) -> Result<PullStats,
 
     let (mut groups, mut snapshots) = (0, 0);
     let mut synced_ns = HashSet::with_capacity(namespaces.len());
-    let mut pull_stats = PullStats::default();
+    let mut sync_stats = SyncStats::default();
 
     for namespace in namespaces {
         let source_store_ns_str = print_store_and_ns(params.source.get_store(), &namespace);
@@ -1310,10 +1263,10 @@ pub(crate) async fn pull_store(mut params: PullParameters) -> Result<PullStats,
         }
 
         match pull_ns(&namespace, &mut params).await {
-            Ok((ns_progress, ns_pull_stats, ns_errors)) => {
+            Ok((ns_progress, ns_sync_stats, ns_errors)) => {
                 errors |= ns_errors;
 
-                pull_stats.add(ns_pull_stats);
+                sync_stats.add(ns_sync_stats);
 
                 if params.max_depth != Some(0) {
                     groups += ns_progress.done_groups;
@@ -1342,14 +1295,14 @@ pub(crate) async fn pull_store(mut params: PullParameters) -> Result<PullStats,
     if params.remove_vanished {
         let (has_errors, stats) = check_and_remove_vanished_ns(&params, synced_ns)?;
         errors |= has_errors;
-        pull_stats.add(PullStats::from(stats));
+        sync_stats.add(SyncStats::from(stats));
     }
 
     if errors {
         bail!("sync failed with some errors.");
     }
 
-    Ok(pull_stats)
+    Ok(sync_stats)
 }
 
 /// Pulls a namespace according to `params`.
@@ -1367,7 +1320,7 @@ pub(crate) async fn pull_store(mut params: PullParameters) -> Result<PullStats,
 pub(crate) async fn pull_ns(
     namespace: &BackupNamespace,
     params: &mut PullParameters,
-) -> Result<(StoreProgress, PullStats, bool), Error> {
+) -> Result<(StoreProgress, SyncStats, bool), Error> {
     let mut list: Vec<BackupGroup> = params.source.list_groups(namespace, &params.owner).await?;
 
     list.sort_unstable_by(|a, b| {
@@ -1397,7 +1350,7 @@ pub(crate) async fn pull_ns(
     }
 
     let mut progress = StoreProgress::new(list.len() as u64);
-    let mut pull_stats = PullStats::default();
+    let mut sync_stats = SyncStats::default();
 
     let target_ns = namespace.map_prefix(&params.source.get_ns(), &params.target.ns)?;
 
@@ -1432,7 +1385,7 @@ pub(crate) async fn pull_ns(
             errors = true; // do not stop here, instead continue
         } else {
             match pull_group(params, namespace, &group, &mut progress).await {
-                Ok(stats) => pull_stats.add(stats),
+                Ok(stats) => sync_stats.add(stats),
                 Err(err) => {
                     info!("sync group {} failed - {err}", &group);
                     errors = true; // do not stop here, instead continue
@@ -1466,13 +1419,13 @@ pub(crate) async fn pull_ns(
                     Ok(stats) => {
                         if !stats.all_removed() {
                             info!("kept some protected snapshots of group '{local_group}'");
-                            pull_stats.add(PullStats::from(RemovedVanishedStats {
+                            sync_stats.add(SyncStats::from(RemovedVanishedStats {
                                 snapshots: stats.removed_snapshots(),
                                 groups: 0,
                                 namespaces: 0,
                             }));
                         } else {
-                            pull_stats.add(PullStats::from(RemovedVanishedStats {
+                            sync_stats.add(SyncStats::from(RemovedVanishedStats {
                                 snapshots: stats.removed_snapshots(),
                                 groups: 1,
                                 namespaces: 0,
@@ -1493,5 +1446,5 @@ pub(crate) async fn pull_ns(
         };
     }
 
-    Ok((progress, pull_stats, errors))
+    Ok((progress, sync_stats, errors))
 }
diff --git a/src/server/sync.rs b/src/server/sync.rs
new file mode 100644
index 000000000..5f143ef63
--- /dev/null
+++ b/src/server/sync.rs
@@ -0,0 +1,51 @@
+//! Sync datastore contents from source to target, either in push or pull direction
+
+use std::time::Duration;
+
+#[derive(Default)]
+pub(crate) struct RemovedVanishedStats {
+    pub(crate) groups: usize,
+    pub(crate) snapshots: usize,
+    pub(crate) namespaces: usize,
+}
+
+impl RemovedVanishedStats {
+    pub(crate) fn add(&mut self, rhs: RemovedVanishedStats) {
+        self.groups += rhs.groups;
+        self.snapshots += rhs.snapshots;
+        self.namespaces += rhs.namespaces;
+    }
+}
+
+#[derive(Default)]
+pub(crate) struct SyncStats {
+    pub(crate) chunk_count: usize,
+    pub(crate) bytes: usize,
+    pub(crate) elapsed: Duration,
+    pub(crate) removed: Option<RemovedVanishedStats>,
+}
+
+impl From<RemovedVanishedStats> for SyncStats {
+    fn from(removed: RemovedVanishedStats) -> Self {
+        Self {
+            removed: Some(removed),
+            ..Default::default()
+        }
+    }
+}
+
+impl SyncStats {
+    pub(crate) fn add(&mut self, rhs: SyncStats) {
+        self.chunk_count += rhs.chunk_count;
+        self.bytes += rhs.bytes;
+        self.elapsed += rhs.elapsed;
+
+        if let Some(rhs_removed) = rhs.removed {
+            if let Some(ref mut removed) = self.removed {
+                removed.add(rhs_removed);
+            } else {
+                self.removed = Some(rhs_removed);
+            }
+        }
+    }
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 03/33] server: sync: move reader trait to common sync module
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 01/33] api: datastore: add missing whitespace in description Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 02/33] server: sync: move sync related stats to common module Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 04/33] server: sync: move source " Christian Ebner
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

Move the `PullReader` trait and the types implementing it to the
common sync module, so this can be reused for the push direction
variant for a sync job as well.

Adapt the naming to be more ambiguous by renaming `PullReader` trait to
`SyncSourceReader`, `LocalReader` to `LocalSourceReader` and
`RemoteReader` to `RemoteSourceReader`.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/server/pull.rs | 167 +++++----------------------------------------
 src/server/sync.rs | 152 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 168 insertions(+), 151 deletions(-)

diff --git a/src/server/pull.rs b/src/server/pull.rs
index 4a97bfaa3..20b5d9af8 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -1,8 +1,7 @@
 //! Sync datastore by pulling contents from remote server
 
-use std::collections::{HashMap, HashSet};
-use std::io::{Seek, Write};
-use std::path::{Path, PathBuf};
+use std::collections::HashSet;
+use std::io::Seek;
 use std::sync::atomic::{AtomicUsize, Ordering};
 use std::sync::{Arc, Mutex};
 use std::time::SystemTime;
@@ -15,11 +14,11 @@ use serde_json::json;
 use tracing::{info, warn};
 
 use pbs_api_types::{
-    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
+    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, GroupFilter,
     GroupListItem, Operation, RateLimitConfig, Remote, SnapshotListItem, MAX_NAMESPACE_DEPTH,
     PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_READ,
 };
-use pbs_client::{BackupReader, BackupRepository, HttpClient, RemoteChunkReader};
+use pbs_client::{BackupReader, BackupRepository, HttpClient};
 use pbs_config::CachedUserInfo;
 use pbs_datastore::data_blob::DataBlob;
 use pbs_datastore::dynamic_index::DynamicIndexReader;
@@ -29,26 +28,15 @@ use pbs_datastore::manifest::{
     ArchiveType, BackupManifest, FileInfo, CLIENT_LOG_BLOB_NAME, MANIFEST_BLOB_NAME,
 };
 use pbs_datastore::read_chunk::AsyncReadChunk;
-use pbs_datastore::{
-    check_backup_owner, DataStore, ListNamespacesRecursive, LocalChunkReader, StoreProgress,
-};
+use pbs_datastore::{check_backup_owner, DataStore, ListNamespacesRecursive, StoreProgress};
 use pbs_tools::sha::sha256;
 
-use super::sync::{RemovedVanishedStats, SyncStats};
+use super::sync::{
+    LocalSourceReader, RemoteSourceReader, RemovedVanishedStats, SyncSourceReader, SyncStats,
+};
 use crate::backup::{check_ns_modification_privs, check_ns_privs, ListAccessibleBackupGroups};
 use crate::tools::parallel_handler::ParallelHandler;
 
-struct RemoteReader {
-    backup_reader: Arc<BackupReader>,
-    dir: BackupDir,
-}
-
-struct LocalReader {
-    _dir_lock: Arc<Mutex<proxmox_sys::fs::DirLockGuard>>,
-    path: PathBuf,
-    datastore: Arc<DataStore>,
-}
-
 pub(crate) struct PullTarget {
     store: Arc<DataStore>,
     ns: BackupNamespace,
@@ -97,7 +85,7 @@ trait PullSource: Send + Sync {
         &self,
         ns: &BackupNamespace,
         dir: &BackupDir,
-    ) -> Result<Arc<dyn PullReader>, Error>;
+    ) -> Result<Arc<dyn SyncSourceReader>, Error>;
 }
 
 #[async_trait::async_trait]
@@ -230,7 +218,7 @@ impl PullSource for RemoteSource {
         &self,
         ns: &BackupNamespace,
         dir: &BackupDir,
-    ) -> Result<Arc<dyn PullReader>, Error> {
+    ) -> Result<Arc<dyn SyncSourceReader>, Error> {
         let backup_reader = BackupReader::start(
             &self.client,
             None,
@@ -240,7 +228,7 @@ impl PullSource for RemoteSource {
             tracing::enabled!(tracing::Level::DEBUG),
         )
         .await?;
-        Ok(Arc::new(RemoteReader {
+        Ok(Arc::new(RemoteSourceReader {
             backup_reader,
             dir: dir.clone(),
         }))
@@ -305,14 +293,14 @@ impl PullSource for LocalSource {
         &self,
         ns: &BackupNamespace,
         dir: &BackupDir,
-    ) -> Result<Arc<dyn PullReader>, Error> {
+    ) -> Result<Arc<dyn SyncSourceReader>, Error> {
         let dir = self.store.backup_dir(ns.clone(), dir.clone())?;
         let dir_lock = proxmox_sys::fs::lock_dir_noblock_shared(
             &dir.full_path(),
             "snapshot",
             "locked by another operation",
         )?;
-        Ok(Arc::new(LocalReader {
+        Ok(Arc::new(LocalSourceReader {
             _dir_lock: Arc::new(Mutex::new(dir_lock)),
             path: dir.full_path(),
             datastore: dir.datastore().clone(),
@@ -320,129 +308,6 @@ impl PullSource for LocalSource {
     }
 }
 
-#[async_trait::async_trait]
-/// `PullReader` is a trait that provides an interface for reading data from a source.
-/// The trait includes methods for getting a chunk reader, loading a file, downloading client log, and checking whether chunk sync should be skipped.
-trait PullReader: Send + Sync {
-    /// Returns a chunk reader with the specified encryption mode.
-    fn chunk_reader(&self, crypt_mode: CryptMode) -> Arc<dyn AsyncReadChunk>;
-
-    /// Asynchronously loads a file from the source into a local file.
-    /// `filename` is the name of the file to load from the source.
-    /// `into` is the path of the local file to load the source file into.
-    async fn load_file_into(&self, filename: &str, into: &Path) -> Result<Option<DataBlob>, Error>;
-
-    /// Tries to download the client log from the source and save it into a local file.
-    async fn try_download_client_log(&self, to_path: &Path) -> Result<(), Error>;
-
-    fn skip_chunk_sync(&self, target_store_name: &str) -> bool;
-}
-
-#[async_trait::async_trait]
-impl PullReader for RemoteReader {
-    fn chunk_reader(&self, crypt_mode: CryptMode) -> Arc<dyn AsyncReadChunk> {
-        Arc::new(RemoteChunkReader::new(
-            self.backup_reader.clone(),
-            None,
-            crypt_mode,
-            HashMap::new(),
-        ))
-    }
-
-    async fn load_file_into(&self, filename: &str, into: &Path) -> Result<Option<DataBlob>, Error> {
-        let mut tmp_file = std::fs::OpenOptions::new()
-            .write(true)
-            .create(true)
-            .truncate(true)
-            .read(true)
-            .open(into)?;
-        let download_result = self.backup_reader.download(filename, &mut tmp_file).await;
-        if let Err(err) = download_result {
-            match err.downcast_ref::<HttpError>() {
-                Some(HttpError { code, message }) => match *code {
-                    StatusCode::NOT_FOUND => {
-                        info!(
-                            "skipping snapshot {} - vanished since start of sync",
-                            &self.dir,
-                        );
-                        return Ok(None);
-                    }
-                    _ => {
-                        bail!("HTTP error {code} - {message}");
-                    }
-                },
-                None => {
-                    return Err(err);
-                }
-            };
-        };
-        tmp_file.rewind()?;
-        Ok(DataBlob::load_from_reader(&mut tmp_file).ok())
-    }
-
-    async fn try_download_client_log(&self, to_path: &Path) -> Result<(), Error> {
-        let mut tmp_path = to_path.to_owned();
-        tmp_path.set_extension("tmp");
-
-        let tmpfile = std::fs::OpenOptions::new()
-            .write(true)
-            .create(true)
-            .read(true)
-            .open(&tmp_path)?;
-
-        // Note: be silent if there is no log - only log successful download
-        if let Ok(()) = self
-            .backup_reader
-            .download(CLIENT_LOG_BLOB_NAME, tmpfile)
-            .await
-        {
-            if let Err(err) = std::fs::rename(&tmp_path, to_path) {
-                bail!("Atomic rename file {:?} failed - {}", to_path, err);
-            }
-            info!("got backup log file {CLIENT_LOG_BLOB_NAME:?}");
-        }
-
-        Ok(())
-    }
-
-    fn skip_chunk_sync(&self, _target_store_name: &str) -> bool {
-        false
-    }
-}
-
-#[async_trait::async_trait]
-impl PullReader for LocalReader {
-    fn chunk_reader(&self, crypt_mode: CryptMode) -> Arc<dyn AsyncReadChunk> {
-        Arc::new(LocalChunkReader::new(
-            self.datastore.clone(),
-            None,
-            crypt_mode,
-        ))
-    }
-
-    async fn load_file_into(&self, filename: &str, into: &Path) -> Result<Option<DataBlob>, Error> {
-        let mut tmp_file = std::fs::OpenOptions::new()
-            .write(true)
-            .create(true)
-            .truncate(true)
-            .read(true)
-            .open(into)?;
-        let mut from_path = self.path.clone();
-        from_path.push(filename);
-        tmp_file.write_all(std::fs::read(from_path)?.as_slice())?;
-        tmp_file.rewind()?;
-        Ok(DataBlob::load_from_reader(&mut tmp_file).ok())
-    }
-
-    async fn try_download_client_log(&self, _to_path: &Path) -> Result<(), Error> {
-        Ok(())
-    }
-
-    fn skip_chunk_sync(&self, target_store_name: &str) -> bool {
-        self.datastore.name() == target_store_name
-    }
-}
-
 /// Parameters for a pull operation.
 pub(crate) struct PullParameters {
     /// Where data is pulled from
@@ -650,7 +515,7 @@ fn verify_archive(info: &FileInfo, csum: &[u8; 32], size: u64) -> Result<(), Err
 /// - if archive is an index, pull referenced chunks
 /// - Rename tmp file into real path
 async fn pull_single_archive<'a>(
-    reader: Arc<dyn PullReader + 'a>,
+    reader: Arc<dyn SyncSourceReader + 'a>,
     snapshot: &'a pbs_datastore::BackupDir,
     archive_info: &'a FileInfo,
     downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
@@ -733,7 +598,7 @@ async fn pull_single_archive<'a>(
 /// -- if not, pull it from the remote
 /// - Download log if not already existing
 async fn pull_snapshot<'a>(
-    reader: Arc<dyn PullReader + 'a>,
+    reader: Arc<dyn SyncSourceReader + 'a>,
     snapshot: &'a pbs_datastore::BackupDir,
     downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
 ) -> Result<SyncStats, Error> {
@@ -844,7 +709,7 @@ async fn pull_snapshot<'a>(
 /// The `reader` is configured to read from the source backup directory, while the
 /// `snapshot` is pointing to the local datastore and target namespace.
 async fn pull_snapshot_from<'a>(
-    reader: Arc<dyn PullReader + 'a>,
+    reader: Arc<dyn SyncSourceReader + 'a>,
     snapshot: &'a pbs_datastore::BackupDir,
     downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
 ) -> Result<SyncStats, Error> {
diff --git a/src/server/sync.rs b/src/server/sync.rs
index 5f143ef63..323bc1a27 100644
--- a/src/server/sync.rs
+++ b/src/server/sync.rs
@@ -1,7 +1,24 @@
 //! Sync datastore contents from source to target, either in push or pull direction
 
+use std::collections::HashMap;
+use std::io::{Seek, Write};
+use std::path::{Path, PathBuf};
+use std::sync::{Arc, Mutex};
 use std::time::Duration;
 
+use anyhow::{bail, Error};
+use http::StatusCode;
+use tracing::info;
+
+use proxmox_router::HttpError;
+
+use pbs_api_types::{BackupDir, CryptMode};
+use pbs_client::{BackupReader, RemoteChunkReader};
+use pbs_datastore::data_blob::DataBlob;
+use pbs_datastore::manifest::CLIENT_LOG_BLOB_NAME;
+use pbs_datastore::read_chunk::AsyncReadChunk;
+use pbs_datastore::{DataStore, LocalChunkReader};
+
 #[derive(Default)]
 pub(crate) struct RemovedVanishedStats {
     pub(crate) groups: usize,
@@ -49,3 +66,138 @@ impl SyncStats {
         }
     }
 }
+
+#[async_trait::async_trait]
+/// `SyncReader` is a trait that provides an interface for reading data from a source.
+/// The trait includes methods for getting a chunk reader, loading a file, downloading client log,
+/// and checking whether chunk sync should be skipped.
+pub(crate) trait SyncSourceReader: Send + Sync {
+    /// Returns a chunk reader with the specified encryption mode.
+    fn chunk_reader(&self, crypt_mode: CryptMode) -> Arc<dyn AsyncReadChunk>;
+
+    /// Asynchronously loads a file from the source into a local file.
+    /// `filename` is the name of the file to load from the source.
+    /// `into` is the path of the local file to load the source file into.
+    async fn load_file_into(&self, filename: &str, into: &Path) -> Result<Option<DataBlob>, Error>;
+
+    /// Tries to download the client log from the source and save it into a local file.
+    async fn try_download_client_log(&self, to_path: &Path) -> Result<(), Error>;
+
+    fn skip_chunk_sync(&self, target_store_name: &str) -> bool;
+}
+
+pub(crate) struct RemoteSourceReader {
+    pub(crate) backup_reader: Arc<BackupReader>,
+    pub(crate) dir: BackupDir,
+}
+
+pub(crate) struct LocalSourceReader {
+    pub(crate) _dir_lock: Arc<Mutex<proxmox_sys::fs::DirLockGuard>>,
+    pub(crate) path: PathBuf,
+    pub(crate) datastore: Arc<DataStore>,
+}
+
+#[async_trait::async_trait]
+impl SyncSourceReader for RemoteSourceReader {
+    fn chunk_reader(&self, crypt_mode: CryptMode) -> Arc<dyn AsyncReadChunk> {
+        Arc::new(RemoteChunkReader::new(
+            self.backup_reader.clone(),
+            None,
+            crypt_mode,
+            HashMap::new(),
+        ))
+    }
+
+    async fn load_file_into(&self, filename: &str, into: &Path) -> Result<Option<DataBlob>, Error> {
+        let mut tmp_file = std::fs::OpenOptions::new()
+            .write(true)
+            .create(true)
+            .truncate(true)
+            .read(true)
+            .open(into)?;
+        let download_result = self.backup_reader.download(filename, &mut tmp_file).await;
+        if let Err(err) = download_result {
+            match err.downcast_ref::<HttpError>() {
+                Some(HttpError { code, message }) => match *code {
+                    StatusCode::NOT_FOUND => {
+                        info!(
+                            "skipping snapshot {} - vanished since start of sync",
+                            &self.dir
+                        );
+                        return Ok(None);
+                    }
+                    _ => {
+                        bail!("HTTP error {code} - {message}");
+                    }
+                },
+                None => {
+                    return Err(err);
+                }
+            };
+        };
+        tmp_file.rewind()?;
+        Ok(DataBlob::load_from_reader(&mut tmp_file).ok())
+    }
+
+    async fn try_download_client_log(&self, to_path: &Path) -> Result<(), Error> {
+        let mut tmp_path = to_path.to_owned();
+        tmp_path.set_extension("tmp");
+
+        let tmpfile = std::fs::OpenOptions::new()
+            .write(true)
+            .create(true)
+            .read(true)
+            .open(&tmp_path)?;
+
+        // Note: be silent if there is no log - only log successful download
+        if let Ok(()) = self
+            .backup_reader
+            .download(CLIENT_LOG_BLOB_NAME, tmpfile)
+            .await
+        {
+            if let Err(err) = std::fs::rename(&tmp_path, to_path) {
+                bail!("Atomic rename file {to_path:?} failed - {err}");
+            }
+            info!("got backup log file {CLIENT_LOG_BLOB_NAME:?}");
+        }
+
+        Ok(())
+    }
+
+    fn skip_chunk_sync(&self, _target_store_name: &str) -> bool {
+        false
+    }
+}
+
+#[async_trait::async_trait]
+impl SyncSourceReader for LocalSourceReader {
+    fn chunk_reader(&self, crypt_mode: CryptMode) -> Arc<dyn AsyncReadChunk> {
+        Arc::new(LocalChunkReader::new(
+            self.datastore.clone(),
+            None,
+            crypt_mode,
+        ))
+    }
+
+    async fn load_file_into(&self, filename: &str, into: &Path) -> Result<Option<DataBlob>, Error> {
+        let mut tmp_file = std::fs::OpenOptions::new()
+            .write(true)
+            .create(true)
+            .truncate(true)
+            .read(true)
+            .open(into)?;
+        let mut from_path = self.path.clone();
+        from_path.push(filename);
+        tmp_file.write_all(std::fs::read(from_path)?.as_slice())?;
+        tmp_file.rewind()?;
+        Ok(DataBlob::load_from_reader(&mut tmp_file).ok())
+    }
+
+    async fn try_download_client_log(&self, _to_path: &Path) -> Result<(), Error> {
+        Ok(())
+    }
+
+    fn skip_chunk_sync(&self, target_store_name: &str) -> bool {
+        self.datastore.name() == target_store_name
+    }
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 04/33] server: sync: move source to common sync module
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (2 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 03/33] server: sync: move reader trait to common sync module Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 05/33] client: backup writer: bundle upload stats counters Christian Ebner
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

Rename the `PullSource` trait to `SyncSource` and move the trait and
types implementing it to the common sync module, making them
reusable for both sync directions, push and pull.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/server/pull.rs | 288 ++-------------------------------------------
 src/server/sync.rs | 276 ++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 280 insertions(+), 284 deletions(-)

diff --git a/src/server/pull.rs b/src/server/pull.rs
index 20b5d9af8..c6932dcc5 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -7,18 +7,14 @@ use std::sync::{Arc, Mutex};
 use std::time::SystemTime;
 
 use anyhow::{bail, format_err, Error};
-use http::StatusCode;
 use proxmox_human_byte::HumanByte;
-use proxmox_router::HttpError;
-use serde_json::json;
-use tracing::{info, warn};
+use tracing::info;
 
 use pbs_api_types::{
-    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, GroupFilter,
-    GroupListItem, Operation, RateLimitConfig, Remote, SnapshotListItem, MAX_NAMESPACE_DEPTH,
-    PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_READ,
+    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, GroupFilter, Operation,
+    RateLimitConfig, Remote, MAX_NAMESPACE_DEPTH, PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_BACKUP,
 };
-use pbs_client::{BackupReader, BackupRepository, HttpClient};
+use pbs_client::BackupRepository;
 use pbs_config::CachedUserInfo;
 use pbs_datastore::data_blob::DataBlob;
 use pbs_datastore::dynamic_index::DynamicIndexReader;
@@ -28,13 +24,13 @@ use pbs_datastore::manifest::{
     ArchiveType, BackupManifest, FileInfo, CLIENT_LOG_BLOB_NAME, MANIFEST_BLOB_NAME,
 };
 use pbs_datastore::read_chunk::AsyncReadChunk;
-use pbs_datastore::{check_backup_owner, DataStore, ListNamespacesRecursive, StoreProgress};
+use pbs_datastore::{check_backup_owner, DataStore, StoreProgress};
 use pbs_tools::sha::sha256;
 
 use super::sync::{
-    LocalSourceReader, RemoteSourceReader, RemovedVanishedStats, SyncSourceReader, SyncStats,
+    LocalSource, RemoteSource, RemovedVanishedStats, SyncSource, SyncSourceReader, SyncStats,
 };
-use crate::backup::{check_ns_modification_privs, check_ns_privs, ListAccessibleBackupGroups};
+use crate::backup::{check_ns_modification_privs, check_ns_privs};
 use crate::tools::parallel_handler::ParallelHandler;
 
 pub(crate) struct PullTarget {
@@ -42,276 +38,10 @@ pub(crate) struct PullTarget {
     ns: BackupNamespace,
 }
 
-pub(crate) struct RemoteSource {
-    repo: BackupRepository,
-    ns: BackupNamespace,
-    client: HttpClient,
-}
-
-pub(crate) struct LocalSource {
-    store: Arc<DataStore>,
-    ns: BackupNamespace,
-}
-
-#[async_trait::async_trait]
-/// `PullSource` is a trait that provides an interface for pulling data/information from a source.
-/// The trait includes methods for listing namespaces, groups, and backup directories,
-/// as well as retrieving a reader for reading data from the source
-trait PullSource: Send + Sync {
-    /// Lists namespaces from the source.
-    async fn list_namespaces(
-        &self,
-        max_depth: &mut Option<usize>,
-    ) -> Result<Vec<BackupNamespace>, Error>;
-
-    /// Lists groups within a specific namespace from the source.
-    async fn list_groups(
-        &self,
-        namespace: &BackupNamespace,
-        owner: &Authid,
-    ) -> Result<Vec<BackupGroup>, Error>;
-
-    /// Lists backup directories for a specific group within a specific namespace from the source.
-    async fn list_backup_dirs(
-        &self,
-        namespace: &BackupNamespace,
-        group: &BackupGroup,
-    ) -> Result<Vec<BackupDir>, Error>;
-    fn get_ns(&self) -> BackupNamespace;
-    fn get_store(&self) -> &str;
-
-    /// Returns a reader for reading data from a specific backup directory.
-    async fn reader(
-        &self,
-        ns: &BackupNamespace,
-        dir: &BackupDir,
-    ) -> Result<Arc<dyn SyncSourceReader>, Error>;
-}
-
-#[async_trait::async_trait]
-impl PullSource for RemoteSource {
-    async fn list_namespaces(
-        &self,
-        max_depth: &mut Option<usize>,
-    ) -> Result<Vec<BackupNamespace>, Error> {
-        if self.ns.is_root() && max_depth.map_or(false, |depth| depth == 0) {
-            return Ok(vec![self.ns.clone()]);
-        }
-
-        let path = format!("api2/json/admin/datastore/{}/namespace", self.repo.store());
-        let mut data = json!({});
-        if let Some(max_depth) = max_depth {
-            data["max-depth"] = json!(max_depth);
-        }
-
-        if !self.ns.is_root() {
-            data["parent"] = json!(self.ns);
-        }
-        self.client.login().await?;
-
-        let mut result = match self.client.get(&path, Some(data)).await {
-            Ok(res) => res,
-            Err(err) => match err.downcast_ref::<HttpError>() {
-                Some(HttpError { code, message }) => match code {
-                    &StatusCode::NOT_FOUND => {
-                        if self.ns.is_root() && max_depth.is_none() {
-                            warn!("Could not query remote for namespaces (404) -> temporarily switching to backwards-compat mode");
-                            warn!("Either make backwards-compat mode explicit (max-depth == 0) or upgrade remote system.");
-                            max_depth.replace(0);
-                        } else {
-                            bail!("Remote namespace set/recursive sync requested, but remote does not support namespaces.")
-                        }
-
-                        return Ok(vec![self.ns.clone()]);
-                    }
-                    _ => {
-                        bail!("Querying namespaces failed - HTTP error {code} - {message}");
-                    }
-                },
-                None => {
-                    bail!("Querying namespaces failed - {err}");
-                }
-            },
-        };
-
-        let list: Vec<BackupNamespace> =
-            serde_json::from_value::<Vec<pbs_api_types::NamespaceListItem>>(result["data"].take())?
-                .into_iter()
-                .map(|list_item| list_item.ns)
-                .collect();
-
-        Ok(list)
-    }
-
-    async fn list_groups(
-        &self,
-        namespace: &BackupNamespace,
-        _owner: &Authid,
-    ) -> Result<Vec<BackupGroup>, Error> {
-        let path = format!("api2/json/admin/datastore/{}/groups", self.repo.store());
-
-        let args = if !namespace.is_root() {
-            Some(json!({ "ns": namespace.clone() }))
-        } else {
-            None
-        };
-
-        self.client.login().await?;
-        let mut result =
-            self.client.get(&path, args).await.map_err(|err| {
-                format_err!("Failed to retrieve backup groups from remote - {}", err)
-            })?;
-
-        Ok(
-            serde_json::from_value::<Vec<GroupListItem>>(result["data"].take())
-                .map_err(Error::from)?
-                .into_iter()
-                .map(|item| item.backup)
-                .collect::<Vec<BackupGroup>>(),
-        )
-    }
-
-    async fn list_backup_dirs(
-        &self,
-        namespace: &BackupNamespace,
-        group: &BackupGroup,
-    ) -> Result<Vec<BackupDir>, Error> {
-        let path = format!("api2/json/admin/datastore/{}/snapshots", self.repo.store());
-
-        let mut args = json!({
-            "backup-type": group.ty,
-            "backup-id": group.id,
-        });
-
-        if !namespace.is_root() {
-            args["ns"] = serde_json::to_value(namespace)?;
-        }
-
-        self.client.login().await?;
-
-        let mut result = self.client.get(&path, Some(args)).await?;
-        let snapshot_list: Vec<SnapshotListItem> = serde_json::from_value(result["data"].take())?;
-        Ok(snapshot_list
-            .into_iter()
-            .filter_map(|item: SnapshotListItem| {
-                let snapshot = item.backup;
-                // in-progress backups can't be synced
-                if item.size.is_none() {
-                    info!("skipping snapshot {snapshot} - in-progress backup");
-                    return None;
-                }
-
-                Some(snapshot)
-            })
-            .collect::<Vec<BackupDir>>())
-    }
-
-    fn get_ns(&self) -> BackupNamespace {
-        self.ns.clone()
-    }
-
-    fn get_store(&self) -> &str {
-        self.repo.store()
-    }
-
-    async fn reader(
-        &self,
-        ns: &BackupNamespace,
-        dir: &BackupDir,
-    ) -> Result<Arc<dyn SyncSourceReader>, Error> {
-        let backup_reader = BackupReader::start(
-            &self.client,
-            None,
-            self.repo.store(),
-            ns,
-            dir,
-            tracing::enabled!(tracing::Level::DEBUG),
-        )
-        .await?;
-        Ok(Arc::new(RemoteSourceReader {
-            backup_reader,
-            dir: dir.clone(),
-        }))
-    }
-}
-
-#[async_trait::async_trait]
-impl PullSource for LocalSource {
-    async fn list_namespaces(
-        &self,
-        max_depth: &mut Option<usize>,
-    ) -> Result<Vec<BackupNamespace>, Error> {
-        ListNamespacesRecursive::new_max_depth(
-            self.store.clone(),
-            self.ns.clone(),
-            max_depth.unwrap_or(MAX_NAMESPACE_DEPTH),
-        )?
-        .collect()
-    }
-
-    async fn list_groups(
-        &self,
-        namespace: &BackupNamespace,
-        owner: &Authid,
-    ) -> Result<Vec<BackupGroup>, Error> {
-        Ok(ListAccessibleBackupGroups::new_with_privs(
-            &self.store,
-            namespace.clone(),
-            0,
-            Some(PRIV_DATASTORE_READ),
-            Some(PRIV_DATASTORE_BACKUP),
-            Some(owner),
-        )?
-        .filter_map(Result::ok)
-        .map(|backup_group| backup_group.group().clone())
-        .collect::<Vec<pbs_api_types::BackupGroup>>())
-    }
-
-    async fn list_backup_dirs(
-        &self,
-        namespace: &BackupNamespace,
-        group: &BackupGroup,
-    ) -> Result<Vec<BackupDir>, Error> {
-        Ok(self
-            .store
-            .backup_group(namespace.clone(), group.clone())
-            .iter_snapshots()?
-            .filter_map(Result::ok)
-            .map(|snapshot| snapshot.dir().to_owned())
-            .collect::<Vec<BackupDir>>())
-    }
-
-    fn get_ns(&self) -> BackupNamespace {
-        self.ns.clone()
-    }
-
-    fn get_store(&self) -> &str {
-        self.store.name()
-    }
-
-    async fn reader(
-        &self,
-        ns: &BackupNamespace,
-        dir: &BackupDir,
-    ) -> Result<Arc<dyn SyncSourceReader>, Error> {
-        let dir = self.store.backup_dir(ns.clone(), dir.clone())?;
-        let dir_lock = proxmox_sys::fs::lock_dir_noblock_shared(
-            &dir.full_path(),
-            "snapshot",
-            "locked by another operation",
-        )?;
-        Ok(Arc::new(LocalSourceReader {
-            _dir_lock: Arc::new(Mutex::new(dir_lock)),
-            path: dir.full_path(),
-            datastore: dir.datastore().clone(),
-        }))
-    }
-}
-
 /// Parameters for a pull operation.
 pub(crate) struct PullParameters {
     /// Where data is pulled from
-    source: Arc<dyn PullSource>,
+    source: Arc<dyn SyncSource>,
     /// Where data should be pulled into
     target: PullTarget,
     /// Owner of synced groups (needs to match local owner of pre-existing groups)
@@ -348,7 +78,7 @@ impl PullParameters {
         };
         let remove_vanished = remove_vanished.unwrap_or(false);
 
-        let source: Arc<dyn PullSource> = if let Some(remote) = remote {
+        let source: Arc<dyn SyncSource> = if let Some(remote) = remote {
             let (remote_config, _digest) = pbs_config::remote::config()?;
             let remote: Remote = remote_config.lookup("remote", remote)?;
 
diff --git a/src/server/sync.rs b/src/server/sync.rs
index 323bc1a27..f8a1e133d 100644
--- a/src/server/sync.rs
+++ b/src/server/sync.rs
@@ -6,18 +6,24 @@ use std::path::{Path, PathBuf};
 use std::sync::{Arc, Mutex};
 use std::time::Duration;
 
-use anyhow::{bail, Error};
+use anyhow::{bail, format_err, Error};
 use http::StatusCode;
-use tracing::info;
+use serde_json::json;
+use tracing::{info, warn};
 
 use proxmox_router::HttpError;
 
-use pbs_api_types::{BackupDir, CryptMode};
-use pbs_client::{BackupReader, RemoteChunkReader};
+use pbs_api_types::{
+    Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupListItem, SnapshotListItem,
+    MAX_NAMESPACE_DEPTH, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_READ,
+};
+use pbs_client::{BackupReader, BackupRepository, HttpClient, RemoteChunkReader};
 use pbs_datastore::data_blob::DataBlob;
 use pbs_datastore::manifest::CLIENT_LOG_BLOB_NAME;
 use pbs_datastore::read_chunk::AsyncReadChunk;
-use pbs_datastore::{DataStore, LocalChunkReader};
+use pbs_datastore::{DataStore, ListNamespacesRecursive, LocalChunkReader};
+
+use crate::backup::ListAccessibleBackupGroups;
 
 #[derive(Default)]
 pub(crate) struct RemovedVanishedStats {
@@ -201,3 +207,263 @@ impl SyncSourceReader for LocalSourceReader {
         self.datastore.name() == target_store_name
     }
 }
+
+#[async_trait::async_trait]
+/// `SyncSource` is a trait that provides an interface for synchronizing data/information from a
+/// source.
+/// The trait includes methods for listing namespaces, groups, and backup directories,
+/// as well as retrieving a reader for reading data from the source.
+pub(crate) trait SyncSource: Send + Sync {
+    /// Lists namespaces from the source.
+    async fn list_namespaces(
+        &self,
+        max_depth: &mut Option<usize>,
+    ) -> Result<Vec<BackupNamespace>, Error>;
+
+    /// Lists groups within a specific namespace from the source.
+    async fn list_groups(
+        &self,
+        namespace: &BackupNamespace,
+        owner: &Authid,
+    ) -> Result<Vec<BackupGroup>, Error>;
+
+    /// Lists backup directories for a specific group within a specific namespace from the source.
+    async fn list_backup_dirs(
+        &self,
+        namespace: &BackupNamespace,
+        group: &BackupGroup,
+    ) -> Result<Vec<BackupDir>, Error>;
+    fn get_ns(&self) -> BackupNamespace;
+    fn get_store(&self) -> &str;
+
+    /// Returns a reader for reading data from a specific backup directory.
+    async fn reader(
+        &self,
+        ns: &BackupNamespace,
+        dir: &BackupDir,
+    ) -> Result<Arc<dyn SyncSourceReader>, Error>;
+}
+
+pub(crate) struct RemoteSource {
+    pub(crate) repo: BackupRepository,
+    pub(crate) ns: BackupNamespace,
+    pub(crate) client: HttpClient,
+}
+
+pub(crate) struct LocalSource {
+    pub(crate) store: Arc<DataStore>,
+    pub(crate) ns: BackupNamespace,
+}
+
+#[async_trait::async_trait]
+impl SyncSource for RemoteSource {
+    async fn list_namespaces(
+        &self,
+        max_depth: &mut Option<usize>,
+    ) -> Result<Vec<BackupNamespace>, Error> {
+        if self.ns.is_root() && max_depth.map_or(false, |depth| depth == 0) {
+            return Ok(vec![self.ns.clone()]);
+        }
+
+        let path = format!("api2/json/admin/datastore/{}/namespace", self.repo.store());
+        let mut data = json!({});
+        if let Some(max_depth) = max_depth {
+            data["max-depth"] = json!(max_depth);
+        }
+
+        if !self.ns.is_root() {
+            data["parent"] = json!(self.ns);
+        }
+        self.client.login().await?;
+
+        let mut result = match self.client.get(&path, Some(data)).await {
+            Ok(res) => res,
+            Err(err) => match err.downcast_ref::<HttpError>() {
+                Some(HttpError { code, message }) => match code {
+                    &StatusCode::NOT_FOUND => {
+                        if self.ns.is_root() && max_depth.is_none() {
+                            warn!("Could not query remote for namespaces (404) -> temporarily switching to backwards-compat mode");
+                            warn!("Either make backwards-compat mode explicit (max-depth == 0) or upgrade remote system.");
+                            max_depth.replace(0);
+                        } else {
+                            bail!("Remote namespace set/recursive sync requested, but remote does not support namespaces.")
+                        }
+
+                        return Ok(vec![self.ns.clone()]);
+                    }
+                    _ => {
+                        bail!("Querying namespaces failed - HTTP error {code} - {message}");
+                    }
+                },
+                None => {
+                    bail!("Querying namespaces failed - {err}");
+                }
+            },
+        };
+
+        let list: Vec<BackupNamespace> =
+            serde_json::from_value::<Vec<pbs_api_types::NamespaceListItem>>(result["data"].take())?
+                .into_iter()
+                .map(|list_item| list_item.ns)
+                .collect();
+
+        Ok(list)
+    }
+
+    async fn list_groups(
+        &self,
+        namespace: &BackupNamespace,
+        _owner: &Authid,
+    ) -> Result<Vec<BackupGroup>, Error> {
+        let path = format!("api2/json/admin/datastore/{}/groups", self.repo.store());
+
+        let args = if !namespace.is_root() {
+            Some(json!({ "ns": namespace.clone() }))
+        } else {
+            None
+        };
+
+        self.client.login().await?;
+        let mut result =
+            self.client.get(&path, args).await.map_err(|err| {
+                format_err!("Failed to retrieve backup groups from remote - {}", err)
+            })?;
+
+        Ok(
+            serde_json::from_value::<Vec<GroupListItem>>(result["data"].take())
+                .map_err(Error::from)?
+                .into_iter()
+                .map(|item| item.backup)
+                .collect::<Vec<BackupGroup>>(),
+        )
+    }
+
+    async fn list_backup_dirs(
+        &self,
+        namespace: &BackupNamespace,
+        group: &BackupGroup,
+    ) -> Result<Vec<BackupDir>, Error> {
+        let path = format!("api2/json/admin/datastore/{}/snapshots", self.repo.store());
+
+        let mut args = json!({
+            "backup-type": group.ty,
+            "backup-id": group.id,
+        });
+
+        if !namespace.is_root() {
+            args["ns"] = serde_json::to_value(namespace)?;
+        }
+
+        self.client.login().await?;
+
+        let mut result = self.client.get(&path, Some(args)).await?;
+        let snapshot_list: Vec<SnapshotListItem> = serde_json::from_value(result["data"].take())?;
+        Ok(snapshot_list
+            .into_iter()
+            .filter_map(|item: SnapshotListItem| {
+                let snapshot = item.backup;
+                // in-progress backups can't be synced
+                if item.size.is_none() {
+                    info!("skipping snapshot {snapshot} - in-progress backup");
+                    return None;
+                }
+
+                Some(snapshot)
+            })
+            .collect::<Vec<BackupDir>>())
+    }
+
+    fn get_ns(&self) -> BackupNamespace {
+        self.ns.clone()
+    }
+
+    fn get_store(&self) -> &str {
+        self.repo.store()
+    }
+
+    async fn reader(
+        &self,
+        ns: &BackupNamespace,
+        dir: &BackupDir,
+    ) -> Result<Arc<dyn SyncSourceReader>, Error> {
+        let backup_reader =
+            BackupReader::start(&self.client, None, self.repo.store(), ns, dir, true).await?;
+        Ok(Arc::new(RemoteSourceReader {
+            backup_reader,
+            dir: dir.clone(),
+        }))
+    }
+}
+
+#[async_trait::async_trait]
+impl SyncSource for LocalSource {
+    async fn list_namespaces(
+        &self,
+        max_depth: &mut Option<usize>,
+    ) -> Result<Vec<BackupNamespace>, Error> {
+        ListNamespacesRecursive::new_max_depth(
+            self.store.clone(),
+            self.ns.clone(),
+            max_depth.unwrap_or(MAX_NAMESPACE_DEPTH),
+        )?
+        .collect()
+    }
+
+    async fn list_groups(
+        &self,
+        namespace: &BackupNamespace,
+        owner: &Authid,
+    ) -> Result<Vec<BackupGroup>, Error> {
+        Ok(ListAccessibleBackupGroups::new_with_privs(
+            &self.store,
+            namespace.clone(),
+            0,
+            Some(PRIV_DATASTORE_READ),
+            Some(PRIV_DATASTORE_BACKUP),
+            Some(owner),
+        )?
+        .filter_map(Result::ok)
+        .map(|backup_group| backup_group.group().clone())
+        .collect::<Vec<pbs_api_types::BackupGroup>>())
+    }
+
+    async fn list_backup_dirs(
+        &self,
+        namespace: &BackupNamespace,
+        group: &BackupGroup,
+    ) -> Result<Vec<BackupDir>, Error> {
+        Ok(self
+            .store
+            .backup_group(namespace.clone(), group.clone())
+            .iter_snapshots()?
+            .filter_map(Result::ok)
+            .map(|snapshot| snapshot.dir().to_owned())
+            .collect::<Vec<BackupDir>>())
+    }
+
+    fn get_ns(&self) -> BackupNamespace {
+        self.ns.clone()
+    }
+
+    fn get_store(&self) -> &str {
+        self.store.name()
+    }
+
+    async fn reader(
+        &self,
+        ns: &BackupNamespace,
+        dir: &BackupDir,
+    ) -> Result<Arc<dyn SyncSourceReader>, Error> {
+        let dir = self.store.backup_dir(ns.clone(), dir.clone())?;
+        let dir_lock = proxmox_sys::fs::lock_dir_noblock_shared(
+            &dir.full_path(),
+            "snapshot",
+            "locked by another operation",
+        )?;
+        Ok(Arc::new(LocalSourceReader {
+            _dir_lock: Arc::new(Mutex::new(dir_lock)),
+            path: dir.full_path(),
+            datastore: dir.datastore().clone(),
+        }))
+    }
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 05/33] client: backup writer: bundle upload stats counters
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (3 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 04/33] server: sync: move source " Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-10-10 14:49   ` Fabian Grünbichler
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 06/33] client: backup writer: factor out merged chunk stream upload Christian Ebner
                   ` (30 subsequent siblings)
  35 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

In preparation for push support in sync jobs.

Introduce `UploadStatsCounters` struct to hold the Arc clones of the
chunk upload statistics counters. By bundling them into the struct,
they can be passed as single function parameter when factoring out
the common stream future implementation in the subsequent
implementation of the chunk upload for push support in sync jobs.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-client/src/backup_writer.rs | 52 ++++++++++++++++++---------------
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index d63c09b5a..34ac47beb 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -65,6 +65,16 @@ struct UploadStats {
     csum: [u8; 32],
 }
 
+struct UploadStatsCounters {
+    injected_chunk_count: Arc<AtomicUsize>,
+    known_chunk_count: Arc<AtomicUsize>,
+    total_chunks: Arc<AtomicUsize>,
+    compressed_stream_len: Arc<AtomicU64>,
+    injected_len: Arc<AtomicUsize>,
+    reused_len: Arc<AtomicUsize>,
+    stream_len: Arc<AtomicUsize>,
+}
+
 type UploadQueueSender = mpsc::Sender<(MergedChunkInfo, Option<h2::client::ResponseFuture>)>;
 type UploadResultReceiver = oneshot::Receiver<Result<(), Error>>;
 
@@ -638,20 +648,23 @@ impl BackupWriter {
         injections: Option<std::sync::mpsc::Receiver<InjectChunks>>,
     ) -> impl Future<Output = Result<UploadStats, Error>> {
         let total_chunks = Arc::new(AtomicUsize::new(0));
-        let total_chunks2 = total_chunks.clone();
         let known_chunk_count = Arc::new(AtomicUsize::new(0));
-        let known_chunk_count2 = known_chunk_count.clone();
         let injected_chunk_count = Arc::new(AtomicUsize::new(0));
-        let injected_chunk_count2 = injected_chunk_count.clone();
 
         let stream_len = Arc::new(AtomicUsize::new(0));
-        let stream_len2 = stream_len.clone();
         let compressed_stream_len = Arc::new(AtomicU64::new(0));
-        let compressed_stream_len2 = compressed_stream_len.clone();
         let reused_len = Arc::new(AtomicUsize::new(0));
-        let reused_len2 = reused_len.clone();
         let injected_len = Arc::new(AtomicUsize::new(0));
-        let injected_len2 = injected_len.clone();
+
+        let counters = UploadStatsCounters {
+            injected_chunk_count: injected_chunk_count.clone(),
+            known_chunk_count: known_chunk_count.clone(),
+            total_chunks: total_chunks.clone(),
+            compressed_stream_len: compressed_stream_len.clone(),
+            injected_len: injected_len.clone(),
+            reused_len: reused_len.clone(),
+            stream_len: stream_len.clone(),
+        };
 
         let append_chunk_path = format!("{}_index", prefix);
         let upload_chunk_path = format!("{}_chunk", prefix);
@@ -794,27 +807,18 @@ impl BackupWriter {
             })
             .then(move |result| async move { upload_result.await?.and(result) }.boxed())
             .and_then(move |_| {
-                let duration = start_time.elapsed();
-                let chunk_count = total_chunks2.load(Ordering::SeqCst);
-                let chunk_reused = known_chunk_count2.load(Ordering::SeqCst);
-                let chunk_injected = injected_chunk_count2.load(Ordering::SeqCst);
-                let size = stream_len2.load(Ordering::SeqCst);
-                let size_reused = reused_len2.load(Ordering::SeqCst);
-                let size_injected = injected_len2.load(Ordering::SeqCst);
-                let size_compressed = compressed_stream_len2.load(Ordering::SeqCst) as usize;
-
                 let mut guard = index_csum_2.lock().unwrap();
                 let csum = guard.take().unwrap().finish();
 
                 futures::future::ok(UploadStats {
-                    chunk_count,
-                    chunk_reused,
-                    chunk_injected,
-                    size,
-                    size_reused,
-                    size_injected,
-                    size_compressed,
-                    duration,
+                    chunk_count: counters.total_chunks.load(Ordering::SeqCst),
+                    chunk_reused: counters.known_chunk_count.load(Ordering::SeqCst),
+                    chunk_injected: counters.injected_chunk_count.load(Ordering::SeqCst),
+                    size: counters.stream_len.load(Ordering::SeqCst),
+                    size_reused: counters.reused_len.load(Ordering::SeqCst),
+                    size_injected: counters.injected_len.load(Ordering::SeqCst),
+                    size_compressed: counters.compressed_stream_len.load(Ordering::SeqCst) as usize,
+                    duration: start_time.elapsed(),
                     csum,
                 })
             })
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 06/33] client: backup writer: factor out merged chunk stream upload
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (4 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 05/33] client: backup writer: bundle upload stats counters Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 07/33] client: backup writer: add chunk count and duration stats Christian Ebner
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

In preparation for implementing push support for sync jobs.

Factor out the upload stream for merged chunks, which can be reused
to upload the local chunks to a remote target datastore during a
snapshot sync operation in push direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-client/src/backup_writer.rs | 47 ++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 18 deletions(-)

diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index 34ac47beb..dceda1548 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -6,6 +6,7 @@ use std::sync::{Arc, Mutex};
 use anyhow::{bail, format_err, Error};
 use futures::future::{self, AbortHandle, Either, FutureExt, TryFutureExt};
 use futures::stream::{Stream, StreamExt, TryStreamExt};
+use openssl::sha::Sha256;
 use serde_json::{json, Value};
 use tokio::io::AsyncReadExt;
 use tokio::sync::{mpsc, oneshot};
@@ -666,19 +667,12 @@ impl BackupWriter {
             stream_len: stream_len.clone(),
         };
 
-        let append_chunk_path = format!("{}_index", prefix);
-        let upload_chunk_path = format!("{}_chunk", prefix);
         let is_fixed_chunk_size = prefix == "fixed";
 
-        let (upload_queue, upload_result) =
-            Self::append_chunk_queue(h2.clone(), wid, append_chunk_path);
-
-        let start_time = std::time::Instant::now();
-
         let index_csum = Arc::new(Mutex::new(Some(openssl::sha::Sha256::new())));
         let index_csum_2 = index_csum.clone();
 
-        stream
+        let stream = stream
             .inject_reused_chunks(injections, stream_len.clone())
             .and_then(move |chunk_info| match chunk_info {
                 InjectedChunksInfo::Known(chunks) => {
@@ -749,7 +743,28 @@ impl BackupWriter {
                     }
                 }
             })
-            .merge_known_chunks()
+            .merge_known_chunks();
+
+        Self::upload_merged_chunk_stream(h2, wid, prefix, stream, index_csum_2, counters)
+    }
+
+    fn upload_merged_chunk_stream(
+        h2: H2Client,
+        wid: u64,
+        prefix: &str,
+        stream: impl Stream<Item = Result<MergedChunkInfo, Error>>,
+        index_csum: Arc<Mutex<Option<Sha256>>>,
+        counters: UploadStatsCounters,
+    ) -> impl Future<Output = Result<UploadStats, Error>> {
+        let append_chunk_path = format!("{prefix}_index");
+        let upload_chunk_path = format!("{prefix}_chunk");
+
+        let (upload_queue, upload_result) =
+            Self::append_chunk_queue(h2.clone(), wid, append_chunk_path);
+
+        let start_time = std::time::Instant::now();
+
+        stream
             .try_for_each(move |merged_chunk_info| {
                 let upload_queue = upload_queue.clone();
 
@@ -759,10 +774,8 @@ impl BackupWriter {
                     let digest_str = hex::encode(digest);
 
                     log::trace!(
-                        "upload new chunk {} ({} bytes, offset {})",
-                        digest_str,
-                        chunk_info.chunk_len,
-                        offset
+                        "upload new chunk {digest_str} ({chunk_len} bytes, offset {offset})",
+                        chunk_len = chunk_info.chunk_len,
                     );
 
                     let chunk_data = chunk_info.chunk.into_inner();
@@ -791,9 +804,7 @@ impl BackupWriter {
                             upload_queue
                                 .send((new_info, Some(response)))
                                 .await
-                                .map_err(|err| {
-                                    format_err!("failed to send to upload queue: {}", err)
-                                })
+                                .map_err(|err| format_err!("failed to send to upload queue: {err}"))
                         },
                     ))
                 } else {
@@ -801,13 +812,13 @@ impl BackupWriter {
                         upload_queue
                             .send((merged_chunk_info, None))
                             .await
-                            .map_err(|err| format_err!("failed to send to upload queue: {}", err))
+                            .map_err(|err| format_err!("failed to send to upload queue: {err}"))
                     })
                 }
             })
             .then(move |result| async move { upload_result.await?.and(result) }.boxed())
             .and_then(move |_| {
-                let mut guard = index_csum_2.lock().unwrap();
+                let mut guard = index_csum.lock().unwrap();
                 let csum = guard.take().unwrap().finish();
 
                 futures::future::ok(UploadStats {
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 07/33] client: backup writer: add chunk count and duration stats
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (5 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 06/33] client: backup writer: factor out merged chunk stream upload Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 08/33] client: backup writer: allow push uploading index and chunks Christian Ebner
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

In addition to size and checksum, return also the chunk count and
duration to the upload stats in order to show this information in
the task log of sync jobs in push direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-client/src/backup_writer.rs | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index dceda1548..b71157164 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -2,6 +2,7 @@ use std::collections::HashSet;
 use std::future::Future;
 use std::sync::atomic::{AtomicU64, AtomicUsize, Ordering};
 use std::sync::{Arc, Mutex};
+use std::time::{Duration, Instant};
 
 use anyhow::{bail, format_err, Error};
 use futures::future::{self, AbortHandle, Either, FutureExt, TryFutureExt};
@@ -43,6 +44,8 @@ impl Drop for BackupWriter {
 pub struct BackupStats {
     pub size: u64,
     pub csum: [u8; 32],
+    pub duration: Duration,
+    pub chunk_count: u64,
 }
 
 /// Options for uploading blobs/streams to the server
@@ -62,7 +65,7 @@ struct UploadStats {
     size_reused: usize,
     size_injected: usize,
     size_compressed: usize,
-    duration: std::time::Duration,
+    duration: Duration,
     csum: [u8; 32],
 }
 
@@ -199,6 +202,7 @@ impl BackupWriter {
         mut reader: R,
         file_name: &str,
     ) -> Result<BackupStats, Error> {
+        let start_time = Instant::now();
         let mut raw_data = Vec::new();
         // fixme: avoid loading into memory
         reader.read_to_end(&mut raw_data)?;
@@ -216,7 +220,12 @@ impl BackupWriter {
                 raw_data,
             )
             .await?;
-        Ok(BackupStats { size, csum })
+        Ok(BackupStats {
+            size,
+            csum,
+            duration: start_time.elapsed(),
+            chunk_count: 0,
+        })
     }
 
     pub async fn upload_blob_from_data(
@@ -225,6 +234,7 @@ impl BackupWriter {
         file_name: &str,
         options: UploadOptions,
     ) -> Result<BackupStats, Error> {
+        let start_time = Instant::now();
         let blob = match (options.encrypt, &self.crypt_config) {
             (false, _) => DataBlob::encode(&data, None, options.compress)?,
             (true, None) => bail!("requested encryption without a crypt config"),
@@ -248,7 +258,12 @@ impl BackupWriter {
                 raw_data,
             )
             .await?;
-        Ok(BackupStats { size, csum })
+        Ok(BackupStats {
+            size,
+            csum,
+            duration: start_time.elapsed(),
+            chunk_count: 0,
+        })
     }
 
     pub async fn upload_blob_from_file<P: AsRef<std::path::Path>>(
@@ -427,6 +442,8 @@ impl BackupWriter {
         Ok(BackupStats {
             size: upload_stats.size as u64,
             csum: upload_stats.csum,
+            duration: upload_stats.duration,
+            chunk_count: upload_stats.chunk_count as u64,
         })
     }
 
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 08/33] client: backup writer: allow push uploading index and chunks
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (6 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 07/33] client: backup writer: add chunk count and duration stats Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 09/33] server: sync: move skip info/reason to common sync module Christian Ebner
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

Add a method `upload_index_chunk_info` to be used for uploading an
existing index and the corresponding chunk stream.
Instead of taking an input stream of raw bytes as the
`upload_stream`, this takes a stream of `ChunkInfo` object provided
by the local chunk reader of the sync jobs source.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-client/src/backup_writer.rs | 106 ++++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index b71157164..dbb7db0e8 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -288,6 +288,112 @@ impl BackupWriter {
             .await
     }
 
+    /// Upload chunks and index
+    pub async fn upload_index_chunk_info(
+        &self,
+        archive_name: &str,
+        stream: impl Stream<Item = Result<ChunkInfo, Error>>,
+        options: UploadOptions,
+        known_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
+    ) -> Result<BackupStats, Error> {
+        let mut param = json!({ "archive-name": archive_name });
+        let prefix = if let Some(size) = options.fixed_size {
+            param["size"] = size.into();
+            "fixed"
+        } else {
+            "dynamic"
+        };
+
+        if options.encrypt && self.crypt_config.is_none() {
+            bail!("requested encryption without a crypt config");
+        }
+
+        let wid = self
+            .h2
+            .post(&format!("{prefix}_index"), Some(param))
+            .await?
+            .as_u64()
+            .unwrap();
+
+        let total_chunks = Arc::new(AtomicUsize::new(0));
+        let known_chunk_count = Arc::new(AtomicUsize::new(0));
+
+        let stream_len = Arc::new(AtomicUsize::new(0));
+        let compressed_stream_len = Arc::new(AtomicU64::new(0));
+        let reused_len = Arc::new(AtomicUsize::new(0));
+
+        let counters = UploadStatsCounters {
+            injected_chunk_count: Arc::new(AtomicUsize::new(0)),
+            known_chunk_count: known_chunk_count.clone(),
+            total_chunks: total_chunks.clone(),
+            compressed_stream_len: compressed_stream_len.clone(),
+            injected_len: Arc::new(AtomicUsize::new(0)),
+            reused_len: reused_len.clone(),
+            stream_len: stream_len.clone(),
+        };
+
+        let is_fixed_chunk_size = prefix == "fixed";
+
+        let index_csum = Arc::new(Mutex::new(Some(Sha256::new())));
+        let index_csum_2 = index_csum.clone();
+
+        let stream = stream
+            .and_then(move |chunk_info| {
+                total_chunks.fetch_add(1, Ordering::SeqCst);
+                reused_len.fetch_add(chunk_info.chunk_len as usize, Ordering::SeqCst);
+                let offset = stream_len.fetch_add(chunk_info.chunk_len as usize, Ordering::SeqCst);
+
+                let end_offset = offset as u64 + chunk_info.chunk_len;
+                let mut guard = index_csum.lock().unwrap();
+                let csum = guard.as_mut().unwrap();
+                if !is_fixed_chunk_size {
+                    csum.update(&end_offset.to_le_bytes());
+                }
+                csum.update(&chunk_info.digest);
+
+                let mut known_chunks = known_chunks.lock().unwrap();
+                if known_chunks.contains(&chunk_info.digest) {
+                    known_chunk_count.fetch_add(1, Ordering::SeqCst);
+                    future::ok(MergedChunkInfo::Known(vec![(
+                        chunk_info.offset,
+                        chunk_info.digest,
+                    )]))
+                } else {
+                    known_chunks.insert(chunk_info.digest);
+                    future::ok(MergedChunkInfo::New(chunk_info))
+                }
+            })
+            .merge_known_chunks();
+
+        let upload_stats = Self::upload_merged_chunk_stream(
+            self.h2.clone(),
+            wid,
+            prefix,
+            stream,
+            index_csum_2,
+            counters,
+        )
+        .await?;
+
+        let param = json!({
+            "wid": wid ,
+            "chunk-count": upload_stats.chunk_count,
+            "size": upload_stats.size,
+            "csum": hex::encode(upload_stats.csum),
+        });
+        let _value = self
+            .h2
+            .post(&format!("{prefix}_close"), Some(param))
+            .await?;
+
+        Ok(BackupStats {
+            size: upload_stats.size as u64,
+            csum: upload_stats.csum,
+            duration: upload_stats.duration,
+            chunk_count: upload_stats.chunk_count as u64,
+        })
+    }
+
     pub async fn upload_stream(
         &self,
         archive_name: &str,
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 09/33] server: sync: move skip info/reason to common sync module
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (7 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 08/33] client: backup writer: allow push uploading index and chunks Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 10/33] server: sync: make skip reason message more genenric Christian Ebner
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

Make `SkipReason` and `SkipInfo` accessible for sync operations of
both direction variants, push and pull.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/server/pull.rs | 82 ++--------------------------------------------
 src/server/sync.rs | 79 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+), 80 deletions(-)

diff --git a/src/server/pull.rs b/src/server/pull.rs
index c6932dcc5..d18c6d643 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -28,7 +28,8 @@ use pbs_datastore::{check_backup_owner, DataStore, StoreProgress};
 use pbs_tools::sha::sha256;
 
 use super::sync::{
-    LocalSource, RemoteSource, RemovedVanishedStats, SyncSource, SyncSourceReader, SyncStats,
+    LocalSource, RemoteSource, RemovedVanishedStats, SkipInfo, SkipReason, SyncSource,
+    SyncSourceReader, SyncStats,
 };
 use crate::backup::{check_ns_modification_privs, check_ns_privs};
 use crate::tools::parallel_handler::ParallelHandler;
@@ -474,85 +475,6 @@ async fn pull_snapshot_from<'a>(
     Ok(sync_stats)
 }
 
-#[derive(PartialEq, Eq)]
-enum SkipReason {
-    AlreadySynced,
-    TransferLast,
-}
-
-impl std::fmt::Display for SkipReason {
-    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
-        write!(
-            f,
-            "{}",
-            match self {
-                SkipReason::AlreadySynced => "older than the newest local snapshot",
-                SkipReason::TransferLast => "due to transfer-last",
-            }
-        )
-    }
-}
-
-struct SkipInfo {
-    oldest: i64,
-    newest: i64,
-    count: u64,
-    skip_reason: SkipReason,
-}
-
-impl SkipInfo {
-    fn new(skip_reason: SkipReason) -> Self {
-        SkipInfo {
-            oldest: i64::MAX,
-            newest: i64::MIN,
-            count: 0,
-            skip_reason,
-        }
-    }
-
-    fn reset(&mut self) {
-        self.count = 0;
-        self.oldest = i64::MAX;
-        self.newest = i64::MIN;
-    }
-
-    fn update(&mut self, backup_time: i64) {
-        self.count += 1;
-
-        if backup_time < self.oldest {
-            self.oldest = backup_time;
-        }
-
-        if backup_time > self.newest {
-            self.newest = backup_time;
-        }
-    }
-
-    fn affected(&self) -> Result<String, Error> {
-        match self.count {
-            0 => Ok(String::new()),
-            1 => Ok(proxmox_time::epoch_to_rfc3339_utc(self.oldest)?),
-            _ => Ok(format!(
-                "{} .. {}",
-                proxmox_time::epoch_to_rfc3339_utc(self.oldest)?,
-                proxmox_time::epoch_to_rfc3339_utc(self.newest)?,
-            )),
-        }
-    }
-}
-
-impl std::fmt::Display for SkipInfo {
-    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
-        write!(
-            f,
-            "skipped: {} snapshot(s) ({}) - {}",
-            self.count,
-            self.affected().map_err(|_| std::fmt::Error)?,
-            self.skip_reason,
-        )
-    }
-}
-
 /// Pulls a group according to `params`.
 ///
 /// Pulling a group consists of the following steps:
diff --git a/src/server/sync.rs b/src/server/sync.rs
index f8a1e133d..ffc32f45f 100644
--- a/src/server/sync.rs
+++ b/src/server/sync.rs
@@ -467,3 +467,82 @@ impl SyncSource for LocalSource {
         }))
     }
 }
+
+#[derive(PartialEq, Eq)]
+pub(crate) enum SkipReason {
+    AlreadySynced,
+    TransferLast,
+}
+
+impl std::fmt::Display for SkipReason {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        write!(
+            f,
+            "{}",
+            match self {
+                SkipReason::AlreadySynced => "older than the newest local snapshot",
+                SkipReason::TransferLast => "due to transfer-last",
+            }
+        )
+    }
+}
+
+pub(crate) struct SkipInfo {
+    oldest: i64,
+    newest: i64,
+    pub(crate) count: u64,
+    skip_reason: SkipReason,
+}
+
+impl SkipInfo {
+    pub(crate) fn new(skip_reason: SkipReason) -> Self {
+        SkipInfo {
+            oldest: i64::MAX,
+            newest: i64::MIN,
+            count: 0,
+            skip_reason,
+        }
+    }
+
+    pub(crate) fn reset(&mut self) {
+        self.count = 0;
+        self.oldest = i64::MAX;
+        self.newest = i64::MIN;
+    }
+
+    pub(crate) fn update(&mut self, backup_time: i64) {
+        self.count += 1;
+
+        if backup_time < self.oldest {
+            self.oldest = backup_time;
+        }
+
+        if backup_time > self.newest {
+            self.newest = backup_time;
+        }
+    }
+
+    fn affected(&self) -> Result<String, Error> {
+        match self.count {
+            0 => Ok(String::new()),
+            1 => Ok(proxmox_time::epoch_to_rfc3339_utc(self.oldest)?),
+            _ => Ok(format!(
+                "{} .. {}",
+                proxmox_time::epoch_to_rfc3339_utc(self.oldest)?,
+                proxmox_time::epoch_to_rfc3339_utc(self.newest)?,
+            )),
+        }
+    }
+}
+
+impl std::fmt::Display for SkipInfo {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        write!(
+            f,
+            "skipped: {} snapshot(s) ({}) - {}",
+            self.count,
+            self.affected().map_err(|_| std::fmt::Error)?,
+            self.skip_reason,
+        )
+    }
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 10/33] server: sync: make skip reason message more genenric
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (8 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 09/33] server: sync: move skip info/reason to common sync module Christian Ebner
@ 2024-09-12 14:32 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 11/33] server: sync: factor out namespace depth check into sync module Christian Ebner
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:32 UTC (permalink / raw)
  To: pbs-devel

By specifying that the snapshot is being skipped because of the
condition met on the sync target instead of 'local', the same message
can be reused for the sync job in push direction without loosing
sense.
---
changes since version 2:
- no changes

 src/server/sync.rs | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/server/sync.rs b/src/server/sync.rs
index ffc32f45f..ee40d0b9d 100644
--- a/src/server/sync.rs
+++ b/src/server/sync.rs
@@ -480,7 +480,8 @@ impl std::fmt::Display for SkipReason {
             f,
             "{}",
             match self {
-                SkipReason::AlreadySynced => "older than the newest local snapshot",
+                SkipReason::AlreadySynced =>
+                    "older than the newest snapshot present on sync target",
                 SkipReason::TransferLast => "due to transfer-last",
             }
         )
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 11/33] server: sync: factor out namespace depth check into sync module
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (9 preceding siblings ...)
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 10/33] server: sync: make skip reason message more genenric Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 12/33] config: acl: mention optional namespace acl path component Christian Ebner
                   ` (24 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

By moving and refactoring the check for a sync job exceeding the
global maximum namespace limit, the same function can be reused for
sync jobs in push direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/server/pull.rs | 20 +++-----------------
 src/server/sync.rs | 21 +++++++++++++++++++++
 2 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/src/server/pull.rs b/src/server/pull.rs
index d18c6d643..3117f7d2c 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -28,8 +28,8 @@ use pbs_datastore::{check_backup_owner, DataStore, StoreProgress};
 use pbs_tools::sha::sha256;
 
 use super::sync::{
-    LocalSource, RemoteSource, RemovedVanishedStats, SkipInfo, SkipReason, SyncSource,
-    SyncSourceReader, SyncStats,
+    check_namespace_depth_limit, LocalSource, RemoteSource, RemovedVanishedStats, SkipInfo,
+    SkipReason, SyncSource, SyncSourceReader, SyncStats,
 };
 use crate::backup::{check_ns_modification_privs, check_ns_privs};
 use crate::tools::parallel_handler::ParallelHandler;
@@ -735,21 +735,7 @@ pub(crate) async fn pull_store(mut params: PullParameters) -> Result<SyncStats,
         params.source.list_namespaces(&mut params.max_depth).await?
     };
 
-    let ns_layers_to_be_pulled = namespaces
-        .iter()
-        .map(BackupNamespace::depth)
-        .max()
-        .map_or(0, |v| v - params.source.get_ns().depth());
-    let target_depth = params.target.ns.depth();
-
-    if ns_layers_to_be_pulled + target_depth > MAX_NAMESPACE_DEPTH {
-        bail!(
-            "Syncing would exceed max allowed namespace depth. ({}+{} > {})",
-            ns_layers_to_be_pulled,
-            target_depth,
-            MAX_NAMESPACE_DEPTH
-        );
-    }
+    check_namespace_depth_limit(&params.source.get_ns(), &params.target.ns, &namespaces)?;
 
     errors |= old_max_depth != params.max_depth; // fail job if we switched to backwards-compat mode
     namespaces.sort_unstable_by_key(|a| a.name_len());
diff --git a/src/server/sync.rs b/src/server/sync.rs
index ee40d0b9d..bd68dda46 100644
--- a/src/server/sync.rs
+++ b/src/server/sync.rs
@@ -547,3 +547,24 @@ impl std::fmt::Display for SkipInfo {
         )
     }
 }
+
+/// Check if a sync from source to target of given namespaces exceeds the global namespace depth limit
+pub(crate) fn check_namespace_depth_limit(
+    source_namespace: &BackupNamespace,
+    target_namespace: &BackupNamespace,
+    namespaces: &[BackupNamespace],
+) -> Result<(), Error> {
+    let target_ns_depth = target_namespace.depth();
+    let sync_ns_depth = namespaces
+        .iter()
+        .map(BackupNamespace::depth)
+        .max()
+        .map_or(0, |v| v - source_namespace.depth());
+
+    if sync_ns_depth + target_ns_depth > MAX_NAMESPACE_DEPTH {
+        bail!(
+            "Syncing would exceed max allowed namespace depth. ({sync_ns_depth}+{target_ns_depth} > {MAX_NAMESPACE_DEPTH})",
+        );
+    }
+    Ok(())
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 12/33] config: acl: mention optional namespace acl path component
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (10 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 11/33] server: sync: factor out namespace depth check into sync module Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores Christian Ebner
                   ` (23 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

`datastore` ACL paths are not limited to the datastore name, but might
have further sub-components specifying the namespace, therefore extend
the comment to mention this.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- not present in previous version

 pbs-config/src/acl.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pbs-config/src/acl.rs b/pbs-config/src/acl.rs
index 4ce4c13c0..6b6500f34 100644
--- a/pbs-config/src/acl.rs
+++ b/pbs-config/src/acl.rs
@@ -80,7 +80,7 @@ pub fn check_acl_path(path: &str) -> Result<(), Error> {
             }
         }
         "datastore" => {
-            // /datastore/{store}
+            // /datastore/{store}/{namespace}
             if components_len <= 2 {
                 return Ok(());
             }
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (11 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 12/33] config: acl: mention optional namespace acl path component Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-10-10 14:49   ` Fabian Grünbichler
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 14/33] api types: define remote permissions and roles for push sync Christian Ebner
                   ` (22 subsequent siblings)
  35 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Extend the component limit for ACL paths of `remote` to include
possible namespace components.

This allows to limit the permissions for sync jobs in push direction
to a namespace subset on the remote datastore.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- not present in previous version

 pbs-config/src/acl.rs | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/pbs-config/src/acl.rs b/pbs-config/src/acl.rs
index 6b6500f34..5177e22f0 100644
--- a/pbs-config/src/acl.rs
+++ b/pbs-config/src/acl.rs
@@ -89,10 +89,13 @@ pub fn check_acl_path(path: &str) -> Result<(), Error> {
             }
         }
         "remote" => {
-            // /remote/{remote}/{store}
+            // /remote/{remote}/{store}/{namespace}
             if components_len <= 3 {
                 return Ok(());
             }
+            if components_len > 3 && components_len <= 3 + pbs_api_types::MAX_NAMESPACE_DEPTH {
+                return Ok(());
+            }
         }
         "system" => {
             if components_len == 1 {
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 14/33] api types: define remote permissions and roles for push sync
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (12 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations Christian Ebner
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Adding the privileges to allow backup, namespace creation and prune
on remote targets, to be used for sync jobs in push direction.

Also adds dedicated roles setting the required privileges.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- Use `PRIV_REMOTE_DATASTORE_` prefix for datastore operation privs
- Adapt roles to also have RemoteDatastorePrune and
  RemoteDatastoreBackup
- Fix typo in comments

 pbs-api-types/src/acl.rs | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/pbs-api-types/src/acl.rs b/pbs-api-types/src/acl.rs
index a8ae57a9d..d865ad745 100644
--- a/pbs-api-types/src/acl.rs
+++ b/pbs-api-types/src/acl.rs
@@ -58,6 +58,12 @@ constnamedbitmap! {
         PRIV_REMOTE_MODIFY("Remote.Modify");
         /// Remote.Read allows reading data from a configured `Remote`
         PRIV_REMOTE_READ("Remote.Read");
+        /// Remote.DatastoreBackup allows creating new snapshots, but also requires backup ownership
+        PRIV_REMOTE_DATASTORE_BACKUP("Remote.DatastoreBackup");
+        /// Remote.DatastoreModify allows to modify remote datastores by creating new namespaces
+        PRIV_REMOTE_DATASTORE_MODIFY("Remote.DatastoreModify");
+        /// Remote.DatastorePrune allows deleting snapshots on a configured `Remote`
+        PRIV_REMOTE_DATASTORE_PRUNE("Remote.DatastorePrune");
 
         /// Sys.Console allows access to the system's console
         PRIV_SYS_CONSOLE("Sys.Console");
@@ -160,6 +166,26 @@ pub const ROLE_REMOTE_SYNC_OPERATOR: u64 = 0
     | PRIV_REMOTE_AUDIT
     | PRIV_REMOTE_READ;
 
+#[rustfmt::skip]
+#[allow(clippy::identity_op)]
+/// Remote.SyncPushOperator can do read and push snapshots to the remote.
+pub const ROLE_REMOTE_SYNC_PUSH_OPERATOR: u64 = 0
+    | PRIV_REMOTE_AUDIT
+    | PRIV_REMOTE_READ
+    | PRIV_REMOTE_DATASTORE_BACKUP;
+
+#[rustfmt::skip]
+#[allow(clippy::identity_op)]
+/// Remote.DatastoreModify can create namespaces on the remote.
+pub const ROLE_REMOTE_DATASTORE_MODIFY: u64 = 0
+    | PRIV_REMOTE_DATASTORE_MODIFY;
+
+#[rustfmt::skip]
+#[allow(clippy::identity_op)]
+/// Remote.DatastoreModify can prune snapshots, groups and namespaces on the remote.
+pub const ROLE_REMOTE_DATASTORE_PRUNE: u64 = 0
+    | PRIV_REMOTE_DATASTORE_PRUNE;
+
 #[rustfmt::skip]
 #[allow(clippy::identity_op)]
 /// Tape.Audit can audit the tape backup configuration and media content
@@ -225,6 +251,12 @@ pub enum Role {
     RemoteAdmin = ROLE_REMOTE_ADMIN,
     /// Synchronization Operator
     RemoteSyncOperator = ROLE_REMOTE_SYNC_OPERATOR,
+    /// Synchronisation Operator (push direction)
+    RemoteSyncPushOperator = ROLE_REMOTE_SYNC_PUSH_OPERATOR,
+    /// Remote Datastore Modify
+    RemoteDatastoreModify = ROLE_REMOTE_DATASTORE_MODIFY,
+    /// Remote Datastore Prune
+    RemoteDatastorePrune = ROLE_REMOTE_DATASTORE_PRUNE,
     /// Tape Auditor
     TapeAudit = ROLE_TAPE_AUDIT,
     /// Tape Administrator
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (13 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 14/33] api types: define remote permissions and roles for push sync Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-10-10 14:48   ` Fabian Grünbichler
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs Christian Ebner
                   ` (20 subsequent siblings)
  35 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Adds the functionality required to push datastore contents from a
source to a remote target.
This includes syncing of the namespaces, backup groups and snapshots
based on the provided filters as well as removing vanished contents
from the target when requested.

While trying to mimic the pull direction of sync jobs, the
implementation is different as access to the remote must be performed
via the REST API, not needed for the pull job which can access the
local datastore via the filesystem directly.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- Implement additional permission checks limiting possible remote
  datastore operations.
- Rename `owner` to `local_user`, this is the user who's view of the
  local datastore is used for the push to the remote target. It can be
  different from the job user, executing the sync job and requiring the
  permissions to access the remote.

 src/server/mod.rs  |   1 +
 src/server/push.rs | 892 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 893 insertions(+)
 create mode 100644 src/server/push.rs

diff --git a/src/server/mod.rs b/src/server/mod.rs
index 468847c2e..882c5cc10 100644
--- a/src/server/mod.rs
+++ b/src/server/mod.rs
@@ -34,6 +34,7 @@ pub use report::*;
 pub mod auth;
 
 pub(crate) mod pull;
+pub(crate) mod push;
 pub(crate) mod sync;
 
 pub(crate) async fn reload_proxy_certificate() -> Result<(), Error> {
diff --git a/src/server/push.rs b/src/server/push.rs
new file mode 100644
index 000000000..cfbb88728
--- /dev/null
+++ b/src/server/push.rs
@@ -0,0 +1,892 @@
+//! Sync datastore by pushing contents to remote server
+
+use std::cmp::Ordering;
+use std::collections::HashSet;
+use std::sync::{Arc, Mutex};
+
+use anyhow::{bail, format_err, Error};
+use futures::stream::{self, StreamExt, TryStreamExt};
+use tokio::sync::mpsc;
+use tokio_stream::wrappers::ReceiverStream;
+use tracing::info;
+
+use pbs_api_types::{
+    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
+    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
+    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
+};
+use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
+use pbs_config::CachedUserInfo;
+use pbs_datastore::data_blob::ChunkInfo;
+use pbs_datastore::dynamic_index::DynamicIndexReader;
+use pbs_datastore::fixed_index::FixedIndexReader;
+use pbs_datastore::index::IndexFile;
+use pbs_datastore::manifest::{ArchiveType, CLIENT_LOG_BLOB_NAME, MANIFEST_BLOB_NAME};
+use pbs_datastore::read_chunk::AsyncReadChunk;
+use pbs_datastore::{BackupManifest, DataStore, StoreProgress};
+
+use super::sync::{
+    check_namespace_depth_limit, LocalSource, RemovedVanishedStats, SkipInfo, SkipReason,
+    SyncSource, SyncStats,
+};
+use crate::api2::config::remote;
+
+/// Target for backups to be pushed to
+pub(crate) struct PushTarget {
+    // Name of the remote as found in remote.cfg
+    remote: String,
+    // Target repository on remote
+    repo: BackupRepository,
+    // Target namespace on remote
+    ns: BackupNamespace,
+    // Http client to connect to remote
+    client: HttpClient,
+}
+
+/// Parameters for a push operation
+pub(crate) struct PushParameters {
+    /// Source of backups to be pushed to remote
+    source: Arc<LocalSource>,
+    /// Target for backups to be pushed to
+    target: PushTarget,
+    /// Local user limiting the accessible source contents, makes sure that the sync job sees the
+    /// same source content when executed by different users with different privileges
+    local_user: Authid,
+    /// User as which the job gets executed, requires the permissions on the remote
+    pub(crate) job_user: Option<Authid>,
+    /// Whether to remove groups which exist locally, but not on the remote end
+    remove_vanished: bool,
+    /// How many levels of sub-namespaces to push (0 == no recursion, None == maximum recursion)
+    max_depth: Option<usize>,
+    /// Filters for reducing the push scope
+    group_filter: Vec<GroupFilter>,
+    /// How many snapshots should be transferred at most (taking the newest N snapshots)
+    transfer_last: Option<usize>,
+}
+
+impl PushParameters {
+    /// Creates a new instance of `PushParameters`.
+    #[allow(clippy::too_many_arguments)]
+    pub(crate) fn new(
+        store: &str,
+        ns: BackupNamespace,
+        remote_id: &str,
+        remote_store: &str,
+        remote_ns: BackupNamespace,
+        local_user: Authid,
+        remove_vanished: Option<bool>,
+        max_depth: Option<usize>,
+        group_filter: Option<Vec<GroupFilter>>,
+        limit: RateLimitConfig,
+        transfer_last: Option<usize>,
+    ) -> Result<Self, Error> {
+        if let Some(max_depth) = max_depth {
+            ns.check_max_depth(max_depth)?;
+            remote_ns.check_max_depth(max_depth)?;
+        };
+        let remove_vanished = remove_vanished.unwrap_or(false);
+
+        let source = Arc::new(LocalSource {
+            store: DataStore::lookup_datastore(store, Some(Operation::Read))?,
+            ns,
+        });
+
+        let (remote_config, _digest) = pbs_config::remote::config()?;
+        let remote: Remote = remote_config.lookup("remote", remote_id)?;
+
+        let repo = BackupRepository::new(
+            Some(remote.config.auth_id.clone()),
+            Some(remote.config.host.clone()),
+            remote.config.port,
+            remote_store.to_string(),
+        );
+
+        let client = remote::remote_client_config(&remote, Some(limit))?;
+        let target = PushTarget {
+            remote: remote_id.to_string(),
+            repo,
+            ns: remote_ns,
+            client,
+        };
+        let group_filter = group_filter.unwrap_or_default();
+
+        Ok(Self {
+            source,
+            target,
+            local_user,
+            job_user: None,
+            remove_vanished,
+            max_depth,
+            group_filter,
+            transfer_last,
+        })
+    }
+}
+
+fn check_ns_remote_datastore_privs(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+    privs: u64,
+) -> Result<(), Error> {
+    let auth_id = params
+        .job_user
+        .as_ref()
+        .ok_or_else(|| format_err!("missing job authid"))?;
+    let user_info = CachedUserInfo::new()?;
+    let mut acl_path: Vec<&str> = vec!["remote", &params.target.remote, params.target.repo.store()];
+
+    if !namespace.is_root() {
+        let ns_components: Vec<&str> = namespace.components().collect();
+        acl_path.extend(ns_components);
+    }
+
+    user_info.check_privs(auth_id, &acl_path, privs, false)?;
+
+    Ok(())
+}
+
+// Fetch the list of namespaces found on target
+async fn fetch_target_namespaces(params: &PushParameters) -> Result<Vec<BackupNamespace>, Error> {
+    let api_path = format!(
+        "api2/json/admin/datastore/{store}/namespace",
+        store = params.target.repo.store(),
+    );
+    let mut result = params.target.client.get(&api_path, None).await?;
+    let namespaces: Vec<NamespaceListItem> = serde_json::from_value(result["data"].take())?;
+    let mut namespaces: Vec<BackupNamespace> = namespaces
+        .into_iter()
+        .map(|namespace| namespace.ns)
+        .collect();
+    namespaces.sort_unstable_by_key(|a| a.name_len());
+
+    Ok(namespaces)
+}
+
+// Remove the provided namespace from the target
+async fn remove_target_namespace(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+) -> Result<(), Error> {
+    if namespace.is_root() {
+        bail!("cannot remove root namespace from target");
+    }
+
+    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
+        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
+
+    let api_path = format!(
+        "api2/json/admin/datastore/{store}/namespace",
+        store = params.target.repo.store(),
+    );
+
+    let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
+    let args = serde_json::json!({
+        "ns": target_ns.name(),
+        "delete-groups": true,
+    });
+
+    params.target.client.delete(&api_path, Some(args)).await?;
+
+    Ok(())
+}
+
+// Fetch the list of groups found on target in given namespace
+async fn fetch_target_groups(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+) -> Result<Vec<BackupGroup>, Error> {
+    let api_path = format!(
+        "api2/json/admin/datastore/{store}/groups",
+        store = params.target.repo.store(),
+    );
+
+    let args = if !namespace.is_root() {
+        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
+        Some(serde_json::json!({ "ns": target_ns.name() }))
+    } else {
+        None
+    };
+
+    let mut result = params.target.client.get(&api_path, args).await?;
+    let groups: Vec<GroupListItem> = serde_json::from_value(result["data"].take())?;
+    let mut groups: Vec<BackupGroup> = groups.into_iter().map(|group| group.backup).collect();
+
+    groups.sort_unstable_by(|a, b| {
+        let type_order = a.ty.cmp(&b.ty);
+        if type_order == Ordering::Equal {
+            a.id.cmp(&b.id)
+        } else {
+            type_order
+        }
+    });
+
+    Ok(groups)
+}
+
+// Remove the provided backup group in given namespace from the target
+async fn remove_target_group(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+    backup_group: &BackupGroup,
+) -> Result<(), Error> {
+    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
+        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
+
+    let api_path = format!(
+        "api2/json/admin/datastore/{store}/groups",
+        store = params.target.repo.store(),
+    );
+
+    let mut args = serde_json::json!({
+        "backup-id": backup_group.id,
+        "backup-type": backup_group.ty,
+    });
+    if !namespace.is_root() {
+        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
+        args["ns"] = serde_json::to_value(target_ns.name())?;
+    }
+
+    params.target.client.delete(&api_path, Some(args)).await?;
+
+    Ok(())
+}
+
+// Check if the namespace is already present on the target, create it otherwise
+async fn check_or_create_target_namespace(
+    params: &PushParameters,
+    target_namespaces: &[BackupNamespace],
+    namespace: &BackupNamespace,
+) -> Result<bool, Error> {
+    let mut created = false;
+
+    if !namespace.is_root() && !target_namespaces.contains(namespace) {
+        // Namespace not present on target, create namespace.
+        // Sub-namespaces have to be created by creating parent components first.
+
+        check_ns_remote_datastore_privs(&params, namespace, PRIV_REMOTE_DATASTORE_MODIFY)
+            .map_err(|err| format_err!("Creating namespace not allowed - {err}"))?;
+
+        let mut parent = BackupNamespace::root();
+        for namespace_component in namespace.components() {
+            let namespace = BackupNamespace::new(namespace_component)?;
+            let api_path = format!(
+                "api2/json/admin/datastore/{store}/namespace",
+                store = params.target.repo.store(),
+            );
+            let mut args = serde_json::json!({ "name": namespace.name() });
+            if !parent.is_root() {
+                args["parent"] = serde_json::to_value(parent.clone())?;
+            }
+            if let Err(err) = params.target.client.post(&api_path, Some(args)).await {
+                let target_store_and_ns =
+                    print_store_and_ns(params.target.repo.store(), &namespace);
+                bail!("sync into {target_store_and_ns} failed - namespace creation failed: {err}");
+            }
+            parent.push(namespace.name())?;
+        }
+
+        created = true;
+    }
+
+    Ok(created)
+}
+
+/// Push contents of source datastore matched by given push parameters to target.
+pub(crate) async fn push_store(mut params: PushParameters) -> Result<SyncStats, Error> {
+    let mut errors = false;
+
+    // Generate list of source namespaces to push to target, limited by max-depth
+    let mut namespaces = params.source.list_namespaces(&mut params.max_depth).await?;
+
+    check_namespace_depth_limit(&params.source.get_ns(), &params.target.ns, &namespaces)?;
+
+    namespaces.sort_unstable_by_key(|a| a.name_len());
+
+    // Fetch all accessible namespaces already present on the target
+    let target_namespaces = fetch_target_namespaces(&params).await?;
+    // Remember synced namespaces, removing non-synced ones when remove vanished flag is set
+    let mut synced_namespaces = HashSet::with_capacity(namespaces.len());
+
+    let (mut groups, mut snapshots) = (0, 0);
+    let mut stats = SyncStats::default();
+    for namespace in namespaces {
+        let source_store_and_ns = print_store_and_ns(params.source.store.name(), &namespace);
+        let target_namespace = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
+        let target_store_and_ns = print_store_and_ns(params.target.repo.store(), &target_namespace);
+
+        info!("----");
+        info!("Syncing {source_store_and_ns} into {target_store_and_ns}");
+
+        synced_namespaces.insert(target_namespace.clone());
+
+        match check_or_create_target_namespace(&params, &target_namespaces, &target_namespace).await
+        {
+            Ok(true) => info!("Created namespace {target_namespace}"),
+            Ok(false) => {}
+            Err(err) => {
+                info!("Cannot sync {source_store_and_ns} into {target_store_and_ns} - {err}");
+                errors = true;
+                continue;
+            }
+        }
+
+        match push_namespace(&namespace, &params).await {
+            Ok((sync_progress, sync_stats, sync_errors)) => {
+                errors |= sync_errors;
+                stats.add(sync_stats);
+
+                if params.max_depth != Some(0) {
+                    groups += sync_progress.done_groups;
+                    snapshots += sync_progress.done_snapshots;
+
+                    let ns = if namespace.is_root() {
+                        "root namespace".into()
+                    } else {
+                        format!("namespace {namespace}")
+                    };
+                    info!(
+                        "Finished syncing {ns}, current progress: {groups} groups, {snapshots} snapshots"
+                    );
+                }
+            }
+            Err(err) => {
+                errors = true;
+                info!("Encountered errors while syncing namespace {namespace} - {err}");
+            }
+        }
+    }
+
+    if params.remove_vanished {
+        for target_namespace in target_namespaces {
+            if synced_namespaces.contains(&target_namespace) {
+                continue;
+            }
+            if let Err(err) = remove_target_namespace(&params, &target_namespace).await {
+                info!("failed to remove vanished namespace {target_namespace} - {err}");
+                continue;
+            }
+            info!("removed vanished namespace {target_namespace}");
+        }
+    }
+
+    if errors {
+        bail!("sync failed with some errors.");
+    }
+
+    Ok(stats)
+}
+
+/// Push namespace including all backup groups to target
+///
+/// Iterate over all backup groups in the namespace and push them to the target.
+pub(crate) async fn push_namespace(
+    namespace: &BackupNamespace,
+    params: &PushParameters,
+) -> Result<(StoreProgress, SyncStats, bool), Error> {
+    // Check if user is allowed to perform backups on remote datastore
+    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_BACKUP)
+        .map_err(|err| format_err!("Pushing to remote not allowed - {err}"))?;
+
+    let mut list: Vec<BackupGroup> = params
+        .source
+        .list_groups(namespace, &params.local_user)
+        .await?;
+
+    list.sort_unstable_by(|a, b| {
+        let type_order = a.ty.cmp(&b.ty);
+        if type_order == Ordering::Equal {
+            a.id.cmp(&b.id)
+        } else {
+            type_order
+        }
+    });
+
+    let total = list.len();
+    let list: Vec<BackupGroup> = list
+        .into_iter()
+        .filter(|group| group.apply_filters(&params.group_filter))
+        .collect();
+
+    info!(
+        "found {filtered} groups to sync (out of {total} total)",
+        filtered = list.len()
+    );
+
+    let target_groups = if params.remove_vanished {
+        fetch_target_groups(params, namespace).await?
+    } else {
+        // avoid fetching of groups, not required if remove vanished not set
+        Vec::new()
+    };
+
+    let mut errors = false;
+    // Remember synced groups, remove others when the remove vanished flag is set
+    let mut synced_groups = HashSet::new();
+    let mut progress = StoreProgress::new(list.len() as u64);
+    let mut stats = SyncStats::default();
+
+    for (done, group) in list.into_iter().enumerate() {
+        progress.done_groups = done as u64;
+        progress.done_snapshots = 0;
+        progress.group_snapshots = 0;
+        synced_groups.insert(group.clone());
+
+        match push_group(params, namespace, &group, &mut progress).await {
+            Ok(sync_stats) => stats.add(sync_stats),
+            Err(err) => {
+                info!("sync group '{group}' failed  - {err}");
+                errors = true;
+            }
+        }
+    }
+
+    if params.remove_vanished {
+        for target_group in target_groups {
+            if synced_groups.contains(&target_group) {
+                continue;
+            }
+            if !target_group.apply_filters(&params.group_filter) {
+                continue;
+            }
+
+            info!("delete vanished group '{target_group}'");
+
+            let count_before = match fetch_target_groups(params, namespace).await {
+                Ok(snapshots) => snapshots.len(),
+                Err(_err) => 0, // ignore errors
+            };
+
+            if let Err(err) = remove_target_group(params, namespace, &target_group).await {
+                info!("{err}");
+                errors = true;
+                continue;
+            }
+
+            let mut count_after = match fetch_target_groups(params, namespace).await {
+                Ok(snapshots) => snapshots.len(),
+                Err(_err) => 0, // ignore errors
+            };
+
+            let deleted_groups = if count_after > 0 {
+                info!("kept some protected snapshots of group '{target_group}'");
+                0
+            } else {
+                1
+            };
+
+            if count_after > count_before {
+                count_after = count_before;
+            }
+
+            stats.add(SyncStats::from(RemovedVanishedStats {
+                snapshots: count_before - count_after,
+                groups: deleted_groups,
+                namespaces: 0,
+            }));
+        }
+    }
+
+    Ok((progress, stats, errors))
+}
+
+async fn fetch_target_snapshots(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+    group: &BackupGroup,
+) -> Result<Vec<SnapshotListItem>, Error> {
+    let api_path = format!(
+        "api2/json/admin/datastore/{store}/snapshots",
+        store = params.target.repo.store(),
+    );
+    let mut args = serde_json::to_value(group)?;
+    if !namespace.is_root() {
+        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
+        args["ns"] = serde_json::to_value(target_ns)?;
+    }
+    let mut result = params.target.client.get(&api_path, Some(args)).await?;
+    let snapshots: Vec<SnapshotListItem> = serde_json::from_value(result["data"].take())?;
+
+    Ok(snapshots)
+}
+
+async fn fetch_previous_backup_time(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+    group: &BackupGroup,
+) -> Result<Option<i64>, Error> {
+    let mut snapshots = fetch_target_snapshots(params, namespace, group).await?;
+    snapshots.sort_unstable_by(|a, b| a.backup.time.cmp(&b.backup.time));
+    Ok(snapshots.last().map(|snapshot| snapshot.backup.time))
+}
+
+async fn forget_target_snapshot(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+    snapshot: &BackupDir,
+) -> Result<(), Error> {
+    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
+        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
+
+    let api_path = format!(
+        "api2/json/admin/datastore/{store}/snapshots",
+        store = params.target.repo.store(),
+    );
+    let mut args = serde_json::to_value(snapshot)?;
+    if !namespace.is_root() {
+        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
+        args["ns"] = serde_json::to_value(target_ns)?;
+    }
+    params.target.client.delete(&api_path, Some(args)).await?;
+
+    Ok(())
+}
+
+/// Push group including all snaphshots to target
+///
+/// Iterate over all snapshots in the group and push them to the target.
+/// The group sync operation consists of the following steps:
+/// - Query snapshots of given group from the source
+/// - Sort snapshots by time
+/// - Apply transfer last cutoff and filters to list
+/// - Iterate the snapshot list and push each snapshot individually
+/// - (Optional): Remove vanished groups if `remove_vanished` flag is set
+pub(crate) async fn push_group(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+    group: &BackupGroup,
+    progress: &mut StoreProgress,
+) -> Result<SyncStats, Error> {
+    let mut already_synced_skip_info = SkipInfo::new(SkipReason::AlreadySynced);
+    let mut transfer_last_skip_info = SkipInfo::new(SkipReason::TransferLast);
+
+    let mut snapshots: Vec<BackupDir> = params.source.list_backup_dirs(namespace, group).await?;
+    snapshots.sort_unstable_by(|a, b| a.time.cmp(&b.time));
+
+    let total_snapshots = snapshots.len();
+    let cutoff = params
+        .transfer_last
+        .map(|count| total_snapshots.saturating_sub(count))
+        .unwrap_or_default();
+
+    let last_snapshot_time = fetch_previous_backup_time(params, namespace, group)
+        .await?
+        .unwrap_or(i64::MIN);
+
+    let mut source_snapshots = HashSet::new();
+    let snapshots: Vec<BackupDir> = snapshots
+        .into_iter()
+        .enumerate()
+        .filter(|&(pos, ref snapshot)| {
+            source_snapshots.insert(snapshot.time);
+            if last_snapshot_time > snapshot.time {
+                already_synced_skip_info.update(snapshot.time);
+                return false;
+            } else if already_synced_skip_info.count > 0 {
+                info!("{already_synced_skip_info}");
+                already_synced_skip_info.reset();
+                return true;
+            }
+
+            if pos < cutoff && last_snapshot_time != snapshot.time {
+                transfer_last_skip_info.update(snapshot.time);
+                return false;
+            } else if transfer_last_skip_info.count > 0 {
+                info!("{transfer_last_skip_info}");
+                transfer_last_skip_info.reset();
+            }
+            true
+        })
+        .map(|(_, dir)| dir)
+        .collect();
+
+    progress.group_snapshots = snapshots.len() as u64;
+
+    let target_snapshots = fetch_target_snapshots(params, namespace, group).await?;
+    let target_snapshots: Vec<BackupDir> = target_snapshots
+        .into_iter()
+        .map(|snapshot| snapshot.backup)
+        .collect();
+
+    let mut stats = SyncStats::default();
+    for (pos, source_snapshot) in snapshots.into_iter().enumerate() {
+        if target_snapshots.contains(&source_snapshot) {
+            progress.done_snapshots = pos as u64 + 1;
+            info!("percentage done: {progress}");
+            continue;
+        }
+        let result = push_snapshot(params, namespace, &source_snapshot).await;
+
+        progress.done_snapshots = pos as u64 + 1;
+        info!("percentage done: {progress}");
+
+        // stop on error
+        let sync_stats = result?;
+        stats.add(sync_stats);
+    }
+
+    if params.remove_vanished {
+        let target_snapshots = fetch_target_snapshots(params, namespace, group).await?;
+        for snapshot in target_snapshots {
+            if source_snapshots.contains(&snapshot.backup.time) {
+                continue;
+            }
+            if snapshot.protected {
+                info!(
+                    "don't delete vanished snapshot {name} (protected)",
+                    name = snapshot.backup
+                );
+                continue;
+            }
+            if let Err(err) = forget_target_snapshot(params, namespace, &snapshot.backup).await {
+                info!(
+                    "could not delete vanished snapshot {name} - {err}",
+                    name = snapshot.backup
+                );
+            }
+            info!("delete vanished snapshot {name}", name = snapshot.backup);
+            stats.add(SyncStats::from(RemovedVanishedStats {
+                snapshots: 1,
+                groups: 0,
+                namespaces: 0,
+            }));
+        }
+    }
+
+    Ok(stats)
+}
+
+/// Push snapshot to target
+///
+/// Creates a new snapshot on the target and pushes the content of the source snapshot to the
+/// target by creating a new manifest file and connecting to the remote as backup writer client.
+/// Chunks are written by recreating the index by uploading the chunk stream as read from the
+/// source. Data blobs are uploaded as such.
+pub(crate) async fn push_snapshot(
+    params: &PushParameters,
+    namespace: &BackupNamespace,
+    snapshot: &BackupDir,
+) -> Result<SyncStats, Error> {
+    let mut stats = SyncStats::default();
+    let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
+    let backup_dir = params
+        .source
+        .store
+        .backup_dir(params.source.ns.clone(), snapshot.clone())?;
+
+    let reader = params.source.reader(namespace, snapshot).await?;
+
+    // Load the source manifest, needed to find crypt mode for files
+    let mut tmp_source_manifest_path = backup_dir.full_path();
+    tmp_source_manifest_path.push(MANIFEST_BLOB_NAME);
+    tmp_source_manifest_path.set_extension("tmp");
+    let source_manifest = if let Some(manifest_blob) = reader
+        .load_file_into(MANIFEST_BLOB_NAME, &tmp_source_manifest_path)
+        .await?
+    {
+        BackupManifest::try_from(manifest_blob)?
+    } else {
+        // no manifest in snapshot, skip
+        return Ok(stats);
+    };
+
+    // Manifest to be created on target, referencing all the source archives after upload.
+    let mut manifest = BackupManifest::new(snapshot.clone());
+
+    // writer instance locks the snapshot on the remote side
+    let backup_writer = BackupWriter::start(
+        &params.target.client,
+        None,
+        params.target.repo.store(),
+        &target_ns,
+        snapshot,
+        false,
+        false,
+    )
+    .await?;
+
+    // Use manifest of previous snapshots in group on target for chunk upload deduplication
+    let previous_manifest = match backup_writer.download_previous_manifest().await {
+        Ok(manifest) => Some(Arc::new(manifest)),
+        Err(err) => {
+            log::info!("Could not download previous manifest - {err}");
+            None
+        }
+    };
+
+    let upload_options = UploadOptions {
+        compress: true,
+        encrypt: false,
+        previous_manifest,
+        ..UploadOptions::default()
+    };
+
+    // Avoid double upload penalty by remembering already seen chunks
+    let known_chunks = Arc::new(Mutex::new(HashSet::with_capacity(1024 * 1024)));
+
+    for entry in source_manifest.files() {
+        let mut path = backup_dir.full_path();
+        path.push(&entry.filename);
+        if path.try_exists()? {
+            match ArchiveType::from_path(&entry.filename)? {
+                ArchiveType::Blob => {
+                    let file = std::fs::File::open(path.clone())?;
+                    let backup_stats = backup_writer.upload_blob(file, &entry.filename).await?;
+                    manifest.add_file(
+                        entry.filename.to_string(),
+                        backup_stats.size,
+                        backup_stats.csum,
+                        entry.chunk_crypt_mode(),
+                    )?;
+                    stats.add(SyncStats {
+                        chunk_count: backup_stats.chunk_count as usize,
+                        bytes: backup_stats.size as usize,
+                        elapsed: backup_stats.duration,
+                        removed: None,
+                    });
+                }
+                ArchiveType::DynamicIndex => {
+                    let index = DynamicIndexReader::open(&path)?;
+                    let chunk_reader = reader.chunk_reader(entry.chunk_crypt_mode());
+                    let sync_stats = push_index(
+                        &entry.filename,
+                        index,
+                        chunk_reader,
+                        &backup_writer,
+                        &mut manifest,
+                        entry.chunk_crypt_mode(),
+                        None,
+                        &known_chunks,
+                    )
+                    .await?;
+                    stats.add(sync_stats);
+                }
+                ArchiveType::FixedIndex => {
+                    let index = FixedIndexReader::open(&path)?;
+                    let chunk_reader = reader.chunk_reader(entry.chunk_crypt_mode());
+                    let size = index.index_bytes();
+                    let sync_stats = push_index(
+                        &entry.filename,
+                        index,
+                        chunk_reader,
+                        &backup_writer,
+                        &mut manifest,
+                        entry.chunk_crypt_mode(),
+                        Some(size),
+                        &known_chunks,
+                    )
+                    .await?;
+                    stats.add(sync_stats);
+                }
+            }
+        } else {
+            info!("{path:?} does not exist, skipped.");
+        }
+    }
+
+    // Fetch client log from source and push to target
+    // this has to be handled individually since the log is never part of the manifest
+    let mut client_log_path = backup_dir.full_path();
+    client_log_path.push(CLIENT_LOG_BLOB_NAME);
+    if client_log_path.is_file() {
+        backup_writer
+            .upload_blob_from_file(
+                &client_log_path,
+                CLIENT_LOG_BLOB_NAME,
+                upload_options.clone(),
+            )
+            .await?;
+    } else {
+        info!("Client log at {client_log_path:?} does not exist or is not a file, skipped.");
+    }
+
+    // Rewrite manifest for pushed snapshot, re-adding the existing fingerprint and signature
+    let mut manifest_json = serde_json::to_value(manifest)?;
+    manifest_json["unprotected"] = source_manifest.unprotected;
+    if let Some(signature) = source_manifest.signature {
+        manifest_json["signature"] = serde_json::to_value(signature)?;
+    }
+    let manifest_string = serde_json::to_string_pretty(&manifest_json).unwrap();
+    let backup_stats = backup_writer
+        .upload_blob_from_data(
+            manifest_string.into_bytes(),
+            MANIFEST_BLOB_NAME,
+            upload_options,
+        )
+        .await?;
+    backup_writer.finish().await?;
+
+    stats.add(SyncStats {
+        chunk_count: backup_stats.chunk_count as usize,
+        bytes: backup_stats.size as usize,
+        elapsed: backup_stats.duration,
+        removed: None,
+    });
+
+    Ok(stats)
+}
+
+// Read fixed or dynamic index and push to target by uploading via the backup writer instance
+//
+// For fixed indexes, the size must be provided as given by the index reader.
+#[allow(clippy::too_many_arguments)]
+async fn push_index<'a>(
+    filename: &'a str,
+    index: impl IndexFile + Send + 'static,
+    chunk_reader: Arc<dyn AsyncReadChunk>,
+    backup_writer: &BackupWriter,
+    manifest: &mut BackupManifest,
+    crypt_mode: CryptMode,
+    size: Option<u64>,
+    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
+) -> Result<SyncStats, Error> {
+    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
+    let mut chunk_infos =
+        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
+
+    tokio::spawn(async move {
+        while let Some(chunk_info) = chunk_infos.next().await {
+            let chunk_info = chunk_reader
+                .read_raw_chunk(&chunk_info.digest)
+                .await
+                .map(|chunk| ChunkInfo {
+                    chunk,
+                    digest: chunk_info.digest,
+                    chunk_len: chunk_info.size(),
+                    offset: chunk_info.range.start,
+                });
+            let _ = upload_channel_tx.send(chunk_info).await;
+        }
+    });
+
+    let chunk_info_stream = ReceiverStream::new(upload_channel_rx).map_err(Error::from);
+
+    let upload_options = UploadOptions {
+        compress: true,
+        encrypt: false,
+        fixed_size: size,
+        ..UploadOptions::default()
+    };
+
+    let upload_stats = backup_writer
+        .upload_index_chunk_info(
+            filename,
+            chunk_info_stream,
+            upload_options,
+            known_chunks.clone(),
+        )
+        .await?;
+
+    manifest.add_file(
+        filename.to_string(),
+        upload_stats.size,
+        upload_stats.csum,
+        crypt_mode,
+    )?;
+
+    Ok(SyncStats {
+        chunk_count: upload_stats.chunk_count as usize,
+        bytes: upload_stats.size as usize,
+        elapsed: upload_stats.duration,
+        removed: None,
+    })
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (14 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-10-10 14:48   ` Fabian Grünbichler
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 17/33] api: push: implement endpoint for sync in push direction Christian Ebner
                   ` (19 subsequent siblings)
  35 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

In order for sync jobs to be either pull or push jobs, allow to
configure the direction of the job.

Adds an additional config type `sync-push` to the sync job config, to
clearly distinguish sync jobs configured in pull and in push
direction.

This approach was chosen in order to limit possible misconfiguration,
as unintentionally switching the sync direction could potentially
delete still required snapshots.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-api-types/src/jobs.rs | 52 +++++++++++++++++++++++++++++++++++++++
 pbs-config/src/sync.rs    | 11 +++++++--
 2 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/pbs-api-types/src/jobs.rs b/pbs-api-types/src/jobs.rs
index 868702bc0..12b39782c 100644
--- a/pbs-api-types/src/jobs.rs
+++ b/pbs-api-types/src/jobs.rs
@@ -20,6 +20,8 @@ const_regex! {
     pub VERIFICATION_JOB_WORKER_ID_REGEX = concatcp!(r"^(", PROXMOX_SAFE_ID_REGEX_STR, r"):");
     /// Regex for sync jobs '(REMOTE|\-):REMOTE_DATASTORE:LOCAL_DATASTORE:(?:LOCAL_NS_ANCHOR:)ACTUAL_JOB_ID'
     pub SYNC_JOB_WORKER_ID_REGEX = concatcp!(r"^(", PROXMOX_SAFE_ID_REGEX_STR, r"|\-):(", PROXMOX_SAFE_ID_REGEX_STR, r"):(", PROXMOX_SAFE_ID_REGEX_STR, r")(?::(", BACKUP_NS_RE, r"))?:");
+    /// Regex for sync direction'(pull|push)'
+    pub SYNC_DIRECTION_REGEX = r"^(pull|push)$";
 }
 
 pub const JOB_ID_SCHEMA: Schema = StringSchema::new("Job ID.")
@@ -498,6 +500,56 @@ pub const TRANSFER_LAST_SCHEMA: Schema =
         .minimum(1)
         .schema();
 
+pub const SYNC_DIRECTION_SCHEMA: Schema = StringSchema::new("Sync job direction (pull|push)")
+    .format(&ApiStringFormat::Pattern(&SYNC_DIRECTION_REGEX))
+    .schema();
+
+/// Direction of the sync job, push or pull
+#[derive(Clone, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Hash, UpdaterType)]
+pub enum SyncDirection {
+    #[default]
+    Pull,
+    Push,
+}
+
+impl ApiType for SyncDirection {
+    const API_SCHEMA: Schema = SYNC_DIRECTION_SCHEMA;
+}
+
+// used for serialization using `proxmox_serde::forward_serialize_to_display` macro
+impl std::fmt::Display for SyncDirection {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        match self {
+            SyncDirection::Pull => f.write_str("pull"),
+            SyncDirection::Push => f.write_str("push"),
+        }
+    }
+}
+
+impl std::str::FromStr for SyncDirection {
+    type Err = anyhow::Error;
+
+    fn from_str(s: &str) -> Result<Self, Self::Err> {
+        match s {
+            "pull" => Ok(SyncDirection::Pull),
+            "push" => Ok(SyncDirection::Push),
+            _ => bail!("invalid sync direction"),
+        }
+    }
+}
+
+proxmox_serde::forward_deserialize_to_from_str!(SyncDirection);
+proxmox_serde::forward_serialize_to_display!(SyncDirection);
+
+impl SyncDirection {
+    pub fn as_config_type_str(&self) -> &'static str {
+        match self {
+            SyncDirection::Pull => "sync",
+            SyncDirection::Push => "sync-push",
+        }
+    }
+}
+
 #[api(
     properties: {
         id: {
diff --git a/pbs-config/src/sync.rs b/pbs-config/src/sync.rs
index 45453abb1..143b73e78 100644
--- a/pbs-config/src/sync.rs
+++ b/pbs-config/src/sync.rs
@@ -18,9 +18,16 @@ fn init() -> SectionConfig {
         _ => unreachable!(),
     };
 
-    let plugin = SectionConfigPlugin::new("sync".to_string(), Some(String::from("id")), obj_schema);
+    let pull_plugin =
+        SectionConfigPlugin::new("sync".to_string(), Some(String::from("id")), obj_schema);
+    let push_plugin = SectionConfigPlugin::new(
+        "sync-push".to_string(),
+        Some(String::from("id")),
+        obj_schema,
+    );
     let mut config = SectionConfig::new(&JOB_ID_SCHEMA);
-    config.register_plugin(plugin);
+    config.register_plugin(pull_plugin);
+    config.register_plugin(push_plugin);
 
     config
 }
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 17/33] api: push: implement endpoint for sync in push direction
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (15 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 18/33] api: sync: move sync job invocation to server sync module Christian Ebner
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Expose the sync job in push direction via a dedicated API endpoint,
analogous to the pull direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- add additional permission checks for user executing the sync job

 src/api2/mod.rs  |   2 +
 src/api2/push.rs | 182 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 184 insertions(+)
 create mode 100644 src/api2/push.rs

diff --git a/src/api2/mod.rs b/src/api2/mod.rs
index a83e4c205..03596326b 100644
--- a/src/api2/mod.rs
+++ b/src/api2/mod.rs
@@ -12,6 +12,7 @@ pub mod helpers;
 pub mod node;
 pub mod ping;
 pub mod pull;
+pub mod push;
 pub mod reader;
 pub mod status;
 pub mod tape;
@@ -29,6 +30,7 @@ const SUBDIRS: SubdirMap = &sorted!([
     ("nodes", &node::ROUTER),
     ("ping", &ping::ROUTER),
     ("pull", &pull::ROUTER),
+    ("push", &push::ROUTER),
     ("reader", &reader::ROUTER),
     ("status", &status::ROUTER),
     ("tape", &tape::ROUTER),
diff --git a/src/api2/push.rs b/src/api2/push.rs
new file mode 100644
index 000000000..49480a074
--- /dev/null
+++ b/src/api2/push.rs
@@ -0,0 +1,182 @@
+use anyhow::{format_err, Context, Error};
+use futures::{future::FutureExt, select};
+use tracing::info;
+
+use pbs_api_types::{
+    Authid, BackupNamespace, GroupFilter, RateLimitConfig, SyncJobConfig, DATASTORE_SCHEMA,
+    GROUP_FILTER_LIST_SCHEMA, NS_MAX_DEPTH_REDUCED_SCHEMA, PRIV_REMOTE_DATASTORE_MODIFY,
+    PRIV_REMOTE_DATASTORE_PRUNE, REMOTE_ID_SCHEMA, REMOVE_VANISHED_BACKUPS_SCHEMA,
+    TRANSFER_LAST_SCHEMA,
+};
+use proxmox_rest_server::WorkerTask;
+use proxmox_router::{Permission, Router, RpcEnvironment};
+use proxmox_schema::api;
+
+use pbs_config::CachedUserInfo;
+
+use crate::server::push::{push_store, PushParameters};
+
+pub fn check_remote_push_privs(
+    auth_id: &Authid,
+    remote: &str,
+    remote_store: &str,
+    delete: bool,
+) -> Result<(), Error> {
+    let user_info = CachedUserInfo::new()?;
+
+    user_info.check_privs(
+        auth_id,
+        &["remote", remote, remote_store],
+        PRIV_REMOTE_DATASTORE_MODIFY,
+        false,
+    )?;
+
+    if delete {
+        user_info.check_privs(
+            auth_id,
+            &["remote", remote, remote_store],
+            PRIV_REMOTE_DATASTORE_PRUNE,
+            false,
+        )?;
+    }
+
+    Ok(())
+}
+
+impl TryFrom<&SyncJobConfig> for PushParameters {
+    type Error = Error;
+
+    fn try_from(sync_job: &SyncJobConfig) -> Result<Self, Self::Error> {
+        PushParameters::new(
+            &sync_job.store,
+            sync_job.ns.clone().unwrap_or_default(),
+            sync_job
+                .remote
+                .as_deref()
+                .context("missing required remote")?,
+            &sync_job.remote_store,
+            sync_job.remote_ns.clone().unwrap_or_default(),
+            sync_job
+                .owner
+                .as_ref()
+                .unwrap_or_else(|| Authid::root_auth_id())
+                .clone(),
+            sync_job.remove_vanished,
+            sync_job.max_depth,
+            sync_job.group_filter.clone(),
+            sync_job.limit.clone(),
+            sync_job.transfer_last,
+        )
+    }
+}
+
+#[api(
+    input: {
+        properties: {
+            store: {
+                schema: DATASTORE_SCHEMA,
+            },
+            ns: {
+                type: BackupNamespace,
+                optional: true,
+            },
+            remote: {
+                schema: REMOTE_ID_SCHEMA,
+            },
+            "remote-store": {
+                schema: DATASTORE_SCHEMA,
+            },
+            "remote-ns": {
+                type: BackupNamespace,
+                optional: true,
+            },
+            "remove-vanished": {
+                schema: REMOVE_VANISHED_BACKUPS_SCHEMA,
+                optional: true,
+            },
+            "max-depth": {
+                schema: NS_MAX_DEPTH_REDUCED_SCHEMA,
+                optional: true,
+            },
+            "group-filter": {
+                schema: GROUP_FILTER_LIST_SCHEMA,
+                optional: true,
+            },
+            limit: {
+                type: RateLimitConfig,
+                flatten: true,
+            },
+            "transfer-last": {
+                schema: TRANSFER_LAST_SCHEMA,
+                optional: true,
+            },
+        },
+    },
+    access: {
+        description: r###"The user needs Remote.Backup privilege on '/remote/{remote}/{remote-store}'
+and needs to own the backup group. Datastore.Read is required on '/datastore/{store}'.
+The delete flag additionally requires the Remote.Prune privilege on '/remote/{remote}/{remote-store}'.
+"###,
+        permission: &Permission::Anybody,
+    },
+)]
+/// Push store to other repository
+#[allow(clippy::too_many_arguments)]
+async fn push(
+    store: String,
+    ns: Option<BackupNamespace>,
+    remote: String,
+    remote_store: String,
+    remote_ns: Option<BackupNamespace>,
+    remove_vanished: Option<bool>,
+    max_depth: Option<usize>,
+    group_filter: Option<Vec<GroupFilter>>,
+    limit: RateLimitConfig,
+    transfer_last: Option<usize>,
+    rpcenv: &mut dyn RpcEnvironment,
+) -> Result<String, Error> {
+    let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
+    let delete = remove_vanished.unwrap_or(false);
+    let ns = ns.unwrap_or_default();
+
+    check_remote_push_privs(&auth_id, &remote, &remote_store, delete)?;
+
+    let mut push_params = PushParameters::new(
+        &store,
+        ns,
+        &remote,
+        &remote_store,
+        remote_ns.unwrap_or_default(),
+        auth_id.clone(),
+        remove_vanished,
+        max_depth,
+        group_filter,
+        limit,
+        transfer_last,
+    )?;
+    push_params.job_user = Some(auth_id.clone());
+
+    let upid_str = WorkerTask::spawn(
+        "sync",
+        Some(store.clone()),
+        auth_id.to_string(),
+        true,
+        move |worker| async move {
+            info!("push datastore '{store}' to '{remote}/{remote_store}'");
+
+            let push_future = push_store(push_params);
+            (select! {
+                success = push_future.fuse() => success,
+                abort = worker.abort_future().map(|_| Err(format_err!("push aborted"))) => abort,
+            })?;
+
+            info!("push datastore '{store}' end");
+
+            Ok(())
+        },
+    )?;
+
+    Ok(upid_str)
+}
+
+pub const ROUTER: Router = Router::new().post(&API_METHOD_PUSH);
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 18/33] api: sync: move sync job invocation to server sync module
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (16 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 17/33] api: push: implement endpoint for sync in push direction Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter Christian Ebner
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Moves and refactores the sync_job_do function into the common server
sync module so that it can be reused for both sync directions, pull
and push.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- move common code to `server` submodule instead of `api2::sync` as this
  is only indirectly api code.
- Also pass along the job user for additional permission checks via the
  `PushParameters`

 src/api2/admin/sync.rs          |  19 +++--
 src/api2/pull.rs                | 108 --------------------------
 src/bin/proxmox-backup-proxy.rs |  15 +++-
 src/server/mod.rs               |   1 +
 src/server/sync.rs              | 132 +++++++++++++++++++++++++++++++-
 5 files changed, 156 insertions(+), 119 deletions(-)

diff --git a/src/api2/admin/sync.rs b/src/api2/admin/sync.rs
index 4e2ba0be8..be324564c 100644
--- a/src/api2/admin/sync.rs
+++ b/src/api2/admin/sync.rs
@@ -10,16 +10,16 @@ use proxmox_router::{
 use proxmox_schema::api;
 use proxmox_sortable_macro::sortable;
 
-use pbs_api_types::{Authid, SyncJobConfig, SyncJobStatus, DATASTORE_SCHEMA, JOB_ID_SCHEMA};
+use pbs_api_types::{
+    Authid, SyncDirection, SyncJobConfig, SyncJobStatus, DATASTORE_SCHEMA, JOB_ID_SCHEMA,
+};
 use pbs_config::sync;
 use pbs_config::CachedUserInfo;
 
 use crate::{
-    api2::{
-        config::sync::{check_sync_job_modify_access, check_sync_job_read_access},
-        pull::do_sync_job,
-    },
+    api2::config::sync::{check_sync_job_modify_access, check_sync_job_read_access},
     server::jobstate::{compute_schedule_status, Job, JobState},
+    server::sync::do_sync_job,
 };
 
 #[api(
@@ -116,7 +116,14 @@ pub fn run_sync_job(
 
     let to_stdout = rpcenv.env_type() == RpcEnvironmentType::CLI;
 
-    let upid_str = do_sync_job(job, sync_job, &auth_id, None, to_stdout)?;
+    let upid_str = do_sync_job(
+        job,
+        sync_job,
+        &auth_id,
+        None,
+        SyncDirection::Pull,
+        to_stdout,
+    )?;
 
     Ok(upid_str)
 }
diff --git a/src/api2/pull.rs b/src/api2/pull.rs
index e733c9839..d039dab59 100644
--- a/src/api2/pull.rs
+++ b/src/api2/pull.rs
@@ -13,10 +13,8 @@ use pbs_api_types::{
     TRANSFER_LAST_SCHEMA,
 };
 use pbs_config::CachedUserInfo;
-use proxmox_human_byte::HumanByte;
 use proxmox_rest_server::WorkerTask;
 
-use crate::server::jobstate::Job;
 use crate::server::pull::{pull_store, PullParameters};
 
 pub fn check_pull_privs(
@@ -93,112 +91,6 @@ impl TryFrom<&SyncJobConfig> for PullParameters {
     }
 }
 
-pub fn do_sync_job(
-    mut job: Job,
-    sync_job: SyncJobConfig,
-    auth_id: &Authid,
-    schedule: Option<String>,
-    to_stdout: bool,
-) -> Result<String, Error> {
-    let job_id = format!(
-        "{}:{}:{}:{}:{}",
-        sync_job.remote.as_deref().unwrap_or("-"),
-        sync_job.remote_store,
-        sync_job.store,
-        sync_job.ns.clone().unwrap_or_default(),
-        job.jobname()
-    );
-    let worker_type = job.jobtype().to_string();
-
-    if sync_job.remote.is_none() && sync_job.store == sync_job.remote_store {
-        bail!("can't sync to same datastore");
-    }
-
-    let upid_str = WorkerTask::spawn(
-        &worker_type,
-        Some(job_id.clone()),
-        auth_id.to_string(),
-        to_stdout,
-        move |worker| async move {
-            job.start(&worker.upid().to_string())?;
-
-            let worker2 = worker.clone();
-            let sync_job2 = sync_job.clone();
-
-            let worker_future = async move {
-                let pull_params = PullParameters::try_from(&sync_job)?;
-
-                info!("Starting datastore sync job '{job_id}'");
-                if let Some(event_str) = schedule {
-                    info!("task triggered by schedule '{event_str}'");
-                }
-
-                info!(
-                    "sync datastore '{}' from '{}{}'",
-                    sync_job.store,
-                    sync_job
-                        .remote
-                        .as_deref()
-                        .map_or(String::new(), |remote| format!("{remote}/")),
-                    sync_job.remote_store,
-                );
-
-                let pull_stats = pull_store(pull_params).await?;
-
-                if pull_stats.bytes != 0 {
-                    let amount = HumanByte::from(pull_stats.bytes);
-                    let rate = HumanByte::new_binary(
-                        pull_stats.bytes as f64 / pull_stats.elapsed.as_secs_f64(),
-                    );
-                    info!(
-                        "Summary: sync job pulled {amount} in {} chunks (average rate: {rate}/s)",
-                        pull_stats.chunk_count,
-                    );
-                } else {
-                    info!("Summary: sync job found no new data to pull");
-                }
-
-                if let Some(removed) = pull_stats.removed {
-                    info!(
-                        "Summary: removed vanished: snapshots: {}, groups: {}, namespaces: {}",
-                        removed.snapshots, removed.groups, removed.namespaces,
-                    );
-                }
-
-                info!("sync job '{}' end", &job_id);
-
-                Ok(())
-            };
-
-            let mut abort_future = worker2
-                .abort_future()
-                .map(|_| Err(format_err!("sync aborted")));
-
-            let result = select! {
-                worker = worker_future.fuse() => worker,
-                abort = abort_future => abort,
-            };
-
-            let status = worker2.create_state(&result);
-
-            match job.finish(status) {
-                Ok(_) => {}
-                Err(err) => {
-                    eprintln!("could not finish job state: {}", err);
-                }
-            }
-
-            if let Err(err) = crate::server::send_sync_status(&sync_job2, &result) {
-                eprintln!("send sync notification failed: {err}");
-            }
-
-            result
-        },
-    )?;
-
-    Ok(upid_str)
-}
-
 #[api(
     input: {
         properties: {
diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
index 041f3aff9..4409234b2 100644
--- a/src/bin/proxmox-backup-proxy.rs
+++ b/src/bin/proxmox-backup-proxy.rs
@@ -46,8 +46,8 @@ use pbs_buildcfg::configdir;
 use proxmox_time::CalendarEvent;
 
 use pbs_api_types::{
-    Authid, DataStoreConfig, Operation, PruneJobConfig, SyncJobConfig, TapeBackupJobConfig,
-    VerificationJobConfig,
+    Authid, DataStoreConfig, Operation, PruneJobConfig, SyncDirection, SyncJobConfig,
+    TapeBackupJobConfig, VerificationJobConfig,
 };
 
 use proxmox_backup::auth_helpers::*;
@@ -57,9 +57,9 @@ use proxmox_backup::tools::{
     PROXMOX_BACKUP_TCP_KEEPALIVE_TIME,
 };
 
-use proxmox_backup::api2::pull::do_sync_job;
 use proxmox_backup::api2::tape::backup::do_tape_backup_job;
 use proxmox_backup::server::do_prune_job;
+use proxmox_backup::server::do_sync_job;
 use proxmox_backup::server::do_verification_job;
 
 fn main() -> Result<(), Error> {
@@ -630,7 +630,14 @@ async fn schedule_datastore_sync_jobs() {
             };
 
             let auth_id = Authid::root_auth_id().clone();
-            if let Err(err) = do_sync_job(job, job_config, &auth_id, Some(event_str), false) {
+            if let Err(err) = do_sync_job(
+                job,
+                job_config,
+                &auth_id,
+                Some(event_str),
+                SyncDirection::Pull,
+                false,
+            ) {
                 eprintln!("unable to start datastore sync job {job_id} - {err}");
             }
         };
diff --git a/src/server/mod.rs b/src/server/mod.rs
index 882c5cc10..2fd95327c 100644
--- a/src/server/mod.rs
+++ b/src/server/mod.rs
@@ -36,6 +36,7 @@ pub mod auth;
 pub(crate) mod pull;
 pub(crate) mod push;
 pub(crate) mod sync;
+pub use sync::do_sync_job;
 
 pub(crate) async fn reload_proxy_certificate() -> Result<(), Error> {
     let proxy_pid = proxmox_rest_server::read_pid(pbs_buildcfg::PROXMOX_BACKUP_PROXY_PID_FN)?;
diff --git a/src/server/sync.rs b/src/server/sync.rs
index bd68dda46..e1f1db8e0 100644
--- a/src/server/sync.rs
+++ b/src/server/sync.rs
@@ -7,15 +7,18 @@ use std::sync::{Arc, Mutex};
 use std::time::Duration;
 
 use anyhow::{bail, format_err, Error};
+use futures::{future::FutureExt, select};
 use http::StatusCode;
 use serde_json::json;
 use tracing::{info, warn};
 
+use proxmox_human_byte::HumanByte;
+use proxmox_rest_server::WorkerTask;
 use proxmox_router::HttpError;
 
 use pbs_api_types::{
     Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupListItem, SnapshotListItem,
-    MAX_NAMESPACE_DEPTH, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_READ,
+    SyncDirection, SyncJobConfig, MAX_NAMESPACE_DEPTH, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_READ,
 };
 use pbs_client::{BackupReader, BackupRepository, HttpClient, RemoteChunkReader};
 use pbs_datastore::data_blob::DataBlob;
@@ -24,6 +27,9 @@ use pbs_datastore::read_chunk::AsyncReadChunk;
 use pbs_datastore::{DataStore, ListNamespacesRecursive, LocalChunkReader};
 
 use crate::backup::ListAccessibleBackupGroups;
+use crate::server::jobstate::Job;
+use crate::server::pull::{pull_store, PullParameters};
+use crate::server::push::{push_store, PushParameters};
 
 #[derive(Default)]
 pub(crate) struct RemovedVanishedStats {
@@ -568,3 +574,127 @@ pub(crate) fn check_namespace_depth_limit(
     }
     Ok(())
 }
+
+/// Run a sync job in given direction
+pub fn do_sync_job(
+    mut job: Job,
+    sync_job: SyncJobConfig,
+    auth_id: &Authid,
+    schedule: Option<String>,
+    sync_direction: SyncDirection,
+    to_stdout: bool,
+) -> Result<String, Error> {
+    let job_id = format!(
+        "{}:{}:{}:{}:{}",
+        sync_job.remote.as_deref().unwrap_or("-"),
+        sync_job.remote_store,
+        sync_job.store,
+        sync_job.ns.clone().unwrap_or_default(),
+        job.jobname(),
+    );
+    let worker_type = job.jobtype().to_string();
+    let job_user = auth_id.clone();
+
+    if sync_job.remote.is_none() && sync_job.store == sync_job.remote_store {
+        bail!("can't sync to same datastore");
+    }
+
+    let upid_str = WorkerTask::spawn(
+        &worker_type,
+        Some(job_id.clone()),
+        auth_id.to_string(),
+        to_stdout,
+        move |worker| async move {
+            job.start(&worker.upid().to_string())?;
+
+            let worker2 = worker.clone();
+            let sync_job2 = sync_job.clone();
+
+            let worker_future = async move {
+                info!("Starting datastore sync job '{job_id}'");
+                if let Some(event_str) = schedule {
+                    info!("task triggered by schedule '{event_str}'");
+                }
+                let sync_stats = match sync_direction {
+                    SyncDirection::Pull => {
+                        info!(
+                            "sync datastore '{}' from '{}{}'",
+                            sync_job.store,
+                            sync_job
+                                .remote
+                                .as_deref()
+                                .map_or(String::new(), |remote| format!("{remote}/")),
+                            sync_job.remote_store,
+                        );
+                        let pull_params = PullParameters::try_from(&sync_job)?;
+                        pull_store(pull_params).await?
+                    }
+                    SyncDirection::Push => {
+                        info!(
+                            "sync datastore '{}' to '{}{}'",
+                            sync_job.store,
+                            sync_job
+                                .remote
+                                .as_deref()
+                                .map_or(String::new(), |remote| format!("{remote}/")),
+                            sync_job.remote_store,
+                        );
+                        let mut push_params = PushParameters::try_from(&sync_job)?;
+                        // Perform permission checks for remote operations via the user the job is
+                        // exexuted as. Without setting the job user, permission checks will fail.
+                        push_params.job_user = Some(job_user);
+                        push_store(push_params).await?
+                    }
+                };
+
+                if sync_stats.bytes != 0 {
+                    let amount = HumanByte::from(sync_stats.bytes);
+                    let rate = HumanByte::new_binary(
+                        sync_stats.bytes as f64 / sync_stats.elapsed.as_secs_f64(),
+                    );
+                    info!(
+                        "Summary: sync job {sync_direction}ed {amount} in {} chunks (average rate: {rate}/s)",
+                        sync_stats.chunk_count,
+                    );
+                } else {
+                    info!("Summary: sync job found no new data to {sync_direction}");
+                }
+
+                if let Some(removed) = sync_stats.removed {
+                    info!(
+                        "Summary: removed vanished: snapshots: {}, groups: {}, namespaces: {}",
+                        removed.snapshots, removed.groups, removed.namespaces,
+                    );
+                }
+
+                info!("sync job '{job_id}' end");
+
+                Ok(())
+            };
+
+            let mut abort_future = worker2
+                .abort_future()
+                .map(|_| Err(format_err!("sync aborted")));
+
+            let result = select! {
+                worker = worker_future.fuse() => worker,
+                abort = abort_future => abort,
+            };
+
+            let status = worker2.create_state(&result);
+
+            match job.finish(status) {
+                Ok(_) => {}
+                Err(err) => eprintln!("could not finish job state: {err}"),
+            }
+
+            if let Err(err) = crate::server::send_sync_status(&sync_job2, &result) {
+                eprintln!("send sync notification failed: {err}");
+            }
+
+            result
+        },
+    )?;
+
+    Ok(upid_str)
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (17 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 18/33] api: sync: move sync job invocation to server sync module Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-10-10 14:48   ` Fabian Grünbichler
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 20/33] api: sync: add permission checks for push sync jobs Christian Ebner
                   ` (16 subsequent siblings)
  35 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Exposes and switch the config type for sync job operations based
on the `sync-direction` parameter. If not set, the default config
type is `sync` and the default sync direction is `pull` for full
backwards compatibility.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/api2/admin/sync.rs               | 28 +++++++++------
 src/api2/config/datastore.rs         | 22 +++++++++---
 src/api2/config/notifications/mod.rs | 15 ++++++--
 src/api2/config/sync.rs              | 53 +++++++++++++++++++++++-----
 src/bin/proxmox-backup-proxy.rs      | 12 +++++--
 5 files changed, 101 insertions(+), 29 deletions(-)

diff --git a/src/api2/admin/sync.rs b/src/api2/admin/sync.rs
index be324564c..bdbc06a8e 100644
--- a/src/api2/admin/sync.rs
+++ b/src/api2/admin/sync.rs
@@ -12,6 +12,7 @@ use proxmox_sortable_macro::sortable;
 
 use pbs_api_types::{
     Authid, SyncDirection, SyncJobConfig, SyncJobStatus, DATASTORE_SCHEMA, JOB_ID_SCHEMA,
+    SYNC_DIRECTION_SCHEMA,
 };
 use pbs_config::sync;
 use pbs_config::CachedUserInfo;
@@ -29,6 +30,10 @@ use crate::{
                 schema: DATASTORE_SCHEMA,
                 optional: true,
             },
+            "sync-direction": {
+                schema: SYNC_DIRECTION_SCHEMA,
+                optional: true,
+            },
         },
     },
     returns: {
@@ -44,6 +49,7 @@ use crate::{
 /// List all sync jobs
 pub fn list_sync_jobs(
     store: Option<String>,
+    sync_direction: Option<SyncDirection>,
     _param: Value,
     rpcenv: &mut dyn RpcEnvironment,
 ) -> Result<Vec<SyncJobStatus>, Error> {
@@ -51,9 +57,10 @@ pub fn list_sync_jobs(
     let user_info = CachedUserInfo::new()?;
 
     let (config, digest) = sync::config()?;
+    let sync_direction = sync_direction.unwrap_or_default();
 
     let job_config_iter = config
-        .convert_to_typed_array("sync")?
+        .convert_to_typed_array(sync_direction.as_config_type_str())?
         .into_iter()
         .filter(|job: &SyncJobConfig| {
             if let Some(store) = &store {
@@ -88,7 +95,11 @@ pub fn list_sync_jobs(
         properties: {
             id: {
                 schema: JOB_ID_SCHEMA,
-            }
+            },
+            "sync-direction": {
+                schema: SYNC_DIRECTION_SCHEMA,
+                optional: true,
+            },
         }
     },
     access: {
@@ -99,6 +110,7 @@ pub fn list_sync_jobs(
 /// Runs the sync jobs manually.
 pub fn run_sync_job(
     id: String,
+    sync_direction: Option<SyncDirection>,
     _info: &ApiMethod,
     rpcenv: &mut dyn RpcEnvironment,
 ) -> Result<String, Error> {
@@ -106,7 +118,8 @@ pub fn run_sync_job(
     let user_info = CachedUserInfo::new()?;
 
     let (config, _digest) = sync::config()?;
-    let sync_job: SyncJobConfig = config.lookup("sync", &id)?;
+    let sync_direction = sync_direction.unwrap_or_default();
+    let sync_job: SyncJobConfig = config.lookup(sync_direction.as_config_type_str(), &id)?;
 
     if !check_sync_job_modify_access(&user_info, &auth_id, &sync_job) {
         bail!("permission check failed");
@@ -116,14 +129,7 @@ pub fn run_sync_job(
 
     let to_stdout = rpcenv.env_type() == RpcEnvironmentType::CLI;
 
-    let upid_str = do_sync_job(
-        job,
-        sync_job,
-        &auth_id,
-        None,
-        SyncDirection::Pull,
-        to_stdout,
-    )?;
+    let upid_str = do_sync_job(job, sync_job, &auth_id, None, sync_direction, to_stdout)?;
 
     Ok(upid_str)
 }
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index ca6edf05a..a01d26cad 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -13,8 +13,9 @@ use proxmox_uuid::Uuid;
 
 use pbs_api_types::{
     Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreNotify, DatastoreTuning, KeepOptions,
-    MaintenanceMode, PruneJobConfig, PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE,
-    PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
+    MaintenanceMode, PruneJobConfig, PruneJobOptions, SyncDirection, DATASTORE_SCHEMA,
+    PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY,
+    PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
 };
 use pbs_config::BackupLockGuard;
 use pbs_datastore::chunk_store::ChunkStore;
@@ -498,8 +499,21 @@ pub async fn delete_datastore(
         for job in list_verification_jobs(Some(name.clone()), Value::Null, rpcenv)? {
             delete_verification_job(job.config.id, None, rpcenv)?
         }
-        for job in list_sync_jobs(Some(name.clone()), Value::Null, rpcenv)? {
-            delete_sync_job(job.config.id, None, rpcenv)?
+        for job in list_sync_jobs(
+            Some(name.clone()),
+            Some(SyncDirection::Pull),
+            Value::Null,
+            rpcenv,
+        )? {
+            delete_sync_job(job.config.id, Some(SyncDirection::Pull), None, rpcenv)?
+        }
+        for job in list_sync_jobs(
+            Some(name.clone()),
+            Some(SyncDirection::Push),
+            Value::Null,
+            rpcenv,
+        )? {
+            delete_sync_job(job.config.id, Some(SyncDirection::Push), None, rpcenv)?
         }
         for job in list_prune_jobs(Some(name.clone()), Value::Null, rpcenv)? {
             delete_prune_job(job.config.id, None, rpcenv)?
diff --git a/src/api2/config/notifications/mod.rs b/src/api2/config/notifications/mod.rs
index dfe82ed03..9622d43ee 100644
--- a/src/api2/config/notifications/mod.rs
+++ b/src/api2/config/notifications/mod.rs
@@ -9,7 +9,7 @@ use proxmox_schema::api;
 use proxmox_sortable_macro::sortable;
 
 use crate::api2::admin::datastore::get_datastore_list;
-use pbs_api_types::PRIV_SYS_AUDIT;
+use pbs_api_types::{SyncDirection, PRIV_SYS_AUDIT};
 
 use crate::api2::admin::prune::list_prune_jobs;
 use crate::api2::admin::sync::list_sync_jobs;
@@ -154,8 +154,16 @@ pub fn get_values(
         });
     }
 
-    let sync_jobs = list_sync_jobs(None, param.clone(), rpcenv)?;
-    for job in sync_jobs {
+    let sync_jobs_pull = list_sync_jobs(None, Some(SyncDirection::Pull), param.clone(), rpcenv)?;
+    for job in sync_jobs_pull {
+        values.push(MatchableValue {
+            field: "job-id".into(),
+            value: job.config.id,
+            comment: job.config.comment,
+        });
+    }
+    let sync_jobs_push = list_sync_jobs(None, Some(SyncDirection::Push), param.clone(), rpcenv)?;
+    for job in sync_jobs_push {
         values.push(MatchableValue {
             field: "job-id".into(),
             value: job.config.id,
@@ -184,6 +192,7 @@ pub fn get_values(
         "package-updates",
         "prune",
         "sync",
+        "sync-push",
         "system-mail",
         "tape-backup",
         "tape-load",
diff --git a/src/api2/config/sync.rs b/src/api2/config/sync.rs
index 6fdc69a9e..a21e0bd6f 100644
--- a/src/api2/config/sync.rs
+++ b/src/api2/config/sync.rs
@@ -1,6 +1,7 @@
 use ::serde::{Deserialize, Serialize};
 use anyhow::{bail, Error};
 use hex::FromHex;
+use pbs_api_types::SyncDirection;
 use serde_json::Value;
 
 use proxmox_router::{http_bail, Permission, Router, RpcEnvironment};
@@ -9,7 +10,7 @@ use proxmox_schema::{api, param_bail};
 use pbs_api_types::{
     Authid, SyncJobConfig, SyncJobConfigUpdater, JOB_ID_SCHEMA, PRIV_DATASTORE_AUDIT,
     PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_MODIFY, PRIV_DATASTORE_PRUNE, PRIV_REMOTE_AUDIT,
-    PRIV_REMOTE_READ, PROXMOX_CONFIG_DIGEST_SCHEMA,
+    PRIV_REMOTE_READ, PROXMOX_CONFIG_DIGEST_SCHEMA, SYNC_DIRECTION_SCHEMA,
 };
 use pbs_config::sync;
 
@@ -77,7 +78,12 @@ pub fn check_sync_job_modify_access(
 
 #[api(
     input: {
-        properties: {},
+        properties: {
+            "sync-direction": {
+                schema: SYNC_DIRECTION_SCHEMA,
+                optional: true,
+            },
+        },
     },
     returns: {
         description: "List configured jobs.",
@@ -92,6 +98,7 @@ pub fn check_sync_job_modify_access(
 /// List all sync jobs
 pub fn list_sync_jobs(
     _param: Value,
+    sync_direction: Option<SyncDirection>,
     rpcenv: &mut dyn RpcEnvironment,
 ) -> Result<Vec<SyncJobConfig>, Error> {
     let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
@@ -99,7 +106,8 @@ pub fn list_sync_jobs(
 
     let (config, digest) = sync::config()?;
 
-    let list = config.convert_to_typed_array("sync")?;
+    let sync_direction = sync_direction.unwrap_or_default();
+    let list = config.convert_to_typed_array(sync_direction.as_config_type_str())?;
 
     rpcenv["digest"] = hex::encode(digest).into();
 
@@ -118,6 +126,10 @@ pub fn list_sync_jobs(
                 type: SyncJobConfig,
                 flatten: true,
             },
+            "sync-direction": {
+                schema: SYNC_DIRECTION_SCHEMA,
+                optional: true,
+            },
         },
     },
     access: {
@@ -128,6 +140,7 @@ pub fn list_sync_jobs(
 /// Create a new sync job.
 pub fn create_sync_job(
     config: SyncJobConfig,
+    sync_direction: Option<SyncDirection>,
     rpcenv: &mut dyn RpcEnvironment,
 ) -> Result<(), Error> {
     let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
@@ -158,7 +171,8 @@ pub fn create_sync_job(
         param_bail!("id", "job '{}' already exists.", config.id);
     }
 
-    section_config.set_data(&config.id, "sync", &config)?;
+    let sync_direction = sync_direction.unwrap_or_default();
+    section_config.set_data(&config.id, sync_direction.as_config_type_str(), &config)?;
 
     sync::save_config(&section_config)?;
 
@@ -173,6 +187,10 @@ pub fn create_sync_job(
             id: {
                 schema: JOB_ID_SCHEMA,
             },
+            "sync-direction": {
+                schema: SYNC_DIRECTION_SCHEMA,
+                optional: true,
+            },
         },
     },
     returns: { type: SyncJobConfig },
@@ -182,13 +200,18 @@ pub fn create_sync_job(
     },
 )]
 /// Read a sync job configuration.
-pub fn read_sync_job(id: String, rpcenv: &mut dyn RpcEnvironment) -> Result<SyncJobConfig, Error> {
+pub fn read_sync_job(
+    id: String,
+    sync_direction: Option<SyncDirection>,
+    rpcenv: &mut dyn RpcEnvironment,
+) -> Result<SyncJobConfig, Error> {
     let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
     let user_info = CachedUserInfo::new()?;
 
     let (config, digest) = sync::config()?;
 
-    let sync_job = config.lookup("sync", &id)?;
+    let sync_direction = sync_direction.unwrap_or_default();
+    let sync_job = config.lookup(sync_direction.as_config_type_str(), &id)?;
     if !check_sync_job_read_access(&user_info, &auth_id, &sync_job) {
         bail!("permission check failed");
     }
@@ -252,6 +275,10 @@ pub enum DeletableProperty {
                     type: DeletableProperty,
                 }
             },
+            "sync-direction": {
+                schema: SYNC_DIRECTION_SCHEMA,
+                optional: true,
+            },
             digest: {
                 optional: true,
                 schema: PROXMOX_CONFIG_DIGEST_SCHEMA,
@@ -269,6 +296,7 @@ pub fn update_sync_job(
     id: String,
     update: SyncJobConfigUpdater,
     delete: Option<Vec<DeletableProperty>>,
+    sync_direction: Option<SyncDirection>,
     digest: Option<String>,
     rpcenv: &mut dyn RpcEnvironment,
 ) -> Result<(), Error> {
@@ -284,7 +312,8 @@ pub fn update_sync_job(
         crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
     }
 
-    let mut data: SyncJobConfig = config.lookup("sync", &id)?;
+    let sync_direction = sync_direction.unwrap_or_default();
+    let mut data: SyncJobConfig = config.lookup(sync_direction.as_config_type_str(), &id)?;
 
     if let Some(delete) = delete {
         for delete_prop in delete {
@@ -409,7 +438,7 @@ pub fn update_sync_job(
         bail!("permission check failed");
     }
 
-    config.set_data(&id, "sync", &data)?;
+    config.set_data(&id, sync_direction.as_config_type_str(), &data)?;
 
     sync::save_config(&config)?;
 
@@ -427,6 +456,10 @@ pub fn update_sync_job(
             id: {
                 schema: JOB_ID_SCHEMA,
             },
+            "sync-direction": {
+                schema: SYNC_DIRECTION_SCHEMA,
+                optional: true,
+            },
             digest: {
                 optional: true,
                 schema: PROXMOX_CONFIG_DIGEST_SCHEMA,
@@ -441,6 +474,7 @@ pub fn update_sync_job(
 /// Remove a sync job configuration
 pub fn delete_sync_job(
     id: String,
+    sync_direction: Option<SyncDirection>,
     digest: Option<String>,
     rpcenv: &mut dyn RpcEnvironment,
 ) -> Result<(), Error> {
@@ -456,7 +490,8 @@ pub fn delete_sync_job(
         crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
     }
 
-    match config.lookup("sync", &id) {
+    let sync_direction = sync_direction.unwrap_or_default();
+    match config.lookup(sync_direction.as_config_type_str(), &id) {
         Ok(job) => {
             if !check_sync_job_modify_access(&user_info, &auth_id, &job) {
                 bail!("permission check failed");
diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
index 4409234b2..2b6f1c133 100644
--- a/src/bin/proxmox-backup-proxy.rs
+++ b/src/bin/proxmox-backup-proxy.rs
@@ -608,7 +608,15 @@ async fn schedule_datastore_sync_jobs() {
         Ok((config, _digest)) => config,
     };
 
-    for (job_id, (_, job_config)) in config.sections {
+    for (job_id, (job_type, job_config)) in config.sections {
+        let sync_direction = match job_type.as_str() {
+            "sync" => SyncDirection::Pull,
+            "sync-push" => SyncDirection::Push,
+            _ => {
+                eprintln!("unexpected config type in sync job config - {job_type}");
+                continue;
+            }
+        };
         let job_config: SyncJobConfig = match serde_json::from_value(job_config) {
             Ok(c) => c,
             Err(err) => {
@@ -635,7 +643,7 @@ async fn schedule_datastore_sync_jobs() {
                 job_config,
                 &auth_id,
                 Some(event_str),
-                SyncDirection::Pull,
+                sync_direction,
                 false,
             ) {
                 eprintln!("unable to start datastore sync job {job_id} - {err}");
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 20/33] api: sync: add permission checks for push sync jobs
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (18 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 21/33] bin: manager: add datastore push cli command Christian Ebner
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

For sync jobs in push direction, also permissions to modify and prune
the snapshots on the remote datastore are required, in contrast to
the pull sync job.

Add additional permissions to be checked on the local instance before
attempting operating on the remote.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- not present in previous version

 src/api2/admin/sync.rs  | 18 +++++++++++++++---
 src/api2/config/sync.rs | 33 ++++++++++++++++++++++++++++++++-
 2 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/src/api2/admin/sync.rs b/src/api2/admin/sync.rs
index bdbc06a8e..0fad10d0c 100644
--- a/src/api2/admin/sync.rs
+++ b/src/api2/admin/sync.rs
@@ -18,7 +18,10 @@ use pbs_config::sync;
 use pbs_config::CachedUserInfo;
 
 use crate::{
-    api2::config::sync::{check_sync_job_modify_access, check_sync_job_read_access},
+    api2::config::sync::{
+        check_sync_job_modify_access, check_sync_job_read_access,
+        check_sync_job_remote_datastore_backup_access,
+    },
     server::jobstate::{compute_schedule_status, Job, JobState},
     server::sync::do_sync_job,
 };
@@ -121,8 +124,17 @@ pub fn run_sync_job(
     let sync_direction = sync_direction.unwrap_or_default();
     let sync_job: SyncJobConfig = config.lookup(sync_direction.as_config_type_str(), &id)?;
 
-    if !check_sync_job_modify_access(&user_info, &auth_id, &sync_job) {
-        bail!("permission check failed");
+    match sync_direction {
+        SyncDirection::Pull => {
+            if !check_sync_job_modify_access(&user_info, &auth_id, &sync_job) {
+                bail!("permission check failed, '{auth_id}' is missing access on datastore");
+            }
+        }
+        SyncDirection::Push => {
+            if !check_sync_job_remote_datastore_backup_access(&user_info, &auth_id, &sync_job) {
+                bail!("permission check failed, '{auth_id}' is missing access on remote");
+            }
+        }
     }
 
     let job = Job::new("syncjob", &id)?;
diff --git a/src/api2/config/sync.rs b/src/api2/config/sync.rs
index a21e0bd6f..5035df8c9 100644
--- a/src/api2/config/sync.rs
+++ b/src/api2/config/sync.rs
@@ -10,7 +10,8 @@ use proxmox_schema::{api, param_bail};
 use pbs_api_types::{
     Authid, SyncJobConfig, SyncJobConfigUpdater, JOB_ID_SCHEMA, PRIV_DATASTORE_AUDIT,
     PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_MODIFY, PRIV_DATASTORE_PRUNE, PRIV_REMOTE_AUDIT,
-    PRIV_REMOTE_READ, PROXMOX_CONFIG_DIGEST_SCHEMA, SYNC_DIRECTION_SCHEMA,
+    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_PRUNE, PRIV_REMOTE_READ,
+    PROXMOX_CONFIG_DIGEST_SCHEMA, SYNC_DIRECTION_SCHEMA,
 };
 use pbs_config::sync;
 
@@ -76,6 +77,36 @@ pub fn check_sync_job_modify_access(
     true
 }
 
+/// Check user privileges required to push contents to a remote datastore.
+pub fn check_sync_job_remote_datastore_backup_access(
+    user_info: &CachedUserInfo,
+    auth_id: &Authid,
+    job: &SyncJobConfig,
+) -> bool {
+    if let Some(remote) = &job.remote {
+        let mut acl_path = vec!["remote", remote, &job.remote_store];
+
+        if let Some(namespace) = job.remote_ns.as_ref() {
+            if namespace.is_root() {
+                let ns_components: Vec<&str> = namespace.components().collect();
+                acl_path.extend(ns_components);
+            }
+        }
+
+        let remote_privs = user_info.lookup_privs(auth_id, &acl_path);
+
+        if let Some(true) = job.remove_vanished {
+            if remote_privs & PRIV_REMOTE_DATASTORE_PRUNE == 0 {
+                return false;
+            }
+        }
+
+        return remote_privs & PRIV_REMOTE_DATASTORE_BACKUP != 0;
+    }
+
+    false
+}
+
 #[api(
     input: {
         properties: {
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 21/33] bin: manager: add datastore push cli command
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (19 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 20/33] api: sync: add permission checks for push sync jobs Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 22/33] ui: group filter: allow to set namespace for local datastore Christian Ebner
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Expose the push api endpoint to be callable via the command line
interface.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- fix incorrect comment (push not pull)

 src/bin/proxmox-backup-manager.rs | 216 +++++++++++++++++++++++-------
 1 file changed, 169 insertions(+), 47 deletions(-)

diff --git a/src/bin/proxmox-backup-manager.rs b/src/bin/proxmox-backup-manager.rs
index 420e96665..f91d5bf29 100644
--- a/src/bin/proxmox-backup-manager.rs
+++ b/src/bin/proxmox-backup-manager.rs
@@ -12,7 +12,7 @@ use proxmox_sys::fs::CreateOptions;
 
 use pbs_api_types::percent_encoding::percent_encode_component;
 use pbs_api_types::{
-    BackupNamespace, GroupFilter, RateLimitConfig, SyncJobConfig, DATASTORE_SCHEMA,
+    BackupNamespace, GroupFilter, RateLimitConfig, SyncDirection, SyncJobConfig, DATASTORE_SCHEMA,
     GROUP_FILTER_LIST_SCHEMA, IGNORE_VERIFIED_BACKUPS_SCHEMA, NS_MAX_DEPTH_SCHEMA,
     REMOTE_ID_SCHEMA, REMOVE_VANISHED_BACKUPS_SCHEMA, TRANSFER_LAST_SCHEMA, UPID_SCHEMA,
     VERIFICATION_OUTDATED_AFTER_SCHEMA,
@@ -294,6 +294,72 @@ fn task_mgmt_cli() -> CommandLineInterface {
     cmd_def.into()
 }
 
+/// Sync datastore by pulling from or pushing to another repository
+#[allow(clippy::too_many_arguments)]
+async fn sync_datastore(
+    remote: String,
+    remote_store: String,
+    remote_ns: Option<BackupNamespace>,
+    store: String,
+    ns: Option<BackupNamespace>,
+    remove_vanished: Option<bool>,
+    max_depth: Option<usize>,
+    group_filter: Option<Vec<GroupFilter>>,
+    limit: RateLimitConfig,
+    transfer_last: Option<usize>,
+    param: Value,
+    sync_direction: SyncDirection,
+) -> Result<Value, Error> {
+    let output_format = get_output_format(&param);
+
+    let client = connect_to_localhost()?;
+    let mut args = json!({
+        "store": store,
+        "remote": remote,
+        "remote-store": remote_store,
+    });
+
+    if remote_ns.is_some() {
+        args["remote-ns"] = json!(remote_ns);
+    }
+
+    if ns.is_some() {
+        args["ns"] = json!(ns);
+    }
+
+    if max_depth.is_some() {
+        args["max-depth"] = json!(max_depth);
+    }
+
+    if group_filter.is_some() {
+        args["group-filter"] = json!(group_filter);
+    }
+
+    if let Some(remove_vanished) = remove_vanished {
+        args["remove-vanished"] = Value::from(remove_vanished);
+    }
+
+    if transfer_last.is_some() {
+        args["transfer-last"] = json!(transfer_last)
+    }
+
+    let mut limit_json = json!(limit);
+    let limit_map = limit_json
+        .as_object_mut()
+        .ok_or_else(|| format_err!("limit is not an Object"))?;
+
+    args.as_object_mut().unwrap().append(limit_map);
+
+    let result = match sync_direction {
+        SyncDirection::Pull => client.post("api2/json/pull", Some(args)).await?,
+        SyncDirection::Push => client.post("api2/json/push", Some(args)).await?,
+    };
+
+    view_task_result(&client, result, &output_format).await?;
+
+    Ok(Value::Null)
+}
+
 // fixme: avoid API redefinition
 #[api(
    input: {
@@ -342,7 +408,7 @@ fn task_mgmt_cli() -> CommandLineInterface {
         }
    }
 )]
-/// Sync datastore from another repository
+/// Sync datastore by pulling from another repository
 #[allow(clippy::too_many_arguments)]
 async fn pull_datastore(
     remote: String,
@@ -357,52 +423,100 @@ async fn pull_datastore(
     transfer_last: Option<usize>,
     param: Value,
 ) -> Result<Value, Error> {
-    let output_format = get_output_format(&param);
-
-    let client = connect_to_localhost()?;
-
-    let mut args = json!({
-        "store": store,
-        "remote": remote,
-        "remote-store": remote_store,
-    });
-
-    if remote_ns.is_some() {
-        args["remote-ns"] = json!(remote_ns);
-    }
-
-    if ns.is_some() {
-        args["ns"] = json!(ns);
-    }
-
-    if max_depth.is_some() {
-        args["max-depth"] = json!(max_depth);
-    }
-
-    if group_filter.is_some() {
-        args["group-filter"] = json!(group_filter);
-    }
-
-    if let Some(remove_vanished) = remove_vanished {
-        args["remove-vanished"] = Value::from(remove_vanished);
-    }
-
-    if transfer_last.is_some() {
-        args["transfer-last"] = json!(transfer_last)
-    }
-
-    let mut limit_json = json!(limit);
-    let limit_map = limit_json
-        .as_object_mut()
-        .ok_or_else(|| format_err!("limit is not an Object"))?;
-
-    args.as_object_mut().unwrap().append(limit_map);
-
-    let result = client.post("api2/json/pull", Some(args)).await?;
-
-    view_task_result(&client, result, &output_format).await?;
+    sync_datastore(
+        remote,
+        remote_store,
+        remote_ns,
+        store,
+        ns,
+        remove_vanished,
+        max_depth,
+        group_filter,
+        limit,
+        transfer_last,
+        param,
+        SyncDirection::Pull,
+    )
+    .await
+}
 
-    Ok(Value::Null)
+#[api(
+   input: {
+        properties: {
+            "store": {
+                schema: DATASTORE_SCHEMA,
+            },
+            "ns": {
+                type: BackupNamespace,
+                optional: true,
+            },
+            remote: {
+                schema: REMOTE_ID_SCHEMA,
+            },
+            "remote-store": {
+                schema: DATASTORE_SCHEMA,
+            },
+            "remote-ns": {
+                type: BackupNamespace,
+                optional: true,
+            },
+            "remove-vanished": {
+                schema: REMOVE_VANISHED_BACKUPS_SCHEMA,
+                optional: true,
+            },
+            "max-depth": {
+                schema: NS_MAX_DEPTH_SCHEMA,
+                optional: true,
+            },
+            "group-filter": {
+                schema: GROUP_FILTER_LIST_SCHEMA,
+                optional: true,
+            },
+            limit: {
+                type: RateLimitConfig,
+                flatten: true,
+            },
+            "output-format": {
+                schema: OUTPUT_FORMAT,
+                optional: true,
+            },
+            "transfer-last": {
+                schema: TRANSFER_LAST_SCHEMA,
+                optional: true,
+            },
+        }
+   }
+)]
+/// Sync datastore by pushing to another repository
+#[allow(clippy::too_many_arguments)]
+async fn push_datastore(
+    remote: String,
+    remote_store: String,
+    remote_ns: Option<BackupNamespace>,
+    store: String,
+    ns: Option<BackupNamespace>,
+    remove_vanished: Option<bool>,
+    max_depth: Option<usize>,
+    group_filter: Option<Vec<GroupFilter>>,
+    limit: RateLimitConfig,
+    transfer_last: Option<usize>,
+    param: Value,
+) -> Result<Value, Error> {
+    sync_datastore(
+        remote,
+        remote_store,
+        remote_ns,
+        store,
+        ns,
+        remove_vanished,
+        max_depth,
+        group_filter,
+        limit,
+        transfer_last,
+        param,
+        SyncDirection::Push,
+    )
+    .await
 }
 
 #[api(
@@ -528,6 +642,14 @@ async fn run() -> Result<(), Error> {
                 .completion_cb("group-filter", complete_remote_datastore_group_filter)
                 .completion_cb("remote-ns", complete_remote_datastore_namespace),
         )
+        .insert(
+            "push",
+            CliCommand::new(&API_METHOD_PUSH_DATASTORE)
+                .arg_param(&["store", "remote", "remote-store"])
+                .completion_cb("store", pbs_config::datastore::complete_datastore_name)
+                .completion_cb("remote", pbs_config::remote::complete_remote_name)
+                .completion_cb("remote-store", complete_remote_datastore_name),
+        )
         .insert(
             "verify",
             CliCommand::new(&API_METHOD_VERIFY)
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 22/33] ui: group filter: allow to set namespace for local datastore
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (20 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 21/33] bin: manager: add datastore push cli command Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 23/33] ui: sync edit: source group filters based on sync direction Christian Ebner
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

The namespace has to be set in order to get the correct groups to be
used as group filter options with a local datastore as source,
required for sync jobs in push direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 www/form/GroupFilter.js | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/www/form/GroupFilter.js b/www/form/GroupFilter.js
index c9c2d913e..7275b00ed 100644
--- a/www/form/GroupFilter.js
+++ b/www/form/GroupFilter.js
@@ -258,7 +258,11 @@ Ext.define('PBS.form.GroupFilter', {
 	    return;
 	}
 	if (me.namespace) {
-	    url += `?namespace=${me.namespace}`;
+	    if (me.remote) {
+		url += `?namespace=${me.namespace}`;
+	    } else {
+		url += `?ns=${me.namespace}`;
+	    }
 	}
 	me.setDsStoreUrl(url);
 	me.dsStore.load({
@@ -279,6 +283,18 @@ Ext.define('PBS.form.GroupFilter', {
 	}
 	me.remote = undefined;
 	me.datastore = datastore;
+	me.namespace = undefined;
+	me.updateGroupSelectors();
+    },
+
+    setLocalNamespace: function(datastore, namespace) {
+	let me = this;
+	if (me.datastore === datastore && me.namespace === namespace) {
+	    return;
+	}
+	me.remote = undefined;
+	me.datastore = datastore;
+	me.namespace = namespace;
 	me.updateGroupSelectors();
     },
 
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 23/33] ui: sync edit: source group filters based on sync direction
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (21 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 22/33] ui: group filter: allow to set namespace for local datastore Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 24/33] ui: add view with separate grids for pull and push sync jobs Christian Ebner
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Switch to the local datastore, used as sync source for jobs in push
direction, to get the available group filter options.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 www/window/SyncJobEdit.js | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/www/window/SyncJobEdit.js b/www/window/SyncJobEdit.js
index 6543995e8..9ca79eaa9 100644
--- a/www/window/SyncJobEdit.js
+++ b/www/window/SyncJobEdit.js
@@ -238,7 +238,13 @@ Ext.define('PBS.window.SyncJobEdit', {
 				let remoteNamespaceField = me.up('pbsSyncJobEdit').down('field[name=remote-ns]');
 				remoteNamespaceField.setRemote(remote);
 				remoteNamespaceField.setRemoteStore(value);
-				me.up('tabpanel').down('pbsGroupFilter').setRemoteDatastore(remote, value);
+
+				if (!me.syncDirectionPush) {
+				    me.up('tabpanel').down('pbsGroupFilter').setRemoteDatastore(remote, value);
+				} else {
+				    let localStore = me.up('pbsSyncJobEdit').down('field[name=store]').getValue();
+				    me.up('tabpanel').down('pbsGroupFilter').setLocalDatastore(localStore);
+				}
 			    },
 			},
 		    },
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 24/33] ui: add view with separate grids for pull and push sync jobs
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (22 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 23/33] ui: sync edit: source group filters based on sync direction Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 25/33] ui: sync job: adapt edit window to be used for pull and push Christian Ebner
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Show sync jobs in pull and in push direction in two separate grids,
visually separating them to limit possible misconfiguration.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 www/Makefile                   |  1 +
 www/config/SyncPullPushView.js | 60 ++++++++++++++++++++++++++++++++++
 www/config/SyncView.js         | 11 ++++++-
 www/datastore/DataStoreList.js |  2 +-
 www/datastore/Panel.js         |  2 +-
 5 files changed, 73 insertions(+), 3 deletions(-)
 create mode 100644 www/config/SyncPullPushView.js

diff --git a/www/Makefile b/www/Makefile
index 609a0ba67..d35e81283 100644
--- a/www/Makefile
+++ b/www/Makefile
@@ -61,6 +61,7 @@ JSSRC=							\
 	config/TrafficControlView.js			\
 	config/ACLView.js				\
 	config/SyncView.js				\
+	config/SyncPullPushView.js			\
 	config/VerifyView.js				\
 	config/PruneView.js				\
 	config/GCView.js				\
diff --git a/www/config/SyncPullPushView.js b/www/config/SyncPullPushView.js
new file mode 100644
index 000000000..1588207c9
--- /dev/null
+++ b/www/config/SyncPullPushView.js
@@ -0,0 +1,60 @@
+Ext.define('PBS.config.SyncPullPush', {
+    extend: 'Ext.panel.Panel',
+    alias: 'widget.pbsSyncJobPullPushView',
+    title: gettext('Sync Jobs'),
+
+    mixins: ['Proxmox.Mixin.CBind'],
+
+    layout: {
+	type: 'vbox',
+	align: 'stretch',
+	multi: true,
+    },
+    defaults: {
+	collapsible: false,
+	margin: '7 10 3 10',
+    },
+    scrollable: true,
+    items: [
+	{
+	    xtype: 'pbsSyncJobView',
+	    itemId: 'syncJobsPull',
+	    syncDirection: 'pull',
+	    cbind: {
+		datastore: '{datastore}',
+	    },
+	    minHeight: 125, // shows at least one line of content
+	},
+	{
+	    xtype: 'splitter',
+	    performCollapse: false,
+	},
+	{
+	    xtype: 'pbsSyncJobView',
+	    itemId: 'syncJobsPush',
+	    syncDirection: 'push',
+	    cbind: {
+		datastore: '{datastore}',
+	    },
+	    flex: 1,
+	    minHeight: 160, // shows at least one line of content
+	},
+    ],
+    initComponent: function() {
+	let me = this;
+
+	let subPanelIds = me.items.map(el => el.itemId).filter(id => !!id);
+
+	me.callParent();
+
+	for (const itemId of subPanelIds) {
+	    let component = me.getComponent(itemId);
+	    component.relayEvents(me, ['activate', 'deactivate', 'destroy']);
+	}
+    },
+
+    cbindData: function(initialConfig) {
+        let me = this;
+        me.datastore = initialConfig.datastore ? initialConfig.datastore : undefined;
+    },
+});
diff --git a/www/config/SyncView.js b/www/config/SyncView.js
index 4669a23e2..68a147615 100644
--- a/www/config/SyncView.js
+++ b/www/config/SyncView.js
@@ -29,7 +29,7 @@ Ext.define('PBS.config.SyncJobView', {
     stateful: true,
     stateId: 'grid-sync-jobs-v1',
 
-    title: gettext('Sync Jobs'),
+    title: gettext('Sync Jobs - Pull Direction'),
 
     controller: {
 	xclass: 'Ext.app.ViewController',
@@ -39,6 +39,7 @@ Ext.define('PBS.config.SyncJobView', {
 	    let view = me.getView();
             Ext.create('PBS.window.SyncJobEdit', {
 		datastore: view.datastore,
+		syncDirection: view.syncDirection,
 		listeners: {
 		    destroy: function() {
 			me.reload();
@@ -55,6 +56,7 @@ Ext.define('PBS.config.SyncJobView', {
 
             Ext.create('PBS.window.SyncJobEdit', {
 		datastore: view.datastore,
+		syncDirection: view.syncDirection,
                 id: selection[0].data.id,
 		listeners: {
 		    destroy: function() {
@@ -117,6 +119,9 @@ Ext.define('PBS.config.SyncJobView', {
 	    if (view.datastore !== undefined) {
 		params.store = view.datastore;
 	    }
+	    if (view.syncDirection !== undefined) {
+		params["sync-direction"] = view.syncDirection;
+	    }
 	    view.getStore().rstore.getProxy().setExtraParams(params);
 	    Proxmox.Utils.monStoreErrors(view, view.getStore().rstore);
 	},
@@ -303,6 +308,10 @@ Ext.define('PBS.config.SyncJobView', {
 	    }
 	}
 
+	if (me.syncDirection === 'push') {
+	    me.title = gettext('Sync Jobs - Push Direction');
+	}
+
 	me.callParent();
     },
 });
diff --git a/www/datastore/DataStoreList.js b/www/datastore/DataStoreList.js
index fc68cfc10..22ef18540 100644
--- a/www/datastore/DataStoreList.js
+++ b/www/datastore/DataStoreList.js
@@ -239,7 +239,7 @@ Ext.define('PBS.datastore.DataStores', {
 	{
 	    iconCls: 'fa fa-refresh',
 	    itemId: 'syncjobs',
-	    xtype: 'pbsSyncJobView',
+	    xtype: 'pbsSyncJobPullPushView',
 	},
 	{
 	    iconCls: 'fa fa-check-circle',
diff --git a/www/datastore/Panel.js b/www/datastore/Panel.js
index ad9fc10fe..e1da7cfac 100644
--- a/www/datastore/Panel.js
+++ b/www/datastore/Panel.js
@@ -68,7 +68,7 @@ Ext.define('PBS.DataStorePanel', {
 	{
 	    iconCls: 'fa fa-refresh',
 	    itemId: 'syncjobs',
-	    xtype: 'pbsSyncJobView',
+	    xtype: 'pbsSyncJobPullPushView',
 	    cbind: {
 		datastore: '{datastore}',
 	    },
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 25/33] ui: sync job: adapt edit window to be used for pull and push
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (23 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 24/33] ui: add view with separate grids for pull and push sync jobs Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 26/33] ui: sync: pass sync-direction to allow removing push jobs Christian Ebner
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Switch the subject and labels to be shown based on the direction of
the sync job, and set the `sync-direction` parameter from the
submit values in case of push direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- switch owner label based on jobs sync direction

 www/window/SyncJobEdit.js | 37 ++++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/www/window/SyncJobEdit.js b/www/window/SyncJobEdit.js
index 9ca79eaa9..b3aa06057 100644
--- a/www/window/SyncJobEdit.js
+++ b/www/window/SyncJobEdit.js
@@ -9,7 +9,7 @@ Ext.define('PBS.window.SyncJobEdit', {
 
     isAdd: true,
 
-    subject: gettext('Sync Job'),
+    subject: gettext('Sync Job - Pull Direction'),
 
     bodyPadding: 0,
 
@@ -29,6 +29,26 @@ Ext.define('PBS.window.SyncJobEdit', {
 	me.scheduleValue = id ? null : 'hourly';
 	me.authid = id ? null : Proxmox.UserName;
 	me.editDatastore = me.datastore === undefined && me.isCreate;
+
+	if (me.syncDirection === 'push') {
+	    me.subject = gettext('Sync Job - Push Direction');
+	    me.syncDirectionPush = true;
+	    me.syncRemoteLabel = gettext('Target Remote');
+	    me.syncRemoteDatastore = gettext('Target Datastore');
+	    me.syncRemoteNamespace = gettext('Target Namespace');
+	    me.syncLocalOwner = gettext('Local User');
+	    me.extraRequestParams = {
+		"sync-direction": 'push',
+	    };
+	} else {
+	    me.subject = gettext('Sync Job - Pull Direction');
+	    me.syncDirectionPush = false;
+	    me.syncRemoteLabel = gettext('Source Remote');
+	    me.syncRemoteDatastore = gettext('Source Datastore');
+	    me.syncRemoteNamespace = gettext('Source Namespace');
+	    me.syncLocalOwner = gettext('Local Owner');
+	}
+
 	return { };
     },
 
@@ -118,10 +138,10 @@ Ext.define('PBS.window.SyncJobEdit', {
 			},
 		    },
 		    {
-			fieldLabel: gettext('Local Owner'),
 			xtype: 'pbsAuthidSelector',
 			name: 'owner',
 			cbind: {
+			    fieldLabel: '{syncLocalOwner}',
 			    value: '{authid}',
 			    deleteEmpty: '{!isCreate}',
 			},
@@ -151,6 +171,9 @@ Ext.define('PBS.window.SyncJobEdit', {
 			xtype: 'radiogroup',
 			fieldLabel: gettext('Location'),
 			defaultType: 'radiofield',
+			cbind: {
+			    disabled: '{syncDirectionPush}',
+			},
 			items: [
 			    {
 				boxLabel: 'Local',
@@ -201,7 +224,9 @@ Ext.define('PBS.window.SyncJobEdit', {
 			},
 		    },
 		    {
-			fieldLabel: gettext('Source Remote'),
+			cbind: {
+			    fieldLabel: '{syncRemoteLabel}',
+			},
 			xtype: 'pbsRemoteSelector',
 			allowBlank: false,
 			name: 'remote',
@@ -222,13 +247,13 @@ Ext.define('PBS.window.SyncJobEdit', {
 			},
 		    },
 		    {
-			fieldLabel: gettext('Source Datastore'),
 			xtype: 'pbsRemoteStoreSelector',
 			allowBlank: false,
 			autoSelect: false,
 			name: 'remote-store',
 			cbind: {
 			    datastore: '{datastore}',
+			    fieldLabel: '{syncRemoteDatastore}',
 			},
 			listeners: {
 			    change: function(field, value) {
@@ -249,7 +274,9 @@ Ext.define('PBS.window.SyncJobEdit', {
 			},
 		    },
 		    {
-			fieldLabel: gettext('Source Namespace'),
+			cbind: {
+			    fieldLabel: '{syncRemoteNamespace}',
+			},
 			xtype: 'pbsRemoteNamespaceSelector',
 			allowBlank: true,
 			autoSelect: false,
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 26/33] ui: sync: pass sync-direction to allow removing push jobs
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (24 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 25/33] ui: sync job: adapt edit window to be used for pull and push Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 27/33] ui: sync view: do not use data model proxy for store Christian Ebner
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Without the `sync-direction` parameter set, the job will not be
found in the config, because the `sync` config type is used instead
of the correct `sync-push` for sync jobs in push direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 www/config/SyncView.js | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/www/config/SyncView.js b/www/config/SyncView.js
index 68a147615..981b9b251 100644
--- a/www/config/SyncView.js
+++ b/www/config/SyncView.js
@@ -104,6 +104,26 @@ Ext.define('PBS.config.SyncJobView', {
 	    });
 	},
 
+	removeSyncJob: function(btn, event, rec) {
+	    let me = this;
+	    let view = me.getView();
+	    let params = {};
+	    if (view.syncDirection !== undefined) {
+		params["sync-direction"] = view.syncDirection;
+	    }
+	    Proxmox.Utils.API2Request({
+		url: '/config/sync/' + rec.getId(),
+		method: 'DELETE',
+		params: params,
+		callback: function(options, success, response) {
+		    Ext.callback(me.callback, me.scope, [options, success, response, 0, me]);
+		},
+		failure: function(response, opt) {
+		    Ext.Msg.alert(gettext('Error'), response.htmlStatus);
+		},
+	    });
+	},
+
 	render_optional_owner: function(value, metadata, record) {
 	    if (!value) return '-';
 	    return Ext.String.htmlEncode(value);
@@ -161,7 +181,7 @@ Ext.define('PBS.config.SyncJobView', {
 	},
 	{
 	    xtype: 'proxmoxStdRemoveButton',
-	    baseurl: '/config/sync/',
+	    handler: 'removeSyncJob',
 	    confirmMsg: gettext('Remove entry?'),
 	    callback: 'reload',
 	},
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 27/33] ui: sync view: do not use data model proxy for store
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (25 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 26/33] ui: sync: pass sync-direction to allow removing push jobs Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 28/33] ui: sync view: set sync direction when invoking run task via api Christian Ebner
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

In order to load data using the same model from different sources,
set the proxy on the store instead of the model.
This allows to use the view to display sync jobs in either pull or
push direction, by setting the additional `sync-direction` parameter
to the proxy's api calls.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 www/config/SyncView.js | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/www/config/SyncView.js b/www/config/SyncView.js
index 981b9b251..39e464fc5 100644
--- a/www/config/SyncView.js
+++ b/www/config/SyncView.js
@@ -16,10 +16,6 @@ Ext.define('pbs-sync-jobs-status', {
 	'comment',
     ],
     idProperty: 'id',
-    proxy: {
-	type: 'proxmox',
-	url: '/api2/json/admin/sync',
-    },
 });
 
 Ext.define('PBS.config.SyncJobView', {
@@ -160,9 +156,12 @@ Ext.define('PBS.config.SyncJobView', {
 	sorters: 'id',
 	rstore: {
 	    type: 'update',
-	    storeid: 'pbs-sync-jobs-status',
 	    model: 'pbs-sync-jobs-status',
 	    interval: 5000,
+	    proxy: {
+		type: 'proxmox',
+		url: '/api2/json/admin/sync',
+	    },
 	},
     },
 
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 28/33] ui: sync view: set sync direction when invoking run task via api
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (26 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 27/33] ui: sync view: do not use data model proxy for store Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 29/33] datastore: move `BackupGroupDeleteStats` to api types Christian Ebner
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Set the correct sync direction for the task to be executed.
Otherwise the task whit the correct id cannot be found as the config
lookup requires the correct config type to be set, which is done
based on the sync direction.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 www/config/SyncView.js | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/www/config/SyncView.js b/www/config/SyncView.js
index 39e464fc5..bf0c7e8c0 100644
--- a/www/config/SyncView.js
+++ b/www/config/SyncView.js
@@ -83,9 +83,14 @@ Ext.define('PBS.config.SyncJobView', {
 	    if (selection.length < 1) return;
 
 	    let id = selection[0].data.id;
+	    let params = {};
+	    if (view.syncDirection !== undefined) {
+		params["sync-direction"] = view.syncDirection;
+	    }
 	    Proxmox.Utils.API2Request({
 		method: 'POST',
 		url: `/admin/sync/${id}/run`,
+		params: params,
 		success: function(response, opt) {
 		    Ext.create('Proxmox.window.TaskViewer', {
 		        upid: response.result.data,
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 29/33] datastore: move `BackupGroupDeleteStats` to api types
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (27 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 28/33] ui: sync view: set sync direction when invoking run task via api Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 30/33] api types: implement api type for `BackupGroupDeleteStats` Christian Ebner
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

In preparation for the delete stats to be exposed as return type to
the backup group delete api endpoint.

Also, rename the private field `unremoved_protected` to a better
fitting `protected_snapshots` to be in line with the method names.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- rename `unremoved_protected` private field to more fitting
  `protected_snapshots`

 pbs-api-types/src/datastore.rs   | 30 +++++++++++++++++++++++++++++
 pbs-datastore/src/backup_info.rs | 33 ++------------------------------
 pbs-datastore/src/datastore.rs   |  7 ++++---
 3 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index 31767417a..c148d5dca 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -1569,3 +1569,33 @@ pub fn print_store_and_ns(store: &str, ns: &BackupNamespace) -> String {
         format!("datastore '{}', namespace '{}'", store, ns)
     }
 }
+
+#[derive(Default)]
+pub struct BackupGroupDeleteStats {
+    // Count of protected snapshots, therefore not removed
+    protected_snapshots: usize,
+    // Count of deleted snapshots
+    removed_snapshots: usize,
+}
+
+impl BackupGroupDeleteStats {
+    pub fn all_removed(&self) -> bool {
+        self.protected_snapshots == 0
+    }
+
+    pub fn removed_snapshots(&self) -> usize {
+        self.removed_snapshots
+    }
+
+    pub fn protected_snapshots(&self) -> usize {
+        self.protected_snapshots
+    }
+
+    pub fn increment_removed_snapshots(&mut self) {
+        self.removed_snapshots += 1;
+    }
+
+    pub fn increment_protected_snapshots(&mut self) {
+        self.protected_snapshots += 1;
+    }
+}
diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index 414ec878d..222134074 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -8,7 +8,8 @@ use anyhow::{bail, format_err, Error};
 use proxmox_sys::fs::{lock_dir_noblock, replace_file, CreateOptions};
 
 use pbs_api_types::{
-    Authid, BackupNamespace, BackupType, GroupFilter, BACKUP_DATE_REGEX, BACKUP_FILE_REGEX,
+    Authid, BackupGroupDeleteStats, BackupNamespace, BackupType, GroupFilter, BACKUP_DATE_REGEX,
+    BACKUP_FILE_REGEX,
 };
 use pbs_config::{open_backup_lockfile, BackupLockGuard};
 
@@ -17,36 +18,6 @@ use crate::manifest::{
 };
 use crate::{DataBlob, DataStore};
 
-#[derive(Default)]
-pub struct BackupGroupDeleteStats {
-    // Count of protected snapshots, therefore not removed
-    unremoved_protected: usize,
-    // Count of deleted snapshots
-    removed_snapshots: usize,
-}
-
-impl BackupGroupDeleteStats {
-    pub fn all_removed(&self) -> bool {
-        self.unremoved_protected == 0
-    }
-
-    pub fn removed_snapshots(&self) -> usize {
-        self.removed_snapshots
-    }
-
-    pub fn protected_snapshots(&self) -> usize {
-        self.unremoved_protected
-    }
-
-    fn increment_removed_snapshots(&mut self) {
-        self.removed_snapshots += 1;
-    }
-
-    fn increment_protected_snapshots(&mut self) {
-        self.unremoved_protected += 1;
-    }
-}
-
 /// BackupGroup is a directory containing a list of BackupDir
 #[derive(Clone)]
 pub struct BackupGroup {
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index d0f3c53ac..c8701d2dd 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -18,11 +18,12 @@ use proxmox_sys::process_locker::ProcessLockSharedGuard;
 use proxmox_worker_task::WorkerTaskContext;
 
 use pbs_api_types::{
-    Authid, BackupNamespace, BackupType, ChunkOrder, DataStoreConfig, DatastoreFSyncLevel,
-    DatastoreTuning, GarbageCollectionStatus, MaintenanceMode, MaintenanceType, Operation, UPID,
+    Authid, BackupGroupDeleteStats, BackupNamespace, BackupType, ChunkOrder, DataStoreConfig,
+    DatastoreFSyncLevel, DatastoreTuning, GarbageCollectionStatus, MaintenanceMode,
+    MaintenanceType, Operation, UPID,
 };
 
-use crate::backup_info::{BackupDir, BackupGroup, BackupGroupDeleteStats};
+use crate::backup_info::{BackupDir, BackupGroup};
 use crate::chunk_store::ChunkStore;
 use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
 use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 30/33] api types: implement api type for `BackupGroupDeleteStats`
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (28 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 29/33] datastore: move `BackupGroupDeleteStats` to api types Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 31/33] datastore: increment deleted group counter when removing group Christian Ebner
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Make the `BackupGroupDeleteStats` exposable via the API by implementing
the ApiTypes trait via the api macro invocation and add an additional
field to account for the number of deleted groups.
Further, add a method to add up the statistics.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- rename `unremoved_protected` private field to more fitting
  `protected_snapshots` to be in line with patch 29.

 pbs-api-types/src/datastore.rs | 36 +++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index c148d5dca..a32a326be 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -1570,8 +1570,28 @@ pub fn print_store_and_ns(store: &str, ns: &BackupNamespace) -> String {
     }
 }
 
-#[derive(Default)]
+pub const DELETE_STATS_COUNT_SCHEMA: Schema =
+    IntegerSchema::new("Number of entities").minimum(0).schema();
+
+#[api(
+    properties: {
+        "removed-groups": {
+            schema: DELETE_STATS_COUNT_SCHEMA,
+        },
+        "protected-snapshots": {
+            schema: DELETE_STATS_COUNT_SCHEMA,
+        },
+        "removed-snapshots": {
+            schema: DELETE_STATS_COUNT_SCHEMA,
+        },
+     },
+)]
+#[derive(Default, Deserialize, Serialize)]
+#[serde(rename_all = "kebab-case")]
+/// Statistics for removed backup groups
 pub struct BackupGroupDeleteStats {
+    // Count of removed groups
+    removed_groups: usize,
     // Count of protected snapshots, therefore not removed
     protected_snapshots: usize,
     // Count of deleted snapshots
@@ -1583,6 +1603,10 @@ impl BackupGroupDeleteStats {
         self.protected_snapshots == 0
     }
 
+    pub fn removed_groups(&self) -> usize {
+        self.removed_groups
+    }
+
     pub fn removed_snapshots(&self) -> usize {
         self.removed_snapshots
     }
@@ -1591,6 +1615,16 @@ impl BackupGroupDeleteStats {
         self.protected_snapshots
     }
 
+    pub fn add(&mut self, rhs: &Self) {
+        self.removed_groups += rhs.removed_groups;
+        self.protected_snapshots += rhs.protected_snapshots;
+        self.removed_snapshots += rhs.removed_snapshots;
+    }
+
+    pub fn increment_removed_groups(&mut self) {
+        self.removed_groups += 1;
+    }
+
     pub fn increment_removed_snapshots(&mut self) {
         self.removed_snapshots += 1;
     }
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 31/33] datastore: increment deleted group counter when removing group
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (29 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 30/33] api types: implement api type for `BackupGroupDeleteStats` Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove Christian Ebner
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

To correctly account also for the number of deleted backup groups, in
preparation to correctly return the delete statistics when removing
contents via the REST API.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-datastore/src/backup_info.rs | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index 222134074..62d12b118 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -221,6 +221,7 @@ impl BackupGroup {
             std::fs::remove_dir_all(&path).map_err(|err| {
                 format_err!("removing group directory {:?} failed - {}", path, err)
             })?;
+            delete_stats.increment_removed_groups();
         }
 
         Ok(delete_stats)
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (30 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 31/33] datastore: increment deleted group counter when removing group Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-10-11  9:32   ` Fabian Grünbichler
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api Christian Ebner
                   ` (3 subsequent siblings)
  35 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Add and expose the backup group delete statistics by adding the
return type to the corresponding REST API endpoints.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-datastore/src/datastore.rs | 20 ++++++++++++++------
 src/api2/admin/datastore.rs    | 18 ++++++++++--------
 src/api2/admin/namespace.rs    | 20 +++++++++++---------
 src/server/pull.rs             |  6 ++++--
 4 files changed, 39 insertions(+), 25 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index c8701d2dd..68c7f2934 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -489,16 +489,22 @@ impl DataStore {
     ///
     /// Does *not* descends into child-namespaces and doesn't remoes the namespace itself either.
     ///
-    /// Returns true if all the groups were removed, and false if some were protected.
-    pub fn remove_namespace_groups(self: &Arc<Self>, ns: &BackupNamespace) -> Result<bool, Error> {
+    /// Returns a tuple with the first item being true if all the groups were removed, and false if some were protected.
+    /// The second item returns the remove statistics.
+    pub fn remove_namespace_groups(
+        self: &Arc<Self>,
+        ns: &BackupNamespace,
+    ) -> Result<(bool, BackupGroupDeleteStats), Error> {
         // FIXME: locking? The single groups/snapshots are already protected, so may not be
         // necessary (depends on what we all allow to do with namespaces)
         log::info!("removing all groups in namespace {}:/{ns}", self.name());
 
         let mut removed_all_groups = true;
+        let mut stats = BackupGroupDeleteStats::default();
 
         for group in self.iter_backup_groups(ns.to_owned())? {
             let delete_stats = group?.destroy()?;
+            stats.add(&delete_stats);
             removed_all_groups = removed_all_groups && delete_stats.all_removed();
         }
 
@@ -515,7 +521,7 @@ impl DataStore {
             }
         }
 
-        Ok(removed_all_groups)
+        Ok((removed_all_groups, stats))
     }
 
     /// Remove a complete backup namespace optionally including all it's, and child namespaces',
@@ -527,13 +533,15 @@ impl DataStore {
         self: &Arc<Self>,
         ns: &BackupNamespace,
         delete_groups: bool,
-    ) -> Result<bool, Error> {
+    ) -> Result<(bool, BackupGroupDeleteStats), Error> {
         let store = self.name();
         let mut removed_all_requested = true;
+        let mut stats = BackupGroupDeleteStats::default();
         if delete_groups {
             log::info!("removing whole namespace recursively below {store}:/{ns}",);
             for ns in self.recursive_iter_backup_ns(ns.to_owned())? {
-                let removed_ns_groups = self.remove_namespace_groups(&ns?)?;
+                let (removed_ns_groups, delete_stats) = self.remove_namespace_groups(&ns?)?;
+                stats.add(&delete_stats);
                 removed_all_requested = removed_all_requested && removed_ns_groups;
             }
         } else {
@@ -574,7 +582,7 @@ impl DataStore {
             }
         }
 
-        Ok(removed_all_requested)
+        Ok((removed_all_requested, stats))
     }
 
     /// Remove a complete backup group including all snapshots.
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 0a5af1e76..49ff9abf0 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -34,10 +34,10 @@ use pxar::accessor::aio::Accessor;
 use pxar::EntryKind;
 
 use pbs_api_types::{
-    print_ns_and_snapshot, print_store_and_ns, Authid, BackupContent, BackupNamespace, BackupType,
-    Counts, CryptMode, DataStoreConfig, DataStoreListItem, DataStoreStatus,
-    GarbageCollectionJobStatus, GroupListItem, JobScheduleStatus, KeepOptions, Operation,
-    PruneJobOptions, SnapshotListItem, SnapshotVerifyState, BACKUP_ARCHIVE_NAME_SCHEMA,
+    print_ns_and_snapshot, print_store_and_ns, Authid, BackupContent, BackupGroupDeleteStats,
+    BackupNamespace, BackupType, Counts, CryptMode, DataStoreConfig, DataStoreListItem,
+    DataStoreStatus, GarbageCollectionJobStatus, GroupListItem, JobScheduleStatus, KeepOptions,
+    Operation, PruneJobOptions, SnapshotListItem, SnapshotVerifyState, BACKUP_ARCHIVE_NAME_SCHEMA,
     BACKUP_ID_SCHEMA, BACKUP_NAMESPACE_SCHEMA, BACKUP_TIME_SCHEMA, BACKUP_TYPE_SCHEMA,
     DATASTORE_SCHEMA, IGNORE_VERIFIED_BACKUPS_SCHEMA, MAX_NAMESPACE_DEPTH, NS_MAX_DEPTH_SCHEMA,
     PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_MODIFY, PRIV_DATASTORE_PRUNE,
@@ -269,6 +269,9 @@ pub fn list_groups(
             },
         },
     },
+    returns: {
+        type: BackupGroupDeleteStats,
+    },
     access: {
         permission: &Permission::Anybody,
         description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any \
@@ -281,7 +284,7 @@ pub async fn delete_group(
     ns: Option<BackupNamespace>,
     group: pbs_api_types::BackupGroup,
     rpcenv: &mut dyn RpcEnvironment,
-) -> Result<Value, Error> {
+) -> Result<BackupGroupDeleteStats, Error> {
     let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
 
     tokio::task::spawn_blocking(move || {
@@ -299,10 +302,9 @@ pub async fn delete_group(
 
         let delete_stats = datastore.remove_backup_group(&ns, &group)?;
         if !delete_stats.all_removed() {
-            bail!("group only partially deleted due to protected snapshots");
+            warn!("group only partially deleted due to protected snapshots");
         }
-
-        Ok(Value::Null)
+        Ok(delete_stats)
     })
     .await?
 }
diff --git a/src/api2/admin/namespace.rs b/src/api2/admin/namespace.rs
index 889dc1a3d..adf665717 100644
--- a/src/api2/admin/namespace.rs
+++ b/src/api2/admin/namespace.rs
@@ -1,13 +1,12 @@
-use anyhow::{bail, Error};
-use serde_json::Value;
+use anyhow::Error;
 
 use pbs_config::CachedUserInfo;
 use proxmox_router::{http_bail, ApiMethod, Permission, Router, RpcEnvironment};
 use proxmox_schema::*;
 
 use pbs_api_types::{
-    Authid, BackupNamespace, NamespaceListItem, Operation, DATASTORE_SCHEMA, NS_MAX_DEPTH_SCHEMA,
-    PROXMOX_SAFE_ID_FORMAT,
+    Authid, BackupGroupDeleteStats, BackupNamespace, NamespaceListItem, Operation,
+    DATASTORE_SCHEMA, NS_MAX_DEPTH_SCHEMA, PROXMOX_SAFE_ID_FORMAT,
 };
 
 use pbs_datastore::DataStore;
@@ -151,22 +150,25 @@ pub fn delete_namespace(
     delete_groups: bool,
     _info: &ApiMethod,
     rpcenv: &mut dyn RpcEnvironment,
-) -> Result<Value, Error> {
+) -> Result<BackupGroupDeleteStats, Error> {
     let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
 
     check_ns_modification_privs(&store, &ns, &auth_id)?;
 
     let datastore = DataStore::lookup_datastore(&store, Some(Operation::Write))?;
 
-    if !datastore.remove_namespace_recursive(&ns, delete_groups)? {
+    let (removed_all, stats) = datastore.remove_namespace_recursive(&ns, delete_groups)?;
+    if !removed_all {
         if delete_groups {
-            bail!("group only partially deleted due to protected snapshots");
+            log::warn!("group only partially deleted due to protected snapshots");
         } else {
-            bail!("only partially deleted due to existing groups but `delete-groups` not true ");
+            log::warn!(
+                "only partially deleted due to existing groups but `delete-groups` not true"
+            );
         }
     }
 
-    Ok(Value::Null)
+    Ok(stats)
 }
 
 pub const ROUTER: Router = Router::new()
diff --git a/src/server/pull.rs b/src/server/pull.rs
index 3117f7d2c..d7f5c42ea 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -645,10 +645,12 @@ fn check_and_remove_ns(params: &PullParameters, local_ns: &BackupNamespace) -> R
     check_ns_modification_privs(params.target.store.name(), local_ns, &params.owner)
         .map_err(|err| format_err!("Removing {local_ns} not allowed - {err}"))?;
 
-    params
+    let (removed_all, _delete_stats) = params
         .target
         .store
-        .remove_namespace_recursive(local_ns, true)
+        .remove_namespace_recursive(local_ns, true)?;
+
+    Ok(removed_all)
 }
 
 fn check_and_remove_vanished_ns(
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (31 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove Christian Ebner
@ 2024-09-12 14:33 ` Christian Ebner
  2024-10-11  9:32   ` Fabian Grünbichler
  2024-10-10 14:48 ` [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Fabian Grünbichler
                   ` (2 subsequent siblings)
  35 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-09-12 14:33 UTC (permalink / raw)
  To: pbs-devel

Use the API exposed additional delete statistics to generate the
task log output for sync jobs in push direction instead of fetching the
contents before and after deleting.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 src/server/push.rs | 65 ++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 37 deletions(-)

diff --git a/src/server/push.rs b/src/server/push.rs
index cfbb88728..dbface907 100644
--- a/src/server/push.rs
+++ b/src/server/push.rs
@@ -11,9 +11,10 @@ use tokio_stream::wrappers::ReceiverStream;
 use tracing::info;
 
 use pbs_api_types::{
-    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
-    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
-    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
+    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupGroupDeleteStats, BackupNamespace,
+    CryptMode, GroupFilter, GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote,
+    SnapshotListItem, PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY,
+    PRIV_REMOTE_DATASTORE_PRUNE,
 };
 use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
 use pbs_config::CachedUserInfo;
@@ -228,7 +229,7 @@ async fn remove_target_group(
     params: &PushParameters,
     namespace: &BackupNamespace,
     backup_group: &BackupGroup,
-) -> Result<(), Error> {
+) -> Result<BackupGroupDeleteStats, Error> {
     check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
         .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
 
@@ -246,9 +247,11 @@ async fn remove_target_group(
         args["ns"] = serde_json::to_value(target_ns.name())?;
     }
 
-    params.target.client.delete(&api_path, Some(args)).await?;
+    let mut result = params.target.client.delete(&api_path, Some(args)).await?;
+    let data = result["data"].take();
+    let delete_stats: BackupGroupDeleteStats = serde_json::from_value(data)?;
 
-    Ok(())
+    Ok(delete_stats)
 }
 
 // Check if the namespace is already present on the target, create it otherwise
@@ -451,38 +454,26 @@ pub(crate) async fn push_namespace(
 
             info!("delete vanished group '{target_group}'");
 
-            let count_before = match fetch_target_groups(params, namespace).await {
-                Ok(snapshots) => snapshots.len(),
-                Err(_err) => 0, // ignore errors
-            };
-
-            if let Err(err) = remove_target_group(params, namespace, &target_group).await {
-                info!("{err}");
-                errors = true;
-                continue;
-            }
-
-            let mut count_after = match fetch_target_groups(params, namespace).await {
-                Ok(snapshots) => snapshots.len(),
-                Err(_err) => 0, // ignore errors
-            };
-
-            let deleted_groups = if count_after > 0 {
-                info!("kept some protected snapshots of group '{target_group}'");
-                0
-            } else {
-                1
-            };
-
-            if count_after > count_before {
-                count_after = count_before;
+            match remove_target_group(params, namespace, &target_group).await {
+                Ok(delete_stats) => {
+                    if delete_stats.protected_snapshots() > 0 {
+                        info!(
+                            "kept {protected_count} protected snapshots of group '{target_group}'",
+                            protected_count = delete_stats.protected_snapshots(),
+                        );
+                    }
+                    stats.add(SyncStats::from(RemovedVanishedStats {
+                        snapshots: delete_stats.removed_snapshots(),
+                        groups: delete_stats.removed_groups(),
+                        namespaces: 0,
+                    }));
+                }
+                Err(err) => {
+                    info!("failed to delete vanished group - {err}");
+                    errors = true;
+                    continue;
+                }
             }
-
-            stats.add(SyncStats::from(RemovedVanishedStats {
-                snapshots: count_before - count_after,
-                groups: deleted_groups,
-                namespaces: 0,
-            }));
         }
     }
 
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations Christian Ebner
@ 2024-10-10 14:48   ` Fabian Grünbichler
  2024-10-14  9:32     ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-10 14:48 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

left some higher level comments on the cover letter as well that are
relevant for this patch!

On September 12, 2024 4:33 pm, Christian Ebner wrote:
> Adds the functionality required to push datastore contents from a
> source to a remote target.
> This includes syncing of the namespaces, backup groups and snapshots
> based on the provided filters as well as removing vanished contents
> from the target when requested.
> 
> While trying to mimic the pull direction of sync jobs, the
> implementation is different as access to the remote must be performed
> via the REST API, not needed for the pull job which can access the
> local datastore via the filesystem directly.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - Implement additional permission checks limiting possible remote
>   datastore operations.
> - Rename `owner` to `local_user`, this is the user who's view of the
>   local datastore is used for the push to the remote target. It can be
>   different from the job user, executing the sync job and requiring the
>   permissions to access the remote.
> 
>  src/server/mod.rs  |   1 +
>  src/server/push.rs | 892 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 893 insertions(+)
>  create mode 100644 src/server/push.rs
> 
> diff --git a/src/server/mod.rs b/src/server/mod.rs
> index 468847c2e..882c5cc10 100644
> --- a/src/server/mod.rs
> +++ b/src/server/mod.rs
> @@ -34,6 +34,7 @@ pub use report::*;
>  pub mod auth;
>  
>  pub(crate) mod pull;
> +pub(crate) mod push;
>  pub(crate) mod sync;
>  
>  pub(crate) async fn reload_proxy_certificate() -> Result<(), Error> {
> diff --git a/src/server/push.rs b/src/server/push.rs
> new file mode 100644
> index 000000000..cfbb88728
> --- /dev/null
> +++ b/src/server/push.rs
> @@ -0,0 +1,892 @@
> +//! Sync datastore by pushing contents to remote server
> +
> +use std::cmp::Ordering;
> +use std::collections::HashSet;
> +use std::sync::{Arc, Mutex};
> +
> +use anyhow::{bail, format_err, Error};
> +use futures::stream::{self, StreamExt, TryStreamExt};
> +use tokio::sync::mpsc;
> +use tokio_stream::wrappers::ReceiverStream;
> +use tracing::info;
> +
> +use pbs_api_types::{
> +    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
> +    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
> +    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
> +};
> +use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
> +use pbs_config::CachedUserInfo;
> +use pbs_datastore::data_blob::ChunkInfo;
> +use pbs_datastore::dynamic_index::DynamicIndexReader;
> +use pbs_datastore::fixed_index::FixedIndexReader;
> +use pbs_datastore::index::IndexFile;
> +use pbs_datastore::manifest::{ArchiveType, CLIENT_LOG_BLOB_NAME, MANIFEST_BLOB_NAME};
> +use pbs_datastore::read_chunk::AsyncReadChunk;
> +use pbs_datastore::{BackupManifest, DataStore, StoreProgress};
> +
> +use super::sync::{
> +    check_namespace_depth_limit, LocalSource, RemovedVanishedStats, SkipInfo, SkipReason,
> +    SyncSource, SyncStats,
> +};
> +use crate::api2::config::remote;
> +
> +/// Target for backups to be pushed to
> +pub(crate) struct PushTarget {
> +    // Name of the remote as found in remote.cfg
> +    remote: String,
> +    // Target repository on remote
> +    repo: BackupRepository,
> +    // Target namespace on remote
> +    ns: BackupNamespace,
> +    // Http client to connect to remote
> +    client: HttpClient,
> +}
> +
> +/// Parameters for a push operation
> +pub(crate) struct PushParameters {
> +    /// Source of backups to be pushed to remote
> +    source: Arc<LocalSource>,
> +    /// Target for backups to be pushed to
> +    target: PushTarget,
> +    /// Local user limiting the accessible source contents, makes sure that the sync job sees the
> +    /// same source content when executed by different users with different privileges
> +    local_user: Authid,
> +    /// User as which the job gets executed, requires the permissions on the remote
> +    pub(crate) job_user: Option<Authid>,
> +    /// Whether to remove groups which exist locally, but not on the remote end
> +    remove_vanished: bool,
> +    /// How many levels of sub-namespaces to push (0 == no recursion, None == maximum recursion)
> +    max_depth: Option<usize>,
> +    /// Filters for reducing the push scope
> +    group_filter: Vec<GroupFilter>,
> +    /// How many snapshots should be transferred at most (taking the newest N snapshots)
> +    transfer_last: Option<usize>,
> +}
> +
> +impl PushParameters {
> +    /// Creates a new instance of `PushParameters`.
> +    #[allow(clippy::too_many_arguments)]
> +    pub(crate) fn new(
> +        store: &str,
> +        ns: BackupNamespace,
> +        remote_id: &str,
> +        remote_store: &str,
> +        remote_ns: BackupNamespace,
> +        local_user: Authid,
> +        remove_vanished: Option<bool>,
> +        max_depth: Option<usize>,
> +        group_filter: Option<Vec<GroupFilter>>,
> +        limit: RateLimitConfig,
> +        transfer_last: Option<usize>,
> +    ) -> Result<Self, Error> {
> +        if let Some(max_depth) = max_depth {
> +            ns.check_max_depth(max_depth)?;
> +            remote_ns.check_max_depth(max_depth)?;
> +        };
> +        let remove_vanished = remove_vanished.unwrap_or(false);
> +
> +        let source = Arc::new(LocalSource {
> +            store: DataStore::lookup_datastore(store, Some(Operation::Read))?,
> +            ns,
> +        });
> +
> +        let (remote_config, _digest) = pbs_config::remote::config()?;
> +        let remote: Remote = remote_config.lookup("remote", remote_id)?;
> +
> +        let repo = BackupRepository::new(
> +            Some(remote.config.auth_id.clone()),
> +            Some(remote.config.host.clone()),
> +            remote.config.port,
> +            remote_store.to_string(),
> +        );
> +
> +        let client = remote::remote_client_config(&remote, Some(limit))?;
> +        let target = PushTarget {
> +            remote: remote_id.to_string(),
> +            repo,
> +            ns: remote_ns,
> +            client,
> +        };
> +        let group_filter = group_filter.unwrap_or_default();
> +
> +        Ok(Self {
> +            source,
> +            target,
> +            local_user,
> +            job_user: None,
> +            remove_vanished,
> +            max_depth,
> +            group_filter,
> +            transfer_last,
> +        })
> +    }
> +}
> +
> +fn check_ns_remote_datastore_privs(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +    privs: u64,
> +) -> Result<(), Error> {
> +    let auth_id = params
> +        .job_user
> +        .as_ref()
> +        .ok_or_else(|| format_err!("missing job authid"))?;
> +    let user_info = CachedUserInfo::new()?;
> +    let mut acl_path: Vec<&str> = vec!["remote", &params.target.remote, params.target.repo.store()];
> +
> +    if !namespace.is_root() {
> +        let ns_components: Vec<&str> = namespace.components().collect();
> +        acl_path.extend(ns_components);
> +    }
> +
> +    user_info.check_privs(auth_id, &acl_path, privs, false)?;
> +
> +    Ok(())
> +}
> +
> +// Fetch the list of namespaces found on target
> +async fn fetch_target_namespaces(params: &PushParameters) -> Result<Vec<BackupNamespace>, Error> {
> +    let api_path = format!(
> +        "api2/json/admin/datastore/{store}/namespace",
> +        store = params.target.repo.store(),
> +    );
> +    let mut result = params.target.client.get(&api_path, None).await?;
> +    let namespaces: Vec<NamespaceListItem> = serde_json::from_value(result["data"].take())?;
> +    let mut namespaces: Vec<BackupNamespace> = namespaces
> +        .into_iter()
> +        .map(|namespace| namespace.ns)
> +        .collect();
> +    namespaces.sort_unstable_by_key(|a| a.name_len());
> +
> +    Ok(namespaces)
> +}
> +
> +// Remove the provided namespace from the target
> +async fn remove_target_namespace(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +) -> Result<(), Error> {
> +    if namespace.is_root() {
> +        bail!("cannot remove root namespace from target");
> +    }
> +
> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
> +        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;

this should be MODIFY, not PRUNE to mimic pull-based syncing, see cover letter

> +
> +    let api_path = format!(
> +        "api2/json/admin/datastore/{store}/namespace",
> +        store = params.target.repo.store(),
> +    );
> +
> +    let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
> +    let target_ns = params.map_namespace(namespace)?;

would it make sense to make this less verbose *and more readable* by
implementing a `map_namespace` on params? 7 call sites ;)

> +    let args = serde_json::json!({
> +        "ns": target_ns.name(),
> +        "delete-groups": true,
> +    });
> +
> +    params.target.client.delete(&api_path, Some(args)).await?;
> +
> +    Ok(())
> +}
> +
> +// Fetch the list of groups found on target in given namespace
> +async fn fetch_target_groups(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +) -> Result<Vec<BackupGroup>, Error> {
> +    let api_path = format!(
> +        "api2/json/admin/datastore/{store}/groups",
> +        store = params.target.repo.store(),
> +    );
> +
> +    let args = if !namespace.is_root() {
> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
> +        Some(serde_json::json!({ "ns": target_ns.name() }))
> +    } else {
> +        None
> +    };
> +
> +    let mut result = params.target.client.get(&api_path, args).await?;
> +    let groups: Vec<GroupListItem> = serde_json::from_value(result["data"].take())?;
> +    let mut groups: Vec<BackupGroup> = groups.into_iter().map(|group| group.backup).collect();
> +
> +    groups.sort_unstable_by(|a, b| {
> +        let type_order = a.ty.cmp(&b.ty);
> +        if type_order == Ordering::Equal {
> +            a.id.cmp(&b.id)
> +        } else {
> +            type_order
> +        }
> +    });
> +
> +    Ok(groups)
> +}
> +
> +// Remove the provided backup group in given namespace from the target
> +async fn remove_target_group(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +    backup_group: &BackupGroup,
> +) -> Result<(), Error> {
> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
> +        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
> +
> +    let api_path = format!(
> +        "api2/json/admin/datastore/{store}/groups",
> +        store = params.target.repo.store(),
> +    );
> +
> +    let mut args = serde_json::json!({
> +        "backup-id": backup_group.id,
> +        "backup-type": backup_group.ty,
> +    });
> +    if !namespace.is_root() {
> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
> +        args["ns"] = serde_json::to_value(target_ns.name())?;
> +    }
> +
> +    params.target.client.delete(&api_path, Some(args)).await?;
> +
> +    Ok(())
> +}
> +
> +// Check if the namespace is already present on the target, create it otherwise
> +async fn check_or_create_target_namespace(
> +    params: &PushParameters,
> +    target_namespaces: &[BackupNamespace],
> +    namespace: &BackupNamespace,
> +) -> Result<bool, Error> {
> +    let mut created = false;
> +
> +    if !namespace.is_root() && !target_namespaces.contains(namespace) {
> +        // Namespace not present on target, create namespace.
> +        // Sub-namespaces have to be created by creating parent components first.
> +
> +        check_ns_remote_datastore_privs(&params, namespace, PRIV_REMOTE_DATASTORE_MODIFY)
> +            .map_err(|err| format_err!("Creating namespace not allowed - {err}"))?;
> +
> +        let mut parent = BackupNamespace::root();
> +        for namespace_component in namespace.components() {
> +            let namespace = BackupNamespace::new(namespace_component)?;
> +            let api_path = format!(
> +                "api2/json/admin/datastore/{store}/namespace",
> +                store = params.target.repo.store(),
> +            );
> +            let mut args = serde_json::json!({ "name": namespace.name() });
> +            if !parent.is_root() {
> +                args["parent"] = serde_json::to_value(parent.clone())?;
> +            }
> +            if let Err(err) = params.target.client.post(&api_path, Some(args)).await {
> +                let target_store_and_ns =
> +                    print_store_and_ns(params.target.repo.store(), &namespace);
> +                bail!("sync into {target_store_and_ns} failed - namespace creation failed: {err}");
> +            }
> +            parent.push(namespace.name())?;

this tries to create every prefix of the missing namespace, instead of
just the missing lower end of the hierarchy.. which is currently fine,
since the create_namespace API endpoint doesn't fail if the namespace
already exists, but since we already have a list of existing namespaces
here, we could make it more future proof (and a bit faster ;)) by
skipping those..

> +        }
> +
> +        created = true;
> +    }
> +
> +    Ok(created)
> +}
> +
> +/// Push contents of source datastore matched by given push parameters to target.
> +pub(crate) async fn push_store(mut params: PushParameters) -> Result<SyncStats, Error> {
> +    let mut errors = false;
> +
> +    // Generate list of source namespaces to push to target, limited by max-depth
> +    let mut namespaces = params.source.list_namespaces(&mut params.max_depth).await?;
> +
> +    check_namespace_depth_limit(&params.source.get_ns(), &params.target.ns, &namespaces)?;
> +
> +    namespaces.sort_unstable_by_key(|a| a.name_len());
> +
> +    // Fetch all accessible namespaces already present on the target
> +    let target_namespaces = fetch_target_namespaces(&params).await?;
> +    // Remember synced namespaces, removing non-synced ones when remove vanished flag is set
> +    let mut synced_namespaces = HashSet::with_capacity(namespaces.len());
> +
> +    let (mut groups, mut snapshots) = (0, 0);
> +    let mut stats = SyncStats::default();
> +    for namespace in namespaces {
> +        let source_store_and_ns = print_store_and_ns(params.source.store.name(), &namespace);
> +        let target_namespace = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
> +        let target_store_and_ns = print_store_and_ns(params.target.repo.store(), &target_namespace);
> +
> +        info!("----");
> +        info!("Syncing {source_store_and_ns} into {target_store_and_ns}");
> +
> +        synced_namespaces.insert(target_namespace.clone());
> +
> +        match check_or_create_target_namespace(&params, &target_namespaces, &target_namespace).await
> +        {
> +            Ok(true) => info!("Created namespace {target_namespace}"),
> +            Ok(false) => {}
> +            Err(err) => {
> +                info!("Cannot sync {source_store_and_ns} into {target_store_and_ns} - {err}");
> +                errors = true;
> +                continue;
> +            }
> +        }
> +
> +        match push_namespace(&namespace, &params).await {
> +            Ok((sync_progress, sync_stats, sync_errors)) => {
> +                errors |= sync_errors;
> +                stats.add(sync_stats);
> +
> +                if params.max_depth != Some(0) {
> +                    groups += sync_progress.done_groups;
> +                    snapshots += sync_progress.done_snapshots;
> +
> +                    let ns = if namespace.is_root() {
> +                        "root namespace".into()
> +                    } else {
> +                        format!("namespace {namespace}")
> +                    };
> +                    info!(
> +                        "Finished syncing {ns}, current progress: {groups} groups, {snapshots} snapshots"
> +                    );
> +                }
> +            }
> +            Err(err) => {
> +                errors = true;
> +                info!("Encountered errors while syncing namespace {namespace} - {err}");
> +            }
> +        }
> +    }
> +
> +    if params.remove_vanished {
> +        for target_namespace in target_namespaces {
> +            if synced_namespaces.contains(&target_namespace) {
> +                continue;
> +            }
> +            if let Err(err) = remove_target_namespace(&params, &target_namespace).await {
> +                info!("failed to remove vanished namespace {target_namespace} - {err}");
> +                continue;
> +            }
> +            info!("removed vanished namespace {target_namespace}");
> +        }
> +    }
> +
> +    if errors {
> +        bail!("sync failed with some errors.");
> +    }
> +
> +    Ok(stats)
> +}
> +
> +/// Push namespace including all backup groups to target
> +///
> +/// Iterate over all backup groups in the namespace and push them to the target.
> +pub(crate) async fn push_namespace(
> +    namespace: &BackupNamespace,
> +    params: &PushParameters,
> +) -> Result<(StoreProgress, SyncStats, bool), Error> {
> +    // Check if user is allowed to perform backups on remote datastore
> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_BACKUP)
> +        .map_err(|err| format_err!("Pushing to remote not allowed - {err}"))?;
> +
> +    let mut list: Vec<BackupGroup> = params
> +        .source
> +        .list_groups(namespace, &params.local_user)
> +        .await?;
> +
> +    list.sort_unstable_by(|a, b| {
> +        let type_order = a.ty.cmp(&b.ty);
> +        if type_order == Ordering::Equal {
> +            a.id.cmp(&b.id)
> +        } else {
> +            type_order
> +        }
> +    });
> +
> +    let total = list.len();
> +    let list: Vec<BackupGroup> = list
> +        .into_iter()
> +        .filter(|group| group.apply_filters(&params.group_filter))
> +        .collect();
> +
> +    info!(
> +        "found {filtered} groups to sync (out of {total} total)",
> +        filtered = list.len()
> +    );
> +
> +    let target_groups = if params.remove_vanished {
> +        fetch_target_groups(params, namespace).await?
> +    } else {
> +        // avoid fetching of groups, not required if remove vanished not set
> +        Vec::new()
> +    };

should we then fetch them below in the if remove_vanished branch, like
we do when handling snapshots?

> +
> +    let mut errors = false;
> +    // Remember synced groups, remove others when the remove vanished flag is set
> +    let mut synced_groups = HashSet::new();
> +    let mut progress = StoreProgress::new(list.len() as u64);
> +    let mut stats = SyncStats::default();
> +
> +    for (done, group) in list.into_iter().enumerate() {
> +        progress.done_groups = done as u64;
> +        progress.done_snapshots = 0;
> +        progress.group_snapshots = 0;
> +        synced_groups.insert(group.clone());
> +
> +        match push_group(params, namespace, &group, &mut progress).await {
> +            Ok(sync_stats) => stats.add(sync_stats),
> +            Err(err) => {
> +                info!("sync group '{group}' failed  - {err}");
> +                errors = true;
> +            }
> +        }
> +    }
> +
> +    if params.remove_vanished {
> +        for target_group in target_groups {
> +            if synced_groups.contains(&target_group) {
> +                continue;
> +            }
> +            if !target_group.apply_filters(&params.group_filter) {
> +                continue;
> +            }
> +
> +            info!("delete vanished group '{target_group}'");
> +
> +            let count_before = match fetch_target_groups(params, namespace).await {
> +                Ok(snapshots) => snapshots.len(),
> +                Err(_err) => 0, // ignore errors
> +            };
> +
> +            if let Err(err) = remove_target_group(params, namespace, &target_group).await {
> +                info!("{err}");
> +                errors = true;
> +                continue;
> +            }
> +
> +            let mut count_after = match fetch_target_groups(params, namespace).await {
> +                Ok(snapshots) => snapshots.len(),
> +                Err(_err) => 0, // ignore errors
> +            };
> +
> +            let deleted_groups = if count_after > 0 {
> +                info!("kept some protected snapshots of group '{target_group}'");
> +                0
> +            } else {
> +                1
> +            };
> +
> +            if count_after > count_before {
> +                count_after = count_before;
> +            }
> +
> +            stats.add(SyncStats::from(RemovedVanishedStats {
> +                snapshots: count_before - count_after,
> +                groups: deleted_groups,
> +                namespaces: 0,
> +            }));
> +        }
> +    }
> +
> +    Ok((progress, stats, errors))
> +}
> +
> +async fn fetch_target_snapshots(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +    group: &BackupGroup,
> +) -> Result<Vec<SnapshotListItem>, Error> {
> +    let api_path = format!(
> +        "api2/json/admin/datastore/{store}/snapshots",
> +        store = params.target.repo.store(),
> +    );
> +    let mut args = serde_json::to_value(group)?;
> +    if !namespace.is_root() {
> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
> +        args["ns"] = serde_json::to_value(target_ns)?;
> +    }
> +    let mut result = params.target.client.get(&api_path, Some(args)).await?;
> +    let snapshots: Vec<SnapshotListItem> = serde_json::from_value(result["data"].take())?;
> +
> +    Ok(snapshots)
> +}
> +
> +async fn fetch_previous_backup_time(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +    group: &BackupGroup,
> +) -> Result<Option<i64>, Error> {
> +    let mut snapshots = fetch_target_snapshots(params, namespace, group).await?;
> +    snapshots.sort_unstable_by(|a, b| a.backup.time.cmp(&b.backup.time));
> +    Ok(snapshots.last().map(|snapshot| snapshot.backup.time))
> +}
> +
> +async fn forget_target_snapshot(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +    snapshot: &BackupDir,
> +) -> Result<(), Error> {
> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
> +        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
> +
> +    let api_path = format!(
> +        "api2/json/admin/datastore/{store}/snapshots",
> +        store = params.target.repo.store(),
> +    );
> +    let mut args = serde_json::to_value(snapshot)?;
> +    if !namespace.is_root() {
> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
> +        args["ns"] = serde_json::to_value(target_ns)?;
> +    }
> +    params.target.client.delete(&api_path, Some(args)).await?;
> +
> +    Ok(())
> +}
> +
> +/// Push group including all snaphshots to target
> +///
> +/// Iterate over all snapshots in the group and push them to the target.
> +/// The group sync operation consists of the following steps:
> +/// - Query snapshots of given group from the source
> +/// - Sort snapshots by time
> +/// - Apply transfer last cutoff and filters to list
> +/// - Iterate the snapshot list and push each snapshot individually
> +/// - (Optional): Remove vanished groups if `remove_vanished` flag is set
> +pub(crate) async fn push_group(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +    group: &BackupGroup,
> +    progress: &mut StoreProgress,
> +) -> Result<SyncStats, Error> {
> +    let mut already_synced_skip_info = SkipInfo::new(SkipReason::AlreadySynced);
> +    let mut transfer_last_skip_info = SkipInfo::new(SkipReason::TransferLast);
> +
> +    let mut snapshots: Vec<BackupDir> = params.source.list_backup_dirs(namespace, group).await?;
> +    snapshots.sort_unstable_by(|a, b| a.time.cmp(&b.time));
> +
> +    let total_snapshots = snapshots.len();
> +    let cutoff = params
> +        .transfer_last
> +        .map(|count| total_snapshots.saturating_sub(count))
> +        .unwrap_or_default();
> +
> +    let last_snapshot_time = fetch_previous_backup_time(params, namespace, group)
> +        .await?
> +        .unwrap_or(i64::MIN);
> +
> +    let mut source_snapshots = HashSet::new();
> +    let snapshots: Vec<BackupDir> = snapshots
> +        .into_iter()
> +        .enumerate()
> +        .filter(|&(pos, ref snapshot)| {
> +            source_snapshots.insert(snapshot.time);
> +            if last_snapshot_time > snapshot.time {
> +                already_synced_skip_info.update(snapshot.time);
> +                return false;
> +            } else if already_synced_skip_info.count > 0 {
> +                info!("{already_synced_skip_info}");
> +                already_synced_skip_info.reset();
> +                return true;
> +            }
> +
> +            if pos < cutoff && last_snapshot_time != snapshot.time {
> +                transfer_last_skip_info.update(snapshot.time);
> +                return false;
> +            } else if transfer_last_skip_info.count > 0 {
> +                info!("{transfer_last_skip_info}");
> +                transfer_last_skip_info.reset();
> +            }
> +            true
> +        })
> +        .map(|(_, dir)| dir)
> +        .collect();
> +
> +    progress.group_snapshots = snapshots.len() as u64;
> +
> +    let target_snapshots = fetch_target_snapshots(params, namespace, group).await?;
> +    let target_snapshots: Vec<BackupDir> = target_snapshots
> +        .into_iter()
> +        .map(|snapshot| snapshot.backup)
> +        .collect();
> +
> +    let mut stats = SyncStats::default();
> +    for (pos, source_snapshot) in snapshots.into_iter().enumerate() {
> +        if target_snapshots.contains(&source_snapshot) {
> +            progress.done_snapshots = pos as u64 + 1;
> +            info!("percentage done: {progress}");
> +            continue;
> +        }
> +        let result = push_snapshot(params, namespace, &source_snapshot).await;
> +
> +        progress.done_snapshots = pos as u64 + 1;
> +        info!("percentage done: {progress}");
> +
> +        // stop on error
> +        let sync_stats = result?;
> +        stats.add(sync_stats);
> +    }
> +
> +    if params.remove_vanished {
> +        let target_snapshots = fetch_target_snapshots(params, namespace, group).await?;
> +        for snapshot in target_snapshots {
> +            if source_snapshots.contains(&snapshot.backup.time) {
> +                continue;
> +            }
> +            if snapshot.protected {
> +                info!(
> +                    "don't delete vanished snapshot {name} (protected)",
> +                    name = snapshot.backup
> +                );
> +                continue;
> +            }
> +            if let Err(err) = forget_target_snapshot(params, namespace, &snapshot.backup).await {
> +                info!(
> +                    "could not delete vanished snapshot {name} - {err}",
> +                    name = snapshot.backup
> +                );
> +            }
> +            info!("delete vanished snapshot {name}", name = snapshot.backup);
> +            stats.add(SyncStats::from(RemovedVanishedStats {
> +                snapshots: 1,
> +                groups: 0,
> +                namespaces: 0,
> +            }));
> +        }
> +    }
> +
> +    Ok(stats)
> +}
> +
> +/// Push snapshot to target
> +///
> +/// Creates a new snapshot on the target and pushes the content of the source snapshot to the
> +/// target by creating a new manifest file and connecting to the remote as backup writer client.
> +/// Chunks are written by recreating the index by uploading the chunk stream as read from the
> +/// source. Data blobs are uploaded as such.
> +pub(crate) async fn push_snapshot(
> +    params: &PushParameters,
> +    namespace: &BackupNamespace,
> +    snapshot: &BackupDir,
> +) -> Result<SyncStats, Error> {
> +    let mut stats = SyncStats::default();
> +    let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
> +    let backup_dir = params
> +        .source
> +        .store
> +        .backup_dir(params.source.ns.clone(), snapshot.clone())?;
> +
> +    let reader = params.source.reader(namespace, snapshot).await?;
> +
> +    // Load the source manifest, needed to find crypt mode for files
> +    let mut tmp_source_manifest_path = backup_dir.full_path();
> +    tmp_source_manifest_path.push(MANIFEST_BLOB_NAME);
> +    tmp_source_manifest_path.set_extension("tmp");
> +    let source_manifest = if let Some(manifest_blob) = reader
> +        .load_file_into(MANIFEST_BLOB_NAME, &tmp_source_manifest_path)
> +        .await?
> +    {
> +        BackupManifest::try_from(manifest_blob)?

why do we copy the manifest into a .tmp file path here instead of just
reading it? and if we need the copy, who's cleaning it up?

> +    } else {
> +        // no manifest in snapshot, skip
> +        return Ok(stats);
> +    };
> +
> +    // Manifest to be created on target, referencing all the source archives after upload.
> +    let mut manifest = BackupManifest::new(snapshot.clone());
> +
> +    // writer instance locks the snapshot on the remote side
> +    let backup_writer = BackupWriter::start(
> +        &params.target.client,
> +        None,
> +        params.target.repo.store(),
> +        &target_ns,
> +        snapshot,
> +        false,
> +        false,
> +    )
> +    .await?;
> +
> +    // Use manifest of previous snapshots in group on target for chunk upload deduplication
> +    let previous_manifest = match backup_writer.download_previous_manifest().await {
> +        Ok(manifest) => Some(Arc::new(manifest)),
> +        Err(err) => {
> +            log::info!("Could not download previous manifest - {err}");
> +            None
> +        }
> +    };

this should not be attempted for the first snapshot in a group, else it
does requests we already know will fail and spams the log as a result as
well..

> +
> +    let upload_options = UploadOptions {
> +        compress: true,
> +        encrypt: false,

this might warrant a comment why it's okay to do that (I hope it is? ;))

> +        previous_manifest,
> +        ..UploadOptions::default()
> +    };
> +
> +    // Avoid double upload penalty by remembering already seen chunks
> +    let known_chunks = Arc::new(Mutex::new(HashSet::with_capacity(1024 * 1024)));
> +
> +    for entry in source_manifest.files() {
> +        let mut path = backup_dir.full_path();
> +        path.push(&entry.filename);
> +        if path.try_exists()? {
> +            match ArchiveType::from_path(&entry.filename)? {
> +                ArchiveType::Blob => {
> +                    let file = std::fs::File::open(path.clone())?;
> +                    let backup_stats = backup_writer.upload_blob(file, &entry.filename).await?;
> +                    manifest.add_file(
> +                        entry.filename.to_string(),
> +                        backup_stats.size,
> +                        backup_stats.csum,
> +                        entry.chunk_crypt_mode(),
> +                    )?;
> +                    stats.add(SyncStats {
> +                        chunk_count: backup_stats.chunk_count as usize,
> +                        bytes: backup_stats.size as usize,
> +                        elapsed: backup_stats.duration,
> +                        removed: None,
> +                    });
> +                }
> +                ArchiveType::DynamicIndex => {
> +                    let index = DynamicIndexReader::open(&path)?;
> +                    let chunk_reader = reader.chunk_reader(entry.chunk_crypt_mode());
> +                    let sync_stats = push_index(
> +                        &entry.filename,
> +                        index,
> +                        chunk_reader,
> +                        &backup_writer,
> +                        &mut manifest,
> +                        entry.chunk_crypt_mode(),
> +                        None,
> +                        &known_chunks,
> +                    )
> +                    .await?;
> +                    stats.add(sync_stats);
> +                }
> +                ArchiveType::FixedIndex => {
> +                    let index = FixedIndexReader::open(&path)?;
> +                    let chunk_reader = reader.chunk_reader(entry.chunk_crypt_mode());
> +                    let size = index.index_bytes();
> +                    let sync_stats = push_index(
> +                        &entry.filename,
> +                        index,
> +                        chunk_reader,
> +                        &backup_writer,
> +                        &mut manifest,
> +                        entry.chunk_crypt_mode(),
> +                        Some(size),
> +                        &known_chunks,
> +                    )
> +                    .await?;
> +                    stats.add(sync_stats);
> +                }
> +            }
> +        } else {
> +            info!("{path:?} does not exist, skipped.");
> +        }
> +    }
> +
> +    // Fetch client log from source and push to target
> +    // this has to be handled individually since the log is never part of the manifest
> +    let mut client_log_path = backup_dir.full_path();
> +    client_log_path.push(CLIENT_LOG_BLOB_NAME);
> +    if client_log_path.is_file() {
> +        backup_writer
> +            .upload_blob_from_file(
> +                &client_log_path,
> +                CLIENT_LOG_BLOB_NAME,
> +                upload_options.clone(),
> +            )
> +            .await?;
> +    } else {
> +        info!("Client log at {client_log_path:?} does not exist or is not a file, skipped.");
> +    }

I am not sure this warrants a log line.. the client log is optional
after all, so this can happen quite a lot in practice (e.g., if you do
host backups without bothering to upload logs..)

I think we should follow the logic of pull based syncing here - add a
log to the last previously synced snapshot if it exists and is missing
on the other end, otherwise only attempt to upload a log if it exists
without logging its absence.

> +
> +    // Rewrite manifest for pushed snapshot, re-adding the existing fingerprint and signature
> +    let mut manifest_json = serde_json::to_value(manifest)?;
> +    manifest_json["unprotected"] = source_manifest.unprotected;
> +    if let Some(signature) = source_manifest.signature {
> +        manifest_json["signature"] = serde_json::to_value(signature)?;
> +    }
> +    let manifest_string = serde_json::to_string_pretty(&manifest_json).unwrap();

couldn't we just upload the original manifest here?

> +    let backup_stats = backup_writer
> +        .upload_blob_from_data(
> +            manifest_string.into_bytes(),
> +            MANIFEST_BLOB_NAME,
> +            upload_options,
> +        )
> +        .await?;
> +    backup_writer.finish().await?;
> +
> +    stats.add(SyncStats {
> +        chunk_count: backup_stats.chunk_count as usize,
> +        bytes: backup_stats.size as usize,
> +        elapsed: backup_stats.duration,
> +        removed: None,
> +    });
> +
> +    Ok(stats)
> +}
> +
> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
> +//
> +// For fixed indexes, the size must be provided as given by the index reader.
> +#[allow(clippy::too_many_arguments)]
> +async fn push_index<'a>(
> +    filename: &'a str,
> +    index: impl IndexFile + Send + 'static,
> +    chunk_reader: Arc<dyn AsyncReadChunk>,
> +    backup_writer: &BackupWriter,
> +    manifest: &mut BackupManifest,
> +    crypt_mode: CryptMode,
> +    size: Option<u64>,
> +    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
> +) -> Result<SyncStats, Error> {
> +    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
> +    let mut chunk_infos =
> +        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());

so this iterates over all the chunks in the index..

> +
> +    tokio::spawn(async move {
> +        while let Some(chunk_info) = chunk_infos.next().await {
> +            let chunk_info = chunk_reader
> +                .read_raw_chunk(&chunk_info.digest)

and this reads them

> +                .await
> +                .map(|chunk| ChunkInfo {
> +                    chunk,
> +                    digest: chunk_info.digest,
> +                    chunk_len: chunk_info.size(),
> +                    offset: chunk_info.range.start,
> +                });
> +            let _ = upload_channel_tx.send(chunk_info).await;

and sends them further along to the upload code.. which will then (in
many cases) throw away all that data we just read because it's already
on the target and we know that because of the previous manifest..

wouldn't it be better to deduplicate here already, and instead of
reading known chunks over and over again, just tell the server to
re-register them? or am I missing something here? :)

> +        }
> +    });
> +
> +    let chunk_info_stream = ReceiverStream::new(upload_channel_rx).map_err(Error::from);
> +
> +    let upload_options = UploadOptions {
> +        compress: true,
> +        encrypt: false,
> +        fixed_size: size,
> +        ..UploadOptions::default()
> +    };
> +
> +    let upload_stats = backup_writer
> +        .upload_index_chunk_info(
> +            filename,
> +            chunk_info_stream,
> +            upload_options,
> +            known_chunks.clone(),
> +        )
> +        .await?;
> +
> +    manifest.add_file(
> +        filename.to_string(),
> +        upload_stats.size,
> +        upload_stats.csum,
> +        crypt_mode,
> +    )?;
> +
> +    Ok(SyncStats {
> +        chunk_count: upload_stats.chunk_count as usize,
> +        bytes: upload_stats.size as usize,
> +        elapsed: upload_stats.duration,
> +        removed: None,
> +    })
> +}
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (32 preceding siblings ...)
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api Christian Ebner
@ 2024-10-10 14:48 ` Fabian Grünbichler
  2024-10-11  7:12   ` Christian Ebner
  2024-10-14 11:04 ` [pbs-devel] partially-applied: " Fabian Grünbichler
  2024-10-17 13:31 ` [pbs-devel] " Christian Ebner
  35 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-10 14:48 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

left some comments on individual patches (those that I got around to
anyway, which is roughly up to patch #20), the permissions are still not
quite right, but since those changes are spread over a few patches, I'll
leave the comment for that here in one place (existing pull priv checks
should remain as they are, the following *only* applies to push based
syncing, except maybe the first bit):

UI/UX issues:

- I can create a sync job without having DatastoreAudit, but then I
  don't see it afterwards (this affects pull and push)

usage of helpers and logic in helpers:

- I can see other people's push jobs (where local_user/owner != auth_id)
-- I can't modify them or create such jobs (unless I am highly privileged)
-- I can execute them (even if I am not highly privileged!)

the check_sync_job_remote_datastore_backup_access helper is wrong (it
doesn't account for auth_id vs local_user/owner at all). also, it is not
called when modifying a sync job or creating one, just when executing it
manually, which is probably also wrong. it also has a logic bug (missing
"not" when preparing the remote ACL path).

privileges:

- for pull-syncing, creating/removing namespaces needs PRIV_DATASTORE_MODIFY
- for push-syncing, creating namespaces needs PRIV_REMOTE_DATASTORE_MODIFY
- for push-syncing, removing namespaces needs PRIV_REMOTE_DATASTORE_PRUNE(!)
- manual push requires PRIV_REMOTE_DATASTORE_MODIFY (instead of
  PRIV_REMOTE_DATASTORE_BACKUP)

related code style nit:

since job_user is required for pushing (in
`check_ns_remote_datastore_privs`), it might make sense to not allow
creation of PushParameters without it set, e.g. by changing the TryFrom
impl to convert from (SyncJobConfig, AuthId) instead of just the
config.. or by using a custom helper.

On September 12, 2024 4:32 pm, Christian Ebner wrote:
> This patch series implements the functionality to extend the current
> sync jobs in pull direction by an additional push direction, allowing
> to push contents of a local source datastore to a remote target.
> 
> The series implements this by using the REST API of the remote target
> for fetching, creating and/or deleting namespaces, groups and backups,
> and reuses the clients backup writer functionality to create snapshots
> by writing a manifeset on the remote target and sync the fixed index,
> dynamic index or blobs contained in the source manifest to the remote,
> preserving also encryption information.
> 
> Thanks to Fabian for further feedback to the previous version of the
> patches, especially regarding users and ACLs.
> 
> Most notable changes since version 2 of the patch series include:
> - Add checks and extend roles and privs to allow for restricting a local
>   users access to remote datastore operations. In order to perform a
>   full sync in push direction, including permissions for namespace
>   creation and deleting contents with remove vansished, a acl.cfg looks
>   like below:
>   ```
>   acl:1:/datastore/datastore:syncoperator@pbs:DatastoreAudit
>   acl:1:/remote:syncoperator@pbs:RemoteSyncOperator
>   acl:1:/remote/local/pushme:syncoperator@pbs:RemoteDatastoreModify,RemoteDatastorePrune,RemoteSyncPushOperator
>   ```
>   Based on further feedback, privs might get further grouped or an
>   additional role containing most of these can be created.
> - Drop patch introducing `no-timestamp-check` flag for backup client, as pointed
>   out by Fabian this is not needed, as only backups newer than the currently
>   last available will be pushed.
> - Fix read snapshots from source by using the correct namespace.
> - Rename PullParameters `owner` to more fitting `local_user`.
> - Fix typos in remote sync push operator comment.
> - Fix comments not matching the functionality for the cli implementations.
> 
> The patch series is structured as follows in this version:
> - patch 1 is a cleanup patch fixing typos in api documentation.
> - patches 2 to 7 are patches restructuring the current code so that
>   functionality of the current pull implementation can be reused for
>   the push implementation as well.
> - patch 8 extens the backup writers functionality to be able to push
>   snapshots to the target.
> - patches 9 to 11 are once again preparatory patches for shared
>   implementation of sync jobs in pull and push direction.
> - patches 12 to 14 define the required permission acls and roles.
> - patch 15 implements almost all of the logic required for the push,
>   including pushing of the datastore, namespace, groups and snapshots,
>   taking into account also filters and additional sync flags.
> - patch 16 extends the current sync job configuration by a new config
>   type `sync-push` allowing to configure sync jobs in push direction
>   while limiting possible misconfiguration errors.
> - patches 17 to 28 expose the new sync job direction via the API, CLI
>   and WebUI.
> - patches 29 to 33 finally are followup patches, changing the return
>   type for the backup group and namespace delete REST API endpoints
>   to return statistics on the deleted snapshots, groups and namespaces,
>   which are then used to include this information in the task log.
>   As this is an API breaking change, the patches are kept independent
>   from the other patches.
> 
> Link to issue on bugtracker:
> https://bugzilla.proxmox.com/show_bug.cgi?id=3044
> 
> Christian Ebner (33):
>   api: datastore: add missing whitespace in description
>   server: sync: move sync related stats to common module
>   server: sync: move reader trait to common sync module
>   server: sync: move source to common sync module
>   client: backup writer: bundle upload stats counters
>   client: backup writer: factor out merged chunk stream upload
>   client: backup writer: add chunk count and duration stats
>   client: backup writer: allow push uploading index and chunks
>   server: sync: move skip info/reason to common sync module
>   server: sync: make skip reason message more genenric
>   server: sync: factor out namespace depth check into sync module
>   config: acl: mention optional namespace acl path component
>   config: acl: allow namespace components for remote datastores
>   api types: define remote permissions and roles for push sync
>   fix #3044: server: implement push support for sync operations
>   config: jobs: add `sync-push` config type for push sync jobs
>   api: push: implement endpoint for sync in push direction
>   api: sync: move sync job invocation to server sync module
>   api: sync jobs: expose optional `sync-direction` parameter
>   api: sync: add permission checks for push sync jobs
>   bin: manager: add datastore push cli command
>   ui: group filter: allow to set namespace for local datastore
>   ui: sync edit: source group filters based on sync direction
>   ui: add view with separate grids for pull and push sync jobs
>   ui: sync job: adapt edit window to be used for pull and push
>   ui: sync: pass sync-direction to allow removing push jobs
>   ui: sync view: do not use data model proxy for store
>   ui: sync view: set sync direction when invoking run task via api
>   datastore: move `BackupGroupDeleteStats` to api types
>   api types: implement api type for `BackupGroupDeleteStats`
>   datastore: increment deleted group counter when removing group
>   api: datastore/namespace: return backup groups delete stats on remove
>   server: sync job: use delete stats provided by the api
> 
>  pbs-api-types/src/acl.rs             |  32 +
>  pbs-api-types/src/datastore.rs       |  64 ++
>  pbs-api-types/src/jobs.rs            |  52 ++
>  pbs-client/src/backup_writer.rs      | 228 +++++--
>  pbs-config/src/acl.rs                |   7 +-
>  pbs-config/src/sync.rs               |  11 +-
>  pbs-datastore/src/backup_info.rs     |  34 +-
>  pbs-datastore/src/datastore.rs       |  27 +-
>  src/api2/admin/datastore.rs          |  24 +-
>  src/api2/admin/namespace.rs          |  20 +-
>  src/api2/admin/sync.rs               |  45 +-
>  src/api2/config/datastore.rs         |  22 +-
>  src/api2/config/notifications/mod.rs |  15 +-
>  src/api2/config/sync.rs              |  84 ++-
>  src/api2/mod.rs                      |   2 +
>  src/api2/pull.rs                     | 108 ----
>  src/api2/push.rs                     | 182 ++++++
>  src/bin/proxmox-backup-manager.rs    | 216 +++++--
>  src/bin/proxmox-backup-proxy.rs      |  25 +-
>  src/server/mod.rs                    |   3 +
>  src/server/pull.rs                   | 658 ++------------------
>  src/server/push.rs                   | 883 +++++++++++++++++++++++++++
>  src/server/sync.rs                   | 700 +++++++++++++++++++++
>  www/Makefile                         |   1 +
>  www/config/SyncPullPushView.js       |  60 ++
>  www/config/SyncView.js               |  47 +-
>  www/datastore/DataStoreList.js       |   2 +-
>  www/datastore/Panel.js               |   2 +-
>  www/form/GroupFilter.js              |  18 +-
>  www/window/SyncJobEdit.js            |  45 +-
>  30 files changed, 2706 insertions(+), 911 deletions(-)
>  create mode 100644 src/api2/push.rs
>  create mode 100644 src/server/push.rs
>  create mode 100644 src/server/sync.rs
>  create mode 100644 www/config/SyncPullPushView.js
> 
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter Christian Ebner
@ 2024-10-10 14:48   ` Fabian Grünbichler
  2024-10-14  8:10     ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-10 14:48 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On September 12, 2024 4:33 pm, Christian Ebner wrote:
> Exposes and switch the config type for sync job operations based
> on the `sync-direction` parameter. If not set, the default config
> type is `sync` and the default sync direction is `pull` for full
> backwards compatibility.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - no changes
> 
>  src/api2/admin/sync.rs               | 28 +++++++++------
>  src/api2/config/datastore.rs         | 22 +++++++++---
>  src/api2/config/notifications/mod.rs | 15 ++++++--
>  src/api2/config/sync.rs              | 53 +++++++++++++++++++++++-----
>  src/bin/proxmox-backup-proxy.rs      | 12 +++++--
>  5 files changed, 101 insertions(+), 29 deletions(-)
> 
> diff --git a/src/api2/admin/sync.rs b/src/api2/admin/sync.rs
> index be324564c..bdbc06a8e 100644
> --- a/src/api2/admin/sync.rs
> +++ b/src/api2/admin/sync.rs
> @@ -12,6 +12,7 @@ use proxmox_sortable_macro::sortable;
>  
>  use pbs_api_types::{
>      Authid, SyncDirection, SyncJobConfig, SyncJobStatus, DATASTORE_SCHEMA, JOB_ID_SCHEMA,
> +    SYNC_DIRECTION_SCHEMA,
>  };
>  use pbs_config::sync;
>  use pbs_config::CachedUserInfo;
> @@ -29,6 +30,10 @@ use crate::{
>                  schema: DATASTORE_SCHEMA,
>                  optional: true,
>              },
> +            "sync-direction": {
> +                schema: SYNC_DIRECTION_SCHEMA,
> +                optional: true,
> +            },
>          },
>      },
>      returns: {
> @@ -44,6 +49,7 @@ use crate::{
>  /// List all sync jobs
>  pub fn list_sync_jobs(
>      store: Option<String>,
> +    sync_direction: Option<SyncDirection>,

would be much nicer if the default were already encoded in the API
schema

>      _param: Value,
>      rpcenv: &mut dyn RpcEnvironment,
>  ) -> Result<Vec<SyncJobStatus>, Error> {
> @@ -51,9 +57,10 @@ pub fn list_sync_jobs(
>      let user_info = CachedUserInfo::new()?;
>  
>      let (config, digest) = sync::config()?;
> +    let sync_direction = sync_direction.unwrap_or_default();

instead of unwrapping here..

>  
>      let job_config_iter = config
> -        .convert_to_typed_array("sync")?
> +        .convert_to_typed_array(sync_direction.as_config_type_str())?
>          .into_iter()
>          .filter(|job: &SyncJobConfig| {
>              if let Some(store) = &store {
> @@ -88,7 +95,11 @@ pub fn list_sync_jobs(
>          properties: {
>              id: {
>                  schema: JOB_ID_SCHEMA,
> -            }
> +            },
> +            "sync-direction": {
> +                schema: SYNC_DIRECTION_SCHEMA,
> +                optional: true,
> +            },
>          }
>      },
>      access: {
> @@ -99,6 +110,7 @@ pub fn list_sync_jobs(
>  /// Runs the sync jobs manually.
>  pub fn run_sync_job(
>      id: String,
> +    sync_direction: Option<SyncDirection>,
>      _info: &ApiMethod,
>      rpcenv: &mut dyn RpcEnvironment,
>  ) -> Result<String, Error> {
> @@ -106,7 +118,8 @@ pub fn run_sync_job(
>      let user_info = CachedUserInfo::new()?;
>  
>      let (config, _digest) = sync::config()?;
> -    let sync_job: SyncJobConfig = config.lookup("sync", &id)?;
> +    let sync_direction = sync_direction.unwrap_or_default();

same here

> +    let sync_job: SyncJobConfig = config.lookup(sync_direction.as_config_type_str(), &id)?;
>  
>      if !check_sync_job_modify_access(&user_info, &auth_id, &sync_job) {
>          bail!("permission check failed");
> @@ -116,14 +129,7 @@ pub fn run_sync_job(
>  
>      let to_stdout = rpcenv.env_type() == RpcEnvironmentType::CLI;
>  
> -    let upid_str = do_sync_job(
> -        job,
> -        sync_job,
> -        &auth_id,
> -        None,
> -        SyncDirection::Pull,
> -        to_stdout,
> -    )?;
> +    let upid_str = do_sync_job(job, sync_job, &auth_id, None, sync_direction, to_stdout)?;
>  
>      Ok(upid_str)
>  }
> diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
> index ca6edf05a..a01d26cad 100644
> --- a/src/api2/config/datastore.rs
> +++ b/src/api2/config/datastore.rs
> @@ -13,8 +13,9 @@ use proxmox_uuid::Uuid;
>  
>  use pbs_api_types::{
>      Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreNotify, DatastoreTuning, KeepOptions,
> -    MaintenanceMode, PruneJobConfig, PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE,
> -    PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
> +    MaintenanceMode, PruneJobConfig, PruneJobOptions, SyncDirection, DATASTORE_SCHEMA,
> +    PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY,
> +    PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
>  };
>  use pbs_config::BackupLockGuard;
>  use pbs_datastore::chunk_store::ChunkStore;
> @@ -498,8 +499,21 @@ pub async fn delete_datastore(
>          for job in list_verification_jobs(Some(name.clone()), Value::Null, rpcenv)? {
>              delete_verification_job(job.config.id, None, rpcenv)?
>          }
> -        for job in list_sync_jobs(Some(name.clone()), Value::Null, rpcenv)? {
> -            delete_sync_job(job.config.id, None, rpcenv)?
> +        for job in list_sync_jobs(
> +            Some(name.clone()),
> +            Some(SyncDirection::Pull),
> +            Value::Null,
> +            rpcenv,
> +        )? {
> +            delete_sync_job(job.config.id, Some(SyncDirection::Pull), None, rpcenv)?
> +        }
> +        for job in list_sync_jobs(
> +            Some(name.clone()),
> +            Some(SyncDirection::Push),
> +            Value::Null,
> +            rpcenv,
> +        )? {
> +            delete_sync_job(job.config.id, Some(SyncDirection::Push), None, rpcenv)?

this looks a bit weird, but I guess it's a side-effect we have to live
with if we want to separate both types of sync jobs somewhat.. could
still be a nested loop though for brevity?

for direction in .. {
    for job in list_sync_jobs(.. , direction, ..)? {
        delete_sync_job(.. , direction, ..)?;
    }
}

>          }
>          for job in list_prune_jobs(Some(name.clone()), Value::Null, rpcenv)? {
>              delete_prune_job(job.config.id, None, rpcenv)?
> diff --git a/src/api2/config/notifications/mod.rs b/src/api2/config/notifications/mod.rs
> index dfe82ed03..9622d43ee 100644
> --- a/src/api2/config/notifications/mod.rs
> +++ b/src/api2/config/notifications/mod.rs
> @@ -9,7 +9,7 @@ use proxmox_schema::api;
>  use proxmox_sortable_macro::sortable;
>  
>  use crate::api2::admin::datastore::get_datastore_list;
> -use pbs_api_types::PRIV_SYS_AUDIT;
> +use pbs_api_types::{SyncDirection, PRIV_SYS_AUDIT};
>  
>  use crate::api2::admin::prune::list_prune_jobs;
>  use crate::api2::admin::sync::list_sync_jobs;
> @@ -154,8 +154,16 @@ pub fn get_values(
>          });
>      }
>  
> -    let sync_jobs = list_sync_jobs(None, param.clone(), rpcenv)?;
> -    for job in sync_jobs {
> +    let sync_jobs_pull = list_sync_jobs(None, Some(SyncDirection::Pull), param.clone(), rpcenv)?;
> +    for job in sync_jobs_pull {
> +        values.push(MatchableValue {
> +            field: "job-id".into(),
> +            value: job.config.id,
> +            comment: job.config.comment,
> +        });
> +    }
> +    let sync_jobs_push = list_sync_jobs(None, Some(SyncDirection::Push), param.clone(), rpcenv)?;
> +    for job in sync_jobs_push {

here as well? or alternatively, all a third SyncDirection variant Any,
but not sure if it's worth it just for those two list_sync_jobs
functions (btw, one of those might benefit from being renamed while we
are at it..).

>          values.push(MatchableValue {
>              field: "job-id".into(),
>              value: job.config.id,
> @@ -184,6 +192,7 @@ pub fn get_values(
>          "package-updates",
>          "prune",
>          "sync",
> +        "sync-push",
>          "system-mail",
>          "tape-backup",
>          "tape-load",
> diff --git a/src/api2/config/sync.rs b/src/api2/config/sync.rs
> index 6fdc69a9e..a21e0bd6f 100644
> --- a/src/api2/config/sync.rs
> +++ b/src/api2/config/sync.rs
> @@ -1,6 +1,7 @@
>  use ::serde::{Deserialize, Serialize};
>  use anyhow::{bail, Error};
>  use hex::FromHex;
> +use pbs_api_types::SyncDirection;
>  use serde_json::Value;
>  
>  use proxmox_router::{http_bail, Permission, Router, RpcEnvironment};
> @@ -9,7 +10,7 @@ use proxmox_schema::{api, param_bail};
>  use pbs_api_types::{
>      Authid, SyncJobConfig, SyncJobConfigUpdater, JOB_ID_SCHEMA, PRIV_DATASTORE_AUDIT,
>      PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_MODIFY, PRIV_DATASTORE_PRUNE, PRIV_REMOTE_AUDIT,
> -    PRIV_REMOTE_READ, PROXMOX_CONFIG_DIGEST_SCHEMA,
> +    PRIV_REMOTE_READ, PROXMOX_CONFIG_DIGEST_SCHEMA, SYNC_DIRECTION_SCHEMA,
>  };
>  use pbs_config::sync;
>  
> @@ -77,7 +78,12 @@ pub fn check_sync_job_modify_access(
>  
>  #[api(
>      input: {
> -        properties: {},
> +        properties: {
> +            "sync-direction": {
> +                schema: SYNC_DIRECTION_SCHEMA,
> +                optional: true,
> +            },
> +        },
>      },
>      returns: {
>          description: "List configured jobs.",
> @@ -92,6 +98,7 @@ pub fn check_sync_job_modify_access(
>  /// List all sync jobs
>  pub fn list_sync_jobs(
>      _param: Value,
> +    sync_direction: Option<SyncDirection>,
>      rpcenv: &mut dyn RpcEnvironment,
>  ) -> Result<Vec<SyncJobConfig>, Error> {
>      let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
> @@ -99,7 +106,8 @@ pub fn list_sync_jobs(
>  
>      let (config, digest) = sync::config()?;
>  
> -    let list = config.convert_to_typed_array("sync")?;
> +    let sync_direction = sync_direction.unwrap_or_default();

this unwrap_or_default would also be better off being encoded in the
schema..

> +    let list = config.convert_to_typed_array(sync_direction.as_config_type_str())?;
>  
>      rpcenv["digest"] = hex::encode(digest).into();
>  
> @@ -118,6 +126,10 @@ pub fn list_sync_jobs(
>                  type: SyncJobConfig,
>                  flatten: true,
>              },
> +            "sync-direction": {
> +                schema: SYNC_DIRECTION_SCHEMA,
> +                optional: true,
> +            },
>          },
>      },
>      access: {
> @@ -128,6 +140,7 @@ pub fn list_sync_jobs(
>  /// Create a new sync job.
>  pub fn create_sync_job(
>      config: SyncJobConfig,
> +    sync_direction: Option<SyncDirection>,
>      rpcenv: &mut dyn RpcEnvironment,
>  ) -> Result<(), Error> {
>      let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
> @@ -158,7 +171,8 @@ pub fn create_sync_job(
>          param_bail!("id", "job '{}' already exists.", config.id);
>      }
>  
> -    section_config.set_data(&config.id, "sync", &config)?;
> +    let sync_direction = sync_direction.unwrap_or_default();

same here

> +    section_config.set_data(&config.id, sync_direction.as_config_type_str(), &config)?;
>  
>      sync::save_config(&section_config)?;
>  
> @@ -173,6 +187,10 @@ pub fn create_sync_job(
>              id: {
>                  schema: JOB_ID_SCHEMA,
>              },
> +            "sync-direction": {
> +                schema: SYNC_DIRECTION_SCHEMA,
> +                optional: true,
> +            },
>          },
>      },
>      returns: { type: SyncJobConfig },
> @@ -182,13 +200,18 @@ pub fn create_sync_job(
>      },
>  )]
>  /// Read a sync job configuration.
> -pub fn read_sync_job(id: String, rpcenv: &mut dyn RpcEnvironment) -> Result<SyncJobConfig, Error> {
> +pub fn read_sync_job(
> +    id: String,
> +    sync_direction: Option<SyncDirection>,
> +    rpcenv: &mut dyn RpcEnvironment,
> +) -> Result<SyncJobConfig, Error> {
>      let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
>      let user_info = CachedUserInfo::new()?;
>  
>      let (config, digest) = sync::config()?;
>  
> -    let sync_job = config.lookup("sync", &id)?;
> +    let sync_direction = sync_direction.unwrap_or_default();

and here

> +    let sync_job = config.lookup(sync_direction.as_config_type_str(), &id)?;
>      if !check_sync_job_read_access(&user_info, &auth_id, &sync_job) {
>          bail!("permission check failed");
>      }
> @@ -252,6 +275,10 @@ pub enum DeletableProperty {
>                      type: DeletableProperty,
>                  }
>              },
> +            "sync-direction": {
> +                schema: SYNC_DIRECTION_SCHEMA,
> +                optional: true,
> +            },
>              digest: {
>                  optional: true,
>                  schema: PROXMOX_CONFIG_DIGEST_SCHEMA,
> @@ -269,6 +296,7 @@ pub fn update_sync_job(
>      id: String,
>      update: SyncJobConfigUpdater,
>      delete: Option<Vec<DeletableProperty>>,
> +    sync_direction: Option<SyncDirection>,
>      digest: Option<String>,
>      rpcenv: &mut dyn RpcEnvironment,
>  ) -> Result<(), Error> {
> @@ -284,7 +312,8 @@ pub fn update_sync_job(
>          crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
>      }
>  
> -    let mut data: SyncJobConfig = config.lookup("sync", &id)?;
> +    let sync_direction = sync_direction.unwrap_or_default();

and here

> +    let mut data: SyncJobConfig = config.lookup(sync_direction.as_config_type_str(), &id)?;
>  
>      if let Some(delete) = delete {
>          for delete_prop in delete {
> @@ -409,7 +438,7 @@ pub fn update_sync_job(
>          bail!("permission check failed");
>      }
>  
> -    config.set_data(&id, "sync", &data)?;
> +    config.set_data(&id, sync_direction.as_config_type_str(), &data)?;
>  
>      sync::save_config(&config)?;
>  
> @@ -427,6 +456,10 @@ pub fn update_sync_job(
>              id: {
>                  schema: JOB_ID_SCHEMA,
>              },
> +            "sync-direction": {
> +                schema: SYNC_DIRECTION_SCHEMA,
> +                optional: true,
> +            },
>              digest: {
>                  optional: true,
>                  schema: PROXMOX_CONFIG_DIGEST_SCHEMA,
> @@ -441,6 +474,7 @@ pub fn update_sync_job(
>  /// Remove a sync job configuration
>  pub fn delete_sync_job(
>      id: String,
> +    sync_direction: Option<SyncDirection>,
>      digest: Option<String>,
>      rpcenv: &mut dyn RpcEnvironment,
>  ) -> Result<(), Error> {
> @@ -456,7 +490,8 @@ pub fn delete_sync_job(
>          crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
>      }
>  
> -    match config.lookup("sync", &id) {
> +    let sync_direction = sync_direction.unwrap_or_default();

and here

> +    match config.lookup(sync_direction.as_config_type_str(), &id) {
>          Ok(job) => {
>              if !check_sync_job_modify_access(&user_info, &auth_id, &job) {
>                  bail!("permission check failed");
> diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
> index 4409234b2..2b6f1c133 100644
> --- a/src/bin/proxmox-backup-proxy.rs
> +++ b/src/bin/proxmox-backup-proxy.rs
> @@ -608,7 +608,15 @@ async fn schedule_datastore_sync_jobs() {
>          Ok((config, _digest)) => config,
>      };
>  
> -    for (job_id, (_, job_config)) in config.sections {
> +    for (job_id, (job_type, job_config)) in config.sections {
> +        let sync_direction = match job_type.as_str() {
> +            "sync" => SyncDirection::Pull,
> +            "sync-push" => SyncDirection::Push,
> +            _ => {
> +                eprintln!("unexpected config type in sync job config - {job_type}");
> +                continue;
> +            }
> +        };

can this even happen? we don't allow unknown section types in the
SyncJobConfig.. arguably, this should have used the `FromStr`
implementation, and might be an argument for keeping it around instead
of dropping it ;)

>          let job_config: SyncJobConfig = match serde_json::from_value(job_config) {
>              Ok(c) => c,
>              Err(err) => {
> @@ -635,7 +643,7 @@ async fn schedule_datastore_sync_jobs() {
>                  job_config,
>                  &auth_id,
>                  Some(event_str),
> -                SyncDirection::Pull,
> +                sync_direction,
>                  false,
>              ) {
>                  eprintln!("unable to start datastore sync job {job_id} - {err}");
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs Christian Ebner
@ 2024-10-10 14:48   ` Fabian Grünbichler
  2024-10-14  8:16     ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-10 14:48 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On September 12, 2024 4:33 pm, Christian Ebner wrote:
> In order for sync jobs to be either pull or push jobs, allow to
> configure the direction of the job.
> 
> Adds an additional config type `sync-push` to the sync job config, to
> clearly distinguish sync jobs configured in pull and in push
> direction.
> 
> This approach was chosen in order to limit possible misconfiguration,
> as unintentionally switching the sync direction could potentially
> delete still required snapshots.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - no changes
> 
>  pbs-api-types/src/jobs.rs | 52 +++++++++++++++++++++++++++++++++++++++
>  pbs-config/src/sync.rs    | 11 +++++++--
>  2 files changed, 61 insertions(+), 2 deletions(-)
> 
> diff --git a/pbs-api-types/src/jobs.rs b/pbs-api-types/src/jobs.rs
> index 868702bc0..12b39782c 100644
> --- a/pbs-api-types/src/jobs.rs
> +++ b/pbs-api-types/src/jobs.rs
> @@ -20,6 +20,8 @@ const_regex! {
>      pub VERIFICATION_JOB_WORKER_ID_REGEX = concatcp!(r"^(", PROXMOX_SAFE_ID_REGEX_STR, r"):");
>      /// Regex for sync jobs '(REMOTE|\-):REMOTE_DATASTORE:LOCAL_DATASTORE:(?:LOCAL_NS_ANCHOR:)ACTUAL_JOB_ID'
>      pub SYNC_JOB_WORKER_ID_REGEX = concatcp!(r"^(", PROXMOX_SAFE_ID_REGEX_STR, r"|\-):(", PROXMOX_SAFE_ID_REGEX_STR, r"):(", PROXMOX_SAFE_ID_REGEX_STR, r")(?::(", BACKUP_NS_RE, r"))?:");
> +    /// Regex for sync direction'(pull|push)'
> +    pub SYNC_DIRECTION_REGEX = r"^(pull|push)$";
>  }
>  
>  pub const JOB_ID_SCHEMA: Schema = StringSchema::new("Job ID.")
> @@ -498,6 +500,56 @@ pub const TRANSFER_LAST_SCHEMA: Schema =
>          .minimum(1)
>          .schema();
>  
> +pub const SYNC_DIRECTION_SCHEMA: Schema = StringSchema::new("Sync job direction (pull|push)")
> +    .format(&ApiStringFormat::Pattern(&SYNC_DIRECTION_REGEX))
> +    .schema();
> +
> +/// Direction of the sync job, push or pull
> +#[derive(Clone, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Hash, UpdaterType)]
> +pub enum SyncDirection {
> +    #[default]
> +    Pull,
> +    Push,
> +}
> +
> +impl ApiType for SyncDirection {
> +    const API_SCHEMA: Schema = SYNC_DIRECTION_SCHEMA;
> +}
> +
> +// used for serialization using `proxmox_serde::forward_serialize_to_display` macro
> +impl std::fmt::Display for SyncDirection {
> +    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
> +        match self {
> +            SyncDirection::Pull => f.write_str("pull"),
> +            SyncDirection::Push => f.write_str("push"),
> +        }
> +    }
> +}
> +
> +impl std::str::FromStr for SyncDirection {
> +    type Err = anyhow::Error;
> +
> +    fn from_str(s: &str) -> Result<Self, Self::Err> {
> +        match s {
> +            "pull" => Ok(SyncDirection::Pull),
> +            "push" => Ok(SyncDirection::Push),
> +            _ => bail!("invalid sync direction"),
> +        }
> +    }
> +}
> +
> +proxmox_serde::forward_deserialize_to_from_str!(SyncDirection);
> +proxmox_serde::forward_serialize_to_display!(SyncDirection);

wouldn't just implementing Display, and deriving Serialize/Deserialize
be enough?

> +
> +impl SyncDirection {
> +    pub fn as_config_type_str(&self) -> &'static str {
> +        match self {
> +            SyncDirection::Pull => "sync",
> +            SyncDirection::Push => "sync-push",
> +        }
> +    }
> +}
> +
>  #[api(
>      properties: {
>          id: {
> diff --git a/pbs-config/src/sync.rs b/pbs-config/src/sync.rs
> index 45453abb1..143b73e78 100644
> --- a/pbs-config/src/sync.rs
> +++ b/pbs-config/src/sync.rs
> @@ -18,9 +18,16 @@ fn init() -> SectionConfig {
>          _ => unreachable!(),
>      };
>  
> -    let plugin = SectionConfigPlugin::new("sync".to_string(), Some(String::from("id")), obj_schema);
> +    let pull_plugin =
> +        SectionConfigPlugin::new("sync".to_string(), Some(String::from("id")), obj_schema);
> +    let push_plugin = SectionConfigPlugin::new(
> +        "sync-push".to_string(),

these here should probably use SyncDirection's as_config_type_str to
make the connection obvious when adapting that at some point in the
future?

> +        Some(String::from("id")),
> +        obj_schema,
> +    );
>      let mut config = SectionConfig::new(&JOB_ID_SCHEMA);
> -    config.register_plugin(plugin);
> +    config.register_plugin(pull_plugin);
> +    config.register_plugin(push_plugin);
>  
>      config
>  }
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 05/33] client: backup writer: bundle upload stats counters
  2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 05/33] client: backup writer: bundle upload stats counters Christian Ebner
@ 2024-10-10 14:49   ` Fabian Grünbichler
  0 siblings, 0 replies; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-10 14:49 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

high level nit: I wonder whether it would make sense to refactor/merge
BackupStats, UploadStats and UploadStatsCounter and add some `fn`s
(e.g., always-inline fns to add to certain counters) and move the lot to
their own module?

e.g., you could then do something like

counters.to_stats(csum, duration)

to get the UploadStats, abstract away all the Atomics, get rid of a lot
of Arc cloning boilerplate, ..

On September 12, 2024 4:32 pm, Christian Ebner wrote:
> In preparation for push support in sync jobs.
> 
> Introduce `UploadStatsCounters` struct to hold the Arc clones of the
> chunk upload statistics counters. By bundling them into the struct,
> they can be passed as single function parameter when factoring out
> the common stream future implementation in the subsequent
> implementation of the chunk upload for push support in sync jobs.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - no changes
> 
>  pbs-client/src/backup_writer.rs | 52 ++++++++++++++++++---------------
>  1 file changed, 28 insertions(+), 24 deletions(-)
> 
> diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
> index d63c09b5a..34ac47beb 100644
> --- a/pbs-client/src/backup_writer.rs
> +++ b/pbs-client/src/backup_writer.rs
> @@ -65,6 +65,16 @@ struct UploadStats {
>      csum: [u8; 32],
>  }
>  
> +struct UploadStatsCounters {
> +    injected_chunk_count: Arc<AtomicUsize>,
> +    known_chunk_count: Arc<AtomicUsize>,
> +    total_chunks: Arc<AtomicUsize>,
> +    compressed_stream_len: Arc<AtomicU64>,
> +    injected_len: Arc<AtomicUsize>,
> +    reused_len: Arc<AtomicUsize>,
> +    stream_len: Arc<AtomicUsize>,
> +}
> +
>  type UploadQueueSender = mpsc::Sender<(MergedChunkInfo, Option<h2::client::ResponseFuture>)>;
>  type UploadResultReceiver = oneshot::Receiver<Result<(), Error>>;
>  
> @@ -638,20 +648,23 @@ impl BackupWriter {
>          injections: Option<std::sync::mpsc::Receiver<InjectChunks>>,
>      ) -> impl Future<Output = Result<UploadStats, Error>> {
>          let total_chunks = Arc::new(AtomicUsize::new(0));
> -        let total_chunks2 = total_chunks.clone();
>          let known_chunk_count = Arc::new(AtomicUsize::new(0));
> -        let known_chunk_count2 = known_chunk_count.clone();
>          let injected_chunk_count = Arc::new(AtomicUsize::new(0));
> -        let injected_chunk_count2 = injected_chunk_count.clone();
>  
>          let stream_len = Arc::new(AtomicUsize::new(0));
> -        let stream_len2 = stream_len.clone();
>          let compressed_stream_len = Arc::new(AtomicU64::new(0));
> -        let compressed_stream_len2 = compressed_stream_len.clone();
>          let reused_len = Arc::new(AtomicUsize::new(0));
> -        let reused_len2 = reused_len.clone();
>          let injected_len = Arc::new(AtomicUsize::new(0));
> -        let injected_len2 = injected_len.clone();
> +
> +        let counters = UploadStatsCounters {
> +            injected_chunk_count: injected_chunk_count.clone(),
> +            known_chunk_count: known_chunk_count.clone(),
> +            total_chunks: total_chunks.clone(),
> +            compressed_stream_len: compressed_stream_len.clone(),
> +            injected_len: injected_len.clone(),
> +            reused_len: reused_len.clone(),
> +            stream_len: stream_len.clone(),
> +        };
>  
>          let append_chunk_path = format!("{}_index", prefix);
>          let upload_chunk_path = format!("{}_chunk", prefix);
> @@ -794,27 +807,18 @@ impl BackupWriter {
>              })
>              .then(move |result| async move { upload_result.await?.and(result) }.boxed())
>              .and_then(move |_| {
> -                let duration = start_time.elapsed();
> -                let chunk_count = total_chunks2.load(Ordering::SeqCst);
> -                let chunk_reused = known_chunk_count2.load(Ordering::SeqCst);
> -                let chunk_injected = injected_chunk_count2.load(Ordering::SeqCst);
> -                let size = stream_len2.load(Ordering::SeqCst);
> -                let size_reused = reused_len2.load(Ordering::SeqCst);
> -                let size_injected = injected_len2.load(Ordering::SeqCst);
> -                let size_compressed = compressed_stream_len2.load(Ordering::SeqCst) as usize;
> -
>                  let mut guard = index_csum_2.lock().unwrap();
>                  let csum = guard.take().unwrap().finish();
>  
>                  futures::future::ok(UploadStats {
> -                    chunk_count,
> -                    chunk_reused,
> -                    chunk_injected,
> -                    size,
> -                    size_reused,
> -                    size_injected,
> -                    size_compressed,
> -                    duration,
> +                    chunk_count: counters.total_chunks.load(Ordering::SeqCst),
> +                    chunk_reused: counters.known_chunk_count.load(Ordering::SeqCst),
> +                    chunk_injected: counters.injected_chunk_count.load(Ordering::SeqCst),
> +                    size: counters.stream_len.load(Ordering::SeqCst),
> +                    size_reused: counters.reused_len.load(Ordering::SeqCst),
> +                    size_injected: counters.injected_len.load(Ordering::SeqCst),
> +                    size_compressed: counters.compressed_stream_len.load(Ordering::SeqCst) as usize,
> +                    duration: start_time.elapsed(),
>                      csum,
>                  })
>              })
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores Christian Ebner
@ 2024-10-10 14:49   ` Fabian Grünbichler
  2024-10-14  8:18     ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-10 14:49 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On September 12, 2024 4:33 pm, Christian Ebner wrote:
> Extend the component limit for ACL paths of `remote` to include
> possible namespace components.
> 
> This allows to limit the permissions for sync jobs in push direction
> to a namespace subset on the remote datastore.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - not present in previous version
> 
>  pbs-config/src/acl.rs | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/pbs-config/src/acl.rs b/pbs-config/src/acl.rs
> index 6b6500f34..5177e22f0 100644
> --- a/pbs-config/src/acl.rs
> +++ b/pbs-config/src/acl.rs
> @@ -89,10 +89,13 @@ pub fn check_acl_path(path: &str) -> Result<(), Error> {
>              }
>          }
>          "remote" => {
> -            // /remote/{remote}/{store}
> +            // /remote/{remote}/{store}/{namespace}
>              if components_len <= 3 {
>                  return Ok(());
>              }
> +            if components_len > 3 && components_len <= 3 + pbs_api_types::MAX_NAMESPACE_DEPTH {
> +                return Ok(());
> +            }

these two ifs can just be combined into a single one with

components_len <= 3 + pbs_api_types::MAX_NAMESPACE_DEPTH

as condition. the same applies to the corresponding variant shifted by 1
for local datastores/namespaces.

>          }
>          "system" => {
>              if components_len == 1 {
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target
  2024-10-10 14:48 ` [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Fabian Grünbichler
@ 2024-10-11  7:12   ` Christian Ebner
  2024-10-11  7:51     ` Fabian Grünbichler
  0 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-10-11  7:12 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

Thanks for tackling the review for this patches, let me look into your 
comments and come back to you with further questions when they arise 
during reworking of the series.

I think the main confusion here stems from different interpretation on 
what the `local_user` should be (as in which users permissions to 
check). I try to clarify inline:

On 10/10/24 16:48, Fabian Grünbichler wrote:
> left some comments on individual patches (those that I got around to
> anyway, which is roughly up to patch #20), the permissions are still not
> quite right, but since those changes are spread over a few patches, I'll
> leave the comment for that here in one place (existing pull priv checks
> should remain as they are, the following *only* applies to push based
> syncing, except maybe the first bit):
> 
> UI/UX issues:
> 
> - I can create a sync job without having DatastoreAudit, but then I
>    don't see it afterwards (this affects pull and push)

This is a bit strange, but will see to fix it for both.

> usage of helpers and logic in helpers:
> 
> - I can see other people's push jobs (where local_user/owner != auth_id)

Local user is the one who's permissions decide which source 
snapshots/groups/namespaces are visible. I did not intend that to be 
used to define permissions for the job itself. The intention was to use 
the authenticated user for that exclusively.

> -- I can't modify them or create such jobs (unless I am highly privileged)

As you can set the sync job's local user (defining which sources can be 
read) this has to be highly privileged.

> -- I can execute them (even if I am not highly privileged!)

So that a highly privileged user can create a job for you, but you are 
only allowed to execute it.

> the check_sync_job_remote_datastore_backup_access helper is wrong (it
> doesn't account for auth_id vs local_user/owner at all). also, it is not
> called when modifying a sync job or creating one, just when executing it
> manually, which is probably also wrong. it also has a logic bug (missing
> "not" when preparing the remote ACL path).

Okay, I need to look closer at this again, but I guess there is once 
again a difference in interpretation of what the local user is/should 
be. So that needs clarification first.
> privileges:
> 
> - for pull-syncing, creating/removing namespaces needs PRIV_DATASTORE_MODIFY
> - for push-syncing, creating namespaces needs PRIV_REMOTE_DATASTORE_MODIFY
> - for push-syncing, removing namespaces needs PRIV_REMOTE_DATASTORE_PRUNE(!)
> - manual push requires PRIV_REMOTE_DATASTORE_MODIFY (instead of
>    PRIV_REMOTE_DATASTORE_BACKUP)

Okay, will double check and adapt.

> 
> related code style nit:
> 
> since job_user is required for pushing (in
> `check_ns_remote_datastore_privs`), it might make sense to not allow
> creation of PushParameters without it set, e.g. by changing the TryFrom
> impl to convert from (SyncJobConfig, AuthId) instead of just the
> config.. or by using a custom helper.
> 

OK, have a look at this as well.

Thanks!



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target
  2024-10-11  7:12   ` Christian Ebner
@ 2024-10-11  7:51     ` Fabian Grünbichler
  0 siblings, 0 replies; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-11  7:51 UTC (permalink / raw)
  To: Christian Ebner, Proxmox Backup Server development discussion

> Christian Ebner <c.ebner@proxmox.com> hat am 11.10.2024 09:12 CEST geschrieben:  
> Thanks for tackling the review for this patches, let me look into your 
> comments and come back to you with further questions when they arise 
> during reworking of the series.

I'll try to get around to the missing parts today!

> I think the main confusion here stems from different interpretation on 
> what the `local_user` should be (as in which users permissions to 
> check). I try to clarify inline:
> 
> On 10/10/24 16:48, Fabian Grünbichler wrote:
> > left some comments on individual patches (those that I got around to
> > anyway, which is roughly up to patch #20), the permissions are still not
> > quite right, but since those changes are spread over a few patches, I'll
> > leave the comment for that here in one place (existing pull priv checks
> > should remain as they are, the following *only* applies to push based
> > syncing, except maybe the first bit):
> > 
> > UI/UX issues:
> > 
> > - I can create a sync job without having DatastoreAudit, but then I
> >    don't see it afterwards (this affects pull and push)
> 
> This is a bit strange, but will see to fix it for both.

I guess we either need to check for Audit as well when creating/.. a sync job, or we should drop the Audit requirement when listing.. maybe a user with just Datastore.Backup should at least see the sync jobs with themselves as owner/local_user (provided they have access to the remote, of course!)? they could after all do the same thing as non-sync pull invocation as well..

> > usage of helpers and logic in helpers:
> > 
> > - I can see other people's push jobs (where local_user/owner != auth_id)
> 
> Local user is the one who's permissions decide which source 
> snapshots/groups/namespaces are visible. I did not intend that to be 
> used to define permissions for the job itself. The intention was to use 
> the authenticated user for that exclusively.
> 
> > -- I can't modify them or create such jobs (unless I am highly privileged)
> 
> As you can set the sync job's local user (defining which sources can be 
> read) this has to be highly privileged.

yes, this part is okay I'd say (although it does not map 1:1 to the pull check, so it might make sense to distangle those).

if I can either read the whole datastore, or change arbitrary ownership, then I can also set local_user. I think currently only the second part is implemented (since for pull-based syncing, that makes sense for the question "am I allowed to set the owner to somebody else).
 
> > -- I can execute them (even if I am not highly privileged!)
> 
> So that a highly privileged user can create a job for you, but you are 
> only allowed to execute it.

but for pull based syncing, the checks are the same for modification and execution. I don't think it makes much sense to allow less-privileged users to execute sync jobs that they would not be able to set up (among other things - the sync job might not have a schedule and not be intended to be run at that point in time!).

> > the check_sync_job_remote_datastore_backup_access helper is wrong (it
> > doesn't account for auth_id vs local_user/owner at all). also, it is not
> > called when modifying a sync job or creating one, just when executing it
> > manually, which is probably also wrong. it also has a logic bug (missing
> > "not" when preparing the remote ACL path).
> 
> Okay, I need to look closer at this again, but I guess there is once 
> again a difference in interpretation of what the local user is/should 
> be. So that needs clarification first.

see above :) IMHO, modifying a sync job and running it should use the same helpers/checks. the latter should disallow running a sync job with a different local_user, unless the the authenticated user can either read all groups anyway, or  change their ownership in arbitrary fashion, in which case the user could do that and run the equivalent sync job as themselves anyway.

> > privileges:
> > 
> > - for pull-syncing, creating/removing namespaces needs PRIV_DATASTORE_MODIFY
> > - for push-syncing, creating namespaces needs PRIV_REMOTE_DATASTORE_MODIFY
> > - for push-syncing, removing namespaces needs PRIV_REMOTE_DATASTORE_PRUNE(!)
> > - manual push requires PRIV_REMOTE_DATASTORE_MODIFY (instead of
> >    PRIV_REMOTE_DATASTORE_BACKUP)
> 
> Okay, will double check and adapt.
> 
>
> > related code style nit:
> > 
> > since job_user is required for pushing (in
> > `check_ns_remote_datastore_privs`), it might make sense to not allow
> > creation of PushParameters without it set, e.g. by changing the TryFrom
> > impl to convert from (SyncJobConfig, AuthId) instead of just the
> > config.. or by using a custom helper.
> > 
> 
> OK, have a look at this as well.
> 
> Thanks!


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api Christian Ebner
@ 2024-10-11  9:32   ` Fabian Grünbichler
  2024-10-15  7:30     ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-11  9:32 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On September 12, 2024 4:33 pm, Christian Ebner wrote:
> Use the API exposed additional delete statistics to generate the
> task log output for sync jobs in push direction instead of fetching the
> contents before and after deleting.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - no changes
> 
>  src/server/push.rs | 65 ++++++++++++++++++++--------------------------
>  1 file changed, 28 insertions(+), 37 deletions(-)
> 
> diff --git a/src/server/push.rs b/src/server/push.rs
> index cfbb88728..dbface907 100644
> --- a/src/server/push.rs
> +++ b/src/server/push.rs
> @@ -11,9 +11,10 @@ use tokio_stream::wrappers::ReceiverStream;
>  use tracing::info;
>  
>  use pbs_api_types::{
> -    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
> -    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
> -    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
> +    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupGroupDeleteStats, BackupNamespace,
> +    CryptMode, GroupFilter, GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote,
> +    SnapshotListItem, PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY,
> +    PRIV_REMOTE_DATASTORE_PRUNE,
>  };
>  use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
>  use pbs_config::CachedUserInfo;
> @@ -228,7 +229,7 @@ async fn remove_target_group(
>      params: &PushParameters,
>      namespace: &BackupNamespace,
>      backup_group: &BackupGroup,
> -) -> Result<(), Error> {
> +) -> Result<BackupGroupDeleteStats, Error> {
>      check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
>          .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
>  
> @@ -246,9 +247,11 @@ async fn remove_target_group(
>          args["ns"] = serde_json::to_value(target_ns.name())?;
>      }
>  
> -    params.target.client.delete(&api_path, Some(args)).await?;
> +    let mut result = params.target.client.delete(&api_path, Some(args)).await?;
> +    let data = result["data"].take();
> +    let delete_stats: BackupGroupDeleteStats = serde_json::from_value(data)?;

what about older target servers that return Value::Null here? from a
quick glance, nothing else requires upgrading the target server to
"enable" push support, so this should probably gracefully handle that
combination as well..

>  
> -    Ok(())
> +    Ok(delete_stats)
>  }
>  
>  // Check if the namespace is already present on the target, create it otherwise
> @@ -451,38 +454,26 @@ pub(crate) async fn push_namespace(
>  
>              info!("delete vanished group '{target_group}'");
>  
> -            let count_before = match fetch_target_groups(params, namespace).await {
> -                Ok(snapshots) => snapshots.len(),
> -                Err(_err) => 0, // ignore errors
> -            };
> -
> -            if let Err(err) = remove_target_group(params, namespace, &target_group).await {
> -                info!("{err}");
> -                errors = true;
> -                continue;
> -            }
> -
> -            let mut count_after = match fetch_target_groups(params, namespace).await {
> -                Ok(snapshots) => snapshots.len(),
> -                Err(_err) => 0, // ignore errors
> -            };
> -
> -            let deleted_groups = if count_after > 0 {
> -                info!("kept some protected snapshots of group '{target_group}'");
> -                0
> -            } else {
> -                1
> -            };
> -
> -            if count_after > count_before {
> -                count_after = count_before;
> +            match remove_target_group(params, namespace, &target_group).await {
> +                Ok(delete_stats) => {
> +                    if delete_stats.protected_snapshots() > 0 {
> +                        info!(
> +                            "kept {protected_count} protected snapshots of group '{target_group}'",
> +                            protected_count = delete_stats.protected_snapshots(),
> +                        );

should this be a warning? this kind of breaks the expectations of
syncing after all..

and wouldn't we also need a similar change for removing namespaces?

> +                    }
> +                    stats.add(SyncStats::from(RemovedVanishedStats {
> +                        snapshots: delete_stats.removed_snapshots(),
> +                        groups: delete_stats.removed_groups(),
> +                        namespaces: 0,
> +                    }));
> +                }
> +                Err(err) => {
> +                    info!("failed to delete vanished group - {err}");
> +                    errors = true;
> +                    continue;
> +                }
>              }
> -
> -            stats.add(SyncStats::from(RemovedVanishedStats {
> -                snapshots: count_before - count_after,
> -                groups: deleted_groups,
> -                namespaces: 0,
> -            }));
>          }
>      }
>  
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove
  2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove Christian Ebner
@ 2024-10-11  9:32   ` Fabian Grünbichler
  2024-10-14 10:24     ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-11  9:32 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On September 12, 2024 4:33 pm, Christian Ebner wrote:
> Add and expose the backup group delete statistics by adding the
> return type to the corresponding REST API endpoints.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - no changes
> 
>  pbs-datastore/src/datastore.rs | 20 ++++++++++++++------
>  src/api2/admin/datastore.rs    | 18 ++++++++++--------
>  src/api2/admin/namespace.rs    | 20 +++++++++++---------
>  src/server/pull.rs             |  6 ++++--
>  4 files changed, 39 insertions(+), 25 deletions(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index c8701d2dd..68c7f2934 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -489,16 +489,22 @@ impl DataStore {
>      ///
>      /// Does *not* descends into child-namespaces and doesn't remoes the namespace itself either.
>      ///
> -    /// Returns true if all the groups were removed, and false if some were protected.
> -    pub fn remove_namespace_groups(self: &Arc<Self>, ns: &BackupNamespace) -> Result<bool, Error> {
> +    /// Returns a tuple with the first item being true if all the groups were removed, and false if some were protected.
> +    /// The second item returns the remove statistics.
> +    pub fn remove_namespace_groups(
> +        self: &Arc<Self>,
> +        ns: &BackupNamespace,
> +    ) -> Result<(bool, BackupGroupDeleteStats), Error> {
>          // FIXME: locking? The single groups/snapshots are already protected, so may not be
>          // necessary (depends on what we all allow to do with namespaces)
>          log::info!("removing all groups in namespace {}:/{ns}", self.name());
>  
>          let mut removed_all_groups = true;
> +        let mut stats = BackupGroupDeleteStats::default();
>  
>          for group in self.iter_backup_groups(ns.to_owned())? {
>              let delete_stats = group?.destroy()?;
> +            stats.add(&delete_stats);
>              removed_all_groups = removed_all_groups && delete_stats.all_removed();
>          }
>  
> @@ -515,7 +521,7 @@ impl DataStore {
>              }
>          }
>  
> -        Ok(removed_all_groups)
> +        Ok((removed_all_groups, stats))
>      }
>  
>      /// Remove a complete backup namespace optionally including all it's, and child namespaces',
> @@ -527,13 +533,15 @@ impl DataStore {
>          self: &Arc<Self>,
>          ns: &BackupNamespace,
>          delete_groups: bool,
> -    ) -> Result<bool, Error> {
> +    ) -> Result<(bool, BackupGroupDeleteStats), Error> {
>          let store = self.name();
>          let mut removed_all_requested = true;
> +        let mut stats = BackupGroupDeleteStats::default();
>          if delete_groups {
>              log::info!("removing whole namespace recursively below {store}:/{ns}",);
>              for ns in self.recursive_iter_backup_ns(ns.to_owned())? {
> -                let removed_ns_groups = self.remove_namespace_groups(&ns?)?;
> +                let (removed_ns_groups, delete_stats) = self.remove_namespace_groups(&ns?)?;
> +                stats.add(&delete_stats);
>                  removed_all_requested = removed_all_requested && removed_ns_groups;
>              }
>          } else {
> @@ -574,7 +582,7 @@ impl DataStore {
>              }
>          }
>  
> -        Ok(removed_all_requested)
> +        Ok((removed_all_requested, stats))
>      }
>  
>      /// Remove a complete backup group including all snapshots.
> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> index 0a5af1e76..49ff9abf0 100644
> --- a/src/api2/admin/datastore.rs
> +++ b/src/api2/admin/datastore.rs
> @@ -34,10 +34,10 @@ use pxar::accessor::aio::Accessor;
>  use pxar::EntryKind;
>  
>  use pbs_api_types::{
> -    print_ns_and_snapshot, print_store_and_ns, Authid, BackupContent, BackupNamespace, BackupType,
> -    Counts, CryptMode, DataStoreConfig, DataStoreListItem, DataStoreStatus,
> -    GarbageCollectionJobStatus, GroupListItem, JobScheduleStatus, KeepOptions, Operation,
> -    PruneJobOptions, SnapshotListItem, SnapshotVerifyState, BACKUP_ARCHIVE_NAME_SCHEMA,
> +    print_ns_and_snapshot, print_store_and_ns, Authid, BackupContent, BackupGroupDeleteStats,
> +    BackupNamespace, BackupType, Counts, CryptMode, DataStoreConfig, DataStoreListItem,
> +    DataStoreStatus, GarbageCollectionJobStatus, GroupListItem, JobScheduleStatus, KeepOptions,
> +    Operation, PruneJobOptions, SnapshotListItem, SnapshotVerifyState, BACKUP_ARCHIVE_NAME_SCHEMA,
>      BACKUP_ID_SCHEMA, BACKUP_NAMESPACE_SCHEMA, BACKUP_TIME_SCHEMA, BACKUP_TYPE_SCHEMA,
>      DATASTORE_SCHEMA, IGNORE_VERIFIED_BACKUPS_SCHEMA, MAX_NAMESPACE_DEPTH, NS_MAX_DEPTH_SCHEMA,
>      PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_MODIFY, PRIV_DATASTORE_PRUNE,
> @@ -269,6 +269,9 @@ pub fn list_groups(
>              },
>          },
>      },
> +    returns: {
> +        type: BackupGroupDeleteStats,
> +    },
>      access: {
>          permission: &Permission::Anybody,
>          description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_MODIFY for any \
> @@ -281,7 +284,7 @@ pub async fn delete_group(
>      ns: Option<BackupNamespace>,
>      group: pbs_api_types::BackupGroup,
>      rpcenv: &mut dyn RpcEnvironment,
> -) -> Result<Value, Error> {
> +) -> Result<BackupGroupDeleteStats, Error> {
>      let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
>  
>      tokio::task::spawn_blocking(move || {
> @@ -299,10 +302,9 @@ pub async fn delete_group(
>  
>          let delete_stats = datastore.remove_backup_group(&ns, &group)?;
>          if !delete_stats.all_removed() {
> -            bail!("group only partially deleted due to protected snapshots");
> +            warn!("group only partially deleted due to protected snapshots");

this not only changes the return type (from nothing to something
actionable, which is okay!) but also the behaviour..

right now with this series applied, if I remove a group with protected
snapshots, I get no indication on the UI that it failed to do so, and
the log message only ends up in journal since there is no task context
here..

I think this would at least warrant opt-in for the new behaviour? in any
case, the warning/error could probably be adapted to contain the counts
at least, now that we have them ;)

>          }
> -
> -        Ok(Value::Null)
> +        Ok(delete_stats)
>      })
>      .await?
>  }
> diff --git a/src/api2/admin/namespace.rs b/src/api2/admin/namespace.rs
> index 889dc1a3d..adf665717 100644
> --- a/src/api2/admin/namespace.rs
> +++ b/src/api2/admin/namespace.rs
> @@ -1,13 +1,12 @@
> -use anyhow::{bail, Error};
> -use serde_json::Value;
> +use anyhow::Error;
>  
>  use pbs_config::CachedUserInfo;
>  use proxmox_router::{http_bail, ApiMethod, Permission, Router, RpcEnvironment};
>  use proxmox_schema::*;
>  
>  use pbs_api_types::{
> -    Authid, BackupNamespace, NamespaceListItem, Operation, DATASTORE_SCHEMA, NS_MAX_DEPTH_SCHEMA,
> -    PROXMOX_SAFE_ID_FORMAT,
> +    Authid, BackupGroupDeleteStats, BackupNamespace, NamespaceListItem, Operation,
> +    DATASTORE_SCHEMA, NS_MAX_DEPTH_SCHEMA, PROXMOX_SAFE_ID_FORMAT,
>  };
>  
>  use pbs_datastore::DataStore;
> @@ -151,22 +150,25 @@ pub fn delete_namespace(
>      delete_groups: bool,
>      _info: &ApiMethod,
>      rpcenv: &mut dyn RpcEnvironment,
> -) -> Result<Value, Error> {
> +) -> Result<BackupGroupDeleteStats, Error> {
>      let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
>  
>      check_ns_modification_privs(&store, &ns, &auth_id)?;
>  
>      let datastore = DataStore::lookup_datastore(&store, Some(Operation::Write))?;
>  
> -    if !datastore.remove_namespace_recursive(&ns, delete_groups)? {
> +    let (removed_all, stats) = datastore.remove_namespace_recursive(&ns, delete_groups)?;
> +    if !removed_all {
>          if delete_groups {
> -            bail!("group only partially deleted due to protected snapshots");
> +            log::warn!("group only partially deleted due to protected snapshots");
>          } else {
> -            bail!("only partially deleted due to existing groups but `delete-groups` not true ");
> +            log::warn!(
> +                "only partially deleted due to existing groups but `delete-groups` not true"
> +            );

same here

>          }
>      }
>  
> -    Ok(Value::Null)
> +    Ok(stats)
>  }
>  
>  pub const ROUTER: Router = Router::new()
> diff --git a/src/server/pull.rs b/src/server/pull.rs
> index 3117f7d2c..d7f5c42ea 100644
> --- a/src/server/pull.rs
> +++ b/src/server/pull.rs
> @@ -645,10 +645,12 @@ fn check_and_remove_ns(params: &PullParameters, local_ns: &BackupNamespace) -> R
>      check_ns_modification_privs(params.target.store.name(), local_ns, &params.owner)
>          .map_err(|err| format_err!("Removing {local_ns} not allowed - {err}"))?;
>  
> -    params
> +    let (removed_all, _delete_stats) = params
>          .target
>          .store
> -        .remove_namespace_recursive(local_ns, true)
> +        .remove_namespace_recursive(local_ns, true)?;
> +
> +    Ok(removed_all)
>  }
>  
>  fn check_and_remove_vanished_ns(
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter
  2024-10-10 14:48   ` Fabian Grünbichler
@ 2024-10-14  8:10     ` Christian Ebner
  2024-10-14  9:25       ` Fabian Grünbichler
  0 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-10-14  8:10 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

On 10/10/24 16:48, Fabian Grünbichler wrote:
> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>> @@ -44,6 +49,7 @@ use crate::{
>>   /// List all sync jobs
>>   pub fn list_sync_jobs(
>>       store: Option<String>,
>> +    sync_direction: Option<SyncDirection>,
> 
> would be much nicer if the default were already encoded in the API
> schema

As discussed off-list, this is not possible at the moment because the 
`api` macro does not allow to set a default value for external types, 
therefore leaving these in place for the time being.

> 
>>       _param: Value,
>>       rpcenv: &mut dyn RpcEnvironment,
>>   ) -> Result<Vec<SyncJobStatus>, Error> {
>> @@ -51,9 +57,10 @@ pub fn list_sync_jobs(
>>       let user_info = CachedUserInfo::new()?;
>>   
>>       let (config, digest) = sync::config()?;
>> +    let sync_direction = sync_direction.unwrap_or_default();
> 
> instead of unwrapping here..
> 
>>   
>>       let job_config_iter = config
>> -        .convert_to_typed_array("sync")?
>> +        .convert_to_typed_array(sync_direction.as_config_type_str())?
>>           .into_iter()
>>           .filter(|job: &SyncJobConfig| {
>>               if let Some(store) = &store {
>> @@ -498,8 +499,21 @@ pub async fn delete_datastore(
>>           for job in list_verification_jobs(Some(name.clone()), Value::Null, rpcenv)? {
>>               delete_verification_job(job.config.id, None, rpcenv)?
>>           }
>> -        for job in list_sync_jobs(Some(name.clone()), Value::Null, rpcenv)? {
>> -            delete_sync_job(job.config.id, None, rpcenv)?
>> +        for job in list_sync_jobs(
>> +            Some(name.clone()),
>> +            Some(SyncDirection::Pull),
>> +            Value::Null,
>> +            rpcenv,
>> +        )? {
>> +            delete_sync_job(job.config.id, Some(SyncDirection::Pull), None, rpcenv)?
>> +        }
>> +        for job in list_sync_jobs(
>> +            Some(name.clone()),
>> +            Some(SyncDirection::Push),
>> +            Value::Null,
>> +            rpcenv,
>> +        )? {
>> +            delete_sync_job(job.config.id, Some(SyncDirection::Push), None, rpcenv)?
> 
> this looks a bit weird, but I guess it's a side-effect we have to live
> with if we want to separate both types of sync jobs somewhat.. could
> still be a nested loop though for brevity?
> 
> for direction in .. {
>      for job in list_sync_jobs(.. , direction, ..)? {
>          delete_sync_job(.. , direction, ..)?;
>      }
> }

Agreed, I went for the suggested nested loop here, makes this a bit more 
compact.

> 
>>           }
>>           for job in list_prune_jobs(Some(name.clone()), Value::Null, rpcenv)? {
>>               delete_prune_job(job.config.id, None, rpcenv)?
>> diff --git a/src/api2/config/notifications/mod.rs b/src/api2/config/notifications/mod.rs
>> index dfe82ed03..9622d43ee 100644
>> --- a/src/api2/config/notifications/mod.rs
>> +++ b/src/api2/config/notifications/mod.rs
>> @@ -9,7 +9,7 @@ use proxmox_schema::api;
>>   use proxmox_sortable_macro::sortable;
>>   
>>   use crate::api2::admin::datastore::get_datastore_list;
>> -use pbs_api_types::PRIV_SYS_AUDIT;
>> +use pbs_api_types::{SyncDirection, PRIV_SYS_AUDIT};
>>   
>>   use crate::api2::admin::prune::list_prune_jobs;
>>   use crate::api2::admin::sync::list_sync_jobs;
>> @@ -154,8 +154,16 @@ pub fn get_values(
>>           });
>>       }
>>   
>> -    let sync_jobs = list_sync_jobs(None, param.clone(), rpcenv)?;
>> -    for job in sync_jobs {
>> +    let sync_jobs_pull = list_sync_jobs(None, Some(SyncDirection::Pull), param.clone(), rpcenv)?;
>> +    for job in sync_jobs_pull {
>> +        values.push(MatchableValue {
>> +            field: "job-id".into(),
>> +            value: job.config.id,
>> +            comment: job.config.comment,
>> +        });
>> +    }
>> +    let sync_jobs_push = list_sync_jobs(None, Some(SyncDirection::Push), param.clone(), rpcenv)?;
>> +    for job in sync_jobs_push {
> 
> here as well? or alternatively, all a third SyncDirection variant Any,
> but not sure if it's worth it just for those two list_sync_jobs
> functions (btw, one of those might benefit from being renamed while we
> are at it..).

What do you mean with being renamed here?

I think a sync variant `Any` is not the right approach, as that could 
lead to issues with clashing id's as these are unique on a job config 
type level only?

So again, using the suggested loop over enum variants.

> 
>>           values.push(MatchableValue {
>>               field: "job-id".into(),
>>               value: job.config.id,
>> diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
>> index 4409234b2..2b6f1c133 100644
>> --- a/src/bin/proxmox-backup-proxy.rs
>> +++ b/src/bin/proxmox-backup-proxy.rs
>> @@ -608,7 +608,15 @@ async fn schedule_datastore_sync_jobs() {
>>           Ok((config, _digest)) => config,
>>       };
>>   
>> -    for (job_id, (_, job_config)) in config.sections {
>> +    for (job_id, (job_type, job_config)) in config.sections {
>> +        let sync_direction = match job_type.as_str() {
>> +            "sync" => SyncDirection::Pull,
>> +            "sync-push" => SyncDirection::Push,
>> +            _ => {
>> +                eprintln!("unexpected config type in sync job config - {job_type}");
>> +                continue;
>> +            }
>> +        };
> 
> can this even happen? we don't allow unknown section types in the
> SyncJobConfig.. arguably, this should have used the `FromStr`
> implementation, and might be an argument for keeping it around instead
> of dropping it ;)

Using the `FromStr` impl of the `SyncDirection` enum does not work here, 
as these are the config type keys for the job config, not the sync 
direction itself.
Given that, I opted for implementing a `from_config_type_str` for 
`SyncDirection` as counterpart for the `as_config_type_str` 
implementation and use that for getting the sync direction based on the 
config type. Error handling still is required, as all match cases must 
be covered (even if logically not possible because already checked 
somewhere else).

> 
>>           let job_config: SyncJobConfig = match serde_json::from_value(job_config) {
>>               Ok(c) => c,
>>               Err(err) => {
>> @@ -635,7 +643,7 @@ async fn schedule_datastore_sync_jobs() {
>>                   job_config,
>>                   &auth_id,
>>                   Some(event_str),
>> -                SyncDirection::Pull,
>> +                sync_direction,
>>                   false,
>>               ) {
>>                   eprintln!("unable to start datastore sync job {job_id} - {err}");
>> -- 
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs
  2024-10-10 14:48   ` Fabian Grünbichler
@ 2024-10-14  8:16     ` Christian Ebner
  0 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-10-14  8:16 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

On 10/10/24 16:48, Fabian Grünbichler wrote:
> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>> In order for sync jobs to be either pull or push jobs, allow to
>> configure the direction of the job.
>>
>> Adds an additional config type `sync-push` to the sync job config, to
>> clearly distinguish sync jobs configured in pull and in push
>> direction.
>>
>> This approach was chosen in order to limit possible misconfiguration,
>> as unintentionally switching the sync direction could potentially
>> delete still required snapshots.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 2:
>> - no changes
>>
>>   pbs-api-types/src/jobs.rs | 52 +++++++++++++++++++++++++++++++++++++++
>>   pbs-config/src/sync.rs    | 11 +++++++--
>>   2 files changed, 61 insertions(+), 2 deletions(-)
>>
>> diff --git a/pbs-api-types/src/jobs.rs b/pbs-api-types/src/jobs.rs
>> index 868702bc0..12b39782c 100644
>> --- a/pbs-api-types/src/jobs.rs
>> +++ b/pbs-api-types/src/jobs.rs
>> @@ -20,6 +20,8 @@ const_regex! {
>>       pub VERIFICATION_JOB_WORKER_ID_REGEX = concatcp!(r"^(", PROXMOX_SAFE_ID_REGEX_STR, r"):");
>>       /// Regex for sync jobs '(REMOTE|\-):REMOTE_DATASTORE:LOCAL_DATASTORE:(?:LOCAL_NS_ANCHOR:)ACTUAL_JOB_ID'
>>       pub SYNC_JOB_WORKER_ID_REGEX = concatcp!(r"^(", PROXMOX_SAFE_ID_REGEX_STR, r"|\-):(", PROXMOX_SAFE_ID_REGEX_STR, r"):(", PROXMOX_SAFE_ID_REGEX_STR, r")(?::(", BACKUP_NS_RE, r"))?:");
>> +    /// Regex for sync direction'(pull|push)'
>> +    pub SYNC_DIRECTION_REGEX = r"^(pull|push)$";
>>   }
>>   
>>   pub const JOB_ID_SCHEMA: Schema = StringSchema::new("Job ID.")
>> @@ -498,6 +500,56 @@ pub const TRANSFER_LAST_SCHEMA: Schema =
>>           .minimum(1)
>>           .schema();
>>   
>> +pub const SYNC_DIRECTION_SCHEMA: Schema = StringSchema::new("Sync job direction (pull|push)")
>> +    .format(&ApiStringFormat::Pattern(&SYNC_DIRECTION_REGEX))
>> +    .schema();
>> +
>> +/// Direction of the sync job, push or pull
>> +#[derive(Clone, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Hash, UpdaterType)]
>> +pub enum SyncDirection {
>> +    #[default]
>> +    Pull,
>> +    Push,
>> +}
>> +
>> +impl ApiType for SyncDirection {
>> +    const API_SCHEMA: Schema = SYNC_DIRECTION_SCHEMA;
>> +}
>> +
>> +// used for serialization using `proxmox_serde::forward_serialize_to_display` macro
>> +impl std::fmt::Display for SyncDirection {
>> +    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
>> +        match self {
>> +            SyncDirection::Pull => f.write_str("pull"),
>> +            SyncDirection::Push => f.write_str("push"),
>> +        }
>> +    }
>> +}
>> +
>> +impl std::str::FromStr for SyncDirection {
>> +    type Err = anyhow::Error;
>> +
>> +    fn from_str(s: &str) -> Result<Self, Self::Err> {
>> +        match s {
>> +            "pull" => Ok(SyncDirection::Pull),
>> +            "push" => Ok(SyncDirection::Push),
>> +            _ => bail!("invalid sync direction"),
>> +        }
>> +    }
>> +}
>> +
>> +proxmox_serde::forward_deserialize_to_from_str!(SyncDirection);
>> +proxmox_serde::forward_serialize_to_display!(SyncDirection);
> 
> wouldn't just implementing Display, and deriving Serialize/Deserialize
> be enough?

I opted for using the `api` macro on the `SyncDirection` enum, so all 
this is auto generated and neither schema, nor regex are required to be 
implemented explicitly, reducing the amount of code for this significantly.

> 
>> +
>> +impl SyncDirection {
>> +    pub fn as_config_type_str(&self) -> &'static str {
>> +        match self {
>> +            SyncDirection::Pull => "sync",
>> +            SyncDirection::Push => "sync-push",
>> +        }
>> +    }
>> +}
>> +
>>   #[api(
>>       properties: {
>>           id: {
>> diff --git a/pbs-config/src/sync.rs b/pbs-config/src/sync.rs
>> index 45453abb1..143b73e78 100644
>> --- a/pbs-config/src/sync.rs
>> +++ b/pbs-config/src/sync.rs
>> @@ -18,9 +18,16 @@ fn init() -> SectionConfig {
>>           _ => unreachable!(),
>>       };
>>   
>> -    let plugin = SectionConfigPlugin::new("sync".to_string(), Some(String::from("id")), obj_schema);
>> +    let pull_plugin =
>> +        SectionConfigPlugin::new("sync".to_string(), Some(String::from("id")), obj_schema);
>> +    let push_plugin = SectionConfigPlugin::new(
>> +        "sync-push".to_string(),
> 
> these here should probably use SyncDirection's as_config_type_str to
> make the connection obvious when adapting that at some point in the
> future?

Agreed, that will be fixed in the next version of the patches.

> 
>> +        Some(String::from("id")),
>> +        obj_schema,
>> +    );
>>       let mut config = SectionConfig::new(&JOB_ID_SCHEMA);
>> -    config.register_plugin(plugin);
>> +    config.register_plugin(pull_plugin);
>> +    config.register_plugin(push_plugin);
>>   
>>       config
>>   }
>> -- 
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores
  2024-10-10 14:49   ` Fabian Grünbichler
@ 2024-10-14  8:18     ` Christian Ebner
  0 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-10-14  8:18 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

On 10/10/24 16:49, Fabian Grünbichler wrote:
> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>> Extend the component limit for ACL paths of `remote` to include
>> possible namespace components.
>>
>> This allows to limit the permissions for sync jobs in push direction
>> to a namespace subset on the remote datastore.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 2:
>> - not present in previous version
>>
>>   pbs-config/src/acl.rs | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/pbs-config/src/acl.rs b/pbs-config/src/acl.rs
>> index 6b6500f34..5177e22f0 100644
>> --- a/pbs-config/src/acl.rs
>> +++ b/pbs-config/src/acl.rs
>> @@ -89,10 +89,13 @@ pub fn check_acl_path(path: &str) -> Result<(), Error> {
>>               }
>>           }
>>           "remote" => {
>> -            // /remote/{remote}/{store}
>> +            // /remote/{remote}/{store}/{namespace}
>>               if components_len <= 3 {
>>                   return Ok(());
>>               }
>> +            if components_len > 3 && components_len <= 3 + pbs_api_types::MAX_NAMESPACE_DEPTH {
>> +                return Ok(());
>> +            }
> 
> these two ifs can just be combined into a single one with
> 
> components_len <= 3 + pbs_api_types::MAX_NAMESPACE_DEPTH
> 
> as condition. the same applies to the corresponding variant shifted by 1
> for local datastores/namespaces.

Ack, will combine these and do the same for the datastore as well.

> 
>>           }
>>           "system" => {
>>               if components_len == 1 {
>> -- 
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter
  2024-10-14  8:10     ` Christian Ebner
@ 2024-10-14  9:25       ` Fabian Grünbichler
  2024-10-14  9:36         ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-14  9:25 UTC (permalink / raw)
  To: Christian Ebner, Proxmox Backup Server development discussion

On October 14, 2024 10:10 am, Christian Ebner wrote:
> On 10/10/24 16:48, Fabian Grünbichler wrote:
>> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>>>           }
>>>           for job in list_prune_jobs(Some(name.clone()), Value::Null, rpcenv)? {
>>>               delete_prune_job(job.config.id, None, rpcenv)?
>>> diff --git a/src/api2/config/notifications/mod.rs b/src/api2/config/notifications/mod.rs
>>> index dfe82ed03..9622d43ee 100644
>>> --- a/src/api2/config/notifications/mod.rs
>>> +++ b/src/api2/config/notifications/mod.rs
>>> @@ -9,7 +9,7 @@ use proxmox_schema::api;
>>>   use proxmox_sortable_macro::sortable;
>>>   
>>>   use crate::api2::admin::datastore::get_datastore_list;
>>> -use pbs_api_types::PRIV_SYS_AUDIT;
>>> +use pbs_api_types::{SyncDirection, PRIV_SYS_AUDIT};
>>>   
>>>   use crate::api2::admin::prune::list_prune_jobs;
>>>   use crate::api2::admin::sync::list_sync_jobs;
>>> @@ -154,8 +154,16 @@ pub fn get_values(
>>>           });
>>>       }
>>>   
>>> -    let sync_jobs = list_sync_jobs(None, param.clone(), rpcenv)?;
>>> -    for job in sync_jobs {
>>> +    let sync_jobs_pull = list_sync_jobs(None, Some(SyncDirection::Pull), param.clone(), rpcenv)?;
>>> +    for job in sync_jobs_pull {
>>> +        values.push(MatchableValue {
>>> +            field: "job-id".into(),
>>> +            value: job.config.id,
>>> +            comment: job.config.comment,
>>> +        });
>>> +    }
>>> +    let sync_jobs_push = list_sync_jobs(None, Some(SyncDirection::Push), param.clone(), rpcenv)?;
>>> +    for job in sync_jobs_push {
>> 
>> here as well? or alternatively, all a third SyncDirection variant Any,
>> but not sure if it's worth it just for those two list_sync_jobs
>> functions (btw, one of those might benefit from being renamed while we
>> are at it..).
> 
> What do you mean with being renamed here?

there's two "list_sync_jobs" fns, which can be confusing when reading
code/patches..

> I think a sync variant `Any` is not the right approach, as that could 
> lead to issues with clashing id's as these are unique on a job config 
> type level only?

yes, the Any variant would only be usable for querying the config, not
for defining a job or persisting it.. that's what I meant with "worth it
for just the two list functions" ;)

> So again, using the suggested loop over enum variants.

ack!

>> 
>>>           values.push(MatchableValue {
>>>               field: "job-id".into(),
>>>               value: job.config.id,
>>> diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
>>> index 4409234b2..2b6f1c133 100644
>>> --- a/src/bin/proxmox-backup-proxy.rs
>>> +++ b/src/bin/proxmox-backup-proxy.rs
>>> @@ -608,7 +608,15 @@ async fn schedule_datastore_sync_jobs() {
>>>           Ok((config, _digest)) => config,
>>>       };
>>>   
>>> -    for (job_id, (_, job_config)) in config.sections {
>>> +    for (job_id, (job_type, job_config)) in config.sections {
>>> +        let sync_direction = match job_type.as_str() {
>>> +            "sync" => SyncDirection::Pull,
>>> +            "sync-push" => SyncDirection::Push,
>>> +            _ => {
>>> +                eprintln!("unexpected config type in sync job config - {job_type}");
>>> +                continue;
>>> +            }
>>> +        };
>> 
>> can this even happen? we don't allow unknown section types in the
>> SyncJobConfig.. arguably, this should have used the `FromStr`
>> implementation, and might be an argument for keeping it around instead
>> of dropping it ;)
> 
> Using the `FromStr` impl of the `SyncDirection` enum does not work here, 
> as these are the config type keys for the job config, not the sync 
> direction itself.
> Given that, I opted for implementing a `from_config_type_str` for 
> `SyncDirection` as counterpart for the `as_config_type_str` 
> implementation and use that for getting the sync direction based on the 
> config type. Error handling still is required, as all match cases must 
> be covered (even if logically not possible because already checked 
> somewhere else).

right. if only we had given the original sync job entries a section type
of "pull" ;) in any case, IMHO it's a good idea to have
serializing/deserializing constructs like that in a single place.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
  2024-10-10 14:48   ` Fabian Grünbichler
@ 2024-10-14  9:32     ` Christian Ebner
  2024-10-14  9:41       ` Fabian Grünbichler
  0 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-10-14  9:32 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

On 10/10/24 16:48, Fabian Grünbichler wrote:
> left some higher level comments on the cover letter as well that are
> relevant for this patch!

Okay, will be addressed there.

> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>> Adds the functionality required to push datastore contents from a
>> source to a remote target.
>> This includes syncing of the namespaces, backup groups and snapshots
>> based on the provided filters as well as removing vanished contents
>> from the target when requested.
>>
>> While trying to mimic the pull direction of sync jobs, the
>> implementation is different as access to the remote must be performed
>> via the REST API, not needed for the pull job which can access the
>> local datastore via the filesystem directly.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 2:
>> - Implement additional permission checks limiting possible remote
>>    datastore operations.
>> - Rename `owner` to `local_user`, this is the user who's view of the
>>    local datastore is used for the push to the remote target. It can be
>>    different from the job user, executing the sync job and requiring the
>>    permissions to access the remote.
>>
>>   src/server/mod.rs  |   1 +
>>   src/server/push.rs | 892 +++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 893 insertions(+)
>>   create mode 100644 src/server/push.rs
>>
>> diff --git a/src/server/mod.rs b/src/server/mod.rs
>> index 468847c2e..882c5cc10 100644
>> --- a/src/server/mod.rs
>> +++ b/src/server/mod.rs
>> @@ -34,6 +34,7 @@ pub use report::*;
>>   pub mod auth;
>>   
>>   pub(crate) mod pull;
>> +pub(crate) mod push;
>>   pub(crate) mod sync;
>>   
>>   pub(crate) async fn reload_proxy_certificate() -> Result<(), Error> {
>> diff --git a/src/server/push.rs b/src/server/push.rs
>> new file mode 100644
>> index 000000000..cfbb88728
>> --- /dev/null
>> +++ b/src/server/push.rs
>> @@ -0,0 +1,892 @@
>> +//! Sync datastore by pushing contents to remote server
>> +
>> +use std::cmp::Ordering;
>> +use std::collections::HashSet;
>> +use std::sync::{Arc, Mutex};
>> +
>> +use anyhow::{bail, format_err, Error};
>> +use futures::stream::{self, StreamExt, TryStreamExt};
>> +use tokio::sync::mpsc;
>> +use tokio_stream::wrappers::ReceiverStream;
>> +use tracing::info;
>> +
>> +use pbs_api_types::{
>> +    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
>> +    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
>> +    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
>> +};
>> +use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
>> +use pbs_config::CachedUserInfo;
>> +use pbs_datastore::data_blob::ChunkInfo;
>> +use pbs_datastore::dynamic_index::DynamicIndexReader;
>> +use pbs_datastore::fixed_index::FixedIndexReader;
>> +use pbs_datastore::index::IndexFile;
>> +use pbs_datastore::manifest::{ArchiveType, CLIENT_LOG_BLOB_NAME, MANIFEST_BLOB_NAME};
>> +use pbs_datastore::read_chunk::AsyncReadChunk;
>> +use pbs_datastore::{BackupManifest, DataStore, StoreProgress};
>> +
>> +use super::sync::{
>> +    check_namespace_depth_limit, LocalSource, RemovedVanishedStats, SkipInfo, SkipReason,
>> +    SyncSource, SyncStats,
>> +};
>> +use crate::api2::config::remote;
>> +
>> +/// Target for backups to be pushed to
>> +pub(crate) struct PushTarget {
>> +    // Name of the remote as found in remote.cfg
>> +    remote: String,
>> +    // Target repository on remote
>> +    repo: BackupRepository,
>> +    // Target namespace on remote
>> +    ns: BackupNamespace,
>> +    // Http client to connect to remote
>> +    client: HttpClient,
>> +}
>> +
>> +/// Parameters for a push operation
>> +pub(crate) struct PushParameters {
>> +    /// Source of backups to be pushed to remote
>> +    source: Arc<LocalSource>,
>> +    /// Target for backups to be pushed to
>> +    target: PushTarget,
>> +    /// Local user limiting the accessible source contents, makes sure that the sync job sees the
>> +    /// same source content when executed by different users with different privileges
>> +    local_user: Authid,
>> +    /// User as which the job gets executed, requires the permissions on the remote
>> +    pub(crate) job_user: Option<Authid>,
>> +    /// Whether to remove groups which exist locally, but not on the remote end
>> +    remove_vanished: bool,
>> +    /// How many levels of sub-namespaces to push (0 == no recursion, None == maximum recursion)
>> +    max_depth: Option<usize>,
>> +    /// Filters for reducing the push scope
>> +    group_filter: Vec<GroupFilter>,
>> +    /// How many snapshots should be transferred at most (taking the newest N snapshots)
>> +    transfer_last: Option<usize>,
>> +}
>> +
>> +impl PushParameters {
>> +    /// Creates a new instance of `PushParameters`.
>> +    #[allow(clippy::too_many_arguments)]
>> +    pub(crate) fn new(
>> +        store: &str,
>> +        ns: BackupNamespace,
>> +        remote_id: &str,
>> +        remote_store: &str,
>> +        remote_ns: BackupNamespace,
>> +        local_user: Authid,
>> +        remove_vanished: Option<bool>,
>> +        max_depth: Option<usize>,
>> +        group_filter: Option<Vec<GroupFilter>>,
>> +        limit: RateLimitConfig,
>> +        transfer_last: Option<usize>,
>> +    ) -> Result<Self, Error> {
>> +        if let Some(max_depth) = max_depth {
>> +            ns.check_max_depth(max_depth)?;
>> +            remote_ns.check_max_depth(max_depth)?;
>> +        };
>> +        let remove_vanished = remove_vanished.unwrap_or(false);
>> +
>> +        let source = Arc::new(LocalSource {
>> +            store: DataStore::lookup_datastore(store, Some(Operation::Read))?,
>> +            ns,
>> +        });
>> +
>> +        let (remote_config, _digest) = pbs_config::remote::config()?;
>> +        let remote: Remote = remote_config.lookup("remote", remote_id)?;
>> +
>> +        let repo = BackupRepository::new(
>> +            Some(remote.config.auth_id.clone()),
>> +            Some(remote.config.host.clone()),
>> +            remote.config.port,
>> +            remote_store.to_string(),
>> +        );
>> +
>> +        let client = remote::remote_client_config(&remote, Some(limit))?;
>> +        let target = PushTarget {
>> +            remote: remote_id.to_string(),
>> +            repo,
>> +            ns: remote_ns,
>> +            client,
>> +        };
>> +        let group_filter = group_filter.unwrap_or_default();
>> +
>> +        Ok(Self {
>> +            source,
>> +            target,
>> +            local_user,
>> +            job_user: None,
>> +            remove_vanished,
>> +            max_depth,
>> +            group_filter,
>> +            transfer_last,
>> +        })
>> +    }
>> +}
>> +
>> +fn check_ns_remote_datastore_privs(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +    privs: u64,
>> +) -> Result<(), Error> {
>> +    let auth_id = params
>> +        .job_user
>> +        .as_ref()
>> +        .ok_or_else(|| format_err!("missing job authid"))?;
>> +    let user_info = CachedUserInfo::new()?;
>> +    let mut acl_path: Vec<&str> = vec!["remote", &params.target.remote, params.target.repo.store()];
>> +
>> +    if !namespace.is_root() {
>> +        let ns_components: Vec<&str> = namespace.components().collect();
>> +        acl_path.extend(ns_components);
>> +    }
>> +
>> +    user_info.check_privs(auth_id, &acl_path, privs, false)?;
>> +
>> +    Ok(())
>> +}
>> +
>> +// Fetch the list of namespaces found on target
>> +async fn fetch_target_namespaces(params: &PushParameters) -> Result<Vec<BackupNamespace>, Error> {
>> +    let api_path = format!(
>> +        "api2/json/admin/datastore/{store}/namespace",
>> +        store = params.target.repo.store(),
>> +    );
>> +    let mut result = params.target.client.get(&api_path, None).await?;
>> +    let namespaces: Vec<NamespaceListItem> = serde_json::from_value(result["data"].take())?;
>> +    let mut namespaces: Vec<BackupNamespace> = namespaces
>> +        .into_iter()
>> +        .map(|namespace| namespace.ns)
>> +        .collect();
>> +    namespaces.sort_unstable_by_key(|a| a.name_len());
>> +
>> +    Ok(namespaces)
>> +}
>> +
>> +// Remove the provided namespace from the target
>> +async fn remove_target_namespace(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +) -> Result<(), Error> {
>> +    if namespace.is_root() {
>> +        bail!("cannot remove root namespace from target");
>> +    }
>> +
>> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
>> +        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
> 
> this should be MODIFY, not PRUNE to mimic pull-based syncing, see cover letter

Ack, changed to be `PRIV_REMOTE_DATASTORE_MODIFY` for the upcoming 
version of the patches.
> 
>> +
>> +    let api_path = format!(
>> +        "api2/json/admin/datastore/{store}/namespace",
>> +        store = params.target.repo.store(),
>> +    );
>> +
>> +    let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
>> +    let target_ns = params.map_namespace(namespace)?;
> 
> would it make sense to make this less verbose *and more readable* by
> implementing a `map_namespace` on params? 7 call sites ;)

Yes, that makes it indeed more readable. I however decided to call it 
`map_to_target`, also leaving the option to make this generic over a 
(currently not required) type, if ever needed in the future.

> 
>> +    let args = serde_json::json!({
>> +        "ns": target_ns.name(),
>> +        "delete-groups": true,
>> +    });
>> +
>> +    params.target.client.delete(&api_path, Some(args)).await?;
>> +
>> +    Ok(())
>> +}
>> +
>> +// Fetch the list of groups found on target in given namespace
>> +async fn fetch_target_groups(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +) -> Result<Vec<BackupGroup>, Error> {
>> +    let api_path = format!(
>> +        "api2/json/admin/datastore/{store}/groups",
>> +        store = params.target.repo.store(),
>> +    );
>> +
>> +    let args = if !namespace.is_root() {
>> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
>> +        Some(serde_json::json!({ "ns": target_ns.name() }))
>> +    } else {
>> +        None
>> +    };
>> +
>> +    let mut result = params.target.client.get(&api_path, args).await?;
>> +    let groups: Vec<GroupListItem> = serde_json::from_value(result["data"].take())?;
>> +    let mut groups: Vec<BackupGroup> = groups.into_iter().map(|group| group.backup).collect();
>> +
>> +    groups.sort_unstable_by(|a, b| {
>> +        let type_order = a.ty.cmp(&b.ty);
>> +        if type_order == Ordering::Equal {
>> +            a.id.cmp(&b.id)
>> +        } else {
>> +            type_order
>> +        }
>> +    });
>> +
>> +    Ok(groups)
>> +}
>> +
>> +// Remove the provided backup group in given namespace from the target
>> +async fn remove_target_group(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +    backup_group: &BackupGroup,
>> +) -> Result<(), Error> {
>> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
>> +        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
>> +
>> +    let api_path = format!(
>> +        "api2/json/admin/datastore/{store}/groups",
>> +        store = params.target.repo.store(),
>> +    );
>> +
>> +    let mut args = serde_json::json!({
>> +        "backup-id": backup_group.id,
>> +        "backup-type": backup_group.ty,
>> +    });
>> +    if !namespace.is_root() {
>> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
>> +        args["ns"] = serde_json::to_value(target_ns.name())?;
>> +    }
>> +
>> +    params.target.client.delete(&api_path, Some(args)).await?;
>> +
>> +    Ok(())
>> +}
>> +
>> +// Check if the namespace is already present on the target, create it otherwise
>> +async fn check_or_create_target_namespace(
>> +    params: &PushParameters,
>> +    target_namespaces: &[BackupNamespace],
>> +    namespace: &BackupNamespace,
>> +) -> Result<bool, Error> {
>> +    let mut created = false;
>> +
>> +    if !namespace.is_root() && !target_namespaces.contains(namespace) {
>> +        // Namespace not present on target, create namespace.
>> +        // Sub-namespaces have to be created by creating parent components first.
>> +
>> +        check_ns_remote_datastore_privs(&params, namespace, PRIV_REMOTE_DATASTORE_MODIFY)
>> +            .map_err(|err| format_err!("Creating namespace not allowed - {err}"))?;
>> +
>> +        let mut parent = BackupNamespace::root();
>> +        for namespace_component in namespace.components() {
>> +            let namespace = BackupNamespace::new(namespace_component)?;
>> +            let api_path = format!(
>> +                "api2/json/admin/datastore/{store}/namespace",
>> +                store = params.target.repo.store(),
>> +            );
>> +            let mut args = serde_json::json!({ "name": namespace.name() });
>> +            if !parent.is_root() {
>> +                args["parent"] = serde_json::to_value(parent.clone())?;
>> +            }
>> +            if let Err(err) = params.target.client.post(&api_path, Some(args)).await {
>> +                let target_store_and_ns =
>> +                    print_store_and_ns(params.target.repo.store(), &namespace);
>> +                bail!("sync into {target_store_and_ns} failed - namespace creation failed: {err}");
>> +            }
>> +            parent.push(namespace.name())?;
> 
> this tries to create every prefix of the missing namespace, instead of
> just the missing lower end of the hierarchy.. which is currently fine,
> since the create_namespace API endpoint doesn't fail if the namespace
> already exists, but since we already have a list of existing namespaces
> here, we could make it more future proof (and a bit faster ;)) by
> skipping those..

Added the additional checks to not perform the api calls to pre-existing 
namespace components of the target.

>> +        }
>> +
>> +        created = true;
>> +    }
>> +
>> +    Ok(created)
>> +}
>> +
>> +/// Push contents of source datastore matched by given push parameters to target.
>> +pub(crate) async fn push_store(mut params: PushParameters) -> Result<SyncStats, Error> {
>> +    let mut errors = false;
>> +
>> +    // Generate list of source namespaces to push to target, limited by max-depth
>> +    let mut namespaces = params.source.list_namespaces(&mut params.max_depth).await?;
>> +
>> +    check_namespace_depth_limit(&params.source.get_ns(), &params.target.ns, &namespaces)?;
>> +
>> +    namespaces.sort_unstable_by_key(|a| a.name_len());
>> +
>> +    // Fetch all accessible namespaces already present on the target
>> +    let target_namespaces = fetch_target_namespaces(&params).await?;
>> +    // Remember synced namespaces, removing non-synced ones when remove vanished flag is set
>> +    let mut synced_namespaces = HashSet::with_capacity(namespaces.len());
>> +
>> +    let (mut groups, mut snapshots) = (0, 0);
>> +    let mut stats = SyncStats::default();
>> +    for namespace in namespaces {
>> +        let source_store_and_ns = print_store_and_ns(params.source.store.name(), &namespace);
>> +        let target_namespace = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
>> +        let target_store_and_ns = print_store_and_ns(params.target.repo.store(), &target_namespace);
>> +
>> +        info!("----");
>> +        info!("Syncing {source_store_and_ns} into {target_store_and_ns}");
>> +
>> +        synced_namespaces.insert(target_namespace.clone());
>> +
>> +        match check_or_create_target_namespace(&params, &target_namespaces, &target_namespace).await
>> +        {
>> +            Ok(true) => info!("Created namespace {target_namespace}"),
>> +            Ok(false) => {}
>> +            Err(err) => {
>> +                info!("Cannot sync {source_store_and_ns} into {target_store_and_ns} - {err}");
>> +                errors = true;
>> +                continue;
>> +            }
>> +        }
>> +
>> +        match push_namespace(&namespace, &params).await {
>> +            Ok((sync_progress, sync_stats, sync_errors)) => {
>> +                errors |= sync_errors;
>> +                stats.add(sync_stats);
>> +
>> +                if params.max_depth != Some(0) {
>> +                    groups += sync_progress.done_groups;
>> +                    snapshots += sync_progress.done_snapshots;
>> +
>> +                    let ns = if namespace.is_root() {
>> +                        "root namespace".into()
>> +                    } else {
>> +                        format!("namespace {namespace}")
>> +                    };
>> +                    info!(
>> +                        "Finished syncing {ns}, current progress: {groups} groups, {snapshots} snapshots"
>> +                    );
>> +                }
>> +            }
>> +            Err(err) => {
>> +                errors = true;
>> +                info!("Encountered errors while syncing namespace {namespace} - {err}");
>> +            }
>> +        }
>> +    }
>> +
>> +    if params.remove_vanished {
>> +        for target_namespace in target_namespaces {
>> +            if synced_namespaces.contains(&target_namespace) {
>> +                continue;
>> +            }
>> +            if let Err(err) = remove_target_namespace(&params, &target_namespace).await {
>> +                info!("failed to remove vanished namespace {target_namespace} - {err}");
>> +                continue;
>> +            }
>> +            info!("removed vanished namespace {target_namespace}");
>> +        }
>> +    }
>> +
>> +    if errors {
>> +        bail!("sync failed with some errors.");
>> +    }
>> +
>> +    Ok(stats)
>> +}
>> +
>> +/// Push namespace including all backup groups to target
>> +///
>> +/// Iterate over all backup groups in the namespace and push them to the target.
>> +pub(crate) async fn push_namespace(
>> +    namespace: &BackupNamespace,
>> +    params: &PushParameters,
>> +) -> Result<(StoreProgress, SyncStats, bool), Error> {
>> +    // Check if user is allowed to perform backups on remote datastore
>> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_BACKUP)
>> +        .map_err(|err| format_err!("Pushing to remote not allowed - {err}"))?;
>> +
>> +    let mut list: Vec<BackupGroup> = params
>> +        .source
>> +        .list_groups(namespace, &params.local_user)
>> +        .await?;
>> +
>> +    list.sort_unstable_by(|a, b| {
>> +        let type_order = a.ty.cmp(&b.ty);
>> +        if type_order == Ordering::Equal {
>> +            a.id.cmp(&b.id)
>> +        } else {
>> +            type_order
>> +        }
>> +    });
>> +
>> +    let total = list.len();
>> +    let list: Vec<BackupGroup> = list
>> +        .into_iter()
>> +        .filter(|group| group.apply_filters(&params.group_filter))
>> +        .collect();
>> +
>> +    info!(
>> +        "found {filtered} groups to sync (out of {total} total)",
>> +        filtered = list.len()
>> +    );
>> +
>> +    let target_groups = if params.remove_vanished {
>> +        fetch_target_groups(params, namespace).await?
>> +    } else {
>> +        // avoid fetching of groups, not required if remove vanished not set
>> +        Vec::new()
>> +    };
> 
> should we then fetch them below in the if remove_vanished branch, like
> we do when handling snapshots?

Yes, I do not recall why exactly I did fetch the groups already here, 
might be a leftover from a previous iteration during implementation.

As this is never used other than for the remove vanished case, moved the 
call there.

>> +
>> +    let mut errors = false;
>> +    // Remember synced groups, remove others when the remove vanished flag is set
>> +    let mut synced_groups = HashSet::new();
>> +    let mut progress = StoreProgress::new(list.len() as u64);
>> +    let mut stats = SyncStats::default();
>> +
>> +    for (done, group) in list.into_iter().enumerate() {
>> +        progress.done_groups = done as u64;
>> +        progress.done_snapshots = 0;
>> +        progress.group_snapshots = 0;
>> +        synced_groups.insert(group.clone());
>> +
>> +        match push_group(params, namespace, &group, &mut progress).await {
>> +            Ok(sync_stats) => stats.add(sync_stats),
>> +            Err(err) => {
>> +                info!("sync group '{group}' failed  - {err}");
>> +                errors = true;
>> +            }
>> +        }
>> +    }
>> +
>> +    if params.remove_vanished {
>> +        for target_group in target_groups {
>> +            if synced_groups.contains(&target_group) {
>> +                continue;
>> +            }
>> +            if !target_group.apply_filters(&params.group_filter) {
>> +                continue;
>> +            }
>> +
>> +            info!("delete vanished group '{target_group}'");
>> +
>> +            let count_before = match fetch_target_groups(params, namespace).await {
>> +                Ok(snapshots) => snapshots.len(),
>> +                Err(_err) => 0, // ignore errors
>> +            };
>> +
>> +            if let Err(err) = remove_target_group(params, namespace, &target_group).await {
>> +                info!("{err}");
>> +                errors = true;
>> +                continue;
>> +            }
>> +
>> +            let mut count_after = match fetch_target_groups(params, namespace).await {
>> +                Ok(snapshots) => snapshots.len(),
>> +                Err(_err) => 0, // ignore errors
>> +            };
>> +
>> +            let deleted_groups = if count_after > 0 {
>> +                info!("kept some protected snapshots of group '{target_group}'");
>> +                0
>> +            } else {
>> +                1
>> +            };
>> +
>> +            if count_after > count_before {
>> +                count_after = count_before;
>> +            }
>> +
>> +            stats.add(SyncStats::from(RemovedVanishedStats {
>> +                snapshots: count_before - count_after,
>> +                groups: deleted_groups,
>> +                namespaces: 0,
>> +            }));
>> +        }
>> +    }
>> +
>> +    Ok((progress, stats, errors))
>> +}
>> +
>> +async fn fetch_target_snapshots(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +    group: &BackupGroup,
>> +) -> Result<Vec<SnapshotListItem>, Error> {
>> +    let api_path = format!(
>> +        "api2/json/admin/datastore/{store}/snapshots",
>> +        store = params.target.repo.store(),
>> +    );
>> +    let mut args = serde_json::to_value(group)?;
>> +    if !namespace.is_root() {
>> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
>> +        args["ns"] = serde_json::to_value(target_ns)?;
>> +    }
>> +    let mut result = params.target.client.get(&api_path, Some(args)).await?;
>> +    let snapshots: Vec<SnapshotListItem> = serde_json::from_value(result["data"].take())?;
>> +
>> +    Ok(snapshots)
>> +}
>> +
>> +async fn fetch_previous_backup_time(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +    group: &BackupGroup,
>> +) -> Result<Option<i64>, Error> {
>> +    let mut snapshots = fetch_target_snapshots(params, namespace, group).await?;
>> +    snapshots.sort_unstable_by(|a, b| a.backup.time.cmp(&b.backup.time));
>> +    Ok(snapshots.last().map(|snapshot| snapshot.backup.time))
>> +}
>> +
>> +async fn forget_target_snapshot(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +    snapshot: &BackupDir,
>> +) -> Result<(), Error> {
>> +    check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
>> +        .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
>> +
>> +    let api_path = format!(
>> +        "api2/json/admin/datastore/{store}/snapshots",
>> +        store = params.target.repo.store(),
>> +    );
>> +    let mut args = serde_json::to_value(snapshot)?;
>> +    if !namespace.is_root() {
>> +        let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
>> +        args["ns"] = serde_json::to_value(target_ns)?;
>> +    }
>> +    params.target.client.delete(&api_path, Some(args)).await?;
>> +
>> +    Ok(())
>> +}
>> +
>> +/// Push group including all snaphshots to target
>> +///
>> +/// Iterate over all snapshots in the group and push them to the target.
>> +/// The group sync operation consists of the following steps:
>> +/// - Query snapshots of given group from the source
>> +/// - Sort snapshots by time
>> +/// - Apply transfer last cutoff and filters to list
>> +/// - Iterate the snapshot list and push each snapshot individually
>> +/// - (Optional): Remove vanished groups if `remove_vanished` flag is set
>> +pub(crate) async fn push_group(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +    group: &BackupGroup,
>> +    progress: &mut StoreProgress,
>> +) -> Result<SyncStats, Error> {
>> +    let mut already_synced_skip_info = SkipInfo::new(SkipReason::AlreadySynced);
>> +    let mut transfer_last_skip_info = SkipInfo::new(SkipReason::TransferLast);
>> +
>> +    let mut snapshots: Vec<BackupDir> = params.source.list_backup_dirs(namespace, group).await?;
>> +    snapshots.sort_unstable_by(|a, b| a.time.cmp(&b.time));
>> +
>> +    let total_snapshots = snapshots.len();
>> +    let cutoff = params
>> +        .transfer_last
>> +        .map(|count| total_snapshots.saturating_sub(count))
>> +        .unwrap_or_default();
>> +
>> +    let last_snapshot_time = fetch_previous_backup_time(params, namespace, group)
>> +        .await?
>> +        .unwrap_or(i64::MIN);
>> +
>> +    let mut source_snapshots = HashSet::new();
>> +    let snapshots: Vec<BackupDir> = snapshots
>> +        .into_iter()
>> +        .enumerate()
>> +        .filter(|&(pos, ref snapshot)| {
>> +            source_snapshots.insert(snapshot.time);
>> +            if last_snapshot_time > snapshot.time {
>> +                already_synced_skip_info.update(snapshot.time);
>> +                return false;
>> +            } else if already_synced_skip_info.count > 0 {
>> +                info!("{already_synced_skip_info}");
>> +                already_synced_skip_info.reset();
>> +                return true;
>> +            }
>> +
>> +            if pos < cutoff && last_snapshot_time != snapshot.time {
>> +                transfer_last_skip_info.update(snapshot.time);
>> +                return false;
>> +            } else if transfer_last_skip_info.count > 0 {
>> +                info!("{transfer_last_skip_info}");
>> +                transfer_last_skip_info.reset();
>> +            }
>> +            true
>> +        })
>> +        .map(|(_, dir)| dir)
>> +        .collect();
>> +
>> +    progress.group_snapshots = snapshots.len() as u64;
>> +
>> +    let target_snapshots = fetch_target_snapshots(params, namespace, group).await?;
>> +    let target_snapshots: Vec<BackupDir> = target_snapshots
>> +        .into_iter()
>> +        .map(|snapshot| snapshot.backup)
>> +        .collect();
>> +
>> +    let mut stats = SyncStats::default();
>> +    for (pos, source_snapshot) in snapshots.into_iter().enumerate() {
>> +        if target_snapshots.contains(&source_snapshot) {
>> +            progress.done_snapshots = pos as u64 + 1;
>> +            info!("percentage done: {progress}");
>> +            continue;
>> +        }
>> +        let result = push_snapshot(params, namespace, &source_snapshot).await;
>> +
>> +        progress.done_snapshots = pos as u64 + 1;
>> +        info!("percentage done: {progress}");
>> +
>> +        // stop on error
>> +        let sync_stats = result?;
>> +        stats.add(sync_stats);
>> +    }
>> +
>> +    if params.remove_vanished {
>> +        let target_snapshots = fetch_target_snapshots(params, namespace, group).await?;
>> +        for snapshot in target_snapshots {
>> +            if source_snapshots.contains(&snapshot.backup.time) {
>> +                continue;
>> +            }
>> +            if snapshot.protected {
>> +                info!(
>> +                    "don't delete vanished snapshot {name} (protected)",
>> +                    name = snapshot.backup
>> +                );
>> +                continue;
>> +            }
>> +            if let Err(err) = forget_target_snapshot(params, namespace, &snapshot.backup).await {
>> +                info!(
>> +                    "could not delete vanished snapshot {name} - {err}",
>> +                    name = snapshot.backup
>> +                );
>> +            }
>> +            info!("delete vanished snapshot {name}", name = snapshot.backup);
>> +            stats.add(SyncStats::from(RemovedVanishedStats {
>> +                snapshots: 1,
>> +                groups: 0,
>> +                namespaces: 0,
>> +            }));
>> +        }
>> +    }
>> +
>> +    Ok(stats)
>> +}
>> +
>> +/// Push snapshot to target
>> +///
>> +/// Creates a new snapshot on the target and pushes the content of the source snapshot to the
>> +/// target by creating a new manifest file and connecting to the remote as backup writer client.
>> +/// Chunks are written by recreating the index by uploading the chunk stream as read from the
>> +/// source. Data blobs are uploaded as such.
>> +pub(crate) async fn push_snapshot(
>> +    params: &PushParameters,
>> +    namespace: &BackupNamespace,
>> +    snapshot: &BackupDir,
>> +) -> Result<SyncStats, Error> {
>> +    let mut stats = SyncStats::default();
>> +    let target_ns = namespace.map_prefix(&params.source.ns, &params.target.ns)?;
>> +    let backup_dir = params
>> +        .source
>> +        .store
>> +        .backup_dir(params.source.ns.clone(), snapshot.clone())?;
>> +
>> +    let reader = params.source.reader(namespace, snapshot).await?;
>> +
>> +    // Load the source manifest, needed to find crypt mode for files
>> +    let mut tmp_source_manifest_path = backup_dir.full_path();
>> +    tmp_source_manifest_path.push(MANIFEST_BLOB_NAME);
>> +    tmp_source_manifest_path.set_extension("tmp");
>> +    let source_manifest = if let Some(manifest_blob) = reader
>> +        .load_file_into(MANIFEST_BLOB_NAME, &tmp_source_manifest_path)
>> +        .await?
>> +    {
>> +        BackupManifest::try_from(manifest_blob)?
> 
> why do we copy the manifest into a .tmp file path here instead of just
> reading it? and if we need the copy, who's cleaning it up?

Switched this over to reading the manifest directly, as you are right 
and the temp file is not required at all. I was primed by the pull 
implementation where it is required since read from the remote.
Makes this part way cleaner and concise.

> 
>> +    } else {
>> +        // no manifest in snapshot, skip
>> +        return Ok(stats);
>> +    };
>> +
>> +    // Manifest to be created on target, referencing all the source archives after upload.
>> +    let mut manifest = BackupManifest::new(snapshot.clone());
>> +
>> +    // writer instance locks the snapshot on the remote side
>> +    let backup_writer = BackupWriter::start(
>> +        &params.target.client,
>> +        None,
>> +        params.target.repo.store(),
>> +        &target_ns,
>> +        snapshot,
>> +        false,
>> +        false,
>> +    )
>> +    .await?;
>> +
>> +    // Use manifest of previous snapshots in group on target for chunk upload deduplication
>> +    let previous_manifest = match backup_writer.download_previous_manifest().await {
>> +        Ok(manifest) => Some(Arc::new(manifest)),
>> +        Err(err) => {
>> +            log::info!("Could not download previous manifest - {err}");
>> +            None
>> +        }
>> +    };
> 
> this should not be attempted for the first snapshot in a group, else it
> does requests we already know will fail and spams the log as a result as
> well..

Agreed, added a check to make sure to only try to fetch the previous 
manifest if there are snapshots present for this backup group on the target.

>> +
>> +    let upload_options = UploadOptions {
>> +        compress: true,
>> +        encrypt: false,
> 
> this might warrant a comment why it's okay to do that (I hope it is? ;))

Ack, added some details on why to set up the upload options as is. 
Basically this is possible, since the backup writer stream code 
performing compression and encryption is bypassed by using the 
`upload_index_chunk_info` method.
Therefore even for encrypted and compressed backups, uploading of the 
chunks as read from the source is possible. The relevant parts need to 
be included in the manifest however, so it is restorable.

Compression is set to true, so that the blob upload of e.g. the backup 
log file (if present) will be compressed.

>> +        previous_manifest,
>> +        ..UploadOptions::default()
>> +    };
>> +
>> +    // Avoid double upload penalty by remembering already seen chunks
>> +    let known_chunks = Arc::new(Mutex::new(HashSet::with_capacity(1024 * 1024)));
>> +
>> +    for entry in source_manifest.files() {
>> +        let mut path = backup_dir.full_path();
>> +        path.push(&entry.filename);
>> +        if path.try_exists()? {
>> +            match ArchiveType::from_path(&entry.filename)? {
>> +                ArchiveType::Blob => {
>> +                    let file = std::fs::File::open(path.clone())?;
>> +                    let backup_stats = backup_writer.upload_blob(file, &entry.filename).await?;
>> +                    manifest.add_file(
>> +                        entry.filename.to_string(),
>> +                        backup_stats.size,
>> +                        backup_stats.csum,
>> +                        entry.chunk_crypt_mode(),
>> +                    )?;
>> +                    stats.add(SyncStats {
>> +                        chunk_count: backup_stats.chunk_count as usize,
>> +                        bytes: backup_stats.size as usize,
>> +                        elapsed: backup_stats.duration,
>> +                        removed: None,
>> +                    });
>> +                }
>> +                ArchiveType::DynamicIndex => {
>> +                    let index = DynamicIndexReader::open(&path)?;
>> +                    let chunk_reader = reader.chunk_reader(entry.chunk_crypt_mode());
>> +                    let sync_stats = push_index(
>> +                        &entry.filename,
>> +                        index,
>> +                        chunk_reader,
>> +                        &backup_writer,
>> +                        &mut manifest,
>> +                        entry.chunk_crypt_mode(),
>> +                        None,
>> +                        &known_chunks,
>> +                    )
>> +                    .await?;
>> +                    stats.add(sync_stats);
>> +                }
>> +                ArchiveType::FixedIndex => {
>> +                    let index = FixedIndexReader::open(&path)?;
>> +                    let chunk_reader = reader.chunk_reader(entry.chunk_crypt_mode());
>> +                    let size = index.index_bytes();
>> +                    let sync_stats = push_index(
>> +                        &entry.filename,
>> +                        index,
>> +                        chunk_reader,
>> +                        &backup_writer,
>> +                        &mut manifest,
>> +                        entry.chunk_crypt_mode(),
>> +                        Some(size),
>> +                        &known_chunks,
>> +                    )
>> +                    .await?;
>> +                    stats.add(sync_stats);
>> +                }
>> +            }
>> +        } else {
>> +            info!("{path:?} does not exist, skipped.");
>> +        }
>> +    }
>> +
>> +    // Fetch client log from source and push to target
>> +    // this has to be handled individually since the log is never part of the manifest
>> +    let mut client_log_path = backup_dir.full_path();
>> +    client_log_path.push(CLIENT_LOG_BLOB_NAME);
>> +    if client_log_path.is_file() {
>> +        backup_writer
>> +            .upload_blob_from_file(
>> +                &client_log_path,
>> +                CLIENT_LOG_BLOB_NAME,
>> +                upload_options.clone(),
>> +            )
>> +            .await?;
>> +    } else {
>> +        info!("Client log at {client_log_path:?} does not exist or is not a file, skipped.");
>> +    }
> 
> I am not sure this warrants a log line.. the client log is optional
> after all, so this can happen quite a lot in practice (e.g., if you do
> host backups without bothering to upload logs..)
 >
> I think we should follow the logic of pull based syncing here - add a
> log to the last previously synced snapshot if it exists and is missing
> on the other end, otherwise only attempt to upload a log if it exists
> without logging its absence.

Dropped the log line for now, as fetching the info from the previous 
snapshot on the target just to display this log line seemed rather 
inefficient.
>> +
>> +    // Rewrite manifest for pushed snapshot, re-adding the existing fingerprint and signature
>> +    let mut manifest_json = serde_json::to_value(manifest)?;
>> +    manifest_json["unprotected"] = source_manifest.unprotected;
>> +    if let Some(signature) = source_manifest.signature {
>> +        manifest_json["signature"] = serde_json::to_value(signature)?;
>> +    }
>> +    let manifest_string = serde_json::to_string_pretty(&manifest_json).unwrap();
> 
> couldn't we just upload the original manifest here?

Yes, this will be done instead. This works even if the backup finish 
call (over-)writes some values (`chunk_upload_stats`), which is why I 
implemented this differently at first.

>> +    let backup_stats = backup_writer
>> +        .upload_blob_from_data(
>> +            manifest_string.into_bytes(),
>> +            MANIFEST_BLOB_NAME,
>> +            upload_options,
>> +        )
>> +        .await?;
>> +    backup_writer.finish().await?;
>> +
>> +    stats.add(SyncStats {
>> +        chunk_count: backup_stats.chunk_count as usize,
>> +        bytes: backup_stats.size as usize,
>> +        elapsed: backup_stats.duration,
>> +        removed: None,
>> +    });
>> +
>> +    Ok(stats)
>> +}
>> +
>> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
>> +//
>> +// For fixed indexes, the size must be provided as given by the index reader.
>> +#[allow(clippy::too_many_arguments)]
>> +async fn push_index<'a>(
>> +    filename: &'a str,
>> +    index: impl IndexFile + Send + 'static,
>> +    chunk_reader: Arc<dyn AsyncReadChunk>,
>> +    backup_writer: &BackupWriter,
>> +    manifest: &mut BackupManifest,
>> +    crypt_mode: CryptMode,
>> +    size: Option<u64>,
>> +    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
>> +) -> Result<SyncStats, Error> {
>> +    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
>> +    let mut chunk_infos =
>> +        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
> 
> so this iterates over all the chunks in the index..
> 
>> +
>> +    tokio::spawn(async move {
>> +        while let Some(chunk_info) = chunk_infos.next().await {
>> +            let chunk_info = chunk_reader
>> +                .read_raw_chunk(&chunk_info.digest)
> 
> and this reads them
> 
>> +                .await
>> +                .map(|chunk| ChunkInfo {
>> +                    chunk,
>> +                    digest: chunk_info.digest,
>> +                    chunk_len: chunk_info.size(),
>> +                    offset: chunk_info.range.start,
>> +                });
>> +            let _ = upload_channel_tx.send(chunk_info).await;
> 
> and sends them further along to the upload code.. which will then (in
> many cases) throw away all that data we just read because it's already
> on the target and we know that because of the previous manifest..
> 
> wouldn't it be better to deduplicate here already, and instead of
> reading known chunks over and over again, just tell the server to
> re-register them? or am I missing something here? :)

Good catch, this is indeed a possible huge performance bottleneck!

Did fix this by moving the known chunks check here (as suggested) and 
stream a `MergedChunkInfo` instead of `ChunkInfo`, which allows to only 
send the chunk's digest and size over to the backup writer. By this 
known chunk are never read.

> 
>> +        }
>> +    });
>> +
>> +    let chunk_info_stream = ReceiverStream::new(upload_channel_rx).map_err(Error::from);
>> +
>> +    let upload_options = UploadOptions {
>> +        compress: true,
>> +        encrypt: false,
>> +        fixed_size: size,
>> +        ..UploadOptions::default()
>> +    };
>> +
>> +    let upload_stats = backup_writer
>> +        .upload_index_chunk_info(
>> +            filename,
>> +            chunk_info_stream,
>> +            upload_options,
>> +            known_chunks.clone(),
>> +        )
>> +        .await?;
>> +
>> +    manifest.add_file(
>> +        filename.to_string(),
>> +        upload_stats.size,
>> +        upload_stats.csum,
>> +        crypt_mode,
>> +    )?;
>> +
>> +    Ok(SyncStats {
>> +        chunk_count: upload_stats.chunk_count as usize,
>> +        bytes: upload_stats.size as usize,
>> +        elapsed: upload_stats.duration,
>> +        removed: None,
>> +    })
>> +}
>> -- 
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter
  2024-10-14  9:25       ` Fabian Grünbichler
@ 2024-10-14  9:36         ` Christian Ebner
  0 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-10-14  9:36 UTC (permalink / raw)
  To: Fabian Grünbichler, Proxmox Backup Server development discussion

On 10/14/24 11:25, Fabian Grünbichler wrote:
> On October 14, 2024 10:10 am, Christian Ebner wrote:
>> On 10/10/24 16:48, Fabian Grünbichler wrote:
>>
>> What do you mean with being renamed here?
> 
> there's two "list_sync_jobs" fns, which can be confusing when reading
> code/patches..

Okay, will adapt that as well then, thanks for clarification!


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
  2024-10-14  9:32     ` Christian Ebner
@ 2024-10-14  9:41       ` Fabian Grünbichler
  2024-10-14  9:53         ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-14  9:41 UTC (permalink / raw)
  To: Christian Ebner, Proxmox Backup Server development discussion


> Christian Ebner <c.ebner@proxmox.com> hat am 14.10.2024 11:32 CEST geschrieben:
> On 10/10/24 16:48, Fabian Grünbichler wrote:
> >> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
> >> +//
> >> +// For fixed indexes, the size must be provided as given by the index reader.
> >> +#[allow(clippy::too_many_arguments)]
> >> +async fn push_index<'a>(
> >> +    filename: &'a str,
> >> +    index: impl IndexFile + Send + 'static,
> >> +    chunk_reader: Arc<dyn AsyncReadChunk>,
> >> +    backup_writer: &BackupWriter,
> >> +    manifest: &mut BackupManifest,
> >> +    crypt_mode: CryptMode,
> >> +    size: Option<u64>,
> >> +    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
> >> +) -> Result<SyncStats, Error> {
> >> +    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
> >> +    let mut chunk_infos =
> >> +        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
> > 
> > so this iterates over all the chunks in the index..
> > 
> >> +
> >> +    tokio::spawn(async move {
> >> +        while let Some(chunk_info) = chunk_infos.next().await {
> >> +            let chunk_info = chunk_reader
> >> +                .read_raw_chunk(&chunk_info.digest)
> > 
> > and this reads them
> > 
> >> +                .await
> >> +                .map(|chunk| ChunkInfo {
> >> +                    chunk,
> >> +                    digest: chunk_info.digest,
> >> +                    chunk_len: chunk_info.size(),
> >> +                    offset: chunk_info.range.start,
> >> +                });
> >> +            let _ = upload_channel_tx.send(chunk_info).await;
> > 
> > and sends them further along to the upload code.. which will then (in
> > many cases) throw away all that data we just read because it's already
> > on the target and we know that because of the previous manifest..
> > 
> > wouldn't it be better to deduplicate here already, and instead of
> > reading known chunks over and over again, just tell the server to
> > re-register them? or am I missing something here? :)
> 
> Good catch, this is indeed a possible huge performance bottleneck!
> 
> Did fix this by moving the known chunks check here (as suggested) and 
> stream a `MergedChunkInfo` instead of `ChunkInfo`, which allows to only 
> send the chunk's digest and size over to the backup writer. By this 
> known chunk are never read.

this would be one of the few cases btw where re-uploading a chunk that is missing on the server side (for whatever reason) would be possible - not sure how easy it would be to integrate though ;)


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
  2024-10-14  9:41       ` Fabian Grünbichler
@ 2024-10-14  9:53         ` Christian Ebner
  2024-10-14 10:01           ` Fabian Grünbichler
  0 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-10-14  9:53 UTC (permalink / raw)
  To: Fabian Grünbichler, Proxmox Backup Server development discussion

On 10/14/24 11:41, Fabian Grünbichler wrote:
> 
>> Christian Ebner <c.ebner@proxmox.com> hat am 14.10.2024 11:32 CEST geschrieben:
>> On 10/10/24 16:48, Fabian Grünbichler wrote:
>>>> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
>>>> +//
>>>> +// For fixed indexes, the size must be provided as given by the index reader.
>>>> +#[allow(clippy::too_many_arguments)]
>>>> +async fn push_index<'a>(
>>>> +    filename: &'a str,
>>>> +    index: impl IndexFile + Send + 'static,
>>>> +    chunk_reader: Arc<dyn AsyncReadChunk>,
>>>> +    backup_writer: &BackupWriter,
>>>> +    manifest: &mut BackupManifest,
>>>> +    crypt_mode: CryptMode,
>>>> +    size: Option<u64>,
>>>> +    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
>>>> +) -> Result<SyncStats, Error> {
>>>> +    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
>>>> +    let mut chunk_infos =
>>>> +        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
>>>
>>> so this iterates over all the chunks in the index..
>>>
>>>> +
>>>> +    tokio::spawn(async move {
>>>> +        while let Some(chunk_info) = chunk_infos.next().await {
>>>> +            let chunk_info = chunk_reader
>>>> +                .read_raw_chunk(&chunk_info.digest)
>>>
>>> and this reads them
>>>
>>>> +                .await
>>>> +                .map(|chunk| ChunkInfo {
>>>> +                    chunk,
>>>> +                    digest: chunk_info.digest,
>>>> +                    chunk_len: chunk_info.size(),
>>>> +                    offset: chunk_info.range.start,
>>>> +                });
>>>> +            let _ = upload_channel_tx.send(chunk_info).await;
>>>
>>> and sends them further along to the upload code.. which will then (in
>>> many cases) throw away all that data we just read because it's already
>>> on the target and we know that because of the previous manifest..
>>>
>>> wouldn't it be better to deduplicate here already, and instead of
>>> reading known chunks over and over again, just tell the server to
>>> re-register them? or am I missing something here? :)
>>
>> Good catch, this is indeed a possible huge performance bottleneck!
>>
>> Did fix this by moving the known chunks check here (as suggested) and
>> stream a `MergedChunkInfo` instead of `ChunkInfo`, which allows to only
>> send the chunk's digest and size over to the backup writer. By this
>> known chunk are never read.
> 
> this would be one of the few cases btw where re-uploading a chunk that is missing on the server side (for whatever reason) would be possible - not sure how easy it would be to integrate though ;)

Well, unfortunately not directly. Here the chunk is only send via the 
channel to the backup writer, which itself further processes the stream 
and transforms it into futures for the upload requests. So the actual 
failure will be there...

In order to upload corrupt/missing chunks I think a different, 
completely decoupled approach might be better (at least for the sync).

E.g. the backup writer could get access to a chunk buffer, which keeps 
the chunks in memory until the upload was successful. This could also 
include possible lookup mechanisms to the local/remote chunk store or 
re-generate required chunks by reading from the block device in case of 
dirty bitmap tracking..

The re-upload in case of reused payload chunks for `ppxar` archives is 
more tricky...


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
  2024-10-14  9:53         ` Christian Ebner
@ 2024-10-14 10:01           ` Fabian Grünbichler
  2024-10-14 10:15             ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-14 10:01 UTC (permalink / raw)
  To: Christian Ebner, Proxmox Backup Server development discussion


> Christian Ebner <c.ebner@proxmox.com> hat am 14.10.2024 11:53 CEST geschrieben:
> 
>  
> On 10/14/24 11:41, Fabian Grünbichler wrote:
> > 
> >> Christian Ebner <c.ebner@proxmox.com> hat am 14.10.2024 11:32 CEST geschrieben:
> >> On 10/10/24 16:48, Fabian Grünbichler wrote:
> >>>> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
> >>>> +//
> >>>> +// For fixed indexes, the size must be provided as given by the index reader.
> >>>> +#[allow(clippy::too_many_arguments)]
> >>>> +async fn push_index<'a>(
> >>>> +    filename: &'a str,
> >>>> +    index: impl IndexFile + Send + 'static,
> >>>> +    chunk_reader: Arc<dyn AsyncReadChunk>,
> >>>> +    backup_writer: &BackupWriter,
> >>>> +    manifest: &mut BackupManifest,
> >>>> +    crypt_mode: CryptMode,
> >>>> +    size: Option<u64>,
> >>>> +    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
> >>>> +) -> Result<SyncStats, Error> {
> >>>> +    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
> >>>> +    let mut chunk_infos =
> >>>> +        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
> >>>
> >>> so this iterates over all the chunks in the index..
> >>>
> >>>> +
> >>>> +    tokio::spawn(async move {
> >>>> +        while let Some(chunk_info) = chunk_infos.next().await {
> >>>> +            let chunk_info = chunk_reader
> >>>> +                .read_raw_chunk(&chunk_info.digest)
> >>>
> >>> and this reads them
> >>>
> >>>> +                .await
> >>>> +                .map(|chunk| ChunkInfo {
> >>>> +                    chunk,
> >>>> +                    digest: chunk_info.digest,
> >>>> +                    chunk_len: chunk_info.size(),
> >>>> +                    offset: chunk_info.range.start,
> >>>> +                });
> >>>> +            let _ = upload_channel_tx.send(chunk_info).await;
> >>>
> >>> and sends them further along to the upload code.. which will then (in
> >>> many cases) throw away all that data we just read because it's already
> >>> on the target and we know that because of the previous manifest..
> >>>
> >>> wouldn't it be better to deduplicate here already, and instead of
> >>> reading known chunks over and over again, just tell the server to
> >>> re-register them? or am I missing something here? :)
> >>
> >> Good catch, this is indeed a possible huge performance bottleneck!
> >>
> >> Did fix this by moving the known chunks check here (as suggested) and
> >> stream a `MergedChunkInfo` instead of `ChunkInfo`, which allows to only
> >> send the chunk's digest and size over to the backup writer. By this
> >> known chunk are never read.
> > 
> > this would be one of the few cases btw where re-uploading a chunk that is missing on the server side (for whatever reason) would be possible - not sure how easy it would be to integrate though ;)
> 
> Well, unfortunately not directly. Here the chunk is only send via the 
> channel to the backup writer, which itself further processes the stream 
> and transforms it into futures for the upload requests. So the actual 
> failure will be there...
> 
> In order to upload corrupt/missing chunks I think a different, 
> completely decoupled approach might be better (at least for the sync).
> 
> E.g. the backup writer could get access to a chunk buffer, which keeps 
> the chunks in memory until the upload was successful. This could also 
> include possible lookup mechanisms to the local/remote chunk store or 
> re-generate required chunks by reading from the block device in case of 
> dirty bitmap tracking..
> 
> The re-upload in case of reused payload chunks for `ppxar` archives is 
> more tricky...

I mostly meant in the sense of - the source chunk doesn't disappear while we are uploading (as opposed to a regular backup, where the input data might have changed in the meantime), so we can go back and retrieve it without the need to always read it and keep it in memory. it would definitely require propagating the information which chunks need to be uploaded despite being "known" back to the client in some fashion.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
  2024-10-14 10:01           ` Fabian Grünbichler
@ 2024-10-14 10:15             ` Christian Ebner
  0 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-10-14 10:15 UTC (permalink / raw)
  To: Fabian Grünbichler, Proxmox Backup Server development discussion

On 10/14/24 12:01, Fabian Grünbichler wrote:
> 
>> Christian Ebner <c.ebner@proxmox.com> hat am 14.10.2024 11:53 CEST geschrieben:
>>
>>   
>> On 10/14/24 11:41, Fabian Grünbichler wrote:
>>>
>>>> Christian Ebner <c.ebner@proxmox.com> hat am 14.10.2024 11:32 CEST geschrieben:
>>>> On 10/10/24 16:48, Fabian Grünbichler wrote:
>>>>>> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
>>>>>> +//
>>>>>> +// For fixed indexes, the size must be provided as given by the index reader.
>>>>>> +#[allow(clippy::too_many_arguments)]
>>>>>> +async fn push_index<'a>(
>>>>>> +    filename: &'a str,
>>>>>> +    index: impl IndexFile + Send + 'static,
>>>>>> +    chunk_reader: Arc<dyn AsyncReadChunk>,
>>>>>> +    backup_writer: &BackupWriter,
>>>>>> +    manifest: &mut BackupManifest,
>>>>>> +    crypt_mode: CryptMode,
>>>>>> +    size: Option<u64>,
>>>>>> +    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
>>>>>> +) -> Result<SyncStats, Error> {
>>>>>> +    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
>>>>>> +    let mut chunk_infos =
>>>>>> +        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
>>>>>
>>>>> so this iterates over all the chunks in the index..
>>>>>
>>>>>> +
>>>>>> +    tokio::spawn(async move {
>>>>>> +        while let Some(chunk_info) = chunk_infos.next().await {
>>>>>> +            let chunk_info = chunk_reader
>>>>>> +                .read_raw_chunk(&chunk_info.digest)
>>>>>
>>>>> and this reads them
>>>>>
>>>>>> +                .await
>>>>>> +                .map(|chunk| ChunkInfo {
>>>>>> +                    chunk,
>>>>>> +                    digest: chunk_info.digest,
>>>>>> +                    chunk_len: chunk_info.size(),
>>>>>> +                    offset: chunk_info.range.start,
>>>>>> +                });
>>>>>> +            let _ = upload_channel_tx.send(chunk_info).await;
>>>>>
>>>>> and sends them further along to the upload code.. which will then (in
>>>>> many cases) throw away all that data we just read because it's already
>>>>> on the target and we know that because of the previous manifest..
>>>>>
>>>>> wouldn't it be better to deduplicate here already, and instead of
>>>>> reading known chunks over and over again, just tell the server to
>>>>> re-register them? or am I missing something here? :)
>>>>
>>>> Good catch, this is indeed a possible huge performance bottleneck!
>>>>
>>>> Did fix this by moving the known chunks check here (as suggested) and
>>>> stream a `MergedChunkInfo` instead of `ChunkInfo`, which allows to only
>>>> send the chunk's digest and size over to the backup writer. By this
>>>> known chunk are never read.
>>>
>>> this would be one of the few cases btw where re-uploading a chunk that is missing on the server side (for whatever reason) would be possible - not sure how easy it would be to integrate though ;)
>>
>> Well, unfortunately not directly. Here the chunk is only send via the
>> channel to the backup writer, which itself further processes the stream
>> and transforms it into futures for the upload requests. So the actual
>> failure will be there...
>>
>> In order to upload corrupt/missing chunks I think a different,
>> completely decoupled approach might be better (at least for the sync).
>>
>> E.g. the backup writer could get access to a chunk buffer, which keeps
>> the chunks in memory until the upload was successful. This could also
>> include possible lookup mechanisms to the local/remote chunk store or
>> re-generate required chunks by reading from the block device in case of
>> dirty bitmap tracking..
>>
>> The re-upload in case of reused payload chunks for `ppxar` archives is
>> more tricky...
> 
> I mostly meant in the sense of - the source chunk doesn't disappear while we are uploading (as opposed to a regular backup, where the input data might have changed in the meantime), so we can go back and retrieve it without the need to always read it and keep it in memory. it would definitely require propagating the information which chunks need to be uploaded despite being "known" back to the client in some fashion.

Yeah, I think ideally there would be a generic interface which allows 
the client to re-generate a missing/corrupt chunk as reported by the 
server (on known chunk upload).

The details of how to re-generate or buffer such chunks should than be 
handled differently based on what the chunk source is (backup of 
streams, backup of filesystems, backup of ...., buffered chunks, syncs, 
...).
If not possible, the backup should simply fail...

Something like a

```
trait {
     fn regenerate_known_chunk(digest: [32; u8]) -> Result<ChunkInfo, 
Error>;
}
```

which can be implemented as required?

But definitely out of scope for this patch series :)



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove
  2024-10-11  9:32   ` Fabian Grünbichler
@ 2024-10-14 10:24     ` Christian Ebner
  0 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-10-14 10:24 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

On 10/11/24 11:32, Fabian Grünbichler wrote:
> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>> @@ -299,10 +302,9 @@ pub async fn delete_group(
>>   
>>           let delete_stats = datastore.remove_backup_group(&ns, &group)?;
>>           if !delete_stats.all_removed() {
>> -            bail!("group only partially deleted due to protected snapshots");
>> +            warn!("group only partially deleted due to protected snapshots");
> 
> this not only changes the return type (from nothing to something
> actionable, which is okay!) but also the behaviour..
> 
> right now with this series applied, if I remove a group with protected
> snapshots, I get no indication on the UI that it failed to do so, and
> the log message only ends up in journal since there is no task context
> here..
> 
> I think this would at least warrant opt-in for the new behaviour? in any
> case, the warning/error could probably be adapted to contain the counts
> at least, now that we have them ;)

Yeah, right, opt-in seems the best way to go, as otherwise getting back 
the stats for the sync job will not work..

Same for the other case...





_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [pbs-devel] partially-applied: [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (33 preceding siblings ...)
  2024-10-10 14:48 ` [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Fabian Grünbichler
@ 2024-10-14 11:04 ` Fabian Grünbichler
  2024-10-17 13:31 ` [pbs-devel] " Christian Ebner
  35 siblings, 0 replies; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-14 11:04 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

applied the initial seven refactoring patches 1-4 & 9-11, leaving out
the backup writer ones for now since those might still see some changes.

On September 12, 2024 4:32 pm, Christian Ebner wrote:
> This patch series implements the functionality to extend the current
> sync jobs in pull direction by an additional push direction, allowing
> to push contents of a local source datastore to a remote target.
> 
> The series implements this by using the REST API of the remote target
> for fetching, creating and/or deleting namespaces, groups and backups,
> and reuses the clients backup writer functionality to create snapshots
> by writing a manifeset on the remote target and sync the fixed index,
> dynamic index or blobs contained in the source manifest to the remote,
> preserving also encryption information.
> 
> Thanks to Fabian for further feedback to the previous version of the
> patches, especially regarding users and ACLs.
> 
> Most notable changes since version 2 of the patch series include:
> - Add checks and extend roles and privs to allow for restricting a local
>   users access to remote datastore operations. In order to perform a
>   full sync in push direction, including permissions for namespace
>   creation and deleting contents with remove vansished, a acl.cfg looks
>   like below:
>   ```
>   acl:1:/datastore/datastore:syncoperator@pbs:DatastoreAudit
>   acl:1:/remote:syncoperator@pbs:RemoteSyncOperator
>   acl:1:/remote/local/pushme:syncoperator@pbs:RemoteDatastoreModify,RemoteDatastorePrune,RemoteSyncPushOperator
>   ```
>   Based on further feedback, privs might get further grouped or an
>   additional role containing most of these can be created.
> - Drop patch introducing `no-timestamp-check` flag for backup client, as pointed
>   out by Fabian this is not needed, as only backups newer than the currently
>   last available will be pushed.
> - Fix read snapshots from source by using the correct namespace.
> - Rename PullParameters `owner` to more fitting `local_user`.
> - Fix typos in remote sync push operator comment.
> - Fix comments not matching the functionality for the cli implementations.
> 
> The patch series is structured as follows in this version:
> - patch 1 is a cleanup patch fixing typos in api documentation.
> - patches 2 to 7 are patches restructuring the current code so that
>   functionality of the current pull implementation can be reused for
>   the push implementation as well.
> - patch 8 extens the backup writers functionality to be able to push
>   snapshots to the target.
> - patches 9 to 11 are once again preparatory patches for shared
>   implementation of sync jobs in pull and push direction.
> - patches 12 to 14 define the required permission acls and roles.
> - patch 15 implements almost all of the logic required for the push,
>   including pushing of the datastore, namespace, groups and snapshots,
>   taking into account also filters and additional sync flags.
> - patch 16 extends the current sync job configuration by a new config
>   type `sync-push` allowing to configure sync jobs in push direction
>   while limiting possible misconfiguration errors.
> - patches 17 to 28 expose the new sync job direction via the API, CLI
>   and WebUI.
> - patches 29 to 33 finally are followup patches, changing the return
>   type for the backup group and namespace delete REST API endpoints
>   to return statistics on the deleted snapshots, groups and namespaces,
>   which are then used to include this information in the task log.
>   As this is an API breaking change, the patches are kept independent
>   from the other patches.
> 
> Link to issue on bugtracker:
> https://bugzilla.proxmox.com/show_bug.cgi?id=3044
> 
> Christian Ebner (33):
>   api: datastore: add missing whitespace in description
>   server: sync: move sync related stats to common module
>   server: sync: move reader trait to common sync module
>   server: sync: move source to common sync module
>   client: backup writer: bundle upload stats counters
>   client: backup writer: factor out merged chunk stream upload
>   client: backup writer: add chunk count and duration stats
>   client: backup writer: allow push uploading index and chunks
>   server: sync: move skip info/reason to common sync module
>   server: sync: make skip reason message more genenric
>   server: sync: factor out namespace depth check into sync module
>   config: acl: mention optional namespace acl path component
>   config: acl: allow namespace components for remote datastores
>   api types: define remote permissions and roles for push sync
>   fix #3044: server: implement push support for sync operations
>   config: jobs: add `sync-push` config type for push sync jobs
>   api: push: implement endpoint for sync in push direction
>   api: sync: move sync job invocation to server sync module
>   api: sync jobs: expose optional `sync-direction` parameter
>   api: sync: add permission checks for push sync jobs
>   bin: manager: add datastore push cli command
>   ui: group filter: allow to set namespace for local datastore
>   ui: sync edit: source group filters based on sync direction
>   ui: add view with separate grids for pull and push sync jobs
>   ui: sync job: adapt edit window to be used for pull and push
>   ui: sync: pass sync-direction to allow removing push jobs
>   ui: sync view: do not use data model proxy for store
>   ui: sync view: set sync direction when invoking run task via api
>   datastore: move `BackupGroupDeleteStats` to api types
>   api types: implement api type for `BackupGroupDeleteStats`
>   datastore: increment deleted group counter when removing group
>   api: datastore/namespace: return backup groups delete stats on remove
>   server: sync job: use delete stats provided by the api
> 
>  pbs-api-types/src/acl.rs             |  32 +
>  pbs-api-types/src/datastore.rs       |  64 ++
>  pbs-api-types/src/jobs.rs            |  52 ++
>  pbs-client/src/backup_writer.rs      | 228 +++++--
>  pbs-config/src/acl.rs                |   7 +-
>  pbs-config/src/sync.rs               |  11 +-
>  pbs-datastore/src/backup_info.rs     |  34 +-
>  pbs-datastore/src/datastore.rs       |  27 +-
>  src/api2/admin/datastore.rs          |  24 +-
>  src/api2/admin/namespace.rs          |  20 +-
>  src/api2/admin/sync.rs               |  45 +-
>  src/api2/config/datastore.rs         |  22 +-
>  src/api2/config/notifications/mod.rs |  15 +-
>  src/api2/config/sync.rs              |  84 ++-
>  src/api2/mod.rs                      |   2 +
>  src/api2/pull.rs                     | 108 ----
>  src/api2/push.rs                     | 182 ++++++
>  src/bin/proxmox-backup-manager.rs    | 216 +++++--
>  src/bin/proxmox-backup-proxy.rs      |  25 +-
>  src/server/mod.rs                    |   3 +
>  src/server/pull.rs                   | 658 ++------------------
>  src/server/push.rs                   | 883 +++++++++++++++++++++++++++
>  src/server/sync.rs                   | 700 +++++++++++++++++++++
>  www/Makefile                         |   1 +
>  www/config/SyncPullPushView.js       |  60 ++
>  www/config/SyncView.js               |  47 +-
>  www/datastore/DataStoreList.js       |   2 +-
>  www/datastore/Panel.js               |   2 +-
>  www/form/GroupFilter.js              |  18 +-
>  www/window/SyncJobEdit.js            |  45 +-
>  30 files changed, 2706 insertions(+), 911 deletions(-)
>  create mode 100644 src/api2/push.rs
>  create mode 100644 src/server/push.rs
>  create mode 100644 src/server/sync.rs
>  create mode 100644 www/config/SyncPullPushView.js
> 
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api
  2024-10-11  9:32   ` Fabian Grünbichler
@ 2024-10-15  7:30     ` Christian Ebner
  2024-10-15  7:44       ` Fabian Grünbichler
  0 siblings, 1 reply; 60+ messages in thread
From: Christian Ebner @ 2024-10-15  7:30 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Fabian Grünbichler

On 10/11/24 11:32, Fabian Grünbichler wrote:
> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>> Use the API exposed additional delete statistics to generate the
>> task log output for sync jobs in push direction instead of fetching the
>> contents before and after deleting.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 2:
>> - no changes
>>
>>   src/server/push.rs | 65 ++++++++++++++++++++--------------------------
>>   1 file changed, 28 insertions(+), 37 deletions(-)
>>
>> diff --git a/src/server/push.rs b/src/server/push.rs
>> index cfbb88728..dbface907 100644
>> --- a/src/server/push.rs
>> +++ b/src/server/push.rs
>> @@ -11,9 +11,10 @@ use tokio_stream::wrappers::ReceiverStream;
>>   use tracing::info;
>>   
>>   use pbs_api_types::{
>> -    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
>> -    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
>> -    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
>> +    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupGroupDeleteStats, BackupNamespace,
>> +    CryptMode, GroupFilter, GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote,
>> +    SnapshotListItem, PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY,
>> +    PRIV_REMOTE_DATASTORE_PRUNE,
>>   };
>>   use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
>>   use pbs_config::CachedUserInfo;
>> @@ -228,7 +229,7 @@ async fn remove_target_group(
>>       params: &PushParameters,
>>       namespace: &BackupNamespace,
>>       backup_group: &BackupGroup,
>> -) -> Result<(), Error> {
>> +) -> Result<BackupGroupDeleteStats, Error> {
>>       check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
>>           .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
>>   
>> @@ -246,9 +247,11 @@ async fn remove_target_group(
>>           args["ns"] = serde_json::to_value(target_ns.name())?;
>>       }
>>   
>> -    params.target.client.delete(&api_path, Some(args)).await?;
>> +    let mut result = params.target.client.delete(&api_path, Some(args)).await?;
>> +    let data = result["data"].take();
>> +    let delete_stats: BackupGroupDeleteStats = serde_json::from_value(data)?;
> 
> what about older target servers that return Value::Null here? from a
> quick glance, nothing else requires upgrading the target server to
> "enable" push support, so this should probably gracefully handle that
> combination as well..

Since this requires now to set an additional `ignore-protected` flag on 
the api endpoint in order to allow to opt-in (see response to previous 
patch) to not return with error when deletion failed, this cannot be 
handled but will rather fail.

It would be possible to retry without the opt-in flag on failure, but 
then again might fail on protected snapshots. Further, without the 
response body, the statistics will be messed up (since empty).

So not sure, should the remove vanished simply be considered 
incompatible with older version or retried until we fail/succeed for 
good? And live with the missing stats, possible leftover snapshots, ecc...

>>   
>> -    Ok(())
>> +    Ok(delete_stats)
>>   }
>>   
>>   // Check if the namespace is already present on the target, create it otherwise
>> @@ -451,38 +454,26 @@ pub(crate) async fn push_namespace(
>>   
>>               info!("delete vanished group '{target_group}'");
>>   
>> -            let count_before = match fetch_target_groups(params, namespace).await {
>> -                Ok(snapshots) => snapshots.len(),
>> -                Err(_err) => 0, // ignore errors
>> -            };
>> -
>> -            if let Err(err) = remove_target_group(params, namespace, &target_group).await {
>> -                info!("{err}");
>> -                errors = true;
>> -                continue;
>> -            }
>> -
>> -            let mut count_after = match fetch_target_groups(params, namespace).await {
>> -                Ok(snapshots) => snapshots.len(),
>> -                Err(_err) => 0, // ignore errors
>> -            };
>> -
>> -            let deleted_groups = if count_after > 0 {
>> -                info!("kept some protected snapshots of group '{target_group}'");
>> -                0
>> -            } else {
>> -                1
>> -            };
>> -
>> -            if count_after > count_before {
>> -                count_after = count_before;
>> +            match remove_target_group(params, namespace, &target_group).await {
>> +                Ok(delete_stats) => {
>> +                    if delete_stats.protected_snapshots() > 0 {
>> +                        info!(
>> +                            "kept {protected_count} protected snapshots of group '{target_group}'",
>> +                            protected_count = delete_stats.protected_snapshots(),
>> +                        );
> 
> should this be a warning? this kind of breaks the expectations of
> syncing after all..

OK, will be a warning in the next version of the patches.

> and wouldn't we also need a similar change for removing namespaces?

Right, that was indeed missing and will be added as well.

>> +                    }
>> +                    stats.add(SyncStats::from(RemovedVanishedStats {
>> +                        snapshots: delete_stats.removed_snapshots(),
>> +                        groups: delete_stats.removed_groups(),
>> +                        namespaces: 0,
>> +                    }));
>> +                }
>> +                Err(err) => {
>> +                    info!("failed to delete vanished group - {err}");
>> +                    errors = true;
>> +                    continue;
>> +                }
>>               }
>> -
>> -            stats.add(SyncStats::from(RemovedVanishedStats {
>> -                snapshots: count_before - count_after,
>> -                groups: deleted_groups,
>> -                namespaces: 0,
>> -            }));
>>           }
>>       }
>>   
>> -- 
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api
  2024-10-15  7:30     ` Christian Ebner
@ 2024-10-15  7:44       ` Fabian Grünbichler
  2024-10-15  8:04         ` Christian Ebner
  0 siblings, 1 reply; 60+ messages in thread
From: Fabian Grünbichler @ 2024-10-15  7:44 UTC (permalink / raw)
  To: Christian Ebner, Proxmox Backup Server development discussion


> Christian Ebner <c.ebner@proxmox.com> hat am 15.10.2024 09:30 CEST geschrieben:
> 
>  
> On 10/11/24 11:32, Fabian Grünbichler wrote:
> > On September 12, 2024 4:33 pm, Christian Ebner wrote:
> >> Use the API exposed additional delete statistics to generate the
> >> task log output for sync jobs in push direction instead of fetching the
> >> contents before and after deleting.
> >>
> >> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> >> ---
> >> changes since version 2:
> >> - no changes
> >>
> >>   src/server/push.rs | 65 ++++++++++++++++++++--------------------------
> >>   1 file changed, 28 insertions(+), 37 deletions(-)
> >>
> >> diff --git a/src/server/push.rs b/src/server/push.rs
> >> index cfbb88728..dbface907 100644
> >> --- a/src/server/push.rs
> >> +++ b/src/server/push.rs
> >> @@ -11,9 +11,10 @@ use tokio_stream::wrappers::ReceiverStream;
> >>   use tracing::info;
> >>   
> >>   use pbs_api_types::{
> >> -    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
> >> -    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
> >> -    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
> >> +    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupGroupDeleteStats, BackupNamespace,
> >> +    CryptMode, GroupFilter, GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote,
> >> +    SnapshotListItem, PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY,
> >> +    PRIV_REMOTE_DATASTORE_PRUNE,
> >>   };
> >>   use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
> >>   use pbs_config::CachedUserInfo;
> >> @@ -228,7 +229,7 @@ async fn remove_target_group(
> >>       params: &PushParameters,
> >>       namespace: &BackupNamespace,
> >>       backup_group: &BackupGroup,
> >> -) -> Result<(), Error> {
> >> +) -> Result<BackupGroupDeleteStats, Error> {
> >>       check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
> >>           .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
> >>   
> >> @@ -246,9 +247,11 @@ async fn remove_target_group(
> >>           args["ns"] = serde_json::to_value(target_ns.name())?;
> >>       }
> >>   
> >> -    params.target.client.delete(&api_path, Some(args)).await?;
> >> +    let mut result = params.target.client.delete(&api_path, Some(args)).await?;
> >> +    let data = result["data"].take();
> >> +    let delete_stats: BackupGroupDeleteStats = serde_json::from_value(data)?;
> > 
> > what about older target servers that return Value::Null here? from a
> > quick glance, nothing else requires upgrading the target server to
> > "enable" push support, so this should probably gracefully handle that
> > combination as well..
> 
> Since this requires now to set an additional `ignore-protected` flag on 
> the api endpoint in order to allow to opt-in (see response to previous 
> patch) to not return with error when deletion failed, this cannot be 
> handled but will rather fail.
> 
> It would be possible to retry without the opt-in flag on failure, but 
> then again might fail on protected snapshots. Further, without the 
> response body, the statistics will be messed up (since empty).
> 
> So not sure, should the remove vanished simply be considered 
> incompatible with older version or retried until we fail/succeed for 
> good? And live with the missing stats, possible leftover snapshots, ecc...

hmm, this is a bit tricky.

for pull, we just log and ignore protected snapshots when removing a vanished group. we also log and ignore any other errors at that point though (just without accounting for removed vs. protected).

but for push, we require the opt-in feature to know that removing the group failed because of a protected snapshot. I guess we could try with the new parameter, if that fails (with a parameter error?), retry without, and if that fails as well, just log whatever error the server returned hoping it's meaningful? or we could make it more robust by querying the remote version at the start of the sync and base the decision on that..


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api
  2024-10-15  7:44       ` Fabian Grünbichler
@ 2024-10-15  8:04         ` Christian Ebner
  0 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-10-15  8:04 UTC (permalink / raw)
  To: Fabian Grünbichler, Proxmox Backup Server development discussion

On 10/15/24 09:44, Fabian Grünbichler wrote:
> 
>> Christian Ebner <c.ebner@proxmox.com> hat am 15.10.2024 09:30 CEST geschrieben:
>>
>>   
>> On 10/11/24 11:32, Fabian Grünbichler wrote:
>>> On September 12, 2024 4:33 pm, Christian Ebner wrote:
>>>> Use the API exposed additional delete statistics to generate the
>>>> task log output for sync jobs in push direction instead of fetching the
>>>> contents before and after deleting.
>>>>
>>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>>>> ---
>>>> changes since version 2:
>>>> - no changes
>>>>
>>>>    src/server/push.rs | 65 ++++++++++++++++++++--------------------------
>>>>    1 file changed, 28 insertions(+), 37 deletions(-)
>>>>
>>>> diff --git a/src/server/push.rs b/src/server/push.rs
>>>> index cfbb88728..dbface907 100644
>>>> --- a/src/server/push.rs
>>>> +++ b/src/server/push.rs
>>>> @@ -11,9 +11,10 @@ use tokio_stream::wrappers::ReceiverStream;
>>>>    use tracing::info;
>>>>    
>>>>    use pbs_api_types::{
>>>> -    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupNamespace, CryptMode, GroupFilter,
>>>> -    GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote, SnapshotListItem,
>>>> -    PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY, PRIV_REMOTE_DATASTORE_PRUNE,
>>>> +    print_store_and_ns, Authid, BackupDir, BackupGroup, BackupGroupDeleteStats, BackupNamespace,
>>>> +    CryptMode, GroupFilter, GroupListItem, NamespaceListItem, Operation, RateLimitConfig, Remote,
>>>> +    SnapshotListItem, PRIV_REMOTE_DATASTORE_BACKUP, PRIV_REMOTE_DATASTORE_MODIFY,
>>>> +    PRIV_REMOTE_DATASTORE_PRUNE,
>>>>    };
>>>>    use pbs_client::{BackupRepository, BackupWriter, HttpClient, UploadOptions};
>>>>    use pbs_config::CachedUserInfo;
>>>> @@ -228,7 +229,7 @@ async fn remove_target_group(
>>>>        params: &PushParameters,
>>>>        namespace: &BackupNamespace,
>>>>        backup_group: &BackupGroup,
>>>> -) -> Result<(), Error> {
>>>> +) -> Result<BackupGroupDeleteStats, Error> {
>>>>        check_ns_remote_datastore_privs(params, namespace, PRIV_REMOTE_DATASTORE_PRUNE)
>>>>            .map_err(|err| format_err!("Pruning remote datastore contents not allowed - {err}"))?;
>>>>    
>>>> @@ -246,9 +247,11 @@ async fn remove_target_group(
>>>>            args["ns"] = serde_json::to_value(target_ns.name())?;
>>>>        }
>>>>    
>>>> -    params.target.client.delete(&api_path, Some(args)).await?;
>>>> +    let mut result = params.target.client.delete(&api_path, Some(args)).await?;
>>>> +    let data = result["data"].take();
>>>> +    let delete_stats: BackupGroupDeleteStats = serde_json::from_value(data)?;
>>>
>>> what about older target servers that return Value::Null here? from a
>>> quick glance, nothing else requires upgrading the target server to
>>> "enable" push support, so this should probably gracefully handle that
>>> combination as well..
>>
>> Since this requires now to set an additional `ignore-protected` flag on
>> the api endpoint in order to allow to opt-in (see response to previous
>> patch) to not return with error when deletion failed, this cannot be
>> handled but will rather fail.
>>
>> It would be possible to retry without the opt-in flag on failure, but
>> then again might fail on protected snapshots. Further, without the
>> response body, the statistics will be messed up (since empty).
>>
>> So not sure, should the remove vanished simply be considered
>> incompatible with older version or retried until we fail/succeed for
>> good? And live with the missing stats, possible leftover snapshots, ecc...
> 
> hmm, this is a bit tricky.
> 
> for pull, we just log and ignore protected snapshots when removing a vanished group. we also log and ignore any other errors at that point though (just without accounting for removed vs. protected).
> 
> but for push, we require the opt-in feature to know that removing the group failed because of a protected snapshot. I guess we could try with the new parameter, if that fails (with a parameter error?), retry without, and if that fails as well, just log whatever error the server returned hoping it's meaningful? or we could make it more robust by querying the remote version at the start of the sync and base the decision on that..

Ah, yes! Fetching the version and use that sounds like a reasonable way 
to go here. Although, there can still be leftovers on error, but this 
can at least be meaningfully logged.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target
  2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
                   ` (34 preceding siblings ...)
  2024-10-14 11:04 ` [pbs-devel] partially-applied: " Fabian Grünbichler
@ 2024-10-17 13:31 ` Christian Ebner
  35 siblings, 0 replies; 60+ messages in thread
From: Christian Ebner @ 2024-10-17 13:31 UTC (permalink / raw)
  To: pbs-devel

superseded-by version 4:
https://lore.proxmox.com/pbs-devel/20241017132716.385234-1-c.ebner@proxmox.com/T/


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2024-10-17 13:30 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 01/33] api: datastore: add missing whitespace in description Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 02/33] server: sync: move sync related stats to common module Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 03/33] server: sync: move reader trait to common sync module Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 04/33] server: sync: move source " Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 05/33] client: backup writer: bundle upload stats counters Christian Ebner
2024-10-10 14:49   ` Fabian Grünbichler
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 06/33] client: backup writer: factor out merged chunk stream upload Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 07/33] client: backup writer: add chunk count and duration stats Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 08/33] client: backup writer: allow push uploading index and chunks Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 09/33] server: sync: move skip info/reason to common sync module Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 10/33] server: sync: make skip reason message more genenric Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 11/33] server: sync: factor out namespace depth check into sync module Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 12/33] config: acl: mention optional namespace acl path component Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores Christian Ebner
2024-10-10 14:49   ` Fabian Grünbichler
2024-10-14  8:18     ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 14/33] api types: define remote permissions and roles for push sync Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations Christian Ebner
2024-10-10 14:48   ` Fabian Grünbichler
2024-10-14  9:32     ` Christian Ebner
2024-10-14  9:41       ` Fabian Grünbichler
2024-10-14  9:53         ` Christian Ebner
2024-10-14 10:01           ` Fabian Grünbichler
2024-10-14 10:15             ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs Christian Ebner
2024-10-10 14:48   ` Fabian Grünbichler
2024-10-14  8:16     ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 17/33] api: push: implement endpoint for sync in push direction Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 18/33] api: sync: move sync job invocation to server sync module Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter Christian Ebner
2024-10-10 14:48   ` Fabian Grünbichler
2024-10-14  8:10     ` Christian Ebner
2024-10-14  9:25       ` Fabian Grünbichler
2024-10-14  9:36         ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 20/33] api: sync: add permission checks for push sync jobs Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 21/33] bin: manager: add datastore push cli command Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 22/33] ui: group filter: allow to set namespace for local datastore Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 23/33] ui: sync edit: source group filters based on sync direction Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 24/33] ui: add view with separate grids for pull and push sync jobs Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 25/33] ui: sync job: adapt edit window to be used for pull and push Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 26/33] ui: sync: pass sync-direction to allow removing push jobs Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 27/33] ui: sync view: do not use data model proxy for store Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 28/33] ui: sync view: set sync direction when invoking run task via api Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 29/33] datastore: move `BackupGroupDeleteStats` to api types Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 30/33] api types: implement api type for `BackupGroupDeleteStats` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 31/33] datastore: increment deleted group counter when removing group Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove Christian Ebner
2024-10-11  9:32   ` Fabian Grünbichler
2024-10-14 10:24     ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api Christian Ebner
2024-10-11  9:32   ` Fabian Grünbichler
2024-10-15  7:30     ` Christian Ebner
2024-10-15  7:44       ` Fabian Grünbichler
2024-10-15  8:04         ` Christian Ebner
2024-10-10 14:48 ` [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Fabian Grünbichler
2024-10-11  7:12   ` Christian Ebner
2024-10-11  7:51     ` Fabian Grünbichler
2024-10-14 11:04 ` [pbs-devel] partially-applied: " Fabian Grünbichler
2024-10-17 13:31 ` [pbs-devel] " Christian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal