all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Hannes Laimer <h.laimer@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup v7 3/9] datastore: add move-namespace
Date: Mon, 20 Apr 2026 16:49:48 +0200	[thread overview]
Message-ID: <1776694266.6hgr1v4evp.astroid@yuna.none> (raw)
In-Reply-To: <20260416171830.266553-4-h.laimer@proxmox.com>

On April 16, 2026 7:18 pm, Hannes Laimer wrote:
> Relocate an entire namespace subtree (the given namespace, all child
> namespaces, and their groups) to a new location within the same
> datastore.
> 
> Groups that cannot be locked because another task is running are
> deferred and retried once after all other groups have been processed.
> Groups that still fail are reported and remain at the source so they
> can be retried individually.
> 
> When merging is enabled and a source group already exists in the
> target, the snapshots are merged provided ownership matches and
> source snapshots are strictly newer.
> 
> An optional max-depth parameter limits how many levels of child
> namespaces are included. When delete-source is true (the default),
> empty source namespaces are removed after the move.
> 
> Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
> ---
>  pbs-datastore/src/datastore.rs | 245 ++++++++++++++++++++++++++++++++-
>  1 file changed, 244 insertions(+), 1 deletion(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index f88c356e..536824c9 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -34,7 +34,7 @@ use pbs_api_types::{
>      ArchiveType, Authid, BackupGroupDeleteStats, BackupNamespace, BackupType, ChunkOrder,
>      DataStoreConfig, DatastoreBackendConfig, DatastoreBackendType, DatastoreFSyncLevel,
>      DatastoreTuning, GarbageCollectionCacheStats, GarbageCollectionStatus, MaintenanceMode,
> -    MaintenanceType, Operation, S3Statistics, UPID,
> +    MaintenanceType, Operation, S3Statistics, MAX_NAMESPACE_DEPTH, UPID,
>  };
>  use pbs_config::s3::S3_CFG_TYPE_ID;
>  use pbs_config::{BackupLockGuard, ConfigVersionCache};
> @@ -1124,6 +1124,249 @@ impl DataStore {
>          Ok((removed_all_requested, stats))
>      }
>  
> +    /// Move a backup namespace (including all child namespaces and groups) to a new location.
> +    ///
> +    /// Groups are moved one at a time. For each group, exclusive locks on the group and all
> +    /// its snapshots (locked and moved in batches) ensure no concurrent operations are active. Groups
> +    /// that cannot be locked (because a backup, verify, or other task is running) are deferred
> +    /// and retried once, then reported in the error.
> +    ///
> +    /// If the target namespace already exists, groups are moved into it. Groups that fail
> +    /// (lock conflict, merge invariant violation, or I/O error) are skipped and reported at
> +    /// the end - they remain at the source and can be retried individually.
> +    ///
> +    /// `max_depth` limits how many levels of child namespaces below `source_ns` are included.
> +    /// `None` means unlimited (move the entire subtree). `Some(0)` moves only the groups
> +    /// directly in `source_ns` (no child namespaces).
> +    ///
> +    /// When `delete_source` is true the source namespace directories are removed after all
> +    /// groups have been moved. When false the (now empty) source namespaces are kept.
> +    ///
> +    /// When `merge_groups` is true and a source group already exists in the target namespace,
> +    /// the snapshots are merged into the existing target group (provided ownership matches
> +    /// and source snapshots are strictly newer). When false the operation fails for that
> +    /// group.
> +    pub fn move_namespace(
> +        self: &Arc<Self>,
> +        source_ns: &BackupNamespace,
> +        target_ns: &BackupNamespace,
> +        max_depth: Option<usize>,
> +        delete_source: bool,
> +        merge_groups: bool,
> +    ) -> Result<(), Error> {
> +        if *OLD_LOCKING {
> +            bail!("move operations require new-style file-based locking");
> +        }
> +        if source_ns.is_root() {
> +            bail!("cannot move root namespace");
> +        }
> +        if source_ns == target_ns {
> +            bail!("source and target namespace must be different");
> +        }
> +
> +        if !self.namespace_exists(source_ns) {
> +            bail!("source namespace '{source_ns}' does not exist");
> +        }
> +        let target_parent = target_ns.parent();
> +        if !target_ns.is_root() && !self.namespace_exists(&target_parent) {
> +            bail!("target namespace parent '{target_parent}' does not exist");
> +        }
> +        if source_ns.contains(target_ns).is_some() {
> +            bail!(
> +                "cannot move namespace '{source_ns}' into its own subtree (target: '{target_ns}')"
> +            );
> +        }

all of these are already done right before calling this, so should
either be moved into a small helper to avoid deviating, or not done
twice IMHO. we don't do anything inbetween other than forking a worker
after all..

> +
> +        let all_source_ns: Vec<BackupNamespace> = if let Some(depth) = max_depth {
> +            ListNamespacesRecursive::new_max_depth(Arc::clone(self), source_ns.clone(), depth)?
> +                .collect::<Result<Vec<_>, Error>>()?
> +        } else {
> +            self.recursive_iter_backup_ns(source_ns.clone())?
> +                .collect::<Result<Vec<_>, Error>>()?
> +        };

this could just be ListNamespacesRecrusive::new_max_depth with the depth
unwrapped to the max value if not set?

> +
> +        let all_source_groups: Vec<BackupGroup> = all_source_ns
> +            .iter()
> +            .map(|ns| self.iter_backup_groups(ns.clone()))
> +            .collect::<Result<Vec<_>, Error>>()?
> +            .into_iter()
> +            .flatten()
> +            .collect::<Result<Vec<_>, Error>>()?;

the only benefit of this is a single log line below, and the chances of
conflicts might be lower if we iterate each namespace on its own,
instead of all of them up front?

> +
> +        let subtree_depth = all_source_ns
> +            .iter()
> +            .map(BackupNamespace::depth)
> +            .max()
> +            .map_or(0, |d| d - source_ns.depth());
> +        if subtree_depth + target_ns.depth() > MAX_NAMESPACE_DEPTH {
> +            bail!(
> +                "move would exceed maximum namespace depth \
> +                ({subtree_depth}+{} > {MAX_NAMESPACE_DEPTH})",
> +                target_ns.depth(),
> +            );
> +        }

this is basically crate::sync::check_namespace_depth_limit, should it
move to BackupNamespace or some other place?

> +
> +        let backend = self.backend()?;
> +
> +        log::info!(
> +            "moving namespace '{source_ns}' -> '{target_ns}': {} namespaces, {} groups",
> +            all_source_ns.len(),
> +            all_source_groups.len(),
> +        );
> +
> +        // Create target local namespace directories upfront (covers empty namespaces).
> +        for ns in &all_source_ns {
> +            let target_child = ns.map_prefix(source_ns, target_ns)?;
> +            std::fs::create_dir_all(self.namespace_path(&target_child)).with_context(|| {
> +                format!("failed to create local dir for target namespace '{target_child}'")
> +            })?;

this could use Datastore::create_namespace , which does some extra
checks and

> +        }
> +
> +        // For S3: create namespace markers for all target namespaces.
> +        if let DatastoreBackend::S3(s3_client) = &backend {
> +            for ns in &all_source_ns {
> +                let target_child = ns.map_prefix(source_ns, target_ns)?;
> +                let object_key = crate::s3::object_key_from_path(
> +                    &target_child.path(),
> +                    NAMESPACE_MARKER_FILENAME,
> +                )
> +                .context("invalid namespace marker object key")?;
> +                log::debug!("creating S3 namespace marker for '{target_child}': {object_key:?}");
> +                proxmox_async::runtime::block_on(
> +                    s3_client.upload_no_replace_with_retry(object_key, Bytes::from("")),
> +                )
> +                .context("failed to create namespace marker on S3 backend")?;

also does this for us

> +            }
> +        }
> +
> +        // Move each group with per-group and per-snapshot locking. Groups that cannot be
> +        // locked are deferred and retried once after all other groups have been processed.
> +        let mut failed_groups: Vec<(BackupNamespace, String)> = Vec::new();
> +        let mut failed_ns: HashSet<BackupNamespace> = HashSet::new();
> +        let mut deferred: Vec<&BackupGroup> = Vec::new();
> +
> +        for group in &all_source_groups {
> +            if let Err(err) =
> +                self.lock_and_move_group(group, source_ns, target_ns, &backend, merge_groups)
> +            {
> +                match err {
> +                    MoveGroupError::Soft(err) => {
> +                        log::info!(
> +                            "deferring group '{}' in '{}'  - lock conflict, will retry: {err:#}",
> +                            group.group(),
> +                            group.backup_ns(),
> +                        );
> +                        deferred.push(group);
> +                    }
> +                    MoveGroupError::Hard(err) => {
> +                        warn!(
> +                            "failed to move group '{}' in '{}': {err:#}",
> +                            group.group(),
> +                            group.backup_ns(),
> +                        );
> +                        failed_groups.push((group.backup_ns().clone(), group.group().to_string()));
> +                        failed_ns.insert(group.backup_ns().clone());
> +                    }
> +                }
> +            }
> +        }
> +
> +        // Retry deferred groups once.
> +        if !deferred.is_empty() {
> +            log::info!("retrying {} deferred group(s)...", deferred.len());
> +            for group in deferred {
> +                if let Err(err) =
> +                    self.lock_and_move_group(group, source_ns, target_ns, &backend, merge_groups)
> +                {
> +                    let err = match err {
> +                        MoveGroupError::Soft(err) | MoveGroupError::Hard(err) => err,
> +                    };
> +                    warn!(
> +                        "failed to move group '{}' in '{}' on retry: {err:#}",
> +                        group.group(),
> +                        group.backup_ns(),
> +                    );
> +                    failed_groups.push((group.backup_ns().clone(), group.group().to_string()));
> +                    failed_ns.insert(group.backup_ns().clone());
> +                }
> +            }
> +        }
> +
> +        // Clean up source namespaces when requested. Process deepest-first so child
> +        // directories are removed before their parents.
> +        if delete_source {
> +            let moved_ns: HashSet<&BackupNamespace> = all_source_ns.iter().collect();

this is confusingly named, these are the namespaces we *tried to* move,
not necessarily the ones we actually *moved*.

> +
> +            for ns in all_source_ns.iter().rev() {
> +                // Skip namespaces that still have groups from failed moves or child
> +                // namespaces that were excluded by the depth limit.
> +                if failed_ns
> +                    .iter()
> +                    .any(|fns| fns == ns || ns.contains(fns).is_some())

nit: fns is a bit opaque, maybe `failed` or `failed_ns`?

should this log a warning that this NS was not cleaned up, and why?

> +                {
> +                    continue;
> +                }
> +                let has_excluded_children = self
> +                    .iter_backup_ns(ns.clone())
> +                    .ok()
> +                    .into_iter()
> +                    .flatten()
> +                    .filter_map(|child| child.ok())

this could use iter_backup_ns_ok instead?

> +                    .any(|child| !moved_ns.contains(&child));

this does a lot of nested iteration now.. what we actually want to check
is if there are any namespaces left below, because if there are, we
can't remove, either because
- we didn't try to move all children in the first place (depth check)
- we didn't manage to move all children we wanted to move (failed_ns,
  handled above)
- new children were created in the meantime

this check here handles the first and third case?

> +                if has_excluded_children {
> +                    continue;
> +                }
> +
> +                // Remove type subdirectories first (should be empty after per-group renames),
> +                // then the namespace directory itself. Uses remove_dir which fails on
> +                // non-empty directories, handling concurrent group creation gracefully.
> +                let ns_path = self.namespace_path(ns);
> +                if let Ok(entries) = std::fs::read_dir(&ns_path) {
> +                    for entry in entries.flatten() {
> +                        if let Err(err) = std::fs::remove_dir(entry.path()) {
> +                            warn!(
> +                                "failed to remove source directory {:?}: {err}",
> +                                entry.path(),
> +                            );
> +                        }
> +                    }
> +                }
> +                if let Err(err) = std::fs::remove_dir(&ns_path) {
> +                    warn!("failed to remove source namespace directory {ns_path:?}: {err}");
> +                    continue;
> +                }
> +
> +                // Local directory removed  - delete the S3 namespace marker too.
> +                if let DatastoreBackend::S3(s3_client) = &backend {
> +                    let object_key =
> +                        crate::s3::object_key_from_path(&ns.path(), NAMESPACE_MARKER_FILENAME)
> +                            .context("invalid namespace marker object key")?;
> +                    log::debug!("deleting source S3 namespace marker for '{ns}': {object_key:?}");
> +                    if let Err(err) =
> +                        proxmox_async::runtime::block_on(s3_client.delete_object(object_key))
> +                    {
> +                        warn!("failed to delete source S3 namespace marker for '{ns}': {err:#}");
> +                    }
> +                }

we do have remove_namespace_recursive already, which is what is used by
the delete_namespace API endpoint, and do not offer a way to just delete
a single namespace with protection against removing empty child
namespaces.. should we just use that here? maybe extending it with a
delete_child_namespaces parameter, and setting that to false here and in
pull, but true in the API endpoint?

> +            }
> +        }
> +
> +        if !failed_groups.is_empty() {
> +            let group_list: Vec<String> = failed_groups
> +                .iter()
> +                .map(|(ns, group)| format!("'{group}' in '{ns}'"))
> +                .collect();
> +            bail!(
> +                "namespace move partially completed; {} group(s) could not be moved \
> +                and remain at source: {}",
> +                failed_groups.len(),
> +                group_list.join(", "),
> +            );
> +        }
> +
> +        Ok(())
> +    }
> +
>      /// Remove a complete backup group including all snapshots.
>      ///
>      /// Returns `BackupGroupDeleteStats`, containing the number of deleted snapshots
> -- 
> 2.47.3
> 
> 
> 
> 
> 
> 




  reply	other threads:[~2026-04-20 14:49 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16 17:18 [PATCH proxmox-backup v7 0/9] fixes #6195: add support for moving groups and namespaces Hannes Laimer
2026-04-16 17:18 ` [PATCH proxmox-backup v7 1/9] ui: show empty groups Hannes Laimer
2026-04-16 17:18 ` [PATCH proxmox-backup v7 2/9] datastore: add move-group Hannes Laimer
2026-04-20 14:49   ` Fabian Grünbichler
2026-04-21 10:43     ` Hannes Laimer
2026-04-16 17:18 ` [PATCH proxmox-backup v7 3/9] datastore: add move-namespace Hannes Laimer
2026-04-20 14:49   ` Fabian Grünbichler [this message]
2026-04-16 17:18 ` [PATCH proxmox-backup v7 4/9] docs: add section on moving namespaces and groups Hannes Laimer
2026-04-16 17:18 ` [PATCH proxmox-backup v7 5/9] api: add POST endpoint for move-group Hannes Laimer
2026-04-20 14:49   ` Fabian Grünbichler
2026-04-16 17:18 ` [PATCH proxmox-backup v7 6/9] api: add POST endpoint for move-namespace Hannes Laimer
2026-04-16 17:18 ` [PATCH proxmox-backup v7 7/9] ui: add move group action Hannes Laimer
2026-04-16 17:18 ` [PATCH proxmox-backup v7 8/9] ui: add move namespace action Hannes Laimer
2026-04-16 17:18 ` [PATCH proxmox-backup v7 9/9] cli: add move-namespace and move-group commands Hannes Laimer
2026-04-20 15:02 ` [PATCH proxmox-backup v7 0/9] fixes #6195: add support for moving groups and namespaces Fabian Grünbichler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1776694266.6hgr1v4evp.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=h.laimer@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal