Re: [pbs-devel] [PATCH proxmox-backup v3 1/3] fix #6195: api: datastore: add endpoint for moving namespaces

From: Hannes Laimer <h.laimer@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>,
	Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Cc: Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 1/3] fix #6195: api: datastore: add endpoint for moving namespaces
Date: Mon, 15 Sep 2025 12:32:53 +0200	[thread overview]
Message-ID: <8bac3dce-5dad-4b69-a8e2-d369b0df0725@proxmox.com> (raw)
In-Reply-To: <7ed6e435-788e-449e-bb5d-057793052920@proxmox.com>

On 15.09.25 12:01, Christian Ebner wrote:
> On 9/15/25 11:19 AM, Hannes Laimer wrote:
>> On 15.09.25 10:56, Christian Ebner wrote:
>>> On 9/15/25 10:27 AM, Hannes Laimer wrote:
>>>> On 15.09.25 10:15, Christian Ebner wrote:
>>>>> Thanks for having a go at this issue, I did not yet have an in 
>>>>> depth look at this but unfortunately I'm afraid the current 
>>>>> implementation approach will not work for the S3 backend (and might 
>>>>> also have issues for local datastores).
>>>>>
>>>>> Copying the S3 objects is not an atomic operation and  will take 
>>>>> some time, so leaves you open for race conditions. E.g. while you 
>>>>> copy contents, a new backup snapshot might be created in one of the 
>>>>> already copied backup groups, which will then however be deleted 
>>>>> afterwards. Same is true for pruning, and other metadata editing 
>>>>> operations such as
>>>>> adding notes, backup task logs, ecc.
>>>>>
>>>>
>>>> Yes, but not really. We lock the `active_operations` tracking file, so
>>>> no new read/write operations can be started after we start the moving
>>>> process. There's a short comment in the API endpoint function.
>>>
>>> Ah yes, I did miss that part. But by doing that you will basically 
>>> block any datastore operation, not just the ones to the source or 
>>> target namespace. This is not ideal IMO. Further you cannot move a NS 
>>> if any other operation is ongoing on the datastore, which might be 
>>> completely unrelated to the source and target namespace, e.g. a 
>>> backup to another namespace?
>>
>> Yes. But I don't think this is something we can (easily) check for, 
>> maybe there is a good way, but I can't think of a feasible one.
>> We could lock all affected groups in advance, but I'm not super sure 
>> we can just move a locked dir, at least with the old locking.
> 
> No, not lock all in advance, but we can lock it on a per backup group 
> basis (source and target) and consider that as the basic operation, so 
> this is mostly a local sync job on the same datastore from one namespace 
> to another one. That is why I suggested to consider the moving of a 
> namespace as batch operation of moving backup groups. While not as 
> performant, this should eliminate possible races and makes error 
> handling/rollback much easier.
> 
>> Given both for local and S3 datastores this is I'd argue a rather fast
>> operations, so just saying 'nobody does anything while we move stuff' is
>> reasonable.
> 
> Well, for an S3 object store with several sub-namespaces, containing 
> hundreds of backup groups and thousands of snapshots (with notes, backup 
> task logs and other metadata) this might take some time. After all there 
> is a copy request for each of the objects involved. Do you have some 
> hard numbers on this?
> >>
>> What we could think about adding is maybe a checkbox for update jobs
>> referencing the NS, but not sure about if we want that.
>>
>>>
>>>> I'm not sure there is much value in more granular locking, I mean, is
>>>> half a successful move worth much? Unless we add some kind of rollback,
>>>> but tbh, I feel like that would not be worth the effort I think.
>>>
>>> Well, it could be just like we do for the sync jobs, skipping the 
>>> move for the ones where the backup group could not be locked or fails 
>>> for some other reason?
>>>
>>
>> Hmm, but then we'd have it in two places, and moving again later won't
>> work because we can't distinguish between a same named ns already
>> existing and a new try to complete an earlier move. And we also can't
>> allow that in general, cause what happens if there's the same VMID
>> twice.
> 
> Not if the failed/skipped group is cleaned up correctly, if not 
> preexisting? And skip if it is preexisting... disallowing any group name 
> collisions.
> 
>>
>>> I think having a more granular backup group unit instead of namespace 
>>> makes this more flexible: what if I only want to move one backup 
>>> group from one namespace to another one, as the initial request in 
>>> the bug report?
>>>
>>
>> That is not possible currently. And, at least with this series, not
>> intended. We could support that eventually, but that should be rather
>> orthogonal to this one I think.
> 
> But then this does not really fix the issue ;)
> 
>>
>>> For example, I had a VM which has been backed up to a given 
>>> namespace, has however since been destroyed, but I want to keep the 
>>> backups by moving the group with all the snapshots to a different 
>>> namespace, freeing the backup type and ID for the current namespace?
>>>
>>
>> I see the use-case for this, but I think these are two things. Moving 
>> a NS and moving a single group.
> 
> This is a design decision we should make now.
> To me it seems to make more sense to see the namespace moving as batch 
> operation of moving groups.
> 

hmm, I think you are right. I tried to keep this really simple given
we're just moving a directory, buuut this is probably not how we want
to do this. I'll rethink the approach. Still, some other opinions would
be good before I start with that.

> Alternatively, IMO we must implement locking for namespace analogous to 
> the locking for backup groups to be able to keep a consistent state, 
> especially for the S3 backend were there are a lot of failure modes. 
> Locking all operations on the datastore and requiring for none (even 
> unrelated ones) to be active before trying the move is not ideal.
> 
> Other opinions here? CC'ing Thomas and Fabian...
> 
>>
>>>>
>>>>> So IMO this must be tackled on a group level, making sure to get an 
>>>>> exclusive lock for each group (on the source as well as target of 
>>>>> the move operation) before doing any manipulation. Only then it is 
>>>>> okay to do any non-atomic operations.
>>>>>
>>>>> The moving of the namespace must then be implemented as batch 
>>>>> operations on the groups and sub-namespaces.
>>>>>
>>>>> This should be handled the same also for regular datastores, to 
>>>>> avoid any races there to.
>>>>
>>>
>>
> 

_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel