From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id B1AC61FF13B for ; Wed, 22 Apr 2026 12:23:55 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 7F42B147F7; Wed, 22 Apr 2026 12:23:55 +0200 (CEST) Message-ID: <265cba9a-65b8-42d2-a3dd-681550b46a31@proxmox.com> Date: Wed, 22 Apr 2026 12:23:50 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH proxmox-backup v7 2/9] datastore: add move-group To: Hannes Laimer , =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , pbs-devel@lists.proxmox.com References: <20260416171830.266553-1-h.laimer@proxmox.com> <20260416171830.266553-3-h.laimer@proxmox.com> <84cfe249-23bf-4498-90e1-90b44dd944b2@proxmox.com> <1776847977.nipqfzc6ef.astroid@yuna.none> <8fa35401-0228-4f09-a9fc-1c66de829d48@proxmox.com> <1776851956.4wbeejt3wr.astroid@yuna.none> Content-Language: en-US, de-DE From: Christian Ebner In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1776853344104 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.071 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: CYDFGSB47MDD27FT2RWVX5LDXIOB2JGQ X-Message-ID-Hash: CYDFGSB47MDD27FT2RWVX5LDXIOB2JGQ X-MailFrom: c.ebner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Backup Server development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 4/22/26 12:16 PM, Hannes Laimer wrote: > On 2026-04-22 12:03, Fabian Grünbichler wrote: >> On April 22, 2026 11:30 am, Christian Ebner wrote: >>> On 4/22/26 11:23 AM, Hannes Laimer wrote: >>>> On 2026-04-22 11:06, Fabian Grünbichler wrote: >>>>> On April 22, 2026 10:40 am, Christian Ebner wrote: >>>>>> On 4/16/26 7:18 PM, Hannes Laimer wrote: >>>>>>> Add support for moving a single backup group to a different namespace >>>>>>> within the same datastore. >>>>>>> >>>>>>> For the filesystem backend each snapshot directory is renamed >>>>>>> individually. For S3 all objects are copied to the target prefix >>>>>>> before deleting the source, per snapshot. >>>>>>> >>>>>>> Exclusive locks on the group and all its snapshots are acquired >>>>>>> before the move to ensure no concurrent operations are active. >>>>>>> Snapshots are locked and moved in batches to avoid exhausting file >>>>>>> descriptors on groups with many snapshots. >>>>>> >>>>>> Unless I overlook it, there currently is still one major issue which can >>>>>> lead to data loss with this: >>>>>> >>>>>> Garbage collection uses the Datastore's list_index_files() method to >>>>>> collect all index files at the start of phase 1. This is to know which >>>>>> chunks need atime updates to mark them as in use. Snapshots which >>>>>> disappear in the mean time can be ignored, as the chunks may then no >>>>>> longer be in use. Snapshots created in the mean time are safe, as there >>>>>> it is the cutoff time protecting newly written chunks which are not >>>>>> referenced by any of the index files which are now not in the list. >>>>>> >>>>>> But if the move happens after GC started and collected the index files, >>>>>> but before reaching that index files. the moved index file still might >>>>>> reference chunks which are in-use, but now never get an atime update. >>>>>> >>>>>> Locking unfortunately does not protect against this. >>>>>> >>>>>> So if there is an ongoing garbage collection phase 1, there is the need >>>>>> for some mechanism to re-inject the index files in the list of indices >>>>>> and therefore chunks to process. >>>>>> This might require to write the moved indices to a file, so they can be >>>>>> read and processed at the end of GC phase 1 even if GC is running in a >>>>>> different process. And it requires to flock that file and wait for it to >>>>>> become available before continuing. >>>>> >>>>> or moving could obtain the GC lock, and you simply cannot move while a >>>>> GC is running or start a GC while a move is in progress? though the >>>>> latter might be problematic.. it is already possible to block GC in >>>>> practice if you have a writer that never finishes (assuming the proxy is >>>>> reloaded every once in a while, which happens once per day at least). >>>>> >>>>> I guess your approach is similar to the trash feature we've discussed a >>>>> while back (just without restoring from trash and all the associated >>>>> complexity ;)).. it would only require blocking moves during this "phase >>>>> 1.5" instead of the whole GC, which would of course be nice.. but it >>>>> also increases the amount of work move needs to do by quite a bit.. >>>> >>>> is it that much though? it would be just appending a line to a file for >>>> every moved index, compared to the actual moving itself, this seems >>>> rather minor, no? >> >> it is another readdir + parsing for each moved snapshot, which is >> definitely not nothing.. but it also isn't that bad. >> >> we'd also need a mechanism to support multiple moves, so we need a >> shared (obtained by moves) lock that allows creating and writing such >> files, and an exclusive lock (obtained by GC) that allows processing and >> clearing them? >> > > yes, at least that's what i had in mind. line appends are atomic, so > concurrent moves writing lines to the same file wouldn't be a problem. > and a second move would have to wait till gc is done draining(if it > started while gc is draining), at which point the gc already has touched > everything that might be moved be the second move > > hope i didn't miss anything, but this should be working. > (technically continuous mv's could starve the gc cause flock puts shared > locks before exclusive ones... but that seems to be a rather constructed > problem since moves are user triggered...) Why not use only exclusive locks then? The operation of appending a new line should be rather limited in duration, and i think it would be okay to have a short timeout when trying to acquiring the flock.