From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id ED8B11FF13B for ; Wed, 22 Apr 2026 11:24:51 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id C96F711D29; Wed, 22 Apr 2026 11:24:51 +0200 (CEST) Message-ID: Date: Wed, 22 Apr 2026 11:24:46 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH proxmox-backup v7 2/9] datastore: add move-group To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , Hannes Laimer , pbs-devel@lists.proxmox.com References: <20260416171830.266553-1-h.laimer@proxmox.com> <20260416171830.266553-3-h.laimer@proxmox.com> <84cfe249-23bf-4498-90e1-90b44dd944b2@proxmox.com> <1776847977.nipqfzc6ef.astroid@yuna.none> Content-Language: en-US, de-DE From: Christian Ebner In-Reply-To: <1776847977.nipqfzc6ef.astroid@yuna.none> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1776849799685 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.071 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Message-ID-Hash: YM5AYMUTOPNBVXWKA7JWFJCPCRJEVRAX X-Message-ID-Hash: YM5AYMUTOPNBVXWKA7JWFJCPCRJEVRAX X-MailFrom: c.ebner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Backup Server development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 4/22/26 11:06 AM, Fabian Grünbichler wrote: > On April 22, 2026 10:40 am, Christian Ebner wrote: >> On 4/16/26 7:18 PM, Hannes Laimer wrote: >>> Add support for moving a single backup group to a different namespace >>> within the same datastore. >>> >>> For the filesystem backend each snapshot directory is renamed >>> individually. For S3 all objects are copied to the target prefix >>> before deleting the source, per snapshot. >>> >>> Exclusive locks on the group and all its snapshots are acquired >>> before the move to ensure no concurrent operations are active. >>> Snapshots are locked and moved in batches to avoid exhausting file >>> descriptors on groups with many snapshots. >> >> Unless I overlook it, there currently is still one major issue which can >> lead to data loss with this: >> >> Garbage collection uses the Datastore's list_index_files() method to >> collect all index files at the start of phase 1. This is to know which >> chunks need atime updates to mark them as in use. Snapshots which >> disappear in the mean time can be ignored, as the chunks may then no >> longer be in use. Snapshots created in the mean time are safe, as there >> it is the cutoff time protecting newly written chunks which are not >> referenced by any of the index files which are now not in the list. >> >> But if the move happens after GC started and collected the index files, >> but before reaching that index files. the moved index file still might >> reference chunks which are in-use, but now never get an atime update. >> >> Locking unfortunately does not protect against this. >> >> So if there is an ongoing garbage collection phase 1, there is the need >> for some mechanism to re-inject the index files in the list of indices >> and therefore chunks to process. >> This might require to write the moved indices to a file, so they can be >> read and processed at the end of GC phase 1 even if GC is running in a >> different process. And it requires to flock that file and wait for it to >> become available before continuing. > > or moving could obtain the GC lock, and you simply cannot move while a > GC is running or start a GC while a move is in progress? though the > latter might be problematic.. it is already possible to block GC in > practice if you have a writer that never finishes (assuming the proxy is > reloaded every once in a while, which happens once per day at least). > > I guess your approach is similar to the trash feature we've discussed a > while back (just without restoring from trash and all the associated > complexity ;)).. it would only require blocking moves during this "phase > 1.5" instead of the whole GC, which would of course be nice.. but it > also increases the amount of work move needs to do by quite a bit.. Yes, that would work as well, the actual snapshot cleanup then deferred to garbage collection. So along the lines of [0]. But this really complicates things even more, especially for S3 backed datastores. I would rather suggest to crate a file which GC phase 1 locks and clears before starting the cleanup. The move simply appends all index paths to this file before the rename making the source invalid. Not sure if there could even be a mechanism to detect whether GC is currently running and it actually needs to write the file or not. [0] https://lore.proxmox.com/pbs-devel/20250513135247.644260-4-c.ebner@proxmox.com/