From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pbs-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id B1AC61FF13B
	for <inbox@lore.proxmox.com>; Wed, 22 Apr 2026 12:23:55 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 7F42B147F7;
	Wed, 22 Apr 2026 12:23:55 +0200 (CEST)
Message-ID: <265cba9a-65b8-42d2-a3dd-681550b46a31@proxmox.com>
Date: Wed, 22 Apr 2026 12:23:50 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PATCH proxmox-backup v7 2/9] datastore: add move-group
To: Hannes Laimer <h.laimer@proxmox.com>,
 =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>,
 pbs-devel@lists.proxmox.com
References: <20260416171830.266553-1-h.laimer@proxmox.com>
 <20260416171830.266553-3-h.laimer@proxmox.com>
 <84cfe249-23bf-4498-90e1-90b44dd944b2@proxmox.com>
 <1776847977.nipqfzc6ef.astroid@yuna.none>
 <d7f783f1-be0f-424e-99fd-edb0e8035db4@proxmox.com>
 <8fa35401-0228-4f09-a9fc-1c66de829d48@proxmox.com>
 <1776851956.4wbeejt3wr.astroid@yuna.none>
 <e319a048-4e84-454b-84f6-b0ddda08408e@proxmox.com>
Content-Language: en-US, de-DE
From: Christian Ebner <c.ebner@proxmox.com>
In-Reply-To: <e319a048-4e84-454b-84f6-b0ddda08408e@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2
X-Bm-Transport-Timestamp: 1776853344104
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.071 Adjusted score from AWL reputation of From:
 address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DMARC_MISSING             0.1 Missing DMARC policy
	KAM_DMARC_STATUS         0.01 Test Rule for DKIM or SPF Failure with Strict
 Alignment
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_PASS               -0.001 SPF: sender matches SPF record
Message-ID-Hash: CYDFGSB47MDD27FT2RWVX5LDXIOB2JGQ
X-Message-ID-Hash: CYDFGSB47MDD27FT2RWVX5LDXIOB2JGQ
X-MailFrom: c.ebner@proxmox.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop;
 banned-address; emergency; member-moderation; nonmember-moderation;
 administrivia; implicit-dest; max-recipients; max-size; news-moderation;
 no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.10
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Owner: <mailto:pbs-devel-owner@lists.proxmox.com>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Subscribe: <mailto:pbs-devel-join@lists.proxmox.com>
List-Unsubscribe: <mailto:pbs-devel-leave@lists.proxmox.com>

On 4/22/26 12:16 PM, Hannes Laimer wrote:
> On 2026-04-22 12:03, Fabian Grünbichler wrote:
>> On April 22, 2026 11:30 am, Christian Ebner wrote:
>>> On 4/22/26 11:23 AM, Hannes Laimer wrote:
>>>> On 2026-04-22 11:06, Fabian Grünbichler wrote:
>>>>> On April 22, 2026 10:40 am, Christian Ebner wrote:
>>>>>> On 4/16/26 7:18 PM, Hannes Laimer wrote:
>>>>>>> Add support for moving a single backup group to a different namespace
>>>>>>> within the same datastore.
>>>>>>>
>>>>>>> For the filesystem backend each snapshot directory is renamed
>>>>>>> individually. For S3 all objects are copied to the target prefix
>>>>>>> before deleting the source, per snapshot.
>>>>>>>
>>>>>>> Exclusive locks on the group and all its snapshots are acquired
>>>>>>> before the move to ensure no concurrent operations are active.
>>>>>>> Snapshots are locked and moved in batches to avoid exhausting file
>>>>>>> descriptors on groups with many snapshots.
>>>>>>
>>>>>> Unless I overlook it, there currently is still one major issue which can
>>>>>> lead to data loss with this:
>>>>>>
>>>>>> Garbage collection uses the Datastore's list_index_files() method to
>>>>>> collect all index files at the start of phase 1. This is to know which
>>>>>> chunks need atime updates to mark them as in use. Snapshots which
>>>>>> disappear in the mean time can be ignored, as the chunks may then no
>>>>>> longer be in use. Snapshots created in the mean time are safe, as there
>>>>>> it is the cutoff time protecting newly written chunks which are not
>>>>>> referenced by any of the index files which are now not in the list.
>>>>>>
>>>>>> But if the move happens after GC started and collected the index files,
>>>>>> but before reaching that index files. the moved index file still might
>>>>>> reference chunks which are in-use, but now never get an atime update.
>>>>>>
>>>>>> Locking unfortunately does not protect against this.
>>>>>>
>>>>>> So if there is an ongoing garbage collection phase 1, there is the need
>>>>>> for some mechanism to re-inject the index files in the list of indices
>>>>>> and therefore chunks to process.
>>>>>> This might require to write the moved indices to a file, so they can be
>>>>>> read and processed at the end of GC phase 1 even if GC is running in a
>>>>>> different process. And it requires to flock that file and wait for it to
>>>>>> become available before continuing.
>>>>>
>>>>> or moving could obtain the GC lock, and you simply cannot move while a
>>>>> GC is running or start a GC while a move is in progress? though the
>>>>> latter might be problematic.. it is already possible to block GC in
>>>>> practice if you have a writer that never finishes (assuming the proxy is
>>>>> reloaded every once in a while, which happens once per day at least).
>>>>>
>>>>> I guess your approach is similar to the trash feature we've discussed a
>>>>> while back (just without restoring from trash and all the associated
>>>>> complexity ;)).. it would only require blocking moves during this "phase
>>>>> 1.5" instead of the whole GC, which would of course be nice.. but it
>>>>> also increases the amount of work move needs to do by quite a bit..
>>>>
>>>> is it that much though? it would be just appending a line to a file for
>>>> every moved index, compared to the actual moving itself, this seems
>>>> rather minor, no?
>>
>> it is another readdir + parsing for each moved snapshot, which is
>> definitely not nothing.. but it also isn't that bad.
>>
>> we'd also need a mechanism to support multiple moves, so we need a
>> shared (obtained by moves) lock that allows creating and writing such
>> files, and an exclusive lock (obtained by GC) that allows processing and
>> clearing them?
>>
> 
> yes, at least that's what i had in mind. line appends are atomic, so
> concurrent moves writing lines to the same file wouldn't be a problem.
> and a second move would have to wait till gc is done draining(if it
> started while gc is draining), at which point the gc already has touched
> everything that might be moved be the second move
> 
> hope i didn't miss anything, but this should be working.
> (technically continuous mv's could starve the gc cause flock puts shared
> locks before exclusive ones... but that seems to be a rather constructed
> problem since moves are user triggered...)

Why not use only exclusive locks then? The operation of appending a new 
line should be rather limited in duration, and i think it would be okay 
to have a short timeout when trying to acquiring the flock.