From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id C251C6CF2B
 for <pve-devel@lists.proxmox.com>; Thu, 12 Aug 2021 11:13:35 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id A7E3321368
 for <pve-devel@lists.proxmox.com>; Thu, 12 Aug 2021 11:13:05 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 468032135A
 for <pve-devel@lists.proxmox.com>; Thu, 12 Aug 2021 11:13:04 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 0BDB7432F4
 for <pve-devel@lists.proxmox.com>; Thu, 12 Aug 2021 11:13:04 +0200 (CEST)
To: pve-devel@lists.proxmox.com,
 =?UTF-8?Q?Fabian_Gr=c3=bcnbichler?= <f.gruenbichler@proxmox.com>
References: <20210609091858.27219-1-f.ebner@proxmox.com>
 <20210609091858.27219-2-f.ebner@proxmox.com>
 <1624346775.lqzf5lnz7c.astroid@nora.none>
From: Fabian Ebner <f.ebner@proxmox.com>
Message-ID: <68158e94-0cb0-623b-696d-3b3c8f1bbff4@proxmox.com>
Date: Thu, 12 Aug 2021 11:12:55 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.12.0
MIME-Version: 1.0
In-Reply-To: <1624346775.lqzf5lnz7c.astroid@nora.none>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.419 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate
 guest on rollback if there are replication jobs for it
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 12 Aug 2021 09:13:35 -0000

I'll likely send the next version of the series today, but wanted to 
address some points from here first (so I don't have to quote everything 
there).

Am 22.06.21 um 09:41 schrieb Fabian Grünbichler:
> On June 9, 2021 11:18 am, Fabian Ebner wrote:
>> so that there will be a valid replication snapshot again.
>>
>> Otherwise, replication will be broken after a rollback if the last
>> (non-replication) snapshot is removed before replication can run again.
> 
> I still see issues with these two patches applied..
> 
> A: snapshot 'test2', successful replication afterwards, then rollback to
> 'test2':
> 
----snip----

> zfs error: could not find any snapshots to destroy; check snapshot names.
> end replication job
> 
> two misleading errors from attempting to delete already-cleaned up
> snapshots (this is just confusing, likely caused by the state file being
> outdated after prepare(), replication is still working as expected
> afterwards)

Those errors are triggered by $cleanup_local_snapshots in 
Replication.pm's replicate() and there currently is no easy way to know 
if we are in an after-rollback situation (or other situation where 
snapshots might already be deleted) there. We could just ignore this 
specific error, but then we won't detect if an actually wrong snapshot 
name was passed in anymore.

> 
> B: successful replication, snapshot 'test3', rollback to 'test3'
> 

----snip----

> 
> now the replication is broken and requires manual intervention to be
> fixed:
> 
> source: no replication snapshots, regular snapshots test, test2, test3
> 
> target: one replication snapshot, regular snapshots test, test2 (test3
> is missing)
> 

Will be fixed in the next version of the series.

----snip----

> 
> I) maybe prepare or at least this calling context should hold the
> replication lock so that it can't race with concurrent replication runs?

Ideally, all snapshot operations would need to hold the lock, or? 
Otherwise, it might happen that some volumes are replicated before the 
snapshot operation was done with them, and some after.

I'll look at that in detail some time and send it as its own series.

> II) maybe prepare should update the state file in general (if the last
> snapshot referenced there is stale and gets removed, the next
> replication run gets confused) - might/should fix A

The problem is that the state is not aware of the individual volumes and 
rollback might not remove replication snapshots from all replicated 
volumes. It does currently, but that's wrong and causing this bug in the 
first place.

> III) maybe prepare and/or run_replication need to learn how to deal with
> "only regular snapshots, but not the last one match" (we could match on
> name+guid to ensure the common base is actually a common, previously
> replicated base/the same snapshot and not just the same name with
> different content) and construct a send stream from that shared snapshot
> instead of attempting a full sync where the target already (partially)
> exists.. that would fix B, and improve replication robustness in
> general (allowing resuming from a partial shared state instead of having
> to remove, start over from scratch..)
> 
> implementing III would also avoid the need for doing a replication after
> rollback - the next replication would handle the situation just fine
> unless ALL previously shared snapshots are removed - we could check for
> that in the remove snapshot code path though.. or we could just schedule
> a replication here instead of directly doing it. rollback is an almost
> instant action (as long as no vmstate is involved), and a replication
> can take a long time so it seems a bit strange to conflate the two..
> 

I went with a similar approach as also discussed off-list, sans the guid 
matching as that's not really possible to get from the current 
volume_snapshot_list while being backwards compatible. And I'm not sure 
it's even possible to trigger a mismatch with the new approach, because 
of the "prevent removal after rollback until replication is run again" 
restriction.

----snip----