From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id A9C3A7237A
 for <pve-devel@lists.proxmox.com>; Mon, 12 Apr 2021 13:37:22 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id A52571E270
 for <pve-devel@lists.proxmox.com>; Mon, 12 Apr 2021 13:37:22 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id D06041E258
 for <pve-devel@lists.proxmox.com>; Mon, 12 Apr 2021 13:37:21 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 9B9F445A0F
 for <pve-devel@lists.proxmox.com>; Mon, 12 Apr 2021 13:37:21 +0200 (CEST)
From: Fabian Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Mon, 12 Apr 2021 13:37:16 +0200
Message-Id: <20210412113717.25356-2-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20210412113717.25356-1-f.ebner@proxmox.com>
References: <20210412113717.25356-1-f.ebner@proxmox.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.007 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: [pve-devel] [PATCH guest-common 2/3] partially fix 3111: snapshot
 rollback: fix removing replication snapshots
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 12 Apr 2021 11:37:22 -0000

Get the replicatable volumes from the snapshot config rather than the current
config. And filter those volumes further to those that will actually be rolled
back.

Previously, a volume that only had replication snapshots (e.g. because it was
added after the non-replication snapshot was taken, or the vmstate volume) would
end up without any snapshots whatsoever. Then, on the next replication run, such
a volume would lead to an error, because replication tried to do a full sync,
but the target volume still existed.

Should be enough for many real-world scenarios, but not a complete fix:
It is still possible to run into the problem by removing the last
(non-replication) snapshot after a rollback before replication can run once.

The list of volids is not required to be sorted by volid for prepare().
(It is now sorted by how foreach_volume() iterates.)

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
 PVE/AbstractConfig.pm | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/PVE/AbstractConfig.pm b/PVE/AbstractConfig.pm
index 3348d8a..c4d1d6c 100644
--- a/PVE/AbstractConfig.pm
+++ b/PVE/AbstractConfig.pm
@@ -974,13 +974,22 @@ sub snapshot_rollback {
 	if ($prepare) {
 	    my $repl_conf = PVE::ReplicationConfig->new();
 	    if ($repl_conf->check_for_existing_jobs($vmid, 1)) {
-		# remove all replication snapshots
-		my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $conf, 1);
-		my $sorted_volids = [ sort keys %$volumes ];
+		# remove replication snapshots on volumes affected by rollback *only*!
+		my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $snap, 1);
+
+		my $volids = [];
+		$class->foreach_volume($snap, sub {
+		    my ($vs, $volume) = @_;
+
+		    my $volid_key = $class->volid_key();
+		    my $volid = $volume->{$volid_key};
+
+		    push @{$volids}, $volid if $volumes->{$volid};
+		});
 
 		# remove all local replication snapshots (jobid => undef)
 		my $logfunc = sub { my $line = shift; chomp $line; print "$line\n"; };
-		PVE::Replication::prepare($storecfg, $sorted_volids, undef, 1, undef, $logfunc);
+		PVE::Replication::prepare($storecfg, $volids, undef, 1, undef, $logfunc);
 	    }
 
 	    $class->foreach_volume($snap, sub {
-- 
2.20.1