From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 45D0870353
 for <pve-devel@lists.proxmox.com>; Mon, 13 Jun 2022 12:30:35 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 365409E46
 for <pve-devel@lists.proxmox.com>; Mon, 13 Jun 2022 12:30:05 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 6B8CE9E12
 for <pve-devel@lists.proxmox.com>; Mon, 13 Jun 2022 12:30:03 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3A1024372E
 for <pve-devel@lists.proxmox.com>; Mon, 13 Jun 2022 12:30:03 +0200 (CEST)
From: Fabian Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Mon, 13 Jun 2022 12:29:55 +0200
Message-Id: <20220613102959.36556-3-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.30.2
In-Reply-To: <20220613102959.36556-1-f.ebner@proxmox.com>
References: <20220613102959.36556-1-f.ebner@proxmox.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.050 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: [pve-devel] [PATCH v2 manager 2/2] pvesr: prepare local job: remove
 stale replicated volumes immediately
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 13 Jun 2022 10:30:35 -0000

Commit 0433b86df6dfdf1d64ee09322719a02a91690707 introduced a
regression where only stale replicated volumes with an older
timestamp would be cleaned up. This meant that after removing a volume
from the guest config, it would only be cleaned up the second time the
replication ran afterwards. And the volume could become completely
orphaned in case the relevant storage wasn't used by the job anymore.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Dependency bump for libpve-guest-common >= 4.0-3 needed for
is_replication_snapshot().

Changes from v1:
    * Adapt to changed behavior of prepare() so we still only catch
      volumes that had replication snapshots belonging to the job.

 PVE/CLI/pvesr.pm | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/PVE/CLI/pvesr.pm b/PVE/CLI/pvesr.pm
index a1be88af..95dad64e 100644
--- a/PVE/CLI/pvesr.pm
+++ b/PVE/CLI/pvesr.pm
@@ -137,8 +137,18 @@ __PACKAGE__->register_method ({
 	    push @$volids, map { $_->{volid} } @$images;
 	}
 	my ($local_snapshots, $cleaned_replicated_volumes) = PVE::Replication::prepare($storecfg, $volids, $jobid, $last_sync, $parent_snapname, $logfunc);
-	foreach my $volid (keys %$cleaned_replicated_volumes) {
-	    if (!$wanted_volids->{$volid}) {
+	for my $volid ($volids->@*) {
+	    next if $wanted_volids->{$volid};
+
+	    my $stale = $cleaned_replicated_volumes->{$volid};
+	    # prepare() will not remove the last_sync snapshot, but if the volume was used by the
+	    # job and is not wanted anymore, it is stale too. And not removing it now might cause
+	    # it to be missed later, because the relevant storage might not get scanned anymore.
+	    $stale ||= grep {
+		PVE::Replication::is_replication_snapshot($_, $jobid)
+	    } keys %{$local_snapshots->{$volid} // {}};
+
+	    if ($stale) {
 		$logfunc->("$jobid: delete stale volume '$volid'");
 		PVE::Storage::vdisk_free($storecfg, $volid);
 		delete $local_snapshots->{$volid};
-- 
2.30.2