From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 4701D7063A for ; Fri, 3 Jun 2022 09:16:32 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 42F544CFC for ; Fri, 3 Jun 2022 09:16:32 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 85D3D4CE3 for ; Fri, 3 Jun 2022 09:16:31 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 5D4D943A09 for ; Fri, 3 Jun 2022 09:16:31 +0200 (CEST) From: Dominik Csapak To: pve-devel@lists.proxmox.com Date: Fri, 3 Jun 2022 09:16:29 +0200 Message-Id: <20220603071630.374408-1-d.csapak@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.110 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [replicationstate.pm] Subject: [pve-devel] [PATCH guest-common v2 1/2] ReplicationState: purge state from non local vms X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2022 07:16:32 -0000 when running replication, we don't want to keep replication states for non-local vms. Normally this would not be a problem, since on migration, we transfer the states anyway, but when the ha-manager steals a vm, it cannot do that. In that case, having an old state lying around is harmful, since the code does not expect the state to be out-of-sync with the actual snapshots on disk. One such problem is the following: Replicate vm 100 from node A to node B and C, and activate HA. When node A dies, it will be relocated to e.g. node B and start replicate from there. If node B now had an old state lying around for it's sync to node C, it might delete the common base snapshots of B and C and cannot sync again. Deleting the state for all non local guests fixes that issue, since it always starts fresh, and the potentially existing old state cannot be valid anyway since we just relocated the vm here (from a dead node). Signed-off-by: Dominik Csapak Reviewed-by: Fabian Grünbichler --- src/PVE/ReplicationState.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/PVE/ReplicationState.pm b/src/PVE/ReplicationState.pm index 0a5e410..8eebb42 100644 --- a/src/PVE/ReplicationState.pm +++ b/src/PVE/ReplicationState.pm @@ -215,7 +215,7 @@ sub purge_old_states { my $tid = $plugin->get_unique_target_id($jobcfg); my $vmid = $jobcfg->{guest}; $used_tids->{$vmid}->{$tid} = 1 - if defined($vms->{ids}->{$vmid}); # && $vms->{ids}->{$vmid}->{node} eq $local_node; + if defined($vms->{ids}->{$vmid}) && $vms->{ids}->{$vmid}->{node} eq $local_node; } my $purge_state = sub { -- 2.30.2