From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id BE4A91FF173
	for <inbox@lore.proxmox.com>; Mon, 24 Mar 2025 12:16:33 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id DEE0919264;
	Mon, 24 Mar 2025 12:16:17 +0100 (CET)
To: pve-devel@lists.proxmox.com
Date: Mon, 24 Mar 2025 12:15:29 +0100
In-Reply-To: <20250324111529.338025-1-alexandre.derumier@groupe-cyllene.com>
References: <20250324111529.338025-1-alexandre.derumier@groupe-cyllene.com>
MIME-Version: 1.0
Message-ID: <mailman.127.1742814976.359.pve-devel@lists.proxmox.com>
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
From: Alexandre Derumier via pve-devel <pve-devel@lists.proxmox.com>
Precedence: list
Cc: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
X-Mailman-Version: 2.1.29
X-BeenThere: pve-devel@lists.proxmox.com
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
Subject: [pve-devel] [PATCH qemu-server 1/1] qemu: add offline migration
 from dead node
Content-Type: multipart/mixed; boundary="===============7178606903814310017=="
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

--===============7178606903814310017==
Content-Type: message/rfc822
Content-Disposition: inline

Return-Path: <root@formationkvm1.odiso.net>
X-Original-To: pve-devel@lists.proxmox.com
Delivered-To: pve-devel@lists.proxmox.com
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by lists.proxmox.com (Postfix) with ESMTPS id 5378FC9A28
	for <pve-devel@lists.proxmox.com>; Mon, 24 Mar 2025 12:16:15 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 3498F1919B
	for <pve-devel@lists.proxmox.com>; Mon, 24 Mar 2025 12:15:45 +0100 (CET)
Received: from bastiontest.odiso.net (unknown [IPv6:2a0a:1580:2000:6700::14])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by firstgate.proxmox.com (Proxmox) with ESMTPS
	for <pve-devel@lists.proxmox.com>; Mon, 24 Mar 2025 12:15:43 +0100 (CET)
Received: from formationkvm1.odiso.net (unknown [10.11.201.57])
	by bastiontest.odiso.net (Postfix) with ESMTP id CED40860943;
	Mon, 24 Mar 2025 12:15:30 +0100 (CET)
Received: by formationkvm1.odiso.net (Postfix, from userid 0)
	id 60A0E111939C; Mon, 24 Mar 2025 12:15:31 +0100 (CET)
From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH qemu-server 1/1] qemu: add offline migration from dead node
Date: Mon, 24 Mar 2025 12:15:29 +0100
Message-Id: <20250324111529.338025-3-alexandre.derumier@groupe-cyllene.com>
X-Mailer: git-send-email 2.39.5
In-Reply-To: <20250324111529.338025-1-alexandre.derumier@groupe-cyllene.com>
References: <20250324111529.338025-1-alexandre.derumier@groupe-cyllene.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.080 Adjusted score from AWL reputation of From: address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DMARC_NONE                0.1 DMARC none policy
	HEADER_FROM_DIFFERENT_DOMAINS  0.062 From and EnvelopeFrom 2nd level mail domains are different
	KAM_DMARC_NONE           0.25 DKIM has Failed or SPF has failed on the message and the domain has no DMARC policy
	KAM_DMARC_STATUS         0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
	KAM_LAZY_DOMAIN_SECURITY      1 Sending domain does not have any anti-forgery methods
	RDNS_NONE               0.793 Delivered to internal network by a host with no rDNS
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_NONE                0.001 SPF: sender does not publish an SPF Record
	URIBL_BLOCKED           0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked.  See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [qemu.pm]

verify that node is dead from corosync && ssh
and move config file from /etc/pve directly

Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
---
 PVE/API2/Qemu.pm | 56 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 52 insertions(+), 4 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 156b1c7b..58c454a6 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4764,6 +4764,9 @@ __PACKAGE__->register_method({
 		description => "Target node.",
 		completion =>  \&PVE::Cluster::complete_migration_target,
             }),
+            deadnode => get_standard_option('pve-node', {
+                description => "Dead source node.",
+            }),
 	    online => {
 		type => 'boolean',
 		description => "Use online/live migration if VM is running. Ignored if VM is stopped.",
@@ -4813,8 +4816,9 @@ __PACKAGE__->register_method({
 	my $authuser = $rpcenv->get_user();
 
 	my $target = extract_param($param, 'target');
+	my $deadnode = extract_param($param, 'deadnode');
 
-	my $localnode = PVE::INotify::nodename();
+	my $localnode = $deadnode ? $deadnode : PVE::INotify::nodename();
 	raise_param_exc({ target => "target is local node."}) if $target eq $localnode;
 
 	PVE::Cluster::check_cfs_quorum();
@@ -4835,14 +4839,43 @@ __PACKAGE__->register_method({
 	raise_param_exc({ migration_network => "Only root may use this option." })
 	    if $param->{migration_network} && $authuser ne 'root@pam';
 
+	raise_param_exc({ deadnode => "Only root may use this option." })
+	    if $param->{deadnode} && $authuser ne 'root@pam';
+
 	# test if VM exists
-	my $conf = PVE::QemuConfig->load_config($vmid);
+	my $conf = $deadnode ? PVE::QemuConfig->load_config($vmid, $deadnode) : PVE::QemuConfig->load_config($vmid);
 
 	# try to detect errors early
 
 	PVE::QemuConfig->check_lock($conf);
 
-	if (PVE::QemuServer::check_running($vmid)) {
+        if ($deadnode) {
+	    die "Can't do online migration of a dead node.\n" if $param->{online};
+	    my $members = PVE::Cluster::get_members();
+	    die "The deadnode $deadnode seem to be alive" if $members->{$deadnode} && $members->{$deadnode}->{online};
+
+	    print "test if deadnode $deadnode respond to ping\n";
+	    eval {
+		PVE::Tools::run_command("/usr/bin/ping -c 1 $members->{$deadnode}->{ip}");
+	    };
+	    if(!$@){
+		die "error: ping to target $deadnode is still working. Node seem to be alive.";
+	    }
+
+	    #make an extra ssh connection to double check that it's not just a corosync crash
+	    my $sshinfo = PVE::SSHInfo::get_ssh_info($deadnode);
+	    my $sshcmd = PVE::SSHInfo::ssh_info_to_command($sshinfo);
+	    push @$sshcmd, 'hostname';
+	    print "test if deadnode $deadnode respond to ssh\n";
+	    eval {
+		PVE::Tools::run_command($sshcmd, timeout => 1);
+	    };
+	    if(!$@){
+		die "error: ssh connection to target $deadnode is still working. Node seem to be alive.";
+	    }
+
+
+	} elsif (PVE::QemuServer::check_running($vmid)) {
 	    die "can't migrate running VM without --online\n" if !$param->{online};
 
 	    my $repl_conf = PVE::ReplicationConfig->new();
@@ -4881,7 +4914,22 @@ __PACKAGE__->register_method({
 	    PVE::QemuServer::check_storage_availability($storecfg, $conf, $target);
 	}
 
-	if (PVE::HA::Config::vm_is_ha_managed($vmid) && $rpcenv->{type} ne 'ha') {
+	if ($deadnode) {
+	    my $realcmd = sub {
+		my $config_fn = PVE::QemuConfig->config_file($vmid, $deadnode);
+		my $new_config_fn = PVE::QemuConfig->config_file($vmid, $target);
+
+		rename($config_fn, $new_config_fn)
+		    or die "failed to move config file to node '$target': $!\n";
+	    };
+
+	    my $worker = sub {
+		return PVE::GuestHelpers::guest_migration_lock($vmid, 10, $realcmd);
+	    };
+
+	    return $rpcenv->fork_worker('qmigrate', $vmid, $authuser, $worker);
+
+        } elsif (PVE::HA::Config::vm_is_ha_managed($vmid) && $rpcenv->{type} ne 'ha') {
 
 	    my $hacmd = sub {
 		my $upid = shift;
-- 
2.39.5



--===============7178606903814310017==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

--===============7178606903814310017==--