From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 898F31FF179 for ; Wed, 10 Dec 2025 13:56:56 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 940921C145; Wed, 10 Dec 2025 13:57:31 +0100 (CET) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Wed, 10 Dec 2025 13:57:08 +0100 Message-ID: <20251210125724.121834-3-f.ebner@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20251210125724.121834-1-f.ebner@proxmox.com> References: <20251210125724.121834-1-f.ebner@proxmox.com> MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1765371440727 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.168 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment POISEN_SPAM_PILL 0.1 Meta: its spam POISEN_SPAM_PILL_1 0.1 random spam to be learned in bayes POISEN_SPAM_PILL_3 0.1 random spam to be learned in bayes RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, dbusvmstate.pm] Subject: [pve-devel] [PATCH v3 qemu-server 2/2] dbus-vmstate: fix method call on dbus object resolving to wrong instance X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" As reported in the community forum [0] and then later by Thomas, who provided the relevant system logs, parallel migration with '--with-conntrack-state' of multiple VMs may currently lead to a crash upon handover: > kvm: Unknown savevm section or instance 'dbus-vmstate/dbus-vmstate' 0. > Make sure that your current VM setup matches your saved VM setup, > including any hotplugged devices > kvm: load of migration failed: Invalid argument In particular, the following sequence (on my test node) pvesh create /nodes/pve9a1/qemu/104/dbus-vmstate --action start pvesh create /nodes/pve9a1/qemu/105/dbus-vmstate --action start pvesh create /nodes/pve9a1/qemu/105/dbus-vmstate --action stop results in the wrong service being shut down (note the unexpected ID in the last line!): Dec 10 10:07:40 pve9a1 pvesh[30453]: starting dbus-vmstate helper for VM 104 Dec 10 10:07:40 pve9a1 systemd[1]: Starting pve-dbus-vmstate@104.service - PVE DBus VMState Helper (VM 104)... Dec 10 10:07:41 pve9a1 dbus-vmstate[30456]: pve-vmstate-104 listening on :1.55 Dec 10 10:07:41 pve9a1 systemd[1]: Started pve-dbus-vmstate@104.service - PVE DBus VMState Helper (VM 104). Dec 10 10:07:44 pve9a1 pvesh[30511]: starting dbus-vmstate helper for VM 105 Dec 10 10:07:44 pve9a1 systemd[1]: Starting pve-dbus-vmstate@105.service - PVE DBus VMState Helper (VM 105)... Dec 10 10:07:45 pve9a1 dbus-vmstate[30573]: pve-vmstate-105 listening on :1.58 Dec 10 10:07:45 pve9a1 systemd[1]: Started pve-dbus-vmstate@105.service - PVE DBus VMState Helper (VM 105). Dec 10 10:07:48 pve9a1 pvesh[30595]: stopping dbus-vmstate helper for VM 105 Dec 10 10:07:48 pve9a1 dbus-vmstate[30456]: shutting down gracefully .. Dec 10 10:07:48 pve9a1 systemd[1]: pve-dbus-vmstate@104.service: Deactivated successfully. So the dbus-vmstate object is removed from the wrong VM before loading the migration state. Note that the crash is still racy, because if the dbus-vmstate is removed on the source side for the same wrong VM before the migration handover, the QEMU objects for both instances will still match. To fix the issue, use the dbus_call_method() helper. Like, this the owner is respected even if there are multiple (queued) owners on the DBus. [0]: https://forum.proxmox.com/threads/176821/post-820775 Fixes: dc76a590 ("fix #5180: migrate: integrate helper for live-migrating conntrack info") Reported-by: Thomas Lamprecht Signed-off-by: Fiona Ebner --- Changes in v3: * split patch. src/PVE/QemuServer/DBusVMState.pm | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/PVE/QemuServer/DBusVMState.pm b/src/PVE/QemuServer/DBusVMState.pm index 05dd1bcd..480d9f70 100644 --- a/src/PVE/QemuServer/DBusVMState.pm +++ b/src/PVE/QemuServer/DBusVMState.pm @@ -127,7 +127,8 @@ sub qemu_del_dbus_vmstate { $num_entries = eval { dbus_get_property($object, 'com.proxmox.VMStateHelper', 'NumMigratedEntries'); }; - eval { $object->Quit() }; + # Quit() does QMP object-del which has a timeout of 60 seconds + eval { dbus_call_method($object, 'com.proxmox.VMStateHelper', 'Quit', [], 70); }; if (my $err = $@) { syslog('warn', "failed to call quit on dbus-vmstate for VM $vmid: $err\n") if !$params{quiet}; -- 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel