* [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration
@ 2025-07-03 11:54 Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH proxmox-ve-rs v3 01/14] config: guest: allow access to raw Vmid value Christoph Heiss
` (13 more replies)
0 siblings, 14 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Fixes #5180 [0].
This implements migration of per-VM conntrack state on live-migration.
The core of the implementation are in patch #7 & #8. See there for more
details.
Patch #1 - #3 implement CONNMARK'ing any VM traffic with their unique
VMID. This is needed later on to filter conntrack entries for the
migration. These three patches can be applied independently,
CONNMARK'ing traffic does not have any visible impact.
Regarding the use of connlabels instead, as was discussed in v2:
Connlabels are (unfortunately) pure bitmasks and cannot be used for
storing arbitrary values such as VMIDs. Thus we are basically stuck with
using connmarks.
Currently, remote/inter-cluster migration is not supported and indicated
to the user with a warning. See also patch #8 for a bit more in-depth
explanation.
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=5180
Dependencies
============
proxmox-firewall depends on the proxmox-ve-rs changes.
qemu-server depends on the pve-firewall/proxmox-firewall changes.
pve-manager only soft-depends on the other, as it will detect whether
conntrack migration is supported.
Testing
=======
I've primarily tested intra-cluster live-migrations, with both the
iptables-based and nftables-based firewall), using the reproducer as
described in #5180. I further verified that the D-Bus services get
started as expected and are _always_ stopped, even in the case of some
migration error.
Finally, I also checked using `conntrack -L -m <vmid>` tool that the
conntrack entries are
a) added/updated on the target node and
b) removed from the source node afterwards
Also tested was the migration from/to an "old" (unpatched) node, which
results in the issue as per #5180 & appropriate warnings in the UI.
For remote migrations, tested that the warning is logged as expected.
History
=======
v1: https://lore.proxmox.com/pve-devel/20250317141152.1247324-1-c.heiss@proxmox.com/
v2: https://lore.proxmox.com/pve-devel/20250424111941.730528-1-c.heiss@proxmox.com/
Changes v1 -> v2:
* rebased as necessary
* "un-rfc'd" firewall conntrack flushing patches
* use an instanced systemd service instead of fork+exec for the
pve-dbus-vmstate helper
Changes v2 -> v3:
* rebased on trixie/latest masters
* added documentation patch
* moved node capability module to
PVE::API2::NodeCapabilities::Qemu::Migration, based on Fiona's
suggestion
Diffstat
========
proxmox-ve-rs:
Christoph Heiss (1):
config: guest: allow access to raw Vmid value
proxmox-ve-config/src/guest/types.rs | 4 ++++
1 file changed, 4 insertions(+)
proxmox-firewall:
Christoph Heiss (1):
firewall: add connmark rule with VMID to all guest chains
proxmox-firewall/src/firewall.rs | 14 +++-
.../integration_tests__firewall.snap | 84 +++++++++++++++++++
proxmox-nftables/src/expression.rs | 9 ++
proxmox-nftables/src/statement.rs | 10 ++-
4 files changed, 114 insertions(+), 3 deletions(-)
pve-firewall:
Christoph Heiss (2):
firewall: add connmark rule with VMID to all guest chains
firewall: helpers: add sub for flushing conntrack entries by mark
debian/control | 3 ++-
src/PVE/Firewall.pm | 14 ++++++++++++--
src/PVE/Firewall/Helpers.pm | 12 ++++++++++++
3 files changed, 26 insertions(+), 3 deletions(-)
qemu-server:
Christoph Heiss (5):
qmp helpers: allow passing structured args via qemu_objectadd()
api2: qemu: add module exposing node migration capabilities
fix #5180: dbus-vmstate: add daemon for QEMUs dbus-vmstate interface
fix #5180: migrate: integrate helper for live-migrating conntrack info
migrate: flush old VM conntrack entries after successful migration
Makefile | 4 +-
debian/control | 7 +-
src/Makefile | 1 +
src/PVE/API2/Makefile | 1 +
src/PVE/API2/NodeCapabilities/Makefile | 9 +
.../API2/NodeCapabilities/Qemu/Migration.pm | 48 +++++
src/PVE/API2/Qemu.pm | 75 ++++++++
src/PVE/CLI/qm.pm | 5 +
src/PVE/QemuMigrate.pm | 78 ++++++++
src/PVE/QemuServer.pm | 6 +
src/PVE/QemuServer/DBusVMState.pm | 125 +++++++++++++
src/PVE/QemuServer/Makefile | 1 +
src/PVE/QemuServer/QMPHelpers.pm | 4 +-
src/dbus-vmstate/Makefile | 11 ++
src/dbus-vmstate/dbus-vmstate | 168 ++++++++++++++++++
src/dbus-vmstate/org.qemu.VMState1.conf | 11 ++
src/dbus-vmstate/pve-dbus-vmstate@.service | 10 ++
17 files changed, 560 insertions(+), 4 deletions(-)
create mode 100644 src/PVE/API2/NodeCapabilities/Makefile
create mode 100644 src/PVE/API2/NodeCapabilities/Qemu/Migration.pm
create mode 100644 src/PVE/QemuServer/DBusVMState.pm
create mode 100644 src/dbus-vmstate/Makefile
create mode 100755 src/dbus-vmstate/dbus-vmstate
create mode 100644 src/dbus-vmstate/org.qemu.VMState1.conf
create mode 100644 src/dbus-vmstate/pve-dbus-vmstate@.service
pve-manager:
Christoph Heiss (4):
api2: capabilities: explicitly import CPU capabilities module
api2: capabilities: proxy index endpoints to respective nodes
api2: capabilities: expose new qemu/migration endpoint
ui: window: Migrate: add checkbox for migrating VM conntrack state
PVE/API2/Capabilities.pm | 11 ++++-
www/manager6/window/Migrate.js | 82 ++++++++++++++++++++++++++++++++--
2 files changed, 89 insertions(+), 4 deletions(-)
pve-docs:
Christoph Heiss (1):
qm: document conntrack state migration for live migrations
qm.adoc | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--
2.47.1
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH proxmox-ve-rs v3 01/14] config: guest: allow access to raw Vmid value
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH proxmox-firewall v3 02/14] firewall: add connmark rule with VMID to all guest chains Christoph Heiss
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Needed in proxmox-nftables/-firewall to generate rules depending on the
numeric vmid.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* no changes
proxmox-ve-config/src/guest/types.rs | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/proxmox-ve-config/src/guest/types.rs b/proxmox-ve-config/src/guest/types.rs
index ed6a48c..a0fb67d 100644
--- a/proxmox-ve-config/src/guest/types.rs
+++ b/proxmox-ve-config/src/guest/types.rs
@@ -13,6 +13,10 @@ impl Vmid {
pub fn new(id: u32) -> Self {
Vmid(id)
}
+
+ pub fn raw_value(&self) -> u32 {
+ self.0
+ }
}
impl From<u32> for Vmid {
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH proxmox-firewall v3 02/14] firewall: add connmark rule with VMID to all guest chains
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH proxmox-ve-rs v3 01/14] config: guest: allow access to raw Vmid value Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH firewall v3 03/14] " Christoph Heiss
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Adds a conntrack attribute with the VMID inside to anything flowing
in/out the guest.
This enables filtering/differentiating conntrack entries between VMs for
live-migration.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* rebased on latest master
Changes v2 -> v3:
* no changes
proxmox-firewall/src/firewall.rs | 14 +++-
.../integration_tests__firewall.snap | 84 +++++++++++++++++++
proxmox-nftables/src/expression.rs | 9 ++
proxmox-nftables/src/statement.rs | 10 ++-
4 files changed, 114 insertions(+), 3 deletions(-)
diff --git a/proxmox-firewall/src/firewall.rs b/proxmox-firewall/src/firewall.rs
index 086b96c..d782fcb 100644
--- a/proxmox-firewall/src/firewall.rs
+++ b/proxmox-firewall/src/firewall.rs
@@ -6,7 +6,9 @@ use anyhow::{bail, Error};
use proxmox_nftables::command::{Add, Commands, Delete, Flush};
use proxmox_nftables::expression::{Meta, Payload};
use proxmox_nftables::helper::NfVec;
-use proxmox_nftables::statement::{AnonymousLimit, Log, LogLevel, Match, Set, SetOperation};
+use proxmox_nftables::statement::{
+ AnonymousLimit, Log, LogLevel, Mangle, Match, Set, SetOperation,
+};
use proxmox_nftables::types::{
AddElement, AddRule, ChainPart, MapValue, RateTimescale, SetName, TableFamily, TableName,
TablePart, Verdict,
@@ -944,7 +946,15 @@ impl Firewall {
vmid: Some(vmid),
};
- commands.reserve(config.rules().len());
+ commands.reserve(config.rules().len() + 1);
+
+ // Add a CONNMARK to anything in/out the guest, to be able to later
+ // track/filter traffic per guest, e.g. in pve-dbus-vmstate.
+ // Need to be first, such that it is always applied.
+ commands.push(Add::rule(AddRule::from_statement(
+ chain.clone(),
+ Mangle::ct_mark(vmid),
+ )));
for config_rule in config.rules() {
for rule in NftRule::from_config_rule(config_rule, &env)? {
diff --git a/proxmox-firewall/tests/snapshots/integration_tests__firewall.snap b/proxmox-firewall/tests/snapshots/integration_tests__firewall.snap
index ad54ad0..e3db8ae 100644
--- a/proxmox-firewall/tests/snapshots/integration_tests__firewall.snap
+++ b/proxmox-firewall/tests/snapshots/integration_tests__firewall.snap
@@ -4489,6 +4489,27 @@ expression: "firewall.full_host_fw().expect(\"firewall can be generated\")"
}
}
},
+ {
+ "add": {
+ "rule": {
+ "family": "bridge",
+ "table": "proxmox-firewall-guests",
+ "chain": "guest-100-in",
+ "expr": [
+ {
+ "mangle": {
+ "key": {
+ "ct": {
+ "key": "mark"
+ }
+ },
+ "value": 100
+ }
+ }
+ ]
+ }
+ }
+ },
{
"add": {
"rule": {
@@ -4764,6 +4785,27 @@ expression: "firewall.full_host_fw().expect(\"firewall can be generated\")"
}
}
},
+ {
+ "add": {
+ "rule": {
+ "family": "bridge",
+ "table": "proxmox-firewall-guests",
+ "chain": "guest-100-out",
+ "expr": [
+ {
+ "mangle": {
+ "key": {
+ "ct": {
+ "key": "mark"
+ }
+ },
+ "value": 100
+ }
+ }
+ ]
+ }
+ }
+ },
{
"add": {
"rule": {
@@ -5150,6 +5192,27 @@ expression: "firewall.full_host_fw().expect(\"firewall can be generated\")"
}
}
},
+ {
+ "add": {
+ "rule": {
+ "family": "bridge",
+ "table": "proxmox-firewall-guests",
+ "chain": "guest-101-in",
+ "expr": [
+ {
+ "mangle": {
+ "key": {
+ "ct": {
+ "key": "mark"
+ }
+ },
+ "value": 101
+ }
+ }
+ ]
+ }
+ }
+ },
{
"add": {
"rule": {
@@ -5212,6 +5275,27 @@ expression: "firewall.full_host_fw().expect(\"firewall can be generated\")"
}
}
},
+ {
+ "add": {
+ "rule": {
+ "family": "bridge",
+ "table": "proxmox-firewall-guests",
+ "chain": "guest-101-out",
+ "expr": [
+ {
+ "mangle": {
+ "key": {
+ "ct": {
+ "key": "mark"
+ }
+ },
+ "value": 101
+ }
+ }
+ ]
+ }
+ }
+ },
{
"add": {
"rule": {
diff --git a/proxmox-nftables/src/expression.rs b/proxmox-nftables/src/expression.rs
index e9ef94f..cbafe85 100644
--- a/proxmox-nftables/src/expression.rs
+++ b/proxmox-nftables/src/expression.rs
@@ -12,6 +12,8 @@ use proxmox_ve_config::firewall::types::port::{PortEntry, PortList};
use proxmox_ve_config::firewall::types::rule_match::{IcmpCode, IcmpType, Icmpv6Code, Icmpv6Type};
#[cfg(feature = "config-ext")]
use proxmox_ve_config::firewall::types::Cidr;
+#[cfg(feature = "config-ext")]
+use proxmox_ve_config::guest::types::Vmid;
#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(rename_all = "lowercase")]
@@ -267,6 +269,13 @@ impl From<&BridgeName> for Expression {
}
}
+#[cfg(feature = "config-ext")]
+impl From<Vmid> for Expression {
+ fn from(value: Vmid) -> Self {
+ Expression::Number(value.raw_value().into())
+ }
+}
+
#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct Meta {
key: String,
diff --git a/proxmox-nftables/src/statement.rs b/proxmox-nftables/src/statement.rs
index 5483368..3264e6c 100644
--- a/proxmox-nftables/src/statement.rs
+++ b/proxmox-nftables/src/statement.rs
@@ -10,6 +10,7 @@ use proxmox_ve_config::firewall::types::rule::Verdict as ConfigVerdict;
#[cfg(feature = "config-ext")]
use proxmox_ve_config::guest::types::Vmid;
+use crate::expression::Ct;
use crate::expression::Meta;
use crate::helper::{NfVec, Null};
use crate::types::{RateTimescale, RateUnit, Verdict};
@@ -370,12 +371,19 @@ pub struct Mangle {
}
impl Mangle {
- pub fn set_mark(value: impl Into<Expression>) -> Self {
+ pub fn meta_mark(value: impl Into<Expression>) -> Self {
Self {
key: Meta::new("mark").into(),
value: value.into(),
}
}
+
+ pub fn ct_mark(value: impl Into<Expression>) -> Self {
+ Self {
+ key: Ct::new("mark", None).into(),
+ value: value.into(),
+ }
+ }
}
#[derive(Clone, Copy, Debug, Deserialize, Serialize)]
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH firewall v3 03/14] firewall: add connmark rule with VMID to all guest chains
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH proxmox-ve-rs v3 01/14] config: guest: allow access to raw Vmid value Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH proxmox-firewall v3 02/14] firewall: add connmark rule with VMID to all guest chains Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH firewall v3 04/14] firewall: helpers: add sub for flushing conntrack entries by mark Christoph Heiss
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Adds a connmark attribute with the VMID inside to anything flowing
in/out the guest, which are also carried over to all conntrack entries.
This enables differentiating conntrack entries between VMs for
live-migration.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* rebased on trixie
src/PVE/Firewall.pm | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/src/PVE/Firewall.pm b/src/PVE/Firewall.pm
index 41b740a..e9996ef 100644
--- a/src/PVE/Firewall.pm
+++ b/src/PVE/Firewall.pm
@@ -2551,11 +2551,14 @@ sub ruleset_chain_add_input_filters {
}
sub ruleset_create_vm_chain {
- my ($ruleset, $chain, $ipversion, $options, $macaddr, $ipfilter_ipset, $direction) = @_;
+ my ($ruleset, $chain, $ipversion, $options, $macaddr, $ipfilter_ipset, $direction, $vmid) = @_;
ruleset_create_chain($ruleset, $chain);
my $accept = generate_nfqueue($options);
+ # needs to be first, to ensure that it gets always applied
+ ruleset_addrule($ruleset, $chain, "", "-j CONNMARK --set-mark $vmid");
+
if (!(defined($options->{dhcp}) && $options->{dhcp} == 0)) {
if ($ipversion == 4) {
if ($direction eq 'OUT') {
@@ -2796,7 +2799,14 @@ sub generate_tap_rules_direction {
if ($options->{enable}) {
# create chain with mac and ip filter
ruleset_create_vm_chain(
- $ruleset, $tapchain, $ipversion, $options, $macaddr, $ipfilter_ipset, $direction,
+ $ruleset,
+ $tapchain,
+ $ipversion,
+ $options,
+ $macaddr,
+ $ipfilter_ipset,
+ $direction,
+ $vmid,
);
ruleset_generate_vm_rules(
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH firewall v3 04/14] firewall: helpers: add sub for flushing conntrack entries by mark
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (2 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH firewall v3 03/14] " Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 05/14] qmp helpers: allow passing structured args via qemu_objectadd() Christoph Heiss
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
A small helper routine for flushing all conntrack table entries which
are marked with a specific value.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* rebased on trixie
debian/control | 3 ++-
src/PVE/Firewall/Helpers.pm | 12 ++++++++++++
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/debian/control b/debian/control
index 2e8e528..59c45af 100644
--- a/debian/control
+++ b/debian/control
@@ -17,7 +17,8 @@ Standards-Version: 4.6.2
Package: pve-firewall
Architecture: any
Conflicts: ulogd,
-Depends: ebtables,
+Depends: conntrack,
+ ebtables,
ipset,
iptables,
libpve-access-control,
diff --git a/src/PVE/Firewall/Helpers.pm b/src/PVE/Firewall/Helpers.pm
index 251ee22..0abfd5d 100644
--- a/src/PVE/Firewall/Helpers.pm
+++ b/src/PVE/Firewall/Helpers.pm
@@ -16,6 +16,7 @@ our @EXPORT_OK = qw(
remove_vmfw_conf
clone_vmfw_conf
collect_refs
+ flush_fw_ct_entries_by_mark
);
my $pvefw_conf_dir = "/etc/pve/firewall";
@@ -188,4 +189,15 @@ sub collect_refs {
return $res;
}
+# Flushes all conntrack table entries which are CONNMARK'd with the specified value.
+sub flush_fw_ct_entries_by_mark {
+ my ($mark) = @_;
+
+ PVE::Tools::run_command(
+ ['conntrack', '--delete', '--mark', $mark],
+ noerr => 1,
+ quiet => 1,
+ );
+}
+
1;
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 05/14] qmp helpers: allow passing structured args via qemu_objectadd()
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (3 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH firewall v3 04/14] firewall: helpers: add sub for flushing conntrack entries by mark Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 06/14] api2: qemu: add module exposing node migration capabilities Christoph Heiss
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
No functional changes for existing code.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* no changes
src/PVE/QemuServer/QMPHelpers.pm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/PVE/QemuServer/QMPHelpers.pm b/src/PVE/QemuServer/QMPHelpers.pm
index c3be2b8e..9e0996cc 100644
--- a/src/PVE/QemuServer/QMPHelpers.pm
+++ b/src/PVE/QemuServer/QMPHelpers.pm
@@ -36,9 +36,9 @@ sub qemu_devicedel {
}
sub qemu_objectadd {
- my ($vmid, $objectid, $qomtype) = @_;
+ my ($vmid, $objectid, $qomtype, %args) = @_;
- mon_cmd($vmid, "object-add", id => $objectid, "qom-type" => $qomtype);
+ mon_cmd($vmid, "object-add", id => $objectid, "qom-type" => $qomtype, %args);
return 1;
}
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 06/14] api2: qemu: add module exposing node migration capabilities
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (4 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 05/14] qmp helpers: allow passing structured args via qemu_objectadd() Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 07/14] fix #5180: dbus-vmstate: add daemon for QEMUs dbus-vmstate interface Christoph Heiss
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Similar to the already existing ones for CPU and QEMU machine support.
Very simple for now, only provides one property for now:
'has-dbus-vmstate' - Whether the dbus-vmstate is available/installed
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* moved module from PVE::API2::Qemu::Migration to
PVE::API2::NodeCapabilities::Qemu::Migration, based on Fiona's
suggestion
* formatted using perltidy
src/PVE/API2/Makefile | 1 +
src/PVE/API2/NodeCapabilities/Makefile | 9 ++++
.../API2/NodeCapabilities/Qemu/Migration.pm | 48 +++++++++++++++++++
3 files changed, 58 insertions(+)
create mode 100644 src/PVE/API2/NodeCapabilities/Makefile
create mode 100644 src/PVE/API2/NodeCapabilities/Qemu/Migration.pm
diff --git a/src/PVE/API2/Makefile b/src/PVE/API2/Makefile
index f8fce567..4620703c 100644
--- a/src/PVE/API2/Makefile
+++ b/src/PVE/API2/Makefile
@@ -7,3 +7,4 @@ install:
install -d -m 0755 $(DESTDIR)$(PERLDIR)/PVE/API2
install -D -m 0644 Qemu.pm $(DESTDIR)$(PERLDIR)/PVE/API2/Qemu.pm
$(MAKE) -C Qemu install
+ $(MAKE) -C NodeCapabilities install
diff --git a/src/PVE/API2/NodeCapabilities/Makefile b/src/PVE/API2/NodeCapabilities/Makefile
new file mode 100644
index 00000000..b35f5529
--- /dev/null
+++ b/src/PVE/API2/NodeCapabilities/Makefile
@@ -0,0 +1,9 @@
+DESTDIR=
+PREFIX=/usr
+PERLDIR=$(PREFIX)/share/perl5
+
+SOURCES := Qemu/Migration.pm
+
+.PHONY: install
+install: $(SOURCES)
+ for i in $(SOURCES); do install -D -m 0644 $$i $(DESTDIR)$(PERLDIR)/PVE/API2/NodeCapabilities/$$i; done
diff --git a/src/PVE/API2/NodeCapabilities/Qemu/Migration.pm b/src/PVE/API2/NodeCapabilities/Qemu/Migration.pm
new file mode 100644
index 00000000..98d683c3
--- /dev/null
+++ b/src/PVE/API2/NodeCapabilities/Qemu/Migration.pm
@@ -0,0 +1,48 @@
+package PVE::API2::NodeCapabilities::Qemu::Migration;
+
+use strict;
+use warnings;
+
+use JSON;
+use PVE::JSONSchema qw(get_standard_option);
+use PVE::RESTHandler;
+
+use base qw(PVE::RESTHandler);
+
+__PACKAGE__->register_method({
+ name => 'capabilities',
+ path => '',
+ method => 'GET',
+ proxyto => 'node',
+ description => 'Get node-specific QEMU migration capabilities of the node.'
+ . " Requires the 'Sys.Audit' permission on '/nodes/<node>'.",
+ permissions => {
+ check => ['perm', '/nodes/{node}', ['Sys.Audit']],
+ },
+ parameters => {
+ additionalProperties => 0,
+ properties => {
+ node => get_standard_option('pve-node'),
+ },
+ },
+ returns => {
+ type => 'object',
+ additionalProperties => 0,
+ properties => {
+ 'dbus-vmstate' => {
+ type => 'boolean',
+ description => 'Whether the host supports live-migrating additional'
+ . ' VM state via the dbus-vmstate helper.',
+ },
+ },
+ },
+ code => sub {
+ return {
+ 'has-dbus-vmstate' => -f '/usr/libexec/qemu-server/dbus-vmstate'
+ ? JSON::true
+ : JSON::false,
+ };
+ },
+});
+
+1;
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 07/14] fix #5180: dbus-vmstate: add daemon for QEMUs dbus-vmstate interface
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (5 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 06/14] api2: qemu: add module exposing node migration capabilities Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 08/14] fix #5180: migrate: integrate helper for live-migrating conntrack info Christoph Heiss
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
First part to fixing #5180 [0].
Adds a simple D-Bus server which implements the `org.qemu.VMState1`
interface as specified in the QEMU documentation [1].
Using the built-in QEMU VMState machinery saves us from having to worry
about transfer and convergence of the data and letl QEMU take care of
it.
Any object on the D-Bus path `/org/qemu/VMState1` implementing that
interface will be called by QEMU during live-migration, iif the `Id`
property is registered within the `dbus-vmstate` QEMU object for a
specific VM.
The actual state loading/restoring is done via the conntrack(8) tool, a
small tool which already implements hard parts of interacting with the
conntrack subsystem via netlink.
Filtering is done on CONNMARK, which is set to the specific VMID for all
packets by the firewall.
Additionally, a custom `com.proxmox.VMStateHelper` interface is
implemented by the object, adding a small `Quit` method for cleanly
shutting down the daemon via the D-Bus API.
For all to work, D-Bus needs a policy describing who is allowed to
access the interface. [2]
Currently, there is a hard-limit of 1 MiB of state enforced by QEMU.
Typical conntrack state entries as dumped by conntrack(8) in the `save`
output format are just plaintext, ASCII lines and mostly around
150-200 characters. That translates then to about ~5200 entries that can
be migrated.
Such a typical line looks like:
-A -t 431974 -u SEEN_REPLY,ASSURED -s 10.1.0.1 -d 10.1.1.20 \
-r 10.1.1.20 -q 10.1.0.1 -p tcp --sport 48550 --dport 22 \
--reply-port-src 22 --reply-port-dst 48550 --state ESTABLISHED
In the future, compression could be implemented for these before sending
them to QEMU, which should increase the above number quite a bit - since
these entries are nicely compressible.
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=5180
[1] https://www.qemu.org/docs/master/interop/dbus-vmstate.html
[2] https://dbus.freedesktop.org/doc/dbus-daemon.1.html#configuration_file
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* convert dbus-vmstate to instanced systemd service
* fix plural for zero entries in migration log
Changes v2 -> v3:
* formatted using perltidy
Makefile | 4 +-
debian/control | 7 +-
src/Makefile | 1 +
src/dbus-vmstate/Makefile | 11 ++
src/dbus-vmstate/dbus-vmstate | 168 +++++++++++++++++++++
src/dbus-vmstate/org.qemu.VMState1.conf | 11 ++
src/dbus-vmstate/pve-dbus-vmstate@.service | 10 ++
7 files changed, 210 insertions(+), 2 deletions(-)
create mode 100644 src/dbus-vmstate/Makefile
create mode 100755 src/dbus-vmstate/dbus-vmstate
create mode 100644 src/dbus-vmstate/org.qemu.VMState1.conf
create mode 100644 src/dbus-vmstate/pve-dbus-vmstate@.service
diff --git a/Makefile b/Makefile
index aa9b6754..58d38a06 100644
--- a/Makefile
+++ b/Makefile
@@ -1,8 +1,10 @@
include /usr/share/dpkg/default.mk
-PACKAGE=qemu-server
+export PACKAGE=qemu-server
BUILDDIR ?= $(PACKAGE)-$(DEB_VERSION_UPSTREAM)
+export DESTDIR ?=
+
GITVERSION:=$(shell git rev-parse HEAD)
DEB=$(PACKAGE)_$(DEB_VERSION_UPSTREAM_REVISION)_$(DEB_BUILD_ARCH).deb
diff --git a/debian/control b/debian/control
index 2b188d6a..b9231903 100644
--- a/debian/control
+++ b/debian/control
@@ -3,9 +3,11 @@ Section: admin
Priority: optional
Maintainer: Proxmox Support Team <support@proxmox.com>
Build-Depends: debhelper-compat (= 13),
+ libclass-methodmaker-perl,
libglib2.0-dev,
libio-multiplex-perl,
libjson-c-dev,
+ libnet-dbus-perl,
libpve-apiclient-perl,
libpve-cluster-perl,
libpve-common-perl (>= 9.0.1),
@@ -29,11 +31,14 @@ Homepage: https://www.proxmox.com
Package: qemu-server
Architecture: any
-Depends: dbus,
+Depends: conntrack,
+ dbus,
genisoimage,
+ libclass-methodmaker-perl,
libio-multiplex-perl,
libjson-perl,
libjson-xs-perl,
+ libnet-dbus-perl,
libnet-ssleay-perl,
libpve-access-control (>= 8.0.0~),
libpve-apiclient-perl,
diff --git a/src/Makefile b/src/Makefile
index f8e9a2d9..3f9b8c6f 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -6,6 +6,7 @@ install:
$(MAKE) -C bin install
$(MAKE) -C qmeventd install
$(MAKE) -C query-machine-capabilities install
+ $(MAKE) -C dbus-vmstate install
$(MAKE) -C usr install
.PHONY: test
diff --git a/src/dbus-vmstate/Makefile b/src/dbus-vmstate/Makefile
new file mode 100644
index 00000000..ced00264
--- /dev/null
+++ b/src/dbus-vmstate/Makefile
@@ -0,0 +1,11 @@
+export LIBEXECDIR := $(DESTDIR)/usr/libexec/$(PACKAGE)
+export LIBSYSTEMDDIR := $(DESTDIR)/usr/lib/systemd
+export DBUSDIR := $(DESTDIR)/usr/share/dbus-1
+
+all:
+
+.PHONY: install
+install:
+ install -D -m 0755 dbus-vmstate $(LIBEXECDIR)/dbus-vmstate
+ install -D -m 0644 pve-dbus-vmstate@.service $(LIBSYSTEMDDIR)/system/pve-dbus-vmstate@.service
+ install -D -m 0644 org.qemu.VMState1.conf $(DBUSDIR)/system.d/org.qemu.VMState1.conf
diff --git a/src/dbus-vmstate/dbus-vmstate b/src/dbus-vmstate/dbus-vmstate
new file mode 100755
index 00000000..ac6f8cfb
--- /dev/null
+++ b/src/dbus-vmstate/dbus-vmstate
@@ -0,0 +1,168 @@
+#!/usr/bin/perl
+
+# Exports an DBus object implementing
+# https://www.qemu.org/docs/master/interop/dbus-vmstate.html
+
+package PVE::QemuServer::DBusVMState;
+
+use warnings;
+use strict;
+
+use Carp;
+use Net::DBus;
+use Net::DBus::Exporter qw(org.qemu.VMState1);
+use Net::DBus::Reactor;
+use PVE::QemuServer::Helpers;
+use PVE::QemuServer::QMPHelpers qw(qemu_objectadd qemu_objectdel);
+use PVE::SafeSyslog;
+use PVE::Tools;
+
+use base qw(Net::DBus::Object);
+
+use Class::MethodMaker [ scalar => [ qw(Id NumMigratedEntries) ]];
+dbus_property('Id', 'string', 'read');
+dbus_property('NumMigratedEntries', 'uint32', 'read', 'com.proxmox.VMStateHelper');
+
+sub new {
+ my ($class, $service, $vmid) = @_;
+
+ my $self = $class->SUPER::new($service, '/org/qemu/VMState1');
+ $self->{vmid} = $vmid;
+ $self->Id("pve-vmstate-$vmid");
+ $self->NumMigratedEntries(0);
+
+ bless $self, $class;
+ return $self;
+}
+
+sub Load {
+ my ($self, $bytes) = @_;
+
+ my $len = scalar(@$bytes);
+ return if $len <= 1; # see also the `Save` method
+
+ my $text = pack('c*', @$bytes);
+
+ eval {
+ PVE::Tools::run_command(
+ ['conntrack', '--load-file', '-'],
+ input => $text,
+ );
+ };
+ if (my $err = $@) {
+ syslog('warn', "failed to restore conntrack state: $err\n");
+ } else {
+ syslog('info', "restored $len bytes of conntrack state\n");
+ }
+}
+dbus_method('Load', [['array', 'byte']], []);
+
+use constant {
+ # From the documentation:
+ # https://www.qemu.org/docs/master/interop/dbus-vmstate.html),
+ # > For now, the data amount to be transferred is arbitrarily limited to 1Mb.
+ #
+ # See also qemu/backends/dbus-vmstate.c:DBUS_VMSTATE_SIZE_LIMIT
+ DBUS_VMSTATE_SIZE_LIMIT => 1024 * 1024,
+};
+
+sub Save {
+ my ($self) = @_;
+
+ my $text = '';
+ my $truncated = 0;
+ my $num_entries = 0;
+ eval {
+ PVE::Tools::run_command(
+ ['conntrack', '--dump', '--mark', $self->{vmid}, '--output', 'save'],
+ outfunc => sub {
+ my ($line) = @_;
+ return if $truncated;
+
+ if ((length($text) + length($line)) > DBUS_VMSTATE_SIZE_LIMIT) {
+ syslog('warn', 'conntrack state too large, ignoring further entries');
+ $truncated = 1;
+ return;
+ }
+
+ # conntrack(8) apparently does not preserve the `--mark` option,
+ # add it back ourselves
+ $text .= "$line --mark $self->{vmid}\n";
+ },
+ errfunc => sub {
+ my ($line) = @_;
+
+ if ($line =~ /(\d) flow entries/) {
+ syslog('info', "received $1 conntrack entries");
+ # conntrack reports the number of displayed entries on stderr,
+ # which shouldn't be considered an error.
+ $self->NumMigratedEntries($1);
+ return;
+ }
+ syslog('err', $line);
+ }
+ );
+ };
+ if (my $err = $@) {
+ syslog('warn', "failed to save conntrack state: $err\n");
+
+ # Apparently either Net::DBus does not correctly zero-sized (byte)
+ # arrays correctly - returning [] yields QEMU failing with
+ #
+ # "kvm: dbus_save_state_proxy: Failed to Save: not a byte array"
+ #
+ # Thus, just return an array with a single element and detect that
+ # appropriately in the `Load`. A valid conntrack state can *never* be
+ # just a single byte, so it is safe to rely on that.
+ return [0];
+ }
+
+ my @bytes = unpack('c*', $text);
+ my $len = scalar(@bytes);
+
+ syslog('info', "transferring $len bytes of conntrack state\n");
+
+ # Same as above w.r.t. returning as single-element array.
+ return $len == 0 ? [0] : \@bytes;
+}
+dbus_method('Save', [], [['array', 'byte']]);
+
+# Additional method for cleanly shutting down the service.
+sub Quit {
+ my ($self) = @_;
+
+ syslog('info', "shutting down gracefully ..\n");
+
+ # On the source side, the VM won't exist anymore, so no need to remove
+ # anything.
+ if (PVE::QemuServer::Helpers::vm_running_locally($self->{vmid})) {
+ eval { qemu_objectdel($self->{vmid}, 'pve-vmstate') };
+ if (my $err = $@) {
+ syslog('warn', "failed to remove object: $err\n");
+ }
+ }
+
+ Net::DBus::Reactor->main()->shutdown();
+}
+dbus_method('Quit', [], [], 'com.proxmox.VMStateHelper', { no_return => 1 });
+
+my $vmid = shift;
+
+my $dbus = Net::DBus->system();
+my $service = $dbus->export_service('org.qemu.VMState1');
+my $obj = PVE::QemuServer::DBusVMState->new($service, $vmid);
+
+$SIG{TERM} = sub {
+ $obj->Quit();
+};
+
+my $addr = $dbus->get_unique_name();
+syslog('info', "pve-vmstate-$vmid listening on $addr\n");
+
+# Inform QEMU about our running dbus-vmstate helper
+qemu_objectadd($vmid, 'pve-vmstate', 'dbus-vmstate',
+ addr => 'unix:path=/run/dbus/system_bus_socket',
+ 'id-list' => "pve-vmstate-$vmid",
+);
+
+Net::DBus::Reactor->main()->run();
diff --git a/src/dbus-vmstate/org.qemu.VMState1.conf b/src/dbus-vmstate/org.qemu.VMState1.conf
new file mode 100644
index 00000000..cfedcae4
--- /dev/null
+++ b/src/dbus-vmstate/org.qemu.VMState1.conf
@@ -0,0 +1,11 @@
+<?xml version="1.0"?>
+<!DOCTYPE busconfig PUBLIC "-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN"
+ "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">
+<busconfig>
+ <policy user="root">
+ <allow own="org.qemu.VMState1" />
+ <allow send_destination="org.qemu.VMState1" />
+ <allow receive_sender="org.qemu.VMState1" />
+ <allow send_destination="com.proxmox.VMStateHelper" />
+ </policy>
+</busconfig>
diff --git a/src/dbus-vmstate/pve-dbus-vmstate@.service b/src/dbus-vmstate/pve-dbus-vmstate@.service
new file mode 100644
index 00000000..56b4e285
--- /dev/null
+++ b/src/dbus-vmstate/pve-dbus-vmstate@.service
@@ -0,0 +1,10 @@
+[Unit]
+Description=PVE DBus VMState Helper (VM %i)
+Requires=dbus.socket
+After=dbus.socket
+PartOf=%i.scope
+
+[Service]
+Slice=qemu.slice
+Type=simple
+ExecStart=/usr/libexec/qemu-server/dbus-vmstate %i
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 08/14] fix #5180: migrate: integrate helper for live-migrating conntrack info
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (6 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 07/14] fix #5180: dbus-vmstate: add daemon for QEMUs dbus-vmstate interface Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 09/14] migrate: flush old VM conntrack entries after successful migration Christoph Heiss
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Fixes #5180 [0].
This implements for live-migration:
a) the dbus-vmstate is started on the target side, together with the VM
b) the dbus-vmstate helper is started on the source side
c) everything is cleaned up properly, in any case
It is currently off-by-default and must be enabled via the optional
`with-conntrack-state` migration parameter.
The conntrack entry migration is done in such a way that it can
soft-fail, w/o impacting the actual migration, i.e. considering it
"best-effort".
A failed conntrack entry migration does not have any real impact on
functionality, other than it might exhibit the problems as lined out in
the issue [0].
For remote migrations, only a warning is thrown for now. Cross-cluster
migration has stricter requirements and is not a "one-size-fits-it-all".
E.g. the most promentient issue if the network segmentation is
different, which would make the conntrack entries useless or require
careful rewriting.
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=5180
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* rebased on latest master
* change to systemctl start/stop for starting/stopping the
dbus-vmstate service
Changes v2 -> v3:
* formatted using perltidy
* slightly adjusted warning message for disabled case for better
readability
src/PVE/API2/Qemu.pm | 75 ++++++++++++++++++
src/PVE/CLI/qm.pm | 5 ++
src/PVE/QemuMigrate.pm | 73 +++++++++++++++++
src/PVE/QemuServer.pm | 6 ++
src/PVE/QemuServer/DBusVMState.pm | 125 ++++++++++++++++++++++++++++++
src/PVE/QemuServer/Makefile | 1 +
6 files changed, 285 insertions(+)
create mode 100644 src/PVE/QemuServer/DBusVMState.pm
diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
index 6565ce71..020f1815 100644
--- a/src/PVE/API2/Qemu.pm
+++ b/src/PVE/API2/Qemu.pm
@@ -46,6 +46,7 @@ use PVE::QemuServer::RNG;
use PVE::QemuServer::RunState;
use PVE::QemuServer::USB;
use PVE::QemuServer::Virtiofs qw(max_virtiofs);
+use PVE::QemuServer::DBusVMState;
use PVE::QemuMigrate;
use PVE::QemuMigrate::Helpers;
use PVE::RPCEnvironment;
@@ -3371,6 +3372,12 @@ __PACKAGE__->register_method({
default => 'max(30, vm memory in GiB)',
optional => 1,
},
+ 'with-conntrack-state' => {
+ type => 'boolean',
+ optional => 1,
+ default => 0,
+ description => 'Whether to migrate conntrack entries for running VMs.',
+ },
},
},
returns => {
@@ -3401,6 +3408,7 @@ __PACKAGE__->register_method({
my $migration_network = $get_root_param->('migration_network');
my $targetstorage = $get_root_param->('targetstorage');
my $force_cpu = $get_root_param->('force-cpu');
+ my $with_conntrack_state = $get_root_param->('with-conntrack-state');
my $storagemap;
@@ -3479,6 +3487,7 @@ __PACKAGE__->register_method({
nbd_proto_version => $nbd_protocol_version,
replicated_volumes => $replicated_volumes,
offline_volumes => $offline_volumes,
+ with_conntrack_state => $with_conntrack_state,
};
my $params = {
@@ -5143,6 +5152,11 @@ __PACKAGE__->register_method({
description =>
"Object of mapped resources with additional information such if they're live migratable.",
},
+ 'has-dbus-vmstate' => {
+ type => 'boolean',
+ description => 'Whether the VM host supports migrating additional VM state, '
+ . 'such as conntrack entries.',
+ },
},
},
code => sub {
@@ -5211,6 +5225,7 @@ __PACKAGE__->register_method({
$res->{local_resources} = $local_resources;
$res->{'mapped-resources'} = [sort keys $mapped_resources->%*];
$res->{'mapped-resource-info'} = $mapped_resources;
+ $res->{'has-dbus-vmstate'} = 1;
return $res;
@@ -5283,6 +5298,12 @@ __PACKAGE__->register_method({
minimum => '0',
default => 'migrate limit from datacenter or storage config',
},
+ 'with-conntrack-state' => {
+ type => 'boolean',
+ optional => 1,
+ default => 0,
+ description => 'Whether to migrate conntrack entries for running VMs.',
+ },
},
},
returns => {
@@ -5339,6 +5360,7 @@ __PACKAGE__->register_method({
} else {
warn "VM isn't running. Doing offline migration instead.\n" if $param->{online};
$param->{online} = 0;
+ $param->{'with-conntrack-state'} = 0;
}
my $storecfg = PVE::Storage::config();
@@ -6682,6 +6704,7 @@ __PACKAGE__->register_method({
warn $@ if $@;
}
+ PVE::QemuServer::DBusVMState::qemu_del_dbus_vmstate($state->{vmid});
PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
}
@@ -6865,4 +6888,56 @@ __PACKAGE__->register_method({
},
});
+__PACKAGE__->register_method({
+ name => 'dbus_vmstate',
+ path => '{vmid}/dbus-vmstate',
+ method => 'POST',
+ proxyto => 'node',
+ description => 'Stop the dbus-vmstate helper for the given VM if running.',
+ permissions => {
+ check => ['perm', '/vms/{vmid}', ['VM.Migrate']],
+ },
+ parameters => {
+ additionalProperties => 0,
+ properties => {
+ node => get_standard_option('pve-node'),
+ vmid =>
+ get_standard_option('pve-vmid', { completion => \&PVE::QemuServer::complete_vmid }),
+ action => {
+ type => 'string',
+ enum => [qw(start stop)],
+ description => 'Action to perform on the DBus VMState helper.',
+ optional => 0,
+ },
+ },
+ },
+ returns => {
+ type => 'null',
+ },
+ code => sub {
+ my ($param) = @_;
+ my ($node, $vmid, $action) = $param->@{qw(node vmid action)};
+
+ my $nodename = PVE::INotify::nodename();
+ if ($node ne 'localhost' && $node ne $nodename) {
+ raise_param_exc(
+ { node => "node needs to be 'localhost' or local hostname '$nodename'" });
+ }
+
+ if (!PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
+ raise_param_exc({ node => "VM $vmid not running locally on node '$nodename'" });
+ }
+
+ if ($action eq 'start') {
+ syslog('info', "starting dbus-vmstate helper for VM $vmid\n");
+ PVE::QemuServer::DBusVMState::qemu_add_dbus_vmstate($vmid);
+ } elsif ($action eq 'stop') {
+ syslog('info', "stopping dbus-vmstate helper for VM $vmid\n");
+ PVE::QemuServer::DBusVMState::qemu_del_dbus_vmstate($vmid);
+ } else {
+ die "unknown action $action\n";
+ }
+ },
+});
+
1;
diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm
index f3e9a702..3e778ad1 100755
--- a/src/PVE/CLI/qm.pm
+++ b/src/PVE/CLI/qm.pm
@@ -37,6 +37,7 @@ use PVE::QemuServer::ImportDisk;
use PVE::QemuServer::Monitor qw(mon_cmd);
use PVE::QemuServer::QMPHelpers;
use PVE::QemuServer::RunState;
+use PVE::QemuServer::DBusVMState;
use PVE::QemuServer;
use PVE::CLIHandler;
@@ -1054,6 +1055,10 @@ __PACKAGE__->register_method({
# vm was shutdown from inside the guest or crashed, doing api cleanup
PVE::QemuServer::vm_stop_cleanup($storecfg, $vmid, $conf, 0, 0, 1);
}
+
+ # ensure that no dbus-vmstate helper is left running in any case
+ PVE::QemuServer::DBusVMState::qemu_del_dbus_vmstate($vmid);
+
PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'post-stop');
$restart = eval { PVE::QemuServer::clear_reboot_request($vmid) };
diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm
index f46bdf40..a6922198 100644
--- a/src/PVE/QemuMigrate.pm
+++ b/src/PVE/QemuMigrate.pm
@@ -35,6 +35,7 @@ use PVE::QemuServer::Monitor qw(mon_cmd);
use PVE::QemuServer::Memory qw(get_current_memory);
use PVE::QemuServer::Network;
use PVE::QemuServer::QMPHelpers;
+use PVE::QemuServer::DBusVMState;
use PVE::QemuServer;
use PVE::AbstractMigrate;
@@ -238,6 +239,27 @@ sub prepare {
# Do not treat a suspended VM as paused, as it might wake up
# during migration and remain paused after migration finishes.
$self->{vm_was_paused} = 1 if PVE::QemuServer::vm_is_paused($vmid, 0);
+
+ if ($self->{opts}->{'with-conntrack-state'}) {
+ if ($self->{opts}->{remote}) {
+ # shouldn't be reached in normal circumstances anyway, as we prevent it on
+ # an API level
+ $self->log(
+ 'warn',
+ 'conntrack state migration not supported for remote migrations, '
+ . 'active connections might get dropped',
+ );
+ $self->{opts}->{'with-conntrack-state'} = 0;
+ } else {
+ PVE::QemuServer::DBusVMState::qemu_add_dbus_vmstate($vmid);
+ }
+ } else {
+ $self->log(
+ 'warn',
+ 'conntrack state migration not supported or disabled, '
+ . 'active connections might get dropped',
+ );
+ }
}
my ($loc_res, $mapped_res, $missing_mappings_by_node) =
@@ -920,6 +942,14 @@ sub phase1_cleanup {
if (my $err = $@) {
$self->log('err', $err);
}
+
+ if ($self->{running} && $self->{opts}->{'with-conntrack-state'}) {
+ # if the VM is running, that means we also tried to migrate additional
+ # state via our dbus-vmstate helper
+ # only need to locally stop it, on the target the VM cleanup will
+ # handle it
+ PVE::QemuServer::DBusVMState::qemu_del_dbus_vmstate($vmid);
+ }
}
sub phase2_start_local_cluster {
@@ -966,6 +996,10 @@ sub phase2_start_local_cluster {
push @$cmd, '--targetstorage', ($self->{opts}->{targetstorage} // '1');
}
+ if ($self->{opts}->{'with-conntrack-state'}) {
+ push @$cmd, '--with-conntrack-state';
+ }
+
my $spice_port;
my $input = "nbd_protocol_version: $migrate->{nbd_proto_version}\n";
@@ -1560,6 +1594,14 @@ sub phase2_cleanup {
$self->log('err', $err);
}
+ if ($self->{running} && $self->{opts}->{'with-conntrack-state'}) {
+ # if the VM is running, that means we also tried to migrate additional
+ # state via our dbus-vmstate helper
+ # only need to locally stop it, on the target the VM cleanup will
+ # handle it
+ PVE::QemuServer::DBusVMState::qemu_del_dbus_vmstate($vmid);
+ }
+
if ($self->{tunnel}) {
eval { PVE::Tunnel::finish_tunnel($self->{tunnel}); };
if (my $err = $@) {
@@ -1683,6 +1725,37 @@ sub phase3_cleanup {
$self->log('info', "skipping guest fstrim, because VM is paused");
}
}
+
+ if ($self->{running} && $self->{opts}->{'with-conntrack-state'}) {
+ # if the VM is running, that means we also migrated additional
+ # state via our dbus-vmstate helper
+ $self->log('info', 'stopping migration dbus-vmstate helpers');
+
+ # first locally
+ my $num = PVE::QemuServer::DBusVMState::qemu_del_dbus_vmstate($vmid);
+ if (defined($num)) {
+ my $plural = $num != 1 ? "entries" : "entry";
+ $self->log('info', "migrated $num conntrack state $plural");
+ }
+
+ # .. and then remote
+ my $targetnode = $self->{node};
+ eval {
+ # FIXME: introduce proper way to call API methods on another node?
+ # See also e.g. pve-network/src/PVE/API2/Network/SDN.pm, which
+ # does something similar.
+ PVE::Tools::run_command([
+ 'pvesh',
+ 'create',
+ "/nodes/$targetnode/qemu/$vmid/dbus-vmstate",
+ '--action',
+ 'stop',
+ ]);
+ };
+ if (my $err = $@) {
+ $self->log('warn', "failed to stop dbus-vmstate on $targetnode: $err\n");
+ }
+ }
}
# close tunnel on successful migration, on error phase2_cleanup closed it
diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm
index e7c98520..9f1219a9 100644
--- a/src/PVE/QemuServer.pm
+++ b/src/PVE/QemuServer.pm
@@ -86,6 +86,7 @@ use PVE::QemuServer::RunState;
use PVE::QemuServer::StateFile;
use PVE::QemuServer::USB;
use PVE::QemuServer::Virtiofs qw(max_virtiofs start_all_virtiofsd);
+use PVE::QemuServer::DBusVMState;
my $have_ha_config;
eval {
@@ -5413,6 +5414,7 @@ sub vm_start {
# replicated_volumes => which volids should be re-used with bitmaps for nbd migration
# offline_volumes => new volids of offline migrated disks like tpmstate and cloudinit, not yet
# contained in config
+# with_conntrack_state => whether to start the dbus-vmstate helper for conntrack state migration
sub vm_start_nolock {
my ($storecfg, $vmid, $conf, $params, $migrate_opts) = @_;
@@ -5783,6 +5785,10 @@ sub vm_start_nolock {
}
}
+ # conntrack migration is only supported for intra-cluster migrations
+ if ($migrate_opts->{with_conntrack_state} && !$migrate_opts->{remote_node}) {
+ PVE::QemuServer::DBusVMState::qemu_add_dbus_vmstate($vmid);
+ }
} else {
mon_cmd($vmid, "balloon", value => $conf->{balloon} * 1024 * 1024)
if !$statefile && $conf->{balloon};
diff --git a/src/PVE/QemuServer/DBusVMState.pm b/src/PVE/QemuServer/DBusVMState.pm
new file mode 100644
index 00000000..36ff8168
--- /dev/null
+++ b/src/PVE/QemuServer/DBusVMState.pm
@@ -0,0 +1,125 @@
+package PVE::QemuServer::DBusVMState;
+
+use strict;
+use warnings;
+
+use PVE::SafeSyslog;
+use PVE::Systemd;
+use PVE::Tools;
+
+use constant {
+ DBUS_VMSTATE_EXE => '/usr/libexec/qemu-server/dbus-vmstate',
+};
+
+# Retrieves a property from an object from a specific interface name.
+# In contrast to accessing the property directly by using $obj->Property, this
+# actually respects the owner of the object and thus can be used for interfaces
+# with might have multiple (queued) owners on the DBus.
+my sub dbus_get_property {
+ my ($obj, $interface, $name) = @_;
+
+ my $con = $obj->{service}->get_bus()->get_connection();
+
+ my $call = $con->make_method_call_message(
+ $obj->{service}->get_service_name(),
+ $obj->{object_path},
+ 'org.freedesktop.DBus.Properties',
+ 'Get',
+ );
+
+ $call->set_destination($obj->get_service()->get_owner_name());
+ $call->append_args_list($interface, $name);
+
+ my @reply = $con->send_with_reply_and_block($call, 10 * 1000)->get_args_list();
+ return $reply[0];
+}
+
+# Starts the dbus-vmstate helper D-Bus service daemon and adds the needed
+# object to the appropriate QEMU instance for the specified VM.
+sub qemu_add_dbus_vmstate {
+ my ($vmid) = @_;
+
+ if (!PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
+ die "VM $vmid must be running locally\n";
+ }
+
+ # In case some leftover, previous instance is running, stop it. Otherwise
+ # we run into errors, as a systemd service instance is unique.
+ if (defined(qemu_del_dbus_vmstate($vmid, quiet => 1))) {
+ warn "stopped previously running dbus-vmstate helper for VM $vmid\n";
+ }
+
+ # Start the actual service, which will then register itself with QEMU.
+ eval { PVE::Tools::run_command(['systemctl', 'start', "pve-dbus-vmstate\@$vmid"]) };
+ if (my $err = $@) {
+ die "failed to start DBus VMState service for VM $vmid: $err\n";
+ }
+}
+
+# Stops the dbus-vmstate helper D-Bus service daemon and removes the associated
+# object from QEMU for the specified VM.
+#
+# Returns the number of migrated conntrack entries, or undef in case of error.
+sub qemu_del_dbus_vmstate {
+ my ($vmid, %params) = @_;
+
+ my $num_entries = undef;
+ my $dbus = Net::DBus->system();
+ my $dbus_obj = $dbus->get_bus_object();
+
+ my $owners = eval { $dbus_obj->ListQueuedOwners('org.qemu.VMState1') };
+ if (my $err = $@) {
+ syslog('warn', "failed to retrieve org.qemu.VMState1 owners: $err\n")
+ if !$params{quiet};
+ return undef;
+ }
+
+ # Iterate through all name owners for 'org.qemu.VMState1' and compare
+ # the ID. If we found the corresponding one for $vmid, retrieve the
+ # `NumMigratedEntries` property and call the `Quit()` method on it.
+ # Any D-Bus interaction might die/croak, so try to be careful here and
+ # swallow any hard errors.
+ foreach my $owner (@$owners) {
+ my $service = eval { Net::DBus::RemoteService->new($dbus, $owner, 'org.qemu.VMState1') };
+ if (my $err = $@) {
+ syslog('warn', "failed to get org.qemu.VMState1 service from D-Bus $owner: $err\n")
+ if !$params{quiet};
+ next;
+ }
+
+ my $object = eval { $service->get_object('/org/qemu/VMState1') };
+ if (my $err = $@) {
+ syslog('warn', "failed to get /org/qemu/VMState1 object from D-Bus $owner: $err\n")
+ if !$params{quiet};
+ next;
+ }
+
+ my $id = eval { dbus_get_property($object, 'org.qemu.VMState1', 'Id') };
+ if (defined($id) && $id eq "pve-vmstate-$vmid") {
+ my $helperobj =
+ eval { $service->get_object('/org/qemu/VMState1', 'com.proxmox.VMStateHelper') };
+ if (my $err = $@) {
+ syslog(
+ 'warn',
+ "found dbus-vmstate helper, but does not implement com.proxmox.VMStateHelper? ($err)\n",
+ ) if !$params{quiet};
+ last;
+ }
+
+ $num_entries = eval {
+ dbus_get_property($object, 'com.proxmox.VMStateHelper', 'NumMigratedEntries');
+ };
+ eval { $object->Quit() };
+ if (my $err = $@) {
+ syslog('warn', "failed to call quit on dbus-vmstate for VM $vmid: $err\n")
+ if !$params{quiet};
+ }
+
+ last;
+ }
+ }
+
+ return $num_entries;
+}
+
+1;
diff --git a/src/PVE/QemuServer/Makefile b/src/PVE/QemuServer/Makefile
index ca30a0ad..23c136bc 100644
--- a/src/PVE/QemuServer/Makefile
+++ b/src/PVE/QemuServer/Makefile
@@ -8,6 +8,7 @@ SOURCES=Agent.pm \
CGroup.pm \
Cloudinit.pm \
CPUConfig.pm \
+ DBusVMState.pm \
Drive.pm \
Helpers.pm \
ImportDisk.pm \
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 09/14] migrate: flush old VM conntrack entries after successful migration
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (7 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 08/14] fix #5180: migrate: integrate helper for live-migrating conntrack info Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 10/14] api2: capabilities: explicitly import CPU capabilities module Christoph Heiss
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
After a successful live-migration, the old VM-specific conntrack entries
are not needed anymore on the source node and can thus be flushed.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* formatted using perltidy
* adjusted info message by adding `[on source] node` at end for
better context
src/PVE/QemuMigrate.pm | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm
index a6922198..22e7bb80 100644
--- a/src/PVE/QemuMigrate.pm
+++ b/src/PVE/QemuMigrate.pm
@@ -11,6 +11,7 @@ use Time::HiRes qw( usleep );
use PVE::AccessControl;
use PVE::Cluster;
use PVE::Format qw(render_bytes);
+use PVE::Firewall::Helpers;
use PVE::GuestHelpers qw(safe_boolean_ne safe_string_ne);
use PVE::INotify;
use PVE::JSONSchema;
@@ -1755,6 +1756,10 @@ sub phase3_cleanup {
if (my $err = $@) {
$self->log('warn', "failed to stop dbus-vmstate on $targetnode: $err\n");
}
+
+ # also flush now-old local conntrack entries for the migrated VM
+ $self->log('info', 'flushing conntrack state for guest on source node');
+ PVE::Firewall::Helpers::flush_fw_ct_entries_by_mark($vmid);
}
}
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH manager v3 10/14] api2: capabilities: explicitly import CPU capabilities module
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (8 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 09/14] migrate: flush old VM conntrack entries after successful migration Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 11/14] api2: capabilities: proxy index endpoints to respective nodes Christoph Heiss
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
This currently works only by pure chance, as it seems to be already
imported somewhere else.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* no changes
PVE/API2/Capabilities.pm | 1 +
1 file changed, 1 insertion(+)
diff --git a/PVE/API2/Capabilities.pm b/PVE/API2/Capabilities.pm
index a58e7b1ac..06b4e638c 100644
--- a/PVE/API2/Capabilities.pm
+++ b/PVE/API2/Capabilities.pm
@@ -6,6 +6,7 @@ use warnings;
use PVE::JSONSchema qw(get_standard_option);
use PVE::RESTHandler;
+use PVE::API2::Qemu::CPU;
use PVE::API2::Qemu::Machine;
use base qw(PVE::RESTHandler);
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH manager v3 11/14] api2: capabilities: proxy index endpoints to respective nodes
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (9 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 10/14] api2: capabilities: explicitly import CPU capabilities module Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 12/14] api2: capabilities: expose new qemu/migration endpoint Christoph Heiss
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Nodes might have different capabilities, depending on their version.
This ensures that always the requested is actually queryied.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* no changes
PVE/API2/Capabilities.pm | 2 ++
1 file changed, 2 insertions(+)
diff --git a/PVE/API2/Capabilities.pm b/PVE/API2/Capabilities.pm
index 06b4e638c..62c3016df 100644
--- a/PVE/API2/Capabilities.pm
+++ b/PVE/API2/Capabilities.pm
@@ -26,6 +26,7 @@ __PACKAGE__->register_method({
path => '',
method => 'GET',
permissions => { user => 'all' },
+ proxyto => 'node',
description => "Node capabilities index.",
parameters => {
additionalProperties => 0,
@@ -57,6 +58,7 @@ __PACKAGE__->register_method({
path => 'qemu',
method => 'GET',
permissions => { user => 'all' },
+ proxyto => 'node',
description => "QEMU capabilities index.",
parameters => {
additionalProperties => 0,
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH manager v3 12/14] api2: capabilities: expose new qemu/migration endpoint
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (10 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 11/14] api2: capabilities: proxy index endpoints to respective nodes Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 13/14] ui: window: Migrate: add checkbox for migrating VM conntrack state Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH docs v3 14/14] qm: document conntrack state migration for live migrations Christoph Heiss
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
This endpoint provides information about migration capabilities of the
node. Currently, only support for dbus-vmstate is indicated.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* no changes
Changes v2 -> v3:
* updated to new module path
PVE/API2/Capabilities.pm | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/PVE/API2/Capabilities.pm b/PVE/API2/Capabilities.pm
index 62c3016df..d45c548e1 100644
--- a/PVE/API2/Capabilities.pm
+++ b/PVE/API2/Capabilities.pm
@@ -8,6 +8,7 @@ use PVE::RESTHandler;
use PVE::API2::Qemu::CPU;
use PVE::API2::Qemu::Machine;
+use PVE::API2::NodeCapabilities::Qemu::Migration;
use base qw(PVE::RESTHandler);
@@ -21,6 +22,11 @@ __PACKAGE__->register_method({
path => 'qemu/machines',
});
+__PACKAGE__->register_method({
+ subclass => 'PVE::API2::NodeCapabilities::Qemu::Migration',
+ path => 'qemu/migration',
+});
+
__PACKAGE__->register_method({
name => 'index',
path => '',
@@ -78,7 +84,7 @@ __PACKAGE__->register_method({
my ($param) = @_;
my $result = [
- { name => 'cpu' }, { name => 'machines' },
+ { name => 'cpu' }, { name => 'machines' }, { name => 'migration' },
];
return $result;
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH manager v3 13/14] ui: window: Migrate: add checkbox for migrating VM conntrack state
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (11 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 12/14] api2: capabilities: expose new qemu/migration endpoint Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH docs v3 14/14] qm: document conntrack state migration for live migrations Christoph Heiss
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Adds a new checkbox to the migration dialog, if it is a
live/online-migration and both the source and target nodes have support
for our dbus-vmstate helper.
If the checkbox is active, it passes along the `with-conntrack-state`
parameter to the migrate API call.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v1 -> v2:
* rebased on latest master
Changes v2 -> v3:
* reformatted using biome
* log warning in case /capabilities/qemu/migration is not (yet)
supported by the target node
www/manager6/window/Migrate.js | 82 ++++++++++++++++++++++++++++++++--
1 file changed, 79 insertions(+), 3 deletions(-)
diff --git a/www/manager6/window/Migrate.js b/www/manager6/window/Migrate.js
index 4103ad135..8b52f9e90 100644
--- a/www/manager6/window/Migrate.js
+++ b/www/manager6/window/Migrate.js
@@ -29,9 +29,10 @@ Ext.define('PVE.window.Migrate', {
allowedNodes: undefined,
overwriteLocalResourceCheck: false,
hasLocalResources: false,
+ withConntrackState: true,
+ bothHaveDbusVmstate: false,
},
},
-
formulas: {
setMigrationMode: function (get) {
if (get('running')) {
@@ -62,6 +63,10 @@ Ext.define('PVE.window.Migrate', {
return false;
}
},
+ conntrackStateCheckboxHidden: (get) =>
+ !get('running') ||
+ get('vmtype') !== 'qemu' ||
+ !get('migration.bothHaveDbusVmstate'),
},
},
@@ -141,6 +146,10 @@ Ext.define('PVE.window.Migrate', {
params.force = 1;
}
+ if (vm.get('migration.bothHaveDbusVmstate') && vm.get('migration.withConntrackState')) {
+ params['with-conntrack-state'] = 1;
+ }
+
Proxmox.Utils.API2Request({
params: params,
url:
@@ -227,12 +236,29 @@ Ext.define('PVE.window.Migrate', {
method: 'GET',
});
migrateStats = result.data;
- me.fetchingNodeMigrateInfo = false;
} catch (error) {
Ext.Msg.alert(Proxmox.Utils.errorText, error.htmlStatus);
+ me.fetchingNodeMigrateInfo = false;
return;
}
+ const target = me.lookup('pveNodeSelector').value;
+ let targetCapabilities = {};
+
+ try {
+ const { result } = await Proxmox.Async.api2({
+ url: `/nodes/${target}/capabilities/qemu/migration`,
+ method: 'GET',
+ });
+ targetCapabilities = result.data;
+ } catch (err) {
+ // Only emit a warning in the case the target node does not (yet) support the
+ // `capabilites/qemu/migration` endpoint and simply treat all features as unsupported.
+ console.warn(`failed to query /capabilites/qemu/migration on '${target}': ${err}`);
+ }
+
+ me.fetchingNodeMigrateInfo = false;
+
if (migrateStats.running) {
vm.set('running', true);
}
@@ -242,7 +268,6 @@ Ext.define('PVE.window.Migrate', {
migration.possible = true;
}
migration.preconditions = [];
- let target = me.lookup('pveNodeSelector').value;
let disallowed = migrateStats.not_allowed_nodes?.[target] ?? {};
if (migrateStats.allowed_nodes && !vm.get('running')) {
@@ -361,6 +386,36 @@ Ext.define('PVE.window.Migrate', {
});
}
+ migration.bothHaveDbusVmstate =
+ migrateStats['has-dbus-vmstate'] && targetCapabilities['has-dbus-vmstate'];
+ if (vm.get('running')) {
+ if (migration.withConntrackState && !migrateStats['has-dbus-vmstate']) {
+ migration.preconditions.push({
+ text: gettext(
+ 'Cannot migrate conntrack state, source node is lacking support. Active network connections might get dropped.',
+ ),
+ severity: 'warning',
+ });
+ }
+ if (migration.withConntrackState && !targetCapabilities['has-dbus-vmstate']) {
+ migration.preconditions.push({
+ text: gettext(
+ 'Cannot migrate conntrack state, target node is lacking support. Active network connections might get dropped.',
+ ),
+ severity: 'warning',
+ });
+ }
+
+ if (migration.bothHaveDbusVmstate && !migration.withConntrackState) {
+ migration.preconditions.push({
+ text: gettext(
+ 'Conntrack state migration disabled. Active network connections might get dropped.',
+ ),
+ severity: 'warning',
+ });
+ }
+ }
+
vm.set('migration', migration);
},
checkLxcPreconditions: function (resetMigrationPossible) {
@@ -456,6 +511,27 @@ Ext.define('PVE.window.Migrate', {
},
},
},
+ {
+ xtype: 'proxmoxcheckbox',
+ name: 'withConntrackState',
+ fieldLabel: gettext('Conntrack state'),
+ autoEl: {
+ tag: 'div',
+ 'data-qtip': gettext(
+ 'Enables live migration of conntrack entries for this VM.',
+ ),
+ },
+ bind: {
+ hidden: '{conntrackStateCheckboxHidden}',
+ value: '{migration.withConntrackState}',
+ },
+ listeners: {
+ change: {
+ fn: 'checkMigratePreconditions',
+ extraArg: true,
+ },
+ },
+ },
],
},
],
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH docs v3 14/14] qm: document conntrack state migration for live migrations
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
` (12 preceding siblings ...)
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 13/14] ui: window: Migrate: add checkbox for migrating VM conntrack state Christoph Heiss
@ 2025-07-03 11:54 ` Christoph Heiss
13 siblings, 0 replies; 15+ messages in thread
From: Christoph Heiss @ 2025-07-03 11:54 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes v2 -> v3:
* new patch
qm.adoc | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/qm.adoc b/qm.adoc
index 8b9e096..823e818 100644
--- a/qm.adoc
+++ b/qm.adoc
@@ -1664,6 +1664,22 @@ For Live Migration to work, there are some things required:
configured, but it cannot be guaranteed - so please test before deploying
such a setup in production.
+Conntrack State Migration
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+NOTE: Conntrack state migration is considered best-effort only and might not
+work, as it heavily depends on the network setup.
+
+Conntrack is a Linux kernel mechanism to enable a stateful firewall by tracking
+individual connection. When live migrating running VMs, active in- and/or
+outbound connections *might* get interrupted as soon as the VM starts running on
+the target host, as the new host node does not have the same conntrack entries
+and thus the firewall can drop packets.
+
+Conntrack state migration copies all conntrack entries on the host for the
+live-migrated VM to the target node and afterwards flushes the migrated entries
+from the conntrack table on the source node.
+
Offline Migration
~~~~~~~~~~~~~~~~~
--
2.49.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-07-03 11:57 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-03 11:54 [pve-devel] [PATCH ve-rs/firewall/qemu-server/manager/docs v3 00/14] fix #5180: migrate conntrack state on live migration Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH proxmox-ve-rs v3 01/14] config: guest: allow access to raw Vmid value Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH proxmox-firewall v3 02/14] firewall: add connmark rule with VMID to all guest chains Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH firewall v3 03/14] " Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH firewall v3 04/14] firewall: helpers: add sub for flushing conntrack entries by mark Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 05/14] qmp helpers: allow passing structured args via qemu_objectadd() Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 06/14] api2: qemu: add module exposing node migration capabilities Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 07/14] fix #5180: dbus-vmstate: add daemon for QEMUs dbus-vmstate interface Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 08/14] fix #5180: migrate: integrate helper for live-migrating conntrack info Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH qemu-server v3 09/14] migrate: flush old VM conntrack entries after successful migration Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 10/14] api2: capabilities: explicitly import CPU capabilities module Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 11/14] api2: capabilities: proxy index endpoints to respective nodes Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 12/14] api2: capabilities: expose new qemu/migration endpoint Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH manager v3 13/14] ui: window: Migrate: add checkbox for migrating VM conntrack state Christoph Heiss
2025-07-03 11:54 ` [pve-devel] [PATCH docs v3 14/14] qm: document conntrack state migration for live migrations Christoph Heiss
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox