From: Fiona Ebner <f.ebner@proxmox.com>
To: Daniel Kral <d.kral@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [PATCH qemu-server v2 06/14] migration: intra-cluster: check config can be parsed on target node
Date: Mon, 2 Mar 2026 15:52:37 +0100 [thread overview]
Message-ID: <f8452cf1-6178-4c2c-82f2-5c03987f611a@proxmox.com> (raw)
In-Reply-To: <DGPOB3UZ8LZC.12LVZ8JX89AQQ@proxmox.com>
Am 27.02.26 um 11:30 AM schrieb Daniel Kral:
> On Wed Feb 25, 2026 at 4:18 PM CET, Fiona Ebner wrote:
>> diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
>> index 1f0864f5..47466513 100644
>> --- a/src/PVE/API2/Qemu.pm
>> +++ b/src/PVE/API2/Qemu.pm
>> @@ -5399,7 +5399,9 @@ __PACKAGE__->register_method({
>> force => {
>> type => 'boolean',
>> description =>
>> - "Allow to migrate VMs which use local devices. Only root may use this option.",
>> + "Allow to migrate VMs which use local devices and for intra-cluster migration,"
>> + . " configuration options not understood by the target. Only root may use this"
>> + . " option.",
>
> HA-managed VMs are always migrated with force set as it was assumed to
> be only used for local devices at the time [0]. This might need some
> adaption so that LRM-initiated migrations won't cause problems for those
> VMs that this patch series wants to fix.
>
> [0] https://git.proxmox.com/?p=pve-ha-manager.git;a=blob;f=src/PVE/HA/Resources/PVEVM.pm;h=7586da84b7f19686b680d4e1434a17ffe1633d6d;hb=1a8d8bcef1934a43d37344caf965c082e55d451c#l116
Hmm, yes, it might be good to add another param like 'skip-config-check'
rather than re-use 'force'. Then we need a way to allow passing that
along to HA migrations, but I see you have posted [0] recently :)
> As we might want to know which guests can be moved to which nodes in the
> future quickly, e.g. for the load balancer to know which target nodes to
> consider, I briefly considered whether it could also make sense to have
> some config versioning, which is negotiated between the source and
> target node (e.g. qemu-server on the source node is lower than the
> target node, so the VM can be migrated), but that might be too strict,
> especially for guests that don't even use the new config properties of
> the more recent qemu-server version.
>
> But maybe these load-balancing decisions can also be more coarse-grained
> then this more fine-grained check for config compatibility and
> implemented at a later time when it actually is needed.
>
> What do you think?
If we need the information for all nodes, there should be a cheap way to
get it, ideally not even an API call per node. One idea is to broadcast
the config schema, and check if parsing with the schema from the other
node works, but it is quite a handful, even if setting 'description' and
'verbose_description' to empty strings:
[I] febner@dev9 ~/repos/pve/qemu-server (master)> ls -lh schema*.json
-rw-rw-r-- 1 febner febner 611K Mar 2 15:20 schema.json
-rw-rw-r-- 1 febner febner 365K Mar 2 15:25 schema-no-desc.json
And it has the limitation that changes in the parsing logic itself would
not be detected. For example, introducing special sections support was a
change in the parsing logic.
Coming back to the general (not-only-HA) situation, not having it in the
preconditions means that users cannot yet select the force (or
skip-config-check) checkbox from the UI, which is also rather bad.
There, we could consider doing it with one API call per node I suppose
(do it for the target upon selection change).
Or maybe we want to flip it around? Proactively do parsing of configs
from other nodes and broadcast the information which configs could and
couldn't be parsed to other nodes? Only needs to be updated when configs
change and might be relatively cheap.
Does anybody have opinions about that last idea?
>> optional => 1,
>> },
>> migration_type => {
>> diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm
>> index f7ec3227..901fe96d 100644
>> --- a/src/PVE/QemuMigrate.pm
>> +++ b/src/PVE/QemuMigrate.pm
>> @@ -355,6 +355,33 @@ sub prepare {
>> my $cmd = [@{ $self->{rem_ssh} }, '/bin/true'];
>> eval { $self->cmd_quiet($cmd); };
>> die "Can't connect to destination address using public key\n" if $@;
>> +
>> + if (!$self->{opts}->{force}) {
>> + # Fork a short-lived tunnel for checking the config. Later, the proper tunnel with SSH
>> + # forwaring info is forked.
>> + my $tunnel = $self->fork_tunnel();
>> + # Compared to remote migration, which also does volume activation, this only strictly
>> + # parses the config, so no large timeout is needed. Unfortunately, mtunnel did not
>> + # indicate that a command is unknown, but not reply at all, so the timeout must be very
>> + # low right now.
>> + # FIXME PVE 10 - bump timeout, the trade-off between delaying backwards migration and
>> + # giving config check more time should now be in favor of config checking
>> + eval {
>> + my $nodename = PVE::INotify::nodename();
>> + PVE::Tunnel::write_tunnel($tunnel, 3, "config $vmid $nodename");
>> + };
>> + if (my $err = $@) {
>> + chomp($err);
>> + # if there is no reply, assume target did not know the command yet
>> + if ($err =~ m/^no reply to command/) {
>> + $self->log('info', "skipping strict configuration check (target too old?)");
>> + } else {
>> + die "$err - use --force to migrate regardless\n";
>
> Though unlikely (I couldn't hit `systemctl stop sshd` on time on the
> target node with a few tries ^^), write_tunnel(...) might fail with $err
> that don't really explain why the migration failed. It might be better
> to filter here or explicitly prepend that the strict config check failed
> here and then add the full error message?
Good catch! Yes, I'll match that the error was actually the one for a
failed config check in v3.
[0]:
https://lore.proxmox.com/pve-devel/20260225143514.368884-1-d.kral@proxmox.com/
next prev parent reply other threads:[~2026-03-02 14:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 15:18 [PATCH-SERIES qemu-server/guest-common/container v2 00/14] migration: strict config check for intra-cluster migration Fiona Ebner
2026-02-25 15:18 ` [PATCH qemu-server v2 01/14] d/control: bump versioned build dependency for libpve-common-perl to 9.0.12 Fiona Ebner
2026-02-25 15:18 ` [PATCH qemu-server v2 02/14] tests: migration: get rid of mocking for removed PVE::QemuMigrate::read_tunnel() Fiona Ebner
2026-02-25 15:18 ` [PATCH qemu-server v2 03/14] qm: mtunnel: avoid using deprecated check_running() helper Fiona Ebner
2026-02-25 15:18 ` [PATCH qemu-server v2 04/14] mtunnel: add 'conf' command to do strict configuration parsing Fiona Ebner
2026-02-25 15:18 ` [PATCH qemu-server v2 05/14] qm: mtunnel: reply when a command is unknown Fiona Ebner
2026-02-25 15:18 ` [PATCH qemu-server v2 06/14] migration: intra-cluster: check config can be parsed on target node Fiona Ebner
2026-02-27 10:31 ` Daniel Kral
2026-03-02 14:52 ` Fiona Ebner [this message]
2026-02-25 15:18 ` [PATCH guest-common v2 07/14] tunnel: add missing IO::File module import Fiona Ebner
2026-02-25 15:18 ` [PATCH guest-common v2 08/14] tunnel: end module with true value as recommended by perlcritic Fiona Ebner
2026-02-25 15:18 ` [PATCH guest-common v2 09/14] tunnel: redirect stderr to log function Fiona Ebner
2026-02-25 15:18 ` [PATCH container v2 10/14] pct: add missing module imports and group according to style guide Fiona Ebner
2026-02-25 15:18 ` [PATCH container v2 11/14] migrate: add missing module imports Fiona Ebner
2026-02-25 15:18 ` [PATCH container v2 12/14] pct: introduce mtunnel command Fiona Ebner
2026-02-25 15:18 ` [PATCH container v2 13/14] d/control: bump versioned build dependency for libpve-common-perl to 9.0.12 Fiona Ebner
2026-02-25 15:18 ` [PATCH container v2 14/14] migration: intra-cluster: check config can be parsed on target node Fiona Ebner
2026-02-27 10:53 ` [PATCH-SERIES qemu-server/guest-common/container v2 00/14] migration: strict config check for intra-cluster migration Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f8452cf1-6178-4c2c-82f2-5c03987f611a@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.