public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together
@ 2023-12-19 13:44 Fiona Ebner
  2023-12-20 12:32 ` Thomas Lamprecht
  2024-01-03 10:32 ` [pve-devel] applied: " Wolfgang Bumiller
  0 siblings, 2 replies; 4+ messages in thread
From: Fiona Ebner @ 2023-12-19 13:44 UTC (permalink / raw)
  To: pve-devel

Currently, volume activation, PCI reservation and resetting systemd
scope happen in between, so the 5 second expiretime used for port
reservation is not always enough.

It's possible to defer telling QEMU where it should listen for
migration and do so after it has been started via QMP. Therefore, the
port reservation can be moved very close to the actual usage.

Mentioned here for completeness and can still be done as an additional
change later if desired: next_migrate_port could be modified to
optionally return the open socket and it should be possible to pass
the file descriptor directly to QEMU, but that would require accepting
the connection before on the Perl side (otherwise leads to ENOTCONN
107). While it would avoid any races, it's not the most elegant
and the change at hand should be enough in all practical situations.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Discussion for v1:
https://lists.proxmox.com/pipermail/pve-devel/2023-November/060149.html

Changes in v2:
    * move reservation+usage much closer together than was done in v1
      of the qemu-server patch
    * drop other partial fix attempts for pve-common

 PVE/QemuServer.pm | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 2063e663..3b1540b6 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5804,10 +5804,9 @@ sub vm_start_nolock {
 		$migrate->{addr} = "[$migrate->{addr}]" if Net::IP::ip_is_ipv6($migrate->{addr});
 	    }
 
-	    my $pfamily = PVE::Tools::get_host_address_family($nodename);
-	    $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
-	    $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
-	    push @$cmd, '-incoming', $migrate->{uri};
+	    # see #4501: port reservation should be done close to usage - tell QEMU where to listen
+	    # via QMP later
+	    push @$cmd, '-incoming', 'defer';
 	    push @$cmd, '-S';
 
 	} elsif ($statefile eq 'unix') {
@@ -5991,8 +5990,15 @@ sub vm_start_nolock {
     eval { PVE::QemuServer::PCI::reserve_pci_usage($pci_reserve_list, $vmid, undef, $pid) };
     warn $@ if $@;
 
-    if (defined($res->{migrate})) {
-	print "migration listens on $res->{migrate}->{uri}\n";
+    if (defined(my $migrate = $res->{migrate})) {
+	if ($migrate->{proto} eq 'tcp') {
+	    my $nodename = nodename();
+	    my $pfamily = PVE::Tools::get_host_address_family($nodename);
+	    $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
+	    $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
+	    mon_cmd($vmid, "migrate-incoming", uri => $migrate->{uri});
+	}
+	print "migration listens on $migrate->{uri}\n";
     } elsif ($statefile) {
 	eval { mon_cmd($vmid, "cont"); };
 	warn $@ if $@;
-- 
2.39.2





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together
  2023-12-19 13:44 [pve-devel] [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together Fiona Ebner
@ 2023-12-20 12:32 ` Thomas Lamprecht
  2023-12-27 17:07   ` Hannes Dürr
  2024-01-03 10:32 ` [pve-devel] applied: " Wolfgang Bumiller
  1 sibling, 1 reply; 4+ messages in thread
From: Thomas Lamprecht @ 2023-12-20 12:32 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

On 19/12/2023 14:44, Fiona Ebner wrote:
> Currently, volume activation, PCI reservation and resetting systemd
> scope happen in between, so the 5 second expiretime used for port
> reservation is not always enough.
> 
> It's possible to defer telling QEMU where it should listen for
> migration and do so after it has been started via QMP. Therefore, the
> port reservation can be moved very close to the actual usage.
> 
> Mentioned here for completeness and can still be done as an additional
> change later if desired: next_migrate_port could be modified to
> optionally return the open socket and it should be possible to pass
> the file descriptor directly to QEMU, but that would require accepting
> the connection before on the Perl side (otherwise leads to ENOTCONN
> 107). While it would avoid any races, it's not the most elegant
> and the change at hand should be enough in all practical situations.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Discussion for v1:
> https://lists.proxmox.com/pipermail/pve-devel/2023-November/060149.html
> 
> Changes in v2:
>     * move reservation+usage much closer together than was done in v1
>       of the qemu-server patch
>     * drop other partial fix attempts for pve-common

I find this approach more than just an OK'ish stop-gap, this should
fix most such issues we can have in practice.

If you can get someone to additionally test this it's fine to apply as
is IMO.

The one thing that might be worse (didn't check reservation logic)
compared to FD passing is when there would be no migration ports
available, as then we would have already spend slightly more time and
resources by having the VM already started. We could side-step this a
bit by looping for requesting a reserved port for a few seconds.

But IMO it's not highly likely to run out of such ports, most actions
that can spawn multiple migrations (like HA) are capped by default.

So once tested a few general migration situations, consider this:

Acked-by: Thomas Lamprecht <t.lamprecht@proxmox.com>




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together
  2023-12-20 12:32 ` Thomas Lamprecht
@ 2023-12-27 17:07   ` Hannes Dürr
  0 siblings, 0 replies; 4+ messages in thread
From: Hannes Dürr @ 2023-12-27 17:07 UTC (permalink / raw)
  To: Proxmox VE development discussion, Thomas Lamprecht, Fiona Ebner

I live-migrated 300 vms with:
migration: insecure
max_workers: 30
and 10 parallel workers
(as described here 
https://forum.proxmox.com/threads/live-migration.127355/#post-557181)

Had zero issues with the patch applied,
without the patch i had ~30 errors

Tested-by: Hannes Duerr <h.duerr@proxmox.com>

On 12/20/23 13:32, Thomas Lamprecht wrote:
> On 19/12/2023 14:44, Fiona Ebner wrote:
>> Currently, volume activation, PCI reservation and resetting systemd
>> scope happen in between, so the 5 second expiretime used for port
>> reservation is not always enough.
>>
>> It's possible to defer telling QEMU where it should listen for
>> migration and do so after it has been started via QMP. Therefore, the
>> port reservation can be moved very close to the actual usage.
>>
>> Mentioned here for completeness and can still be done as an additional
>> change later if desired: next_migrate_port could be modified to
>> optionally return the open socket and it should be possible to pass
>> the file descriptor directly to QEMU, but that would require accepting
>> the connection before on the Perl side (otherwise leads to ENOTCONN
>> 107). While it would avoid any races, it's not the most elegant
>> and the change at hand should be enough in all practical situations.
>>
>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> ---
>>
>> Discussion for v1:
>> https://lists.proxmox.com/pipermail/pve-devel/2023-November/060149.html
>>
>> Changes in v2:
>>      * move reservation+usage much closer together than was done in v1
>>        of the qemu-server patch
>>      * drop other partial fix attempts for pve-common
> I find this approach more than just an OK'ish stop-gap, this should
> fix most such issues we can have in practice.
>
> If you can get someone to additionally test this it's fine to apply as
> is IMO.
>
> The one thing that might be worse (didn't check reservation logic)
> compared to FD passing is when there would be no migration ports
> available, as then we would have already spend slightly more time and
> resources by having the VM already started. We could side-step this a
> bit by looping for requesting a reserved port for a few seconds.
>
> But IMO it's not highly likely to run out of such ports, most actions
> that can spawn multiple migrations (like HA) are capped by default.
>
> So once tested a few general migration situations, consider this:
>
> Acked-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] applied: [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together
  2023-12-19 13:44 [pve-devel] [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together Fiona Ebner
  2023-12-20 12:32 ` Thomas Lamprecht
@ 2024-01-03 10:32 ` Wolfgang Bumiller
  1 sibling, 0 replies; 4+ messages in thread
From: Wolfgang Bumiller @ 2024-01-03 10:32 UTC (permalink / raw)
  To: Fiona Ebner; +Cc: pve-devel

applied, thanks




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-01-03 10:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-19 13:44 [pve-devel] [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together Fiona Ebner
2023-12-20 12:32 ` Thomas Lamprecht
2023-12-27 17:07   ` Hannes Dürr
2024-01-03 10:32 ` [pve-devel] applied: " Wolfgang Bumiller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal