public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server] fix #4372: fix vm_resume migration callback
@ 2022-11-29 12:09 Fabian Grünbichler
  2022-11-29 18:24 ` Fabian Grünbichler
  2022-11-30 15:22 ` [pve-devel] applied: " Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Fabian Grünbichler @ 2022-11-29 12:09 UTC (permalink / raw)
  To: pve-devel

the fix for the recently introduced requirement of loading the VM config while
migrating was incomplete, since the vmlist node value could already be out of
date by the time load_config is called.

extend the fallback behaviour even further, by doing the following sequence:
- try regular load_config (likely case, rename already fully processed)
- if it fails, get node from vmlist, and load_config using that
- it that fails, invalidate the PVE::Cluster cache, retry regular load_config

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    an alternative approach would be to do the fallback load first, and if that
    fails do the target node load. it has the downside of doing two loads in the
    "good"/likely case where the rename is processed before the resume call, while
    making the unlikely case (fallback needed) cheaper.

 PVE/QemuServer.pm | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a746b3dd..a52a883e 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6377,12 +6377,18 @@ sub vm_resume {
 	my $reset = 0;
 	my $conf;
 	if ($nocheck) {
-	    my $vmlist = PVE::Cluster::get_vmlist();
-	    my $node;
-	    if (exists($vmlist->{ids}->{$vmid})) {
-		$node = $vmlist->{ids}->{$vmid}->{node};
+	    $conf = eval { PVE::QemuConfig->load_config($vmid) }; # try on target node
+	    if ($@) {
+		my $vmlist = PVE::Cluster::get_vmlist();
+		if (exists($vmlist->{ids}->{$vmid})) {
+		    my $node = $vmlist->{ids}->{$vmid}->{node};
+		    $conf = eval { PVE::QemuConfig->load_config($vmid, $node) }; # try on source node
+		}
+		if (!$conf) {
+		    PVE::Cluster::cfs_update(); # vmlist was wrong, invalidate cache
+		    $conf = PVE::QemuConfig->load_config($vmid); # last try on target node again
+		}
 	    }
-	    $conf = PVE::QemuConfig->load_config($vmid, $node);
 	} else {
 	    $conf = PVE::QemuConfig->load_config($vmid);
 	}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [PATCH qemu-server] fix #4372: fix vm_resume migration callback
  2022-11-29 12:09 [pve-devel] [PATCH qemu-server] fix #4372: fix vm_resume migration callback Fabian Grünbichler
@ 2022-11-29 18:24 ` Fabian Grünbichler
  2022-11-30 15:22 ` [pve-devel] applied: " Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Fabian Grünbichler @ 2022-11-29 18:24 UTC (permalink / raw)
  To: Proxmox VE development discussion

> Fabian Grünbichler <f.gruenbichler@proxmox.com> hat am 29.11.2022 13:09 CET geschrieben:
> 
>  
> the fix for the recently introduced requirement of loading the VM config while
> migrating was incomplete, since the vmlist node value could already be out of
> date by the time load_config is called.
> 
> extend the fallback behaviour even further, by doing the following sequence:
> - try regular load_config (likely case, rename already fully processed)
> - if it fails, get node from vmlist, and load_config using that
> - it that fails, invalidate the PVE::Cluster cache, retry regular load_config
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

FWIW, I managed to trigger both the fallback and the fallback to the fallback a few times now with a slightly modified test setup, and both cases (as well as the regular code path ;)) work as expected. it triggers quite rarely for me (<1% of migrations, with about a half to a third of those requiring the fallback fallback).




^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] applied: [PATCH qemu-server] fix #4372: fix vm_resume migration callback
  2022-11-29 12:09 [pve-devel] [PATCH qemu-server] fix #4372: fix vm_resume migration callback Fabian Grünbichler
  2022-11-29 18:24 ` Fabian Grünbichler
@ 2022-11-30 15:22 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2022-11-30 15:22 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Am 29/11/2022 um 13:09 schrieb Fabian Grünbichler:
> the fix for the recently introduced requirement of loading the VM config while
> migrating was incomplete, since the vmlist node value could already be out of
> date by the time load_config is called.
> 
> extend the fallback behaviour even further, by doing the following sequence:
> - try regular load_config (likely case, rename already fully processed)
> - if it fails, get node from vmlist, and load_config using that
> - it that fails, invalidate the PVE::Cluster cache, retry regular load_config
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>     an alternative approach would be to do the fallback load first, and if that
>     fails do the target node load. it has the downside of doing two loads in the
>     "good"/likely case where the rename is processed before the resume call, while
>     making the unlikely case (fallback needed) cheaper.
> 
>  PVE/QemuServer.pm | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
>

applied, thanks!




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-11-30 15:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-29 12:09 [pve-devel] [PATCH qemu-server] fix #4372: fix vm_resume migration callback Fabian Grünbichler
2022-11-29 18:24 ` Fabian Grünbichler
2022-11-30 15:22 ` [pve-devel] applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal