public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v2 qemu-server] resume: bump timeout for query-status
@ 2024-07-25 12:32 Fiona Ebner
  2024-07-29 17:16 ` [pve-devel] applied: " Thomas Lamprecht
  0 siblings, 1 reply; 2+ messages in thread
From: Fiona Ebner @ 2024-07-25 12:32 UTC (permalink / raw)
  To: pve-devel

As reported in the community forum [0], after migration, the VM might
not immediately be able to respond to QMP commands, which means the VM
could fail to resume and stay in paused state on the target.

The reason is that activating the block drives in QEMU can take a bit
of time. For example, it might be necessary to invalidate the caches
(where for raw devices a flush might be needed) and the request
alignment and size of the block device needs to be queried.

In [0], an external Ceph cluster with krbd is used, and the initial
read to the block device after migration, for probing the request
alignment, takes a bit over 10 seconds[1]. Use 60 seconds as the new
timeout to be on the safe side for the future.

All callers are inside workers or via the 'qm' CLI command, so bumping
beyond 30 seconds is fine.

[0]: https://forum.proxmox.com/threads/149610/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v2:
* improve commit message with new findings from the forum thread

 PVE/QemuServer.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index bf59b091..9e840912 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6461,7 +6461,9 @@ sub vm_resume {
     my ($vmid, $skiplock, $nocheck) = @_;
 
     PVE::QemuConfig->lock_config($vmid, sub {
-	my $res = mon_cmd($vmid, 'query-status');
+	# After migration, the VM might not immediately be able to respond to QMP commands, because
+	# activating the block devices might take a bit of time.
+	my $res = mon_cmd($vmid, 'query-status', timeout => 60);
 	my $resume_cmd = 'cont';
 	my $reset = 0;
 	my $conf;
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [pve-devel] applied: [PATCH v2 qemu-server] resume: bump timeout for query-status
  2024-07-25 12:32 [pve-devel] [PATCH v2 qemu-server] resume: bump timeout for query-status Fiona Ebner
@ 2024-07-29 17:16 ` Thomas Lamprecht
  0 siblings, 0 replies; 2+ messages in thread
From: Thomas Lamprecht @ 2024-07-29 17:16 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 25/07/2024 um 14:32 schrieb Fiona Ebner:
> As reported in the community forum [0], after migration, the VM might
> not immediately be able to respond to QMP commands, which means the VM
> could fail to resume and stay in paused state on the target.
> 
> The reason is that activating the block drives in QEMU can take a bit
> of time. For example, it might be necessary to invalidate the caches
> (where for raw devices a flush might be needed) and the request
> alignment and size of the block device needs to be queried.
> 
> In [0], an external Ceph cluster with krbd is used, and the initial
> read to the block device after migration, for probing the request
> alignment, takes a bit over 10 seconds[1]. Use 60 seconds as the new
> timeout to be on the safe side for the future.
> 
> All callers are inside workers or via the 'qm' CLI command, so bumping
> beyond 30 seconds is fine.
> 
> [0]: https://forum.proxmox.com/threads/149610/
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v2:
> * improve commit message with new findings from the forum thread
> 
>  PVE/QemuServer.pm | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
>

applied, thanks!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-07-29 17:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-25 12:32 [pve-devel] [PATCH v2 qemu-server] resume: bump timeout for query-status Fiona Ebner
2024-07-29 17:16 ` [pve-devel] applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal