* [pve-devel] [PATCH qemu-server v2] vm start: set higher timeout if using PCI passthrough
@ 2023-10-06 12:15 Friedrich Weber
2023-10-06 16:13 ` [pve-devel] applied: " Thomas Lamprecht
0 siblings, 1 reply; 2+ messages in thread
From: Friedrich Weber @ 2023-10-06 12:15 UTC (permalink / raw)
To: pve-devel
The default VM startup timeout is `max(30, VM memory in GiB)` seconds.
Multiple reports in the forum [0] [1] and the bug tracker [2] suggest
this is too short when using PCI passthrough with a large amount of VM
memory, since QEMU needs to map the whole memory during startup (see
comment #2 in [2]). As a result, VM startup fails with "got timeout".
To work around this, set a larger default timeout if at least one PCI
device is passed through. The question remains how to choose an
appropriate timeout. Users reported the following startup times:
ref | RAM | time | ratio (s/GiB)
---------------------------------
[1] | 60G | 135s | 2.25
[1] | 70G | 157s | 2.24
[1] | 80G | 277s | 3.46
[2] | 65G | 213s | 3.28
[2] | 96G | >290s | >3.02
The data does not really indicate any simple (e.g. linear)
relationship between RAM and startup time (even data from the same
source). However, to keep the heuristic simple, assume linear growth
and multiply the default timeout by 4 if at least one `hostpci[n]`
option is present, obtaining `4 * max(30, VM memory in GiB)`. This
covers all cases above, and should still leave some headroom.
[0]: https://forum.proxmox.com/threads/83765/post-552071
[1]: https://forum.proxmox.com/threads/126398/post-592826
[2]: https://bugzilla.proxmox.com/show_bug.cgi?id=3502
Suggested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
Notes:
changes since v1 (was called "vm start: set minimum timeout of 300s if
using PCI passthrough", 20230503133723.165739-1-f.weber@proxmox.com):
* Use a constant multiplier as suggested by Fiona (thx!)
Another workaround is offered by an unapplied patch series [3] of bug
3502 [2] that makes it possible to set VM-specific timeouts (also in
the GUI). Users could use this option to manually set a higher
timeout for VMs that use PCI passthrough. However, it is not
immediately obvious that a higher timeout is necessary when using
PCI passthrough. Since the problem seems to come up somewhat
frequently, I think it makes sense to have the heuristic choose a
higher timeout by default.
As discussed in v1, I'll also pick up the patch series to allow users
to set custom timeouts [3], also to offer a workaround for cases where
the new heuristic chooses a timeout that is still too short.
[2]: https://bugzilla.proxmox.com/show_bug.cgi?id=3502
[3]: https://lists.proxmox.com/pipermail/pve-devel/2023-January/055352.html
PVE/QemuServer/Helpers.pm | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
index 8817427..0afb631 100644
--- a/PVE/QemuServer/Helpers.pm
+++ b/PVE/QemuServer/Helpers.pm
@@ -152,6 +152,13 @@ sub config_aware_timeout {
$timeout = int($memory/1024);
}
+ # When using PCI passthrough, users reported much higher startup times,
+ # growing with the amount of memory configured. Constant factor chosen
+ # based on user reports.
+ if (grep(/^hostpci[0-9]+$/, keys %$config)) {
+ $timeout *= 4;
+ }
+
if ($is_suspended && $timeout < 300) {
$timeout = 300;
}
--
2.39.2
^ permalink raw reply [flat|nested] 2+ messages in thread
* [pve-devel] applied: [PATCH qemu-server v2] vm start: set higher timeout if using PCI passthrough
2023-10-06 12:15 [pve-devel] [PATCH qemu-server v2] vm start: set higher timeout if using PCI passthrough Friedrich Weber
@ 2023-10-06 16:13 ` Thomas Lamprecht
0 siblings, 0 replies; 2+ messages in thread
From: Thomas Lamprecht @ 2023-10-06 16:13 UTC (permalink / raw)
To: Proxmox VE development discussion, Friedrich Weber
Am 06/10/2023 um 14:15 schrieb Friedrich Weber:
> The default VM startup timeout is `max(30, VM memory in GiB)` seconds.
> Multiple reports in the forum [0] [1] and the bug tracker [2] suggest
> this is too short when using PCI passthrough with a large amount of VM
> memory, since QEMU needs to map the whole memory during startup (see
> comment #2 in [2]). As a result, VM startup fails with "got timeout".
>
> To work around this, set a larger default timeout if at least one PCI
> device is passed through. The question remains how to choose an
> appropriate timeout. Users reported the following startup times:
>
> ref | RAM | time | ratio (s/GiB)
> ---------------------------------
> [1] | 60G | 135s | 2.25
> [1] | 70G | 157s | 2.24
> [1] | 80G | 277s | 3.46
> [2] | 65G | 213s | 3.28
> [2] | 96G | >290s | >3.02
>
> The data does not really indicate any simple (e.g. linear)
> relationship between RAM and startup time (even data from the same
> source). However, to keep the heuristic simple, assume linear growth
> and multiply the default timeout by 4 if at least one `hostpci[n]`
> option is present, obtaining `4 * max(30, VM memory in GiB)`. This
> covers all cases above, and should still leave some headroom.
>
> [0]: https://forum.proxmox.com/threads/83765/post-552071
> [1]: https://forum.proxmox.com/threads/126398/post-592826
> [2]: https://bugzilla.proxmox.com/show_bug.cgi?id=3502
>
> Suggested-by: Fiona Ebner <f.ebner@proxmox.com>
> Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
> ---
>
> Notes:
> changes since v1 (was called "vm start: set minimum timeout of 300s if
> using PCI passthrough", 20230503133723.165739-1-f.weber@proxmox.com):
> * Use a constant multiplier as suggested by Fiona (thx!)
>
> Another workaround is offered by an unapplied patch series [3] of bug
> 3502 [2] that makes it possible to set VM-specific timeouts (also in
> the GUI). Users could use this option to manually set a higher
> timeout for VMs that use PCI passthrough. However, it is not
> immediately obvious that a higher timeout is necessary when using
> PCI passthrough. Since the problem seems to come up somewhat
> frequently, I think it makes sense to have the heuristic choose a
> higher timeout by default.
yeah, I think so too.
>
> As discussed in v1, I'll also pick up the patch series to allow users
> to set custom timeouts [3], also to offer a workaround for cases where
> the new heuristic chooses a timeout that is still too short.
>
> [2]: https://bugzilla.proxmox.com/show_bug.cgi?id=3502
> [3]: https://lists.proxmox.com/pipermail/pve-devel/2023-January/055352.html
>
> PVE/QemuServer/Helpers.pm | 7 +++++++
> 1 file changed, 7 insertions(+)
>
>
applied, thanks!
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-10-06 16:13 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-06 12:15 [pve-devel] [PATCH qemu-server v2] vm start: set higher timeout if using PCI passthrough Friedrich Weber
2023-10-06 16:13 ` [pve-devel] applied: " Thomas Lamprecht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox