From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 845451FF15C for ; Wed, 30 Oct 2024 10:52:44 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 04DD613B58; Wed, 30 Oct 2024 10:52:46 +0100 (CET) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Wed, 30 Oct 2024 10:52:40 +0100 Message-Id: <20241030095240.11452-1-f.ebner@proxmox.com> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.058 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH qemu] async snapshot: stop vCPU throttling after finishing X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" In the community forum, users reported issues about RCU stalls and sluggish VMs after taking a snapshot with RAM in Proxmox VE [0]. Mario was also experiencing similar issues from time to time and recently, obtained a GDB stacktrace. The stacktrace showed that, in his case, the vCPU threads were waiting in cpu_throttle_thread(). It is a good guess that the issues in the forum could also be because of that. >From searching in the source code, it seems that migration is the only user of the vCPU throttling functions in QEMU relevant for Proxmox VE (the only other place where it is used is the Cocoa UI). In particular, RAM migration will begin throttling vCPUs for auto-converge. In migration_iteration_finish() there is an unconditional call to cpu_throttle_stop(), so do the same in the async snapshot code specific to Proxmox VE. It's not clear why the issue began to surface more prominently only now, since the vCPU throttling was there since commit 070afca258 ("migration: Dynamic cpu throttling for auto-converge") in QEMU v2.10.0. However, there were a lot of changes in the migration code between v8.1.5 and v9.0.2 and a few of them might have affected the likelihood of cpu_throttle_set() being called, for example, 4e1871c450 ("migration: Don't serialize devices in qemu_savevm_state_iterate()") [0]: https://forum.proxmox.com/threads/153483 Reported-by: Mario Loderer Signed-off-by: Fiona Ebner Tested-by: Mario Loderer --- ...-async-for-background-state-snapshots.patch | 18 +++++++++++++----- ...-add-optional-buffer-size-to-QEMUFile.patch | 6 +++--- 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch b/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch index 964caf3..b0e75e9 100644 --- a/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch +++ b/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch @@ -28,7 +28,8 @@ Signed-off-by: Stefan Reiter adapt to removal of QEMUFileOps improve condition for entering final stage adapt to QAPI and other changes for 8.2 - make sure to not call vm_start() from coroutine] + make sure to not call vm_start() from coroutine + stop CPU throttling after finishing] Signed-off-by: Fiona Ebner --- hmp-commands-info.hx | 13 + @@ -36,13 +37,13 @@ Signed-off-by: Fiona Ebner include/migration/snapshot.h | 2 + include/monitor/hmp.h | 3 + migration/meson.build | 1 + - migration/savevm-async.c | 538 +++++++++++++++++++++++++++++++++++ + migration/savevm-async.c | 545 +++++++++++++++++++++++++++++++++++ monitor/hmp-cmds.c | 38 +++ qapi/migration.json | 34 +++ qapi/misc.json | 18 ++ qemu-options.hx | 12 + system/vl.c | 10 + - 11 files changed, 686 insertions(+) + 11 files changed, 693 insertions(+) create mode 100644 migration/savevm-async.c diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx @@ -140,10 +141,10 @@ index 95d1cf2250..800f12a60d 100644 'threadinfo.c', diff --git a/migration/savevm-async.c b/migration/savevm-async.c new file mode 100644 -index 0000000000..72cf6588c2 +index 0000000000..1af32604c7 --- /dev/null +++ b/migration/savevm-async.c -@@ -0,0 +1,538 @@ +@@ -0,0 +1,545 @@ +#include "qemu/osdep.h" +#include "migration/channel-savevm-async.h" +#include "migration/migration.h" @@ -154,6 +155,7 @@ index 0000000000..72cf6588c2 +#include "migration/global_state.h" +#include "migration/ram.h" +#include "migration/qemu-file.h" ++#include "sysemu/cpu-throttle.h" +#include "sysemu/sysemu.h" +#include "sysemu/runstate.h" +#include "block/block.h" @@ -342,6 +344,12 @@ index 0000000000..72cf6588c2 + ret || aborted ? MIGRATION_STATUS_FAILED : MIGRATION_STATUS_COMPLETED); + ms->to_dst_file = NULL; + ++ /* ++ * Same as in migration_iteration_finish(): saving RAM might've turned on CPU throttling for ++ * auto-converge, make sure to disable it. ++ */ ++ cpu_throttle_stop(); ++ + qemu_savevm_state_cleanup(); + + ret = save_snapshot_cleanup(); diff --git a/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch b/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch index 043b249..92bc9f2 100644 --- a/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch +++ b/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch @@ -193,10 +193,10 @@ index 32fd4a34fd..36a0cd8cc8 100644 /* diff --git a/migration/savevm-async.c b/migration/savevm-async.c -index 72cf6588c2..fb4e8ea689 100644 +index 1af32604c7..be2035cd2e 100644 --- a/migration/savevm-async.c +++ b/migration/savevm-async.c -@@ -379,7 +379,7 @@ void qmp_savevm_start(const char *statefile, Error **errp) +@@ -386,7 +386,7 @@ void qmp_savevm_start(const char *statefile, Error **errp) QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target, &snap_state.bs_pos)); @@ -205,7 +205,7 @@ index 72cf6588c2..fb4e8ea689 100644 if (!snap_state.file) { error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile); -@@ -503,7 +503,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp) +@@ -510,7 +510,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp) blk_op_block_all(be, blocker); /* restore the VM state */ -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel