* [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress
@ 2023-01-26 13:46 Fiona Ebner
2023-01-26 13:46 ` [pve-devel] [PATCH qemu 2/2] savevm-async: keep more free space when entering final stage Fiona Ebner
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Fiona Ebner @ 2023-01-26 13:46 UTC (permalink / raw)
To: pve-devel
when pend_postcopy is large. By definition, pend_postcopy won't
decrease when iterating, so a value larger than the cutoff of 400000
would lead to essentially empty iterations, filling up the state file
until only 30 MiB + pending_size remain and the second half of the
check would trigger.
Avoid this, by not considering pend_postcopy for the cutoff to enter
the final phase.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
Many thanks to Fabian for discussing the issue with me!
...vevm-async-for-background-state-snapshots.patch | 14 ++++++++------
...-PVE-add-optional-buffer-size-to-QEMUFile.patch | 6 +++---
...async-register-yank-before-migration_inco.patch | 4 ++--
3 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch b/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch
index fcbe4bc..91d4710 100644
--- a/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch
+++ b/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch
@@ -24,7 +24,8 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[improve aborting]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[FE: further improve aborting
- adapt to removal of QEMUFileOps]
+ adapt to removal of QEMUFileOps
+ improve condition for entering final stage]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hmp-commands-info.hx | 13 +
@@ -32,13 +33,13 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
include/migration/snapshot.h | 2 +
include/monitor/hmp.h | 5 +
migration/meson.build | 1 +
- migration/savevm-async.c | 531 +++++++++++++++++++++++++++++++++++
+ migration/savevm-async.c | 532 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 57 ++++
qapi/migration.json | 34 +++
qapi/misc.json | 32 +++
qemu-options.hx | 12 +
softmmu/vl.c | 10 +
- 11 files changed, 730 insertions(+)
+ 11 files changed, 731 insertions(+)
create mode 100644 migration/savevm-async.c
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
@@ -154,10 +155,10 @@ index 8cac83c06c..0842d00cd2 100644
), gnutls)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
-index 0000000000..05d394c0e2
+index 0000000000..4a4e91a26d
--- /dev/null
+++ b/migration/savevm-async.c
-@@ -0,0 +1,531 @@
+@@ -0,0 +1,532 @@
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "migration/migration.h"
@@ -415,7 +416,8 @@ index 0000000000..05d394c0e2
+
+ maxlen = blk_getlength(snap_state.target) - 30*1024*1024;
+
-+ if (pending_size > 400000 && snap_state.bs_pos + pending_size < maxlen) {
++ /* Note that there is no progress for pend_postcopy when iterating */
++ if (pending_size - pend_postcopy > 400000 && snap_state.bs_pos + pending_size < maxlen) {
+ ret = qemu_savevm_state_iterate(snap_state.file, false);
+ if (ret < 0) {
+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret);
diff --git a/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch b/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch
index 5cf894b..956a09d 100644
--- a/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch
+++ b/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch
@@ -192,10 +192,10 @@ index fa13d04d78..914f1a63a8 100644
int qemu_fclose(QEMUFile *f);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
-index 05d394c0e2..bafe6ae5eb 100644
+index 4a4e91a26d..20aae335ca 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
-@@ -367,7 +367,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
+@@ -368,7 +368,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
&snap_state.bs_pos));
@@ -204,7 +204,7 @@ index 05d394c0e2..bafe6ae5eb 100644
if (!snap_state.file) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
-@@ -500,7 +500,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
+@@ -501,7 +501,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
blk_op_block_all(be, blocker);
/* restore the VM state */
diff --git a/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch b/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch
index b25c9fc..746b735 100644
--- a/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch
+++ b/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch
@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 5 insertions(+)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
-index bafe6ae5eb..da3634048f 100644
+index 20aae335ca..94c5ae1c81 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -20,6 +20,7 @@
@@ -22,7 +22,7 @@ index bafe6ae5eb..da3634048f 100644
/* #define DEBUG_SAVEVM_STATE */
-@@ -514,6 +515,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
+@@ -515,6 +516,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
dirty_bitmap_mig_before_vm_start();
qemu_fclose(f);
--
2.30.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* [pve-devel] [PATCH qemu 2/2] savevm-async: keep more free space when entering final stage
2023-01-26 13:46 [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress Fiona Ebner
@ 2023-01-26 13:46 ` Fiona Ebner
2023-02-09 9:30 ` [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress Fiona Ebner
2023-02-21 8:15 ` [pve-devel] applied-series: " Thomas Lamprecht
2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2023-01-26 13:46 UTC (permalink / raw)
To: pve-devel
In qemu-server, we already allocate 2 * $mem_size + 500 MiB for driver
state (which was 32 MiB long ago according to git history). It seems
likely that the 30 MiB cutoff in the savevm-async implementation was
chosen based on that.
In bug #4476 [0], another issue caused the iteration to not make any
progress and the state file filled up all the way to the 30 MiB +
pending_size cutoff. Since the guest is not stopped immediately after
the check, it can still dirty some RAM and the current cutoff is not
enough for a reproducer VM (was done while bug #4476 still was not
fixed), dirtying memory with
> stress-ng -B 2 --bigheap-growth 64.0M'
After entering the final stage, savevm actually filled up the state
file completely, leading to an I/O error. It's probably the same
scenario as reported in the bug report, the error message was fixed in
commit a020815 ("savevm-async: fix function name in error message")
after the bug report.
If not for the bug, the cutoff will only be reached by a VM that's
dirtying RAM faster than can be written to the storage, so increase
the cutoff to 100 MiB to have a bigger chance to finish successfully,
while still trying to not increase downtime too much for
non-hibernation snapshots.
[0]: https://bugzilla.proxmox.com/show_bug.cgi?id=4476
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
...vm-async-for-background-state-snapshots.patch | 16 +++++++++++-----
...VE-add-optional-buffer-size-to-QEMUFile.patch | 6 +++---
...ync-register-yank-before-migration_inco.patch | 4 ++--
3 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch b/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch
index 91d4710..3898bd4 100644
--- a/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch
+++ b/debian/patches/pve/0017-PVE-add-savevm-async-for-background-state-snapshots.patch
@@ -33,13 +33,13 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
include/migration/snapshot.h | 2 +
include/monitor/hmp.h | 5 +
migration/meson.build | 1 +
- migration/savevm-async.c | 532 +++++++++++++++++++++++++++++++++++
+ migration/savevm-async.c | 538 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 57 ++++
qapi/migration.json | 34 +++
qapi/misc.json | 32 +++
qemu-options.hx | 12 +
softmmu/vl.c | 10 +
- 11 files changed, 731 insertions(+)
+ 11 files changed, 737 insertions(+)
create mode 100644 migration/savevm-async.c
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
@@ -155,10 +155,10 @@ index 8cac83c06c..0842d00cd2 100644
), gnutls)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
-index 0000000000..4a4e91a26d
+index 0000000000..dc30558713
--- /dev/null
+++ b/migration/savevm-async.c
-@@ -0,0 +1,532 @@
+@@ -0,0 +1,538 @@
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "migration/migration.h"
@@ -414,7 +414,13 @@ index 0000000000..4a4e91a26d
+
+ pending_size = pend_precopy + pend_compatible + pend_postcopy;
+
-+ maxlen = blk_getlength(snap_state.target) - 30*1024*1024;
++ /*
++ * A guest reaching this cutoff is dirtying lots of RAM. It should be
++ * large enough so that the guest can't dirty this much between the
++ * check and the guest actually being stopped, but it should be small
++ * enough to avoid long downtimes for non-hibernation snapshots.
++ */
++ maxlen = blk_getlength(snap_state.target) - 100*1024*1024;
+
+ /* Note that there is no progress for pend_postcopy when iterating */
+ if (pending_size - pend_postcopy > 400000 && snap_state.bs_pos + pending_size < maxlen) {
diff --git a/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch b/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch
index 956a09d..3095417 100644
--- a/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch
+++ b/debian/patches/pve/0018-PVE-add-optional-buffer-size-to-QEMUFile.patch
@@ -192,10 +192,10 @@ index fa13d04d78..914f1a63a8 100644
int qemu_fclose(QEMUFile *f);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
-index 4a4e91a26d..20aae335ca 100644
+index dc30558713..a38e7351c1 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
-@@ -368,7 +368,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
+@@ -374,7 +374,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
&snap_state.bs_pos));
@@ -204,7 +204,7 @@ index 4a4e91a26d..20aae335ca 100644
if (!snap_state.file) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
-@@ -501,7 +501,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
+@@ -507,7 +507,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
blk_op_block_all(be, blocker);
/* restore the VM state */
diff --git a/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch b/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch
index 746b735..c4aae04 100644
--- a/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch
+++ b/debian/patches/pve/0049-PVE-savevm-async-register-yank-before-migration_inco.patch
@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 5 insertions(+)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
-index 20aae335ca..94c5ae1c81 100644
+index a38e7351c1..0b1b60c6ae 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -20,6 +20,7 @@
@@ -22,7 +22,7 @@ index 20aae335ca..94c5ae1c81 100644
/* #define DEBUG_SAVEVM_STATE */
-@@ -515,6 +516,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
+@@ -521,6 +522,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
dirty_bitmap_mig_before_vm_start();
qemu_fclose(f);
--
2.30.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress
2023-01-26 13:46 [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress Fiona Ebner
2023-01-26 13:46 ` [pve-devel] [PATCH qemu 2/2] savevm-async: keep more free space when entering final stage Fiona Ebner
@ 2023-02-09 9:30 ` Fiona Ebner
2023-02-21 8:15 ` [pve-devel] applied-series: " Thomas Lamprecht
2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2023-02-09 9:30 UTC (permalink / raw)
To: pve-devel
Should we include this before rolling out QEMU 7.2 to public
repositories? Will always trigger with large disk(s), when doing a PBS
backup and at any later time a snapshot/hibernation. It's not a new
problem, but there were a few reports recently, and 7.2-rollout seems
like a good time.
Am 26.01.23 um 14:46 schrieb Fiona Ebner:
> when pend_postcopy is large. By definition, pend_postcopy won't
> decrease when iterating, so a value larger than the cutoff of 400000
> would lead to essentially empty iterations, filling up the state file
> until only 30 MiB + pending_size remain and the second half of the
> check would trigger.
>
> Avoid this, by not considering pend_postcopy for the cutoff to enter
> the final phase.
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [pve-devel] applied-series: [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress
2023-01-26 13:46 [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress Fiona Ebner
2023-01-26 13:46 ` [pve-devel] [PATCH qemu 2/2] savevm-async: keep more free space when entering final stage Fiona Ebner
2023-02-09 9:30 ` [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress Fiona Ebner
@ 2023-02-21 8:15 ` Thomas Lamprecht
2 siblings, 0 replies; 4+ messages in thread
From: Thomas Lamprecht @ 2023-02-21 8:15 UTC (permalink / raw)
To: Proxmox VE development discussion, Fiona Ebner
Am 26/01/2023 um 14:46 schrieb Fiona Ebner:
> when pend_postcopy is large. By definition, pend_postcopy won't
> decrease when iterating, so a value larger than the cutoff of 400000
> would lead to essentially empty iterations, filling up the state file
> until only 30 MiB + pending_size remain and the second half of the
> check would trigger.
>
> Avoid this, by not considering pend_postcopy for the cutoff to enter
> the final phase.
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> Many thanks to Fabian for discussing the issue with me!
>
> ...vevm-async-for-background-state-snapshots.patch | 14 ++++++++------
> ...-PVE-add-optional-buffer-size-to-QEMUFile.patch | 6 +++---
> ...async-register-yank-before-migration_inco.patch | 4 ++--
> 3 files changed, 13 insertions(+), 11 deletions(-)
>
>
applied both patches, thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-02-21 8:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-26 13:46 [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress Fiona Ebner
2023-01-26 13:46 ` [pve-devel] [PATCH qemu 2/2] savevm-async: keep more free space when entering final stage Fiona Ebner
2023-02-09 9:30 ` [pve-devel] [PATCH qemu 1/2] fix #4476: savevm-async: avoid looping without progress Fiona Ebner
2023-02-21 8:15 ` [pve-devel] applied-series: " Thomas Lamprecht
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal