public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC qemu] savevm-async: improve check for blockers
@ 2024-05-17 11:39 Fiona Ebner
  2024-05-17 14:30 ` Fiona Ebner
  2024-05-17 15:00 ` Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Fiona Ebner @ 2024-05-17 11:39 UTC (permalink / raw)
  To: pve-devel

Same rationale as with upstream QEMU commit 5aaac46793 ("migration:
savevm: consult migration blockers"), migration and (async) snapshot
are essentially the same operation and thus snapshot also needs to
check for migration blockers. For example, this catches passed-through
PCI devices, where the driver does not support migration.

However, the commit notes:

> There is really no difference between live migration and savevm, except
> that savevm does not require bdrv_invalidate_cache to be implemented
> by all disks.  However, it is unlikely that savevm is used with anything
> except qcow2 disks, so the penalty is small and worth the improvement
> in catching bad usage of savevm.

and for Proxmox VE, suspend-to-disk with VMDK does use savevm-async
and would be broken by simply using migration_is_blocked(). To keep
this working, introduce a new helper that filters blockers with the
prefix used by the VMDK migration blocker.

The function qemu_savevm_state_blocked() is called as part of
migration_is_blocked_allow_prefix() so no check is lost with this
patch.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

An alternative would be to mark the VMDK blocker as a
"live-migration-only" blocker and extending migration_is_blocked() or
using separate helpers to check for migration and snapshot blockers
differently. But that requires touching more machinery and probably
needs more adaptation going forward than the approach here.

 migration/migration.c    | 22 ++++++++++++++++++++++
 migration/migration.h    |  1 +
 migration/savevm-async.c |  7 ++++++-
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index b8d7e471a4..6235309a00 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1897,6 +1897,28 @@ void qmp_migrate_pause(Error **errp)
                "during postcopy-active or postcopy-recover state");
 }
 
+/*
+ * HACK to allow hibernation in Proxmox VE even when VMDK image is present.
+ */
+bool migration_is_blocked_allow_prefix(Error **errp, const char *prefix)
+{
+    GSList *blockers = migration_blockers[migrate_mode()];
+
+    if (qemu_savevm_state_blocked(errp)) {
+        return true;
+    }
+
+    while (blockers) {
+        if (!g_str_has_prefix(error_get_pretty(blockers->data), prefix)) {
+            error_propagate(errp, error_copy(blockers->data));
+            return true;
+        }
+        blockers = g_slist_next(blockers);
+    }
+
+    return false;
+}
+
 bool migration_is_blocked(Error **errp)
 {
     GSList *blockers = migration_blockers[migrate_mode()];
diff --git a/migration/migration.h b/migration/migration.h
index 8045e39c26..575805a8e2 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -484,6 +484,7 @@ int migration_call_notifiers(MigrationState *s, MigrationEventType type,
                              Error **errp);
 
 int migrate_init(MigrationState *s, Error **errp);
+bool migration_is_blocked_allow_prefix(Error **errp, const char *prefix);
 bool migration_is_blocked(Error **errp);
 /* True if outgoing migration has entered postcopy phase */
 bool migration_in_postcopy(void);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index bf36fc06d2..33085446e1 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -363,7 +363,12 @@ void qmp_savevm_start(const char *statefile, Error **errp)
         return;
     }
 
-    if (qemu_savevm_state_blocked(errp)) {
+    /*
+     * The limitation for VMDK images only applies to live-migration, not
+     * snapshots, see commit 5aaac46793 ("migration: savevm: consult migration
+     * blockers").
+     */
+    if (migration_is_blocked_allow_prefix(errp, "The vmdk format used by node")) {
         return;
     }
 
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [RFC qemu] savevm-async: improve check for blockers
  2024-05-17 11:39 [pve-devel] [RFC qemu] savevm-async: improve check for blockers Fiona Ebner
@ 2024-05-17 14:30 ` Fiona Ebner
  2024-05-17 15:00 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Fiona Ebner @ 2024-05-17 14:30 UTC (permalink / raw)
  To: pve-devel

Am 17.05.24 um 13:39 schrieb Fiona Ebner:
> Same rationale as with upstream QEMU commit 5aaac46793 ("migration:
> savevm: consult migration blockers"), migration and (async) snapshot
> are essentially the same operation and thus snapshot also needs to
> check for migration blockers. For example, this catches passed-through
> PCI devices, where the driver does not support migration.
> 
> However, the commit notes:
> 
>> There is really no difference between live migration and savevm, except
>> that savevm does not require bdrv_invalidate_cache to be implemented
>> by all disks.  However, it is unlikely that savevm is used with anything
>> except qcow2 disks, so the penalty is small and worth the improvement
>> in catching bad usage of savevm.
> 
> and for Proxmox VE, suspend-to-disk with VMDK does use savevm-async
> and would be broken by simply using migration_is_blocked(). To keep
> this working, introduce a new helper that filters blockers with the
> prefix used by the VMDK migration blocker.
> 
> The function qemu_savevm_state_blocked() is called as part of
> migration_is_blocked_allow_prefix() so no check is lost with this
> patch.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> An alternative would be to mark the VMDK blocker as a
> "live-migration-only" blocker and extending migration_is_blocked() or
> using separate helpers to check for migration and snapshot blockers
> differently. But that requires touching more machinery and probably
> needs more adaptation going forward than the approach here.
> 

So, this also "breaks" and at the same time fixes snapshot and hibernate
with VNC clipboard by preventing it. Currently, we do not have any
checks in the snapshot and hibernate API calls for the VNC clipboard and
they "work", but the clipboard will be broken after restore (and I mean
broken, not just the contents lost). So some users might consider adding
such checks (and this patch) a breaking change even if it's technically
correct to prevent snapshot and hibernate with VNC clipboard. And other
users might (rightly) complain about broken clipboard.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [RFC qemu] savevm-async: improve check for blockers
  2024-05-17 11:39 [pve-devel] [RFC qemu] savevm-async: improve check for blockers Fiona Ebner
  2024-05-17 14:30 ` Fiona Ebner
@ 2024-05-17 15:00 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2024-05-17 15:00 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

subject might be improved by being less general/ambiguous, something like:

savevm-async: improve coverage by also checking for migration blockers

or 

savevm-async: block snapshot also if migration would fail

or

savevm-async: reuse migration blocker check for snapshots

Would have helped me to have a better initial context for reading this commit
(message).

Am 17/05/2024 um 13:39 schrieb Fiona Ebner:
> Same rationale as with upstream QEMU commit 5aaac46793 ("migration:
> savevm: consult migration blockers"), migration and (async) snapshot
> are essentially the same operation and thus snapshot also needs to
> check for migration blockers. For example, this catches passed-through
> PCI devices, where the driver does not support migration.
> 
> However, the commit notes:
> 
>> There is really no difference between live migration and savevm, except
>> that savevm does not require bdrv_invalidate_cache to be implemented
>> by all disks.  However, it is unlikely that savevm is used with anything
>> except qcow2 disks, so the penalty is small and worth the improvement
>> in catching bad usage of savevm.
> 
> and for Proxmox VE, suspend-to-disk with VMDK does use savevm-async
> and would be broken by simply using migration_is_blocked(). To keep
> this working, introduce a new helper that filters blockers with the
> prefix used by the VMDK migration blocker.
> 
> The function qemu_savevm_state_blocked() is called as part of
> migration_is_blocked_allow_prefix() so no check is lost with this
> patch.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> An alternative would be to mark the VMDK blocker as a
> "live-migration-only" blocker and extending migration_is_blocked() or
> using separate helpers to check for migration and snapshot blockers
> differently. But that requires touching more machinery and probably
> needs more adaptation going forward than the approach here.
> 
>  migration/migration.c    | 22 ++++++++++++++++++++++
>  migration/migration.h    |  1 +
>  migration/savevm-async.c |  7 ++++++-
>  3 files changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index b8d7e471a4..6235309a00 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1897,6 +1897,28 @@ void qmp_migrate_pause(Error **errp)
>                 "during postcopy-active or postcopy-recover state");
>  }
>  
> +/*
> + * HACK to allow hibernation in Proxmox VE even when VMDK image is present.
> + */
> +bool migration_is_blocked_allow_prefix(Error **errp, const char *prefix)
> +{
> +    GSList *blockers = migration_blockers[migrate_mode()];
> +
> +    if (qemu_savevm_state_blocked(errp)) {
> +        return true;
> +    }
> +
> +    while (blockers) {
> +        if (!g_str_has_prefix(error_get_pretty(blockers->data), prefix)) {
> +            error_propagate(errp, error_copy(blockers->data));
> +            return true;
> +        }
> +        blockers = g_slist_next(blockers);
> +    }
> +
> +    return false;
> +}
> +
>  bool migration_is_blocked(Error **errp)
>  {
>      GSList *blockers = migration_blockers[migrate_mode()];
> diff --git a/migration/migration.h b/migration/migration.h
> index 8045e39c26..575805a8e2 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -484,6 +484,7 @@ int migration_call_notifiers(MigrationState *s, MigrationEventType type,
>                               Error **errp);
>  
>  int migrate_init(MigrationState *s, Error **errp);
> +bool migration_is_blocked_allow_prefix(Error **errp, const char *prefix);
>  bool migration_is_blocked(Error **errp);
>  /* True if outgoing migration has entered postcopy phase */
>  bool migration_in_postcopy(void);
> diff --git a/migration/savevm-async.c b/migration/savevm-async.c
> index bf36fc06d2..33085446e1 100644
> --- a/migration/savevm-async.c
> +++ b/migration/savevm-async.c
> @@ -363,7 +363,12 @@ void qmp_savevm_start(const char *statefile, Error **errp)
>          return;
>      }
>  
> -    if (qemu_savevm_state_blocked(errp)) {
> +    /*
> +     * The limitation for VMDK images only applies to live-migration, not
> +     * snapshots, see commit 5aaac46793 ("migration: savevm: consult migration
> +     * blockers").
> +     */
> +    if (migration_is_blocked_allow_prefix(errp, "The vmdk format used by node")) {

meh, not a big fan of matching strings here, especially as that is not
stable ABI, I mean I did not check, but I would be surprised if that's
the case – maybe we could factor out that string here and when its added
as blocker into a common constant so that we'd notice if it changes.

And if we only uses this here then why add a generic "ignore one specific
blocker" helper, might be better to at least contain that detail in a
"qemu_savevm_async_state_blocked" one that takes only the `errp` as
parameter, as hacks should IMO always be quite specific to avoid the
spread of them (I know you would check in detail before doing so, but
not everybody does).

>          return;
>      }
>  



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-05-17 15:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-17 11:39 [pve-devel] [RFC qemu] savevm-async: improve check for blockers Fiona Ebner
2024-05-17 14:30 ` Fiona Ebner
2024-05-17 15:00 ` Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal