public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Fabian Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pve-devel] [PATCH v2 qemu-server++ 0/15] remote migration
Date: Tue, 30 Nov 2021 15:06:14 +0100	[thread overview]
Message-ID: <7a0968df-559b-81de-1df1-f912866b39d5@proxmox.com> (raw)
In-Reply-To: <20211111140721.3288364-1-f.gruenbichler@proxmox.com>

Am 11.11.21 um 15:07 schrieb Fabian Grünbichler:
> this series adds remote migration for VMs.
> 
> both live and offline migration including NBD and storage-migrated disks
> should work.
> 

Played around with it for a while. Biggest issue is that migration fails 
if there is no 'meta' property in the config. Most other things I wish 
for are better error handling, but it seems to be in good shape otherwise!


Error "storage does not exist" if the real issue is missing access 
rights. But that error also appears if missing access for 
/cluster/resources or if the target node does not exists.


For the 'config' command, 'Sys.Modify' seems to be required
     failed to handle 'config' command - 403 Permission check failed (/, 
Sys.Modify)
but it does create an empty configuration file, leading to
     target_vmid: Guest with ID '5678' already exists on remote cluster
on the next attempt.
It also already allocates the disks, but doesn't clean them up, because 
it gets the wrong lock (since the config is empty) and aborts the 'quit' 
command.


If the config is not recent enough to have a 'meta' property:
     failed to handle 'config' command - unable to parse value of 'meta' 
- got undefined value
Same issue with disk+config cleanup as above.


The local VM stayes locked with 'migrate'. Is that how it should be?
Also the __migration__ snapshot will stay around, resulting in an error 
when trying to migrate again.


For live migration I always got a (cosmetic?) "WS closed 
unexpectedly"-error:
tunnel: -> sending command "quit" to remote
tunnel: <- got reply
tunnel: Tunnel to 
https://192.168.20.142:8006/api2/json/nodes/rob2/qemu/5678/mtunnelwebsocket?
ticket=PVETUNNEL%3A<SNIP>&socket=%2Frun%2Fqemu-server%2F5678.mtunnel 
failed - WS closed unexpectedly
2021-11-30 13:49:39 migration finished successfully (duration 00:01:02)
UPID:pve701:0000D8AD:000CB782:61A61DA5:qmigrate:111:root@pam:


Fun fact: the identity storage mapping will be used for storages that 
don't appear in the explicit mapping. E.g. it's possible to migrate a VM 
that only has disks on storeA with --target-storage storeB:storeB (if 
storeA exists on the target of course). But the explicit identity 
mapping is prohibited.


When a target bridge is not present (should that be detected ahead of 
starting the migration?) and likely for any other startup failure the 
only error in the log is:
2021-11-30 14:43:10 ERROR: online migrate failure - error - tunnel 
command '{"cmd":"star<SNIP>
failed to handle 'start' command - start failed: QEMU exited with code 1
For non-remote migration we are more verbose in this case and log the 
QEMU output.


Can/should an interrupt be handled more gracefully, so that remote 
cleanup still happens?
^CCMD websocket tunnel died: command 'proxmox-websocket-tunnel' failed: 
interrupted by signal

2021-11-30 14:39:07 ERROR: interrupted by signal
2021-11-30 14:39:07 aborting phase 1 - cleanup resources
2021-11-30 14:39:08 ERROR: writing to tunnel failed: broken pipe
2021-11-30 14:39:08 ERROR: migration aborted (duration 00:00:10): 
interrupted by signal


> besides lots of rebases, implemented todos and fixed issues the main
> difference to the previous RFC is that we no longer define remote
> entries in a config file, but just expect the caller/client to give us
> all the required information to connect to the remote cluster.
> 
> new in v2: dropped parts already applied, incorporated Fabian's and
> Dominik's feedback (thanks!)
> 
> overview over affected repos and changes, see individual patches for
> more details.
> 
> proxmox-websocket-tunnel:
> 
> new tunnel helper tool for forwarding commands and data over websocket
> connections, required by qemu-server on source side
> 
> pve-access-control:
> 
> new ticket type, required by qemu-server on target side
> 
> pve-guest-common:
> 
> handle remote migration (no SSH) in AbstractMigrate,
> required by qemu-server
> 
> pve-storage:
> 
> extend 'pvesm import' to allow import from UNIX socket, required on
> target node by qemu-server
> 
> qemu-server:
> 
> some refactoring, new mtunnel endpoints, new remote_migration endpoints
> TODO: handle pending changes and snapshots
> TODO: proper CLI for remote migration
> potential TODO: precond endpoint?
> 
> pve-http-server:
> 
> fix for handling unflushed proxy streams
> 
> as usual, some of the patches are best viewed with '-w', especially in
> qemu-server..
> 
> required dependencies are noted, qemu-server also requires a build-dep
> on patched pve-common since the required options/formats would be
> missing otherwise..
> proxmox-websocket-tunnel
> 
> Fabian Grünbichler (4):
>    initial commit
>    add tunnel implementation
>    add fingerprint validation
>    add packaging
> 
> pve-access-control
> 
> Fabian Grünbichler (2):
>    tickets: add tunnel ticket
>    ticket: normalize path for verification
> 
>   src/PVE/AccessControl.pm | 52 ++++++++++++++++++++++++++++++----------
>   1 file changed, 40 insertions(+), 12 deletions(-)
> 
> pve-http-server
> 
> Fabian Grünbichler (1):
>    webproxy: handle unflushed write buffer
> 
>   src/PVE/APIServer/AnyEvent.pm | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> qemu-server
> 
> Fabian Grünbichler (8):
>    refactor map_storage to map_id
>    schema: use pve-bridge-id
>    update_vm: allow simultaneous setting of boot-order and dev
>    nbd alloc helper: allow passing in explicit format
>    mtunnel: add API endpoints
>    migrate: refactor remote VM/tunnel start
>    migrate: add remote migration handling
>    api: add remote migrate endpoint
> 
>   PVE/API2/Qemu.pm   | 826 ++++++++++++++++++++++++++++++++++++++++++++-
>   PVE/QemuMigrate.pm | 813 ++++++++++++++++++++++++++++++++++++--------
>   PVE/QemuServer.pm  |  80 +++--
>   debian/control     |   2 +
>   4 files changed, 1539 insertions(+), 182 deletions(-)
> 




  parent reply	other threads:[~2021-11-30 14:06 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 14:07 Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 proxmox-websocket-tunnel 1/4] initial commit Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 proxmox-websocket-tunnel 2/4] add tunnel implementation Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 proxmox-websocket-tunnel 3/4] add fingerprint validation Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 proxmox-websocket-tunnel 4/4] add packaging Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 access-control 1/2] tickets: add tunnel ticket Fabian Grünbichler
2021-11-11 15:50   ` [pve-devel] applied: " Thomas Lamprecht
2021-11-11 14:07 ` [pve-devel] [PATCH v2 access-control 2/2] ticket: normalize path for verification Fabian Grünbichler
2021-11-11 15:50   ` [pve-devel] applied: " Thomas Lamprecht
2021-11-11 14:07 ` [pve-devel] [PATCH v2 http-server 1/1] webproxy: handle unflushed write buffer Fabian Grünbichler
2021-11-11 16:04   ` [pve-devel] applied: " Thomas Lamprecht
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 1/8] refactor map_storage to map_id Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 2/8] schema: use pve-bridge-id Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 3/8] update_vm: allow simultaneous setting of boot-order and dev Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 4/8] nbd alloc helper: allow passing in explicit format Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 5/8] mtunnel: add API endpoints Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 6/8] migrate: refactor remote VM/tunnel start Fabian Grünbichler
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 7/8] migrate: add remote migration handling Fabian Grünbichler
2021-11-30 13:57   ` Fabian Ebner
2021-11-11 14:07 ` [pve-devel] [PATCH v2 qemu-server 8/8] api: add remote migrate endpoint Fabian Grünbichler
2021-11-12  8:03 ` [pve-devel] [PATCH guest-common] migrate: handle migration_network with remote migration Fabian Grünbichler
2021-11-30 14:06 ` Fabian Ebner [this message]
2021-12-02 15:36   ` [pve-devel] [PATCH v2 qemu-server++ 0/15] " Fabian Grünbichler
2021-12-03  7:49     ` Fabian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7a0968df-559b-81de-1df1-f912866b39d5@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal