From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.gruenbichler@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 2FAB5831DC
 for <pve-devel@lists.proxmox.com>; Thu,  2 Dec 2021 16:37:15 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 1EBF01BD5E
 for <pve-devel@lists.proxmox.com>; Thu,  2 Dec 2021 16:36:45 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id C7CB71BD53
 for <pve-devel@lists.proxmox.com>; Thu,  2 Dec 2021 16:36:43 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 8F99244DE6
 for <pve-devel@lists.proxmox.com>; Thu,  2 Dec 2021 16:36:43 +0100 (CET)
Date: Thu, 02 Dec 2021 16:36:36 +0100
From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= <f.gruenbichler@proxmox.com>
To: Fabian Ebner <f.ebner@proxmox.com>, pve-devel@lists.proxmox.com
References: <20211111140721.3288364-1-f.gruenbichler@proxmox.com>
 <7a0968df-559b-81de-1df1-f912866b39d5@proxmox.com>
In-Reply-To: <7a0968df-559b-81de-1df1-f912866b39d5@proxmox.com>
MIME-Version: 1.0
User-Agent: astroid/0.15.0 (https://github.com/astroidmail/astroid)
Message-Id: <1638457827.k2t0muctq1.astroid@nora.none>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.520 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NUMERIC_HTTP_ADDR       1.242 Uses a numeric IP address in URL
 POISEN_SPAM_PILL          0.1 Meta: its spam
 POISEN_SPAM_PILL_1        0.1 random spam to be learned in bayes
 POISEN_SPAM_PILL_3        0.1 random spam to be learned in bayes
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 WEIRD_PORT              0.001 Uses non-standard port number for HTTP
Subject: Re: [pve-devel] [PATCH v2 qemu-server++ 0/15] remote migration
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 02 Dec 2021 15:37:15 -0000

On November 30, 2021 3:06 pm, Fabian Ebner wrote:
> Am 11.11.21 um 15:07 schrieb Fabian Gr=C3=BCnbichler:
>> this series adds remote migration for VMs.
>>=20
>> both live and offline migration including NBD and storage-migrated disks
>> should work.
>>=20
>=20
> Played around with it for a while. Biggest issue is that migration fails=20
> if there is no 'meta' property in the config. Most other things I wish=20
> for are better error handling, but it seems to be in good shape otherwise=
!

thanks for testing and reviewing!

> Error "storage does not exist" if the real issue is missing access=20
> rights. But that error also appears if missing access for=20
> /cluster/resources or if the target node does not exists.

hmm, yeah, those are from the "abort early" checks that I'd like to punt=20
to some pre-condition API endpoint anyway, or to the client itself.. but=20
"missing access" and "doesn't exist" can be rather similar from that=20
perspective, so probably a more generic error message is best :)

> For the 'config' command, 'Sys.Modify' seems to be required
>      failed to handle 'config' command - 403 Permission check failed (/,=20
> Sys.Modify)

did you maybe have 'startup' set in the VM config? that requires=20
Sys.Modify on / to change, and we pass the full config (well, most of=20
it) to update_vm_api.. I don't think there is an intentional check=20
otherwise for that privilege, so it's likely from something in the=20
config..

> but it does create an empty configuration file, leading to
>      target_vmid: Guest with ID '5678' already exists on remote cluster
> on the next attempt.
> It also already allocates the disks, but doesn't clean them up, because=20
> it gets the wrong lock (since the config is empty) and aborts the 'quit'=20
> command.

yeah, that is related:
- we remove the lock prio to calling update_vm_api
- if updating fails, the config is therefor unlocked (and empty, since=20
  it only contains the lock prior to this command)
- mtunnel still expects the create lock to be there and errors out

the fix is to not remove the lock, but set skiplock for the=20
update_vm_api call (this is in an flocked context with the config lock=20
checked after entering the locked scope, so it is as safe as can be).

that way if update_vm_api fails, the config still has the 'create' lock,=20
and cleanup triggered by the client works.

> If the config is not recent enough to have a 'meta' property:
>      failed to handle 'config' command - unable to parse value of 'meta'=20
> - got undefined value
> Same issue with disk+config cleanup as above.

yeah, that should be skipped if undef/false. the missing cleanup is=20
fixed by the changes above.

> The local VM stayes locked with 'migrate'. Is that how it should be?

unless you pass `--delete`, yes. the idea was that this prevents=20
starting the VM on source and target cluster by accident, but using a=20
different lock value might be more appropriate to signify 'this guest=20
moved to another cluster'.

> Also the __migration__ snapshot will stay around, resulting in an error=20
> when trying to migrate again.

I'll take a look, that shouldn't happen and probably indicates some=20
missing cleanup somewhere ;)

> For live migration I always got a (cosmetic?) "WS closed=20
> unexpectedly"-error:
> tunnel: -> sending command "quit" to remote
> tunnel: <- got reply
> tunnel: Tunnel to=20
> https://192.168.20.142:8006/api2/json/nodes/rob2/qemu/5678/mtunnelwebsock=
et?
> ticket=3DPVETUNNEL%3A<SNIP>&socket=3D%2Frun%2Fqemu-server%2F5678.mtunnel=20
> failed - WS closed unexpectedly
> 2021-11-30 13:49:39 migration finished successfully (duration 00:01:02)
> UPID:pve701:0000D8AD:000CB782:61A61DA5:qmigrate:111:root@pam:

yeah, this is cosmetic. we send the quit, the remote end closes the=20
socket, the tunnel doesn't know that the websocket closing is expected=20
and prints that error. I'll think about whether we can handle this=20
somehow with the existing commands, or we need to do a proper close=20
command from the client side as well (I dropped the start of something=20
like this in v2 ;))

-> quit
<- done (server expects WS to close)
-> close (ctrl cmd)
-> close WS

or

-> close (ctrl cmd, sets flag to expect WS to be closed)
-> quit
<- done
<- close WS

I'd like to avoid having the tunnel parse the quit command and having=20
that serve a double-purpose, to keep the tunnel re-usable and agnostic.

> Fun fact: the identity storage mapping will be used for storages that=20
> don't appear in the explicit mapping. E.g. it's possible to migrate a VM=20
> that only has disks on storeA with --target-storage storeB:storeB (if=20
> storeA exists on the target of course). But the explicit identity=20
> mapping is prohibited.

thanks, that should be caught as well.

> When a target bridge is not present (should that be detected ahead of=20
> starting the migration?) and likely for any other startup failure the=20
> only error in the log is:

yeah, the targetbridge should probably be part of the initial checks=20
like the storages (and then be re-checked anyway for the actual start=20
command, but at least for non-misbehaving clients they get an early=20
abort).

> 2021-11-30 14:43:10 ERROR: online migrate failure - error - tunnel=20
> command '{"cmd":"star<SNIP>
> failed to handle 'start' command - start failed: QEMU exited with code 1
> For non-remote migration we are more verbose in this case and log the=20
> QEMU output.

yeah, that should be easily improvable. the query-disk-import could=20
similarly be extended with an optional 'msg' that we can use to print=20
progress if there is any..

> Can/should an interrupt be handled more gracefully, so that remote=20
> cleanup still happens?
> ^CCMD websocket tunnel died: command 'proxmox-websocket-tunnel' failed:=20
> interrupted by signal
>=20
> 2021-11-30 14:39:07 ERROR: interrupted by signal
> 2021-11-30 14:39:07 aborting phase 1 - cleanup resources
> 2021-11-30 14:39:08 ERROR: writing to tunnel failed: broken pipe
> 2021-11-30 14:39:08 ERROR: migration aborted (duration 00:00:10):=20
> interrupted by signal

yeah, we should probably leave it up to the regular error-handling and=20
cleanup logic to cleanly close the tunnel, instead of it being directly=20
terminated by the signal..