From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 4A45A71896 for ; Thu, 8 Apr 2021 18:44:20 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 3921D1E13B for ; Thu, 8 Apr 2021 18:44:20 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 059781E12C for ; Thu, 8 Apr 2021 18:44:19 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id C399B45A05 for ; Thu, 8 Apr 2021 18:44:18 +0200 (CEST) Message-ID: Date: Thu, 8 Apr 2021 18:44:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:88.0) Gecko/20100101 Thunderbird/88.0 Content-Language: en-US To: Proxmox VE development discussion , Fabian Ebner References: <20210408103316.7619-1-f.ebner@proxmox.com> From: Thomas Lamprecht In-Reply-To: <20210408103316.7619-1-f.ebner@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.043 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [qemu.pm, qemuconfig.pm, qemuserver.pm] Subject: Re: [pve-devel] [POC qemu-server] fix 3303: allow "live" upgrade of qemu version X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Apr 2021 16:44:20 -0000 On 08.04.21 12:33, Fabian Ebner wrote: > The code is in a very early state, I'm just sending this to discuss the idea. > I didn't do a whole lot of testing yet, but it does seem to work. > > The idea is rather simple: > 1. save the state to ramfs > 2. stop the VM > 3. start the VM loading the state For the record, as we (Dietmar, you and I) discussed this a bit off-list: The issue we see here is that one temporarily requires a potential big chunk of free memory, i.e., another time the amount the guest is assigned. So tens to hundreds of GiB, which (educated guess) > 90 % of our users just do not have available, at least for the bigger VMs of theirs. So, it would be nicer if we could makes this more QEMU internal, e.g., just save the state out (as that one may not be compatible 1:1 for reuse with the new QEMU version) and re-use the guest memory directly, e.g., start new QEMU process migrate state and map over the guest-memory, then pause old one, cont new one and be done (very condensed). That may have it's own difficulties/edge-cases, but it would not require having so much extra memory freely available... > > This approach solves the problem that our stack is (currently) not designed to > have multiple instances with the same VM ID running. To do so, we'd need to > handle config locking, sockets, pid file, passthrough resources?, etc. > > Another nice feature of this approach is that it doesn't require touching the > vm_start or migration code at all, avoiding further bloating. > > > Thanks to Fabian G. and Stefan for inspiring this idea: > > Fabian G. suggested using the suspend to disk + start route if the required > changes to our stack would turn out to be infeasable. > > Stefan suggested migrating to a dummy VM (outside our stack) which just holds > the state and migrating back right away. It seems that dummy VM is in fact not > even needed ;) If we really really care about smallest possible downtime, this > approach might still be the best, and we'd need to start the dummy VM while the > backwards migration runs (resulting in two times the migration downtime). But > it does have more moving parts and requires some migration/startup changes. > > > Fabian Ebner (6): > create vmstate_size helper > create savevm_monitor helper > draft of upgrade_qemu function > draft of qemuupgrade API call > add timing for testing > add usleep parameter to savevm_monitor > > PVE/API2/Qemu.pm | 60 ++++++++++++++++++++++ > PVE/QemuConfig.pm | 10 +--- > PVE/QemuServer.pm | 125 +++++++++++++++++++++++++++++++++++++++------- > 3 files changed, 170 insertions(+), 25 deletions(-) >