From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 7B331715BE
 for <pve-devel@lists.proxmox.com>; Thu,  8 Apr 2021 12:33:22 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 36B7519F6F
 for <pve-devel@lists.proxmox.com>; Thu,  8 Apr 2021 12:33:22 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 8CA8A19F40
 for <pve-devel@lists.proxmox.com>; Thu,  8 Apr 2021 12:33:20 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 5394B45A0B
 for <pve-devel@lists.proxmox.com>; Thu,  8 Apr 2021 12:33:20 +0200 (CEST)
From: Fabian Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Thu,  8 Apr 2021 12:33:10 +0200
Message-Id: <20210408103316.7619-1-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.20.1
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.007 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [qemuserver.pm, qemuconfig.pm, qemu.pm]
Subject: [pve-devel] [POC qemu-server] fix 3303: allow "live" upgrade of
 qemu version
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 08 Apr 2021 10:33:22 -0000

The code is in a very early state, I'm just sending this to discuss the idea.
I didn't do a whole lot of testing yet, but it does seem to work.

The idea is rather simple:
1. save the state to ramfs
2. stop the VM
3. start the VM loading the state

This approach solves the problem that our stack is (currently) not designed to
have multiple instances with the same VM ID running. To do so, we'd need to
handle config locking, sockets, pid file, passthrough resources?, etc.

Another nice feature of this approach is that it doesn't require touching the
vm_start or migration code at all, avoiding further bloating.


Thanks to Fabian G. and Stefan for inspiring this idea:

Fabian G. suggested using the suspend to disk + start route if the required
changes to our stack would turn out to be infeasable.

Stefan suggested migrating to a dummy VM (outside our stack) which just holds
the state and migrating back right away. It seems that dummy VM is in fact not
even needed ;) If we really really care about smallest possible downtime, this
approach might still be the best, and we'd need to start the dummy VM while the
backwards migration runs (resulting in two times the migration downtime). But
it does have more moving parts and requires some migration/startup changes.


Fabian Ebner (6):
  create vmstate_size helper
  create savevm_monitor helper
  draft of upgrade_qemu function
  draft of qemuupgrade API call
  add timing for testing
  add usleep parameter to savevm_monitor

 PVE/API2/Qemu.pm  |  60 ++++++++++++++++++++++
 PVE/QemuConfig.pm |  10 +---
 PVE/QemuServer.pm | 125 +++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 170 insertions(+), 25 deletions(-)

-- 
2.20.1