From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 5CBBA66272 for ; Thu, 5 Nov 2020 13:35:47 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 4E05418223 for ; Thu, 5 Nov 2020 13:35:47 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id C10E218216 for ; Thu, 5 Nov 2020 13:35:46 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 81F404601A for ; Thu, 5 Nov 2020 13:35:46 +0100 (CET) To: Proxmox VE development discussion , Stefan Reiter Cc: w.bumiller@proxmox.com References: <20201019121842.20277-1-s.reiter@proxmox.com> From: Thomas Lamprecht Message-ID: Date: Thu, 5 Nov 2020 13:35:45 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:83.0) Gecko/20100101 Thunderbird/83.0 MIME-Version: 1.0 In-Reply-To: <20201019121842.20277-1-s.reiter@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.115 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] applied-series: [PATCH v2 0/7] Handle guest shutdown during backups X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Nov 2020 12:35:47 -0000 On 19.10.20 14:18, Stefan Reiter wrote: > Use QEMU's -no-shutdown argument so the QEMU instance stays alive even if the > guest shuts down. This allows running backups to continue. > > To handle cleanup of QEMU processes, this series extends the qmeventd to handle > SHUTDOWN events not just for detecting guest triggered shutdowns, but also to > clean the QEMU process via SIGTERM (which quits it even with -no-shutdown > enabled). > > A VZDump instance can then signal qmeventd (via the /var/run/qmeventd.sock) to > keep alive certain VM processes if they're backing up, and once the backup is > done, they close their connection to the socket, and qmeventd knows that it can > now safely kill the VM (as long as the guest hasn't booted again, which is > possible with some changes to the vm_start code also done in this series). > > This series requires a lot of testing, since there can be quite a few edge cases > lounging around. So far it's been doing well for me, aside from the VNC GUI > looking a bit confused when you do the 'shutdown during backup' motion (i.e. the > last image from the framebuffer stays in the VNC window, looks more like the > guest has crashed than shut down) - but I haven't found a solution for that. > > > v2: > * use a pidfd (see `man pidfd_open`, though the manpage does not seem to be > available on Debian atm - I suppose since they don't support kernel 5.3 yet?), > fall back to regular racy kill() included, for people running older kernels > * initialize client->type with CLIENT_NONE instead of client->state > * rebase on latest master > > > qemu-server: Stefan Reiter (6): > qmeventd: add handling for -no-shutdown QEMU instances > qmeventd: add last-ditch effort SIGKILL cleanup > vzdump: connect to qmeventd for duration of backup > vzdump: use dirty bitmap for not running VMs too > config_to_command: use -no-shutdown option > fix vm_resume and allow vm_start with QMP status 'shutdown' applied series, thanks! As talked off-list, may make sense to also disable the shutdown button for running backups, as the user cannot use that to stop the backup, but rather should use the "stop" button of the backup worker task, if they want to do that.