From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 61AD5BBE69 for ; Wed, 20 Dec 2023 13:32:47 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 3A501C843 for ; Wed, 20 Dec 2023 13:32:17 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 20 Dec 2023 13:32:16 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 434E148743 for ; Wed, 20 Dec 2023 13:32:16 +0100 (CET) Message-ID: Date: Wed, 20 Dec 2023 13:32:14 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Content-Language: en-GB To: Proxmox VE development discussion , Fiona Ebner References: <20231219134459.49187-1-f.ebner@proxmox.com> From: Thomas Lamprecht In-Reply-To: <20231219134459.49187-1-f.ebner@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.058 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH v2 qemu-server] fix #4501: TCP migration: start vm: move port reservation and usage closer together X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Dec 2023 12:32:47 -0000 On 19/12/2023 14:44, Fiona Ebner wrote: > Currently, volume activation, PCI reservation and resetting systemd > scope happen in between, so the 5 second expiretime used for port > reservation is not always enough. > > It's possible to defer telling QEMU where it should listen for > migration and do so after it has been started via QMP. Therefore, the > port reservation can be moved very close to the actual usage. > > Mentioned here for completeness and can still be done as an additional > change later if desired: next_migrate_port could be modified to > optionally return the open socket and it should be possible to pass > the file descriptor directly to QEMU, but that would require accepting > the connection before on the Perl side (otherwise leads to ENOTCONN > 107). While it would avoid any races, it's not the most elegant > and the change at hand should be enough in all practical situations. > > Signed-off-by: Fiona Ebner > --- > > Discussion for v1: > https://lists.proxmox.com/pipermail/pve-devel/2023-November/060149.html > > Changes in v2: > * move reservation+usage much closer together than was done in v1 > of the qemu-server patch > * drop other partial fix attempts for pve-common I find this approach more than just an OK'ish stop-gap, this should fix most such issues we can have in practice. If you can get someone to additionally test this it's fine to apply as is IMO. The one thing that might be worse (didn't check reservation logic) compared to FD passing is when there would be no migration ports available, as then we would have already spend slightly more time and resources by having the VM already started. We could side-step this a bit by looping for requesting a reserved port for a few seconds. But IMO it's not highly likely to run out of such ports, most actions that can spawn multiple migrations (like HA) are capped by default. So once tested a few general migration situations, consider this: Acked-by: Thomas Lamprecht