From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id D7C099852D
 for <pve-devel@lists.proxmox.com>; Wed, 15 Nov 2023 11:17:17 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id B20A931C9
 for <pve-devel@lists.proxmox.com>; Wed, 15 Nov 2023 11:16:47 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Wed, 15 Nov 2023 11:16:47 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 00F3642F6E
 for <pve-devel@lists.proxmox.com>; Wed, 15 Nov 2023 11:16:47 +0100 (CET)
Message-ID: <cd0a5e37-7d64-4969-a319-dcd2e01c7b73@proxmox.com>
Date: Wed, 15 Nov 2023 11:16:46 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>
References: <20231114140204.27679-1-f.ebner@proxmox.com>
 <20231114140204.27679-4-f.ebner@proxmox.com>
 <767911ec-7dee-443e-bb29-513d0c63a74a@proxmox.com>
 <1700038013.zqvp143ykl.astroid@yuna.none>
From: Fiona Ebner <f.ebner@proxmox.com>
In-Reply-To: <1700038013.zqvp143ykl.astroid@yuna.none>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.079 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: Re: [pve-devel] [RFC common 2/2] fix #4501: next unused port: work
 around issue with too short expiretime
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 15 Nov 2023 10:17:17 -0000

Am 15.11.23 um 09:51 schrieb Fabian Grünbichler:
> On November 14, 2023 3:13 pm, Fiona Ebner wrote:
>> Am 14.11.23 um 15:02 schrieb Fiona Ebner:
>>> For QEMU migration via TCP, there's a bit of time between port
>>> reservation and usage, because currently, the port needs to be
>>> reserved before doing a fork, where the systemd scope needs to be set
>>> up and swtpm might need to be started before the QEMU binary can be
>>> invoked and actually use the port.
>>>
>>> To improve the situation, get the latest port recorded in the
>>> reservation file and start trying from the next port, wrapping around
>>> when hitting the end. Drastically reduces the chances to run into a
>>> conflict, because after a given port reservation, all other ports are
>>> tried first before returning to that port.
>>
>> Sorry, this is not true. It can be that in the meantime, a port for a
>> different range is reserved and that will remove the reservation for the
>> port in the migration range if expired. So we'd need to change the code
>> to remove only reservations from the current range to not lose track of
>> the latest previously used migration port.
> 
> the whole thing would also still be racy anyway across processes, right?
> not sure it's worth the additional effort compared to the other patches
> then.. if those are not enough (i.e., we still get real-world reports)
> then the "increase expiry further + explicit release" part could still
> be implemented as follow-up..
> 

No, it's not racy. The reserved ports are recorded in a file while
taking a lock, so each process will see what the others have last used.

My question is if the explicit release isn't much more effort than the
round-robin-style approach here, because it puts the burden on the
callers and you need a good way to actually check if the port is now
used successfully (without creating new races!) and a new helper for
removing the reservation. (That said, with round-robin we would need to
remember which range a port was for if we ever want to support
overlapping ranges).

As long as you have competition for early ports, you just need one
instance where the time between reservation and usage is longer than the
expiretime and you're very likely to hit the issue (except another
earlier port is free again). With round-robin, you need such an instance
and have all(!) other ports reserved/used in the meantime.