public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Fiona Ebner <f.ebner@proxmox.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [RFC common 2/2] fix #4501: next unused port: work around issue with too short expiretime
Date: Wed, 15 Nov 2023 11:27:10 +0100	[thread overview]
Message-ID: <1700043709.1l8tqlkmok.astroid@yuna.none> (raw)
In-Reply-To: <cd0a5e37-7d64-4969-a319-dcd2e01c7b73@proxmox.com>

On November 15, 2023 11:16 am, Fiona Ebner wrote:
> Am 15.11.23 um 09:51 schrieb Fabian Grünbichler:
>> On November 14, 2023 3:13 pm, Fiona Ebner wrote:
>>> Am 14.11.23 um 15:02 schrieb Fiona Ebner:
>>>> For QEMU migration via TCP, there's a bit of time between port
>>>> reservation and usage, because currently, the port needs to be
>>>> reserved before doing a fork, where the systemd scope needs to be set
>>>> up and swtpm might need to be started before the QEMU binary can be
>>>> invoked and actually use the port.
>>>>
>>>> To improve the situation, get the latest port recorded in the
>>>> reservation file and start trying from the next port, wrapping around
>>>> when hitting the end. Drastically reduces the chances to run into a
>>>> conflict, because after a given port reservation, all other ports are
>>>> tried first before returning to that port.
>>>
>>> Sorry, this is not true. It can be that in the meantime, a port for a
>>> different range is reserved and that will remove the reservation for the
>>> port in the migration range if expired. So we'd need to change the code
>>> to remove only reservations from the current range to not lose track of
>>> the latest previously used migration port.
>> 
>> the whole thing would also still be racy anyway across processes, right?
>> not sure it's worth the additional effort compared to the other patches
>> then.. if those are not enough (i.e., we still get real-world reports)
>> then the "increase expiry further + explicit release" part could still
>> be implemented as follow-up..
>> 
> 
> No, it's not racy. The reserved ports are recorded in a file while
> taking a lock, so each process will see what the others have last used.

you are of course, right, sorry for the noise - I misread the diff and
thought the new variables were local state, instead of just helper
variables inside the sub..

> My question is if the explicit release isn't much more effort than the
> round-robin-style approach here, because it puts the burden on the
> callers and you need a good way to actually check if the port is now
> used successfully (without creating new races!) and a new helper for
> removing the reservation. (That said, with round-robin we would need to
> remember which range a port was for if we ever want to support
> overlapping ranges).

yes, we'd need to convert callers to become

reserve();

do_thing_that_binds();

clear_reservation();

possibly with the clearing repeated in the error handling code path.
clearing the reservation could also just mean setting the expiry a few
seconds into the future, for example, to cover any "binding might happen
with a slight delay in a forked process" type of situations.

> As long as you have competition for early ports, you just need one
> instance where the time between reservation and usage is longer than the
> expiretime and you're very likely to hit the issue (except another
> earlier port is free again). With round-robin, you need such an instance
> and have all(!) other ports reserved/used in the meantime.

true. the only way to really fix it would be to make the reservation
actually already do the binding, and pass around the open socket like
Wolfgang suggests. if that works for Qemu, we could at least make that
behaviour opt-in and convert this particular (and most likely to be
problematic) usage?




      reply	other threads:[~2023-11-15 10:27 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-14 14:02 [pve-devel] [RFC qemu-server/common] fix #4501: improve port reservation for QEMU TCP migration Fiona Ebner
2023-11-14 14:02 ` [pve-devel] [RFC qemu-server 1/1] partially fix #4501: migration: start vm: move port reservation and usage closer together Fiona Ebner
2023-11-15  8:55   ` Fabian Grünbichler
2023-11-15 10:12     ` Wolfgang Bumiller
2023-11-15 10:22       ` Fiona Ebner
2023-11-15 11:21         ` Wolfgang Bumiller
2023-11-14 14:02 ` [pve-devel] [RFC common 1/2] partially fix #4501: next unused port: bump port reservation expiretime Fiona Ebner
2023-11-15  8:51   ` Fabian Grünbichler
2023-11-14 14:02 ` [pve-devel] [RFC common 2/2] fix #4501: next unused port: work around issue with too short expiretime Fiona Ebner
2023-11-14 14:13   ` Fiona Ebner
2023-11-15  8:51     ` Fabian Grünbichler
2023-11-15 10:16       ` Fiona Ebner
2023-11-15 10:27         ` Fabian Grünbichler [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1700043709.1l8tqlkmok.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal