From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
Stoiko Ivanov <s.ivanov@proxmox.com>
Subject: Re: [pve-devel] [PATCH http-server 1/3] accept-phase: fix conn_count "leak"
Date: Fri, 4 Dec 2020 08:41:35 +0100 [thread overview]
Message-ID: <9b796a6a-71ff-5e51-101f-15fce02240e7@proxmox.com> (raw)
In-Reply-To: <20201203184322.20253-2-s.ivanov@proxmox.com>
On 03.12.20 19:43, Stoiko Ivanov wrote:
> When handling new connections in 'accept_connections' the number of
> active connections got increased before the AnyEvent::Handle
> registered the callback which would decrement it on error.
>
> Any error/die beforehand would skip the decrement, and leave the
> process in an endless loop upon exiting in wait_end_loop.
>
> This can happen e.g. when the call to getpeername fails, or if the
> connection is denied by the ALLOW_FROM/DENY_FROM settings in
> '/etc/default/pveproxy' (which is also the simplest reproducer for
> that).
>
> Additionally it can cause a denial of service, by attempting to
> connect from a denied ip until the connection count exeeds the maximum
> connections of all child-processes.
>
> Reported via our community-forum:
> https://forum.proxmox.com/threads/pveproxy-eats-available-ram.79617/
>
> Co-Authored-by: Dominik Csapak <d.csapak@proxmox.com>
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
> PVE/APIServer/AnyEvent.pm | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/PVE/APIServer/AnyEvent.pm b/PVE/APIServer/AnyEvent.pm
> index c55da7f..c5f5fdc 100644
> --- a/PVE/APIServer/AnyEvent.pm
> +++ b/PVE/APIServer/AnyEvent.pm
> @@ -1479,8 +1479,6 @@ sub accept {
>
> fh_nonblocking $clientfh, 1;
>
> - $self->{conn_count}++;
> -
> return $clientfh;
> }
>
> @@ -1561,7 +1559,7 @@ sub accept_connections {
> my $reqstate = { keep_alive => $self->{keep_alive} };
>
> # stop keep-alive when there are many open connections
> - if ($self->{conn_count} >= $self->{max_conn_soft_limit}) {
> + if ($self->{conn_count}+1 >= $self->{max_conn_soft_limit}) {
style nit: don't glue operators together `self->{conn_count} + 1`
> $reqstate->{keep_alive} = 0;
> }
>
> @@ -1600,6 +1598,9 @@ sub accept_connections {
> },
> ($self->{tls_ctx} ? (tls => "accept", tls_ctx => $self->{tls_ctx}) : ()));
>
> + $self->{conn_count}++;
> +
But isn't this wrong too? The FH could already get a EOF here, and thus get reduced
before increased - one could maybe argue "well it should get increased again after,
here, so brought in sync again", i.e.:
1. Get's registered
2. clientfh EOF
-> $self->client_do_disconnect($reqstate);
-> $self->{conn_count}--;
! Wrong counter here, could lead to possible wrong decisions now already (not checked
for sure) or when adding/changing something (as this is non-obvious, not even a comment
hinting it!)
3. resume here, brought in sync again, reminds me of a short comic strip I recently
run into [0].
So between 2. and 3. we are in limbo, while short it still matters, every race triggers
sooner or later, computers are just to fast and scheduling to nondeterministic for that
to not happen.
Why not move the $self->{conn_count}++; before AnyEvent Handle instance is created,
i.e., where we do $early_err = 0; as this effectively is the barrier for the connection
being valid or not. We could also add handling for when the handle creation itself fails,
setting a flag afterwards and checking both, that flag and $early_err in the existing
error handling branch outside of the eval, and decrement in that case.
Or, do you have some documented behavior, not stated here in the commit, that this all
just cannot happen at all?
[0]: https://i.redd.it/m4zbw3u7rbk21.jpg
> +
> print "$$: ACCEPT FH" . $clientfh->fileno() . " CONN$self->{conn_count}\n" if $self->{debug};
>
> $self->push_request_header($reqstate);
>
next prev parent reply other threads:[~2020-12-04 7:41 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-03 18:43 [pve-devel] [PATCH http-server 0/3] improve error handling in accept_connections Stoiko Ivanov
2020-12-03 18:43 ` [pve-devel] [PATCH http-server 1/3] accept-phase: fix conn_count "leak" Stoiko Ivanov
2020-12-04 7:41 ` Thomas Lamprecht [this message]
2020-12-03 18:43 ` [pve-devel] [PATCH http-server 2/3] accept-phase: shutdown socket on early error Stoiko Ivanov
2020-12-03 18:43 ` [pve-devel] [PATCH http-server 3/3] add debug log for problems during accept Stoiko Ivanov
2020-12-04 6:37 ` Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9b796a6a-71ff-5e51-101f-15fce02240e7@proxmox.com \
--to=t.lamprecht@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=s.ivanov@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox