public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH http-server v3 0/4] improve error handling in accept_connections
@ 2020-12-10 14:02 Stoiko Ivanov
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 1/4] add debug print helper Stoiko Ivanov
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Stoiko Ivanov @ 2020-12-10 14:02 UTC (permalink / raw)
  To: pve-devel

v2->v3:
* incorporated Thomas patient and excellent feedback (Thanks!!):
* replaced the two error-flags by a single handle_creation flag,  based
  on the observation that the code after AnyEvent::Handle->new() does not die
  (and that we'd probably still shutdown the socket if it did)
* replaced the calls to shutdown($fh, SHUT_RD) with calls to close - in case
  accept() fails, we don't want to sent anything + it sents one tcp-packet less
  (only rst+ack, instead of fin+ack,rst+ack)
* replaced syslog+dprint by a single warn (in our AnyEvent daemons that results
  in the line being printed to STDERR+to syslog with level warning)
* squashed 1/5 (introduce dprint sub) and 5/5 (use it)

tested on my workstation (the cleanup by having an deny-acl in
/etc/default/pveproxy and connecting 1000 times with openssl s_client)

original cover-letter for v2:
v1->v2:
* increment of connection count now happens right before the AnyEvent::Handle
  is created
* the handle-creation is guarded by an error-flag, and if it fails the
  connection count is decremented (bounded to 0) again
* as suggested by Thomas - added a debug print sub which includes the
  package name, linenumber and function name where the printing happens
* refactored all active debug-prints to use it.

original cover-letter for v1:
This patchset is the result of investigating a report in our community forum:
https://forum.proxmox.com/threads/pveproxy-eats-available-ram.79617/

The first patch fixes an issue where pveproxy worker processes would never
exit (and eat quite a bit of ram+cpu) when 'getpeername' returned an error.

The second seemed to me like a sensible further cleanup, and the third patch
will hopefully provide the needed information when debugging such things in
the future.

Huge thanks to Dominik, who analyzed this issue with me!


Stoiko Ivanov (4):
  add debug print helper
  accept-phase: fix conn_count "leak"
  accept-phase: shutdown socket on early error
  add debug log for problems during accept

 PVE/APIServer/AnyEvent.pm | 68 ++++++++++++++++++++++++++++++---------
 1 file changed, 53 insertions(+), 15 deletions(-)

-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] [PATCH http-server v3 1/4] add debug print helper
  2020-12-10 14:02 [pve-devel] [PATCH http-server v3 0/4] improve error handling in accept_connections Stoiko Ivanov
@ 2020-12-10 14:02 ` Stoiko Ivanov
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 2/4] accept-phase: fix conn_count "leak" Stoiko Ivanov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Stoiko Ivanov @ 2020-12-10 14:02 UTC (permalink / raw)
  To: pve-devel; +Cc: Thomas Lamprecht

and uniformly use it

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 PVE/APIServer/AnyEvent.pm | 32 +++++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/PVE/APIServer/AnyEvent.pm b/PVE/APIServer/AnyEvent.pm
index c55da7f..b8c28ce 100644
--- a/PVE/APIServer/AnyEvent.pm
+++ b/PVE/APIServer/AnyEvent.pm
@@ -66,6 +66,16 @@ my $split_abs_uri = sub {
     return wantarray ? ($rel_uri, $format) : $rel_uri;
 };
 
+sub dprint {
+    my ($self, $message) = @_;
+
+    return if !$self->{debug};
+
+    my ($pkg, $pkgfile, $line, $sub) = caller(1);
+    $sub =~ s/^(?:.+::)+//;
+    print "worker[$$]: $pkg +$line: $sub: $message\n";
+}
+
 sub log_request {
     my ($self, $reqstate) = @_;
 
@@ -143,13 +153,13 @@ sub client_do_disconnect {
 	return;
     }
 
-    print "close connection $hdl\n" if $self->{debug};
+    $self->dprint("close connection $hdl");
 
     &$shutdown_hdl($hdl);
 
     $self->{conn_count}--;
 
-    print "$$: CLOSE FH" .  $hdl->{fh}->fileno() . " CONN$self->{conn_count}\n" if $self->{debug};
+    $self->dprint("CLOSE FH" .  $hdl->{fh}->fileno() . " CONN$self->{conn_count}");
 }
 
 sub finish_response {
@@ -412,7 +422,7 @@ sub websocket_proxy {
 	    my ($fh) = @_
 		or die "connect to '$remhost:$remport' failed: $!";
 
-	    print "$$: CONNECTed to '$remhost:$remport'\n" if $self->{debug};
+	    $self->dprint("CONNECTed to '$remhost:$remport'");
 
 	    $reqstate->{proxyhdl} = AnyEvent::Handle->new(
 		fh => $fh,
@@ -507,7 +517,7 @@ sub websocket_proxy {
 			$reqstate->{proxyhdl}->push_write($payload) if $reqstate->{proxyhdl};
 		    } elsif ($opcode == 8) {
 			my $statuscode = unpack ("n", $payload);
-			print "websocket received close. status code: '$statuscode'\n" if $self->{debug};
+			$self->dprint("websocket received close. status code: '$statuscode'");
 			if ($reqstate->{proxyhdl}) {
 			    $reqstate->{proxyhdl}->push_shutdown();
 			}
@@ -538,7 +548,7 @@ sub websocket_proxy {
 		"Sec-WebSocket-Protocol: $wsproto\015\012" .
 		"\015\012";
 
-	    print $res if $self->{debug};
+	    $self->dprint($res);
 
 	    $reqstate->{hdl}->push_write($res);
 
@@ -840,9 +850,9 @@ sub handle_spice_proxy_request {
 
         if ($node ne 'localhost' && PVE::INotify::nodename() !~ m/^$node$/i) {
             $remip = $self->remote_node_ip($node);
-	    print "REMOTE CONNECT $vmid, $remip, $connect_str\n" if $self->{debug};
+	    $self->dprint("REMOTE CONNECT $vmid, $remip, $connect_str");
         } else {
-	    print "$$: CONNECT $vmid, $node, $spiceport\n" if $self->{debug};
+	    $self->dprint("CONNECT $vmid, $node, $spiceport");
 	}
 
 	if ($remip && $r->header('PVEDisableProxy')) {
@@ -860,7 +870,7 @@ sub handle_spice_proxy_request {
 	    my ($fh) = @_
 		or die "connect to '$remhost:$remport' failed: $!";
 
-	    print "$$: CONNECTed to '$remhost:$remport'\n" if $self->{debug};
+	    $self->dprint("CONNECTed to '$remhost:$remport'");
 	    $reqstate->{proxyhdl} = AnyEvent::Handle->new(
 		fh => $fh,
 		rbuf_max => 64*1024,
@@ -1333,7 +1343,7 @@ sub unshift_read_header {
 
 			die "upload without content length header not supported" if !$len;
 
-			print "start upload $path $ct $boundary\n" if $self->{debug};
+			$self->dprint("start upload $path $ct $boundary");
 
 			my $tmpfilename = get_upload_filename();
 			my $outfh = IO::File->new($tmpfilename, O_RDWR|O_CREAT|O_EXCL, 0600) ||
@@ -1571,7 +1581,7 @@ sub accept_connections {
 	    }
 
 	    if (!$self->{trusted_env} && !$self->check_host_access($reqstate->{peer_host})) {
-		print "$$: ABORT request from $reqstate->{peer_host} - access denied\n" if $self->{debug};
+		$self->dprint("ABORT request from $reqstate->{peer_host} - access denied");
 		$reqstate->{log}->{code} = 403;
 		$self->log_request($reqstate);
 		next;
@@ -1600,7 +1610,7 @@ sub accept_connections {
 		},
 		($self->{tls_ctx} ? (tls => "accept", tls_ctx => $self->{tls_ctx}) : ()));
 
-	    print "$$: ACCEPT FH" .  $clientfh->fileno() . " CONN$self->{conn_count}\n" if $self->{debug};
+	    $self->dprint("ACCEPT FH" .  $clientfh->fileno() . " CONN$self->{conn_count}");
 
 	    $self->push_request_header($reqstate);
 	}
-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] [PATCH http-server v3 2/4] accept-phase: fix conn_count "leak"
  2020-12-10 14:02 [pve-devel] [PATCH http-server v3 0/4] improve error handling in accept_connections Stoiko Ivanov
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 1/4] add debug print helper Stoiko Ivanov
@ 2020-12-10 14:02 ` Stoiko Ivanov
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 3/4] accept-phase: shutdown socket on early error Stoiko Ivanov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Stoiko Ivanov @ 2020-12-10 14:02 UTC (permalink / raw)
  To: pve-devel

When handling new connections in 'accept_connections' the number of
active connections (conn_count) got increased before the callback, which
would eventually decrease it got registered in AnyEvent::Handle->new.

Any error/die before registering the callback would skip the
decrement, and leave the process in an endless loop upon exiting in
wait_end_loop.

This can happen e.g. when the call to getpeername fails, or if the
connection is denied by the ALLOW_FROM/DENY_FROM settings in
'/etc/default/pveproxy' (which is also a simple reproducer for that).

Additionally it can cause a denial of service, by attempting to
connect from a denied ip until the connection count exeeds the maximum
connections of all child-processes.

This patch addresses the issue by incrementing the connection count
before attempting to create the handle, and decrementing it again, if
handle creation fails.

A warning is logged if 'conn_count' turns negative when decrementing
during cleanup on error/eof. In case creating a new handle during
initial accept_connection fails, a warning is logged as well, but
'conn_count' is not decremented.

Reported via our community-forum:
https://forum.proxmox.com/threads/pveproxy-eats-available-ram.79617/

Co-Authored-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 PVE/APIServer/AnyEvent.pm | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/PVE/APIServer/AnyEvent.pm b/PVE/APIServer/AnyEvent.pm
index b8c28ce..be60f2e 100644
--- a/PVE/APIServer/AnyEvent.pm
+++ b/PVE/APIServer/AnyEvent.pm
@@ -157,6 +157,8 @@ sub client_do_disconnect {
 
     &$shutdown_hdl($hdl);
 
+    warn "connection count <= 0!\n" if $self->{conn_count} <= 0;
+
     $self->{conn_count}--;
 
     $self->dprint("CLOSE FH" .  $hdl->{fh}->fileno() . " CONN$self->{conn_count}");
@@ -1489,8 +1491,6 @@ sub accept {
 
     fh_nonblocking $clientfh, 1;
 
-    $self->{conn_count}++;
-
     return $clientfh;
 }
 
@@ -1564,6 +1564,7 @@ sub check_host_access {
 sub accept_connections {
     my ($self) = @_;
 
+    my $handle_creation;
     eval {
 
 	while (my $clientfh = $self->accept()) {
@@ -1571,7 +1572,7 @@ sub accept_connections {
 	    my $reqstate = { keep_alive => $self->{keep_alive} };
 
 	    # stop keep-alive when there are many open connections
-	    if ($self->{conn_count} >= $self->{max_conn_soft_limit}) {
+	    if ($self->{conn_count} + 1 >= $self->{max_conn_soft_limit}) {
 		$reqstate->{keep_alive} = 0;
 	    }
 
@@ -1587,6 +1588,11 @@ sub accept_connections {
 		next;
 	    }
 
+	    # Increment conn_count before creating new handle, since creation
+	    # triggers callbacks, which can potentialy decrement (e.g.
+	    # on_error) conn_count before AnyEvent::Handle->new() returns.
+	    $handle_creation = 1;
+	    $self->{conn_count}++;
 	    $reqstate->{hdl} = AnyEvent::Handle->new(
 		fh => $clientfh,
 		rbuf_max => 64*1024,
@@ -1609,6 +1615,7 @@ sub accept_connections {
 		    if (my $err = $@) { syslog('err', "$err"); }
 		},
 		($self->{tls_ctx} ? (tls => "accept", tls_ctx => $self->{tls_ctx}) : ()));
+	    $handle_creation = 0;
 
 	    $self->dprint("ACCEPT FH" .  $clientfh->fileno() . " CONN$self->{conn_count}");
 
@@ -1618,6 +1625,13 @@ sub accept_connections {
 
     if (my $err = $@) {
 	syslog('err', $err);
+	if ($handle_creation) {
+	    if ($self->{conn_count} <= 0) {
+		warn "connection count <= 0 not decrementing!\n";
+	    } else {
+		$self->{conn_count}--;
+	    }
+	}
 	$self->{end_loop} = 1;
     }
 
-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] [PATCH http-server v3 3/4] accept-phase: shutdown socket on early error
  2020-12-10 14:02 [pve-devel] [PATCH http-server v3 0/4] improve error handling in accept_connections Stoiko Ivanov
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 1/4] add debug print helper Stoiko Ivanov
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 2/4] accept-phase: fix conn_count "leak" Stoiko Ivanov
@ 2020-12-10 14:02 ` Stoiko Ivanov
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 4/4] add debug log for problems during accept Stoiko Ivanov
  2020-12-10 19:25 ` [pve-devel] applied-series: [PATCH http-server v3 0/4] improve error handling in accept_connections Thomas Lamprecht
  4 siblings, 0 replies; 6+ messages in thread
From: Stoiko Ivanov @ 2020-12-10 14:02 UTC (permalink / raw)
  To: pve-devel

if an error happens before AnyEvent::Handle registers the cleanup
callback, we should shutdown/close the socket, when handling it.

Using close, instead of shutdown($sock, SHUT_WR) here, since we are in
an error-state, and would not read from the socket anyways.
(Additionally close sends just on packet (RST,ACK), vs shutdown
(FIN,ACK+RST,ACK) in its use here).

Co-Authored-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 PVE/APIServer/AnyEvent.pm | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/PVE/APIServer/AnyEvent.pm b/PVE/APIServer/AnyEvent.pm
index be60f2e..d3b48b7 100644
--- a/PVE/APIServer/AnyEvent.pm
+++ b/PVE/APIServer/AnyEvent.pm
@@ -1532,6 +1532,11 @@ sub check_host_access {
 
     my $cip = Net::IP->new($clientip);
 
+    if (!$cip) {
+	$self->dprint("client IP not parsable: $@");
+	return 0;
+    }
+
     my $match_allow = 0;
     my $match_deny = 0;
 
@@ -1564,10 +1569,10 @@ sub check_host_access {
 sub accept_connections {
     my ($self) = @_;
 
-    my $handle_creation;
+    my ($clientfh, $handle_creation);
     eval {
 
-	while (my $clientfh = $self->accept()) {
+	while ($clientfh = $self->accept()) {
 
 	    my $reqstate = { keep_alive => $self->{keep_alive} };
 
@@ -1579,12 +1584,16 @@ sub accept_connections {
 	    if (my $sin = getpeername($clientfh)) {
 		my ($pfamily, $pport, $phost) = PVE::Tools::unpack_sockaddr_in46($sin);
 		($reqstate->{peer_port}, $reqstate->{peer_host}) = ($pport,  Socket::inet_ntop($pfamily, $phost));
+	    } else {
+		close($clientfh);
+		next;
 	    }
 
 	    if (!$self->{trusted_env} && !$self->check_host_access($reqstate->{peer_host})) {
 		$self->dprint("ABORT request from $reqstate->{peer_host} - access denied");
 		$reqstate->{log}->{code} = 403;
 		$self->log_request($reqstate);
+		close($clientfh);
 		next;
 	    }
 
@@ -1625,6 +1634,7 @@ sub accept_connections {
 
     if (my $err = $@) {
 	syslog('err', $err);
+	close($clientfh);
 	if ($handle_creation) {
 	    if ($self->{conn_count} <= 0) {
 		warn "connection count <= 0 not decrementing!\n";
-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] [PATCH http-server v3 4/4] add debug log for problems during accept
  2020-12-10 14:02 [pve-devel] [PATCH http-server v3 0/4] improve error handling in accept_connections Stoiko Ivanov
                   ` (2 preceding siblings ...)
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 3/4] accept-phase: shutdown socket on early error Stoiko Ivanov
@ 2020-12-10 14:02 ` Stoiko Ivanov
  2020-12-10 19:25 ` [pve-devel] applied-series: [PATCH http-server v3 0/4] improve error handling in accept_connections Thomas Lamprecht
  4 siblings, 0 replies; 6+ messages in thread
From: Stoiko Ivanov @ 2020-12-10 14:02 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 PVE/APIServer/AnyEvent.pm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/PVE/APIServer/AnyEvent.pm b/PVE/APIServer/AnyEvent.pm
index d3b48b7..8a1af54 100644
--- a/PVE/APIServer/AnyEvent.pm
+++ b/PVE/APIServer/AnyEvent.pm
@@ -1544,6 +1544,7 @@ sub check_host_access {
 	foreach my $t (@{$self->{allow_from}}) {
 	    if ($t->overlaps($cip)) {
 		$match_allow = 1;
+		$self->dprint("client IP allowed: ". $t->prefix());
 		last;
 	    }
 	}
@@ -1552,6 +1553,7 @@ sub check_host_access {
     if ($self->{deny_from}) {
 	foreach my $t (@{$self->{deny_from}}) {
 	    if ($t->overlaps($cip)) {
+		$self->dprint("client IP denied: ". $t->prefix());
 		$match_deny = 1;
 		last;
 	    }
@@ -1585,6 +1587,7 @@ sub accept_connections {
 		my ($pfamily, $pport, $phost) = PVE::Tools::unpack_sockaddr_in46($sin);
 		($reqstate->{peer_port}, $reqstate->{peer_host}) = ($pport,  Socket::inet_ntop($pfamily, $phost));
 	    } else {
+		$self->dprint("getpeername failed: $!");
 		close($clientfh);
 		next;
 	    }
@@ -1634,6 +1637,7 @@ sub accept_connections {
 
     if (my $err = $@) {
 	syslog('err', $err);
+	$self->dprint("connection accept error: $err");
 	close($clientfh);
 	if ($handle_creation) {
 	    if ($self->{conn_count} <= 0) {
-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] applied-series: [PATCH http-server v3 0/4] improve error handling in accept_connections
  2020-12-10 14:02 [pve-devel] [PATCH http-server v3 0/4] improve error handling in accept_connections Stoiko Ivanov
                   ` (3 preceding siblings ...)
  2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 4/4] add debug log for problems during accept Stoiko Ivanov
@ 2020-12-10 19:25 ` Thomas Lamprecht
  4 siblings, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2020-12-10 19:25 UTC (permalink / raw)
  To: Proxmox VE development discussion, Stoiko Ivanov

On 10.12.20 15:02, Stoiko Ivanov wrote:
> v2->v3:
> * incorporated Thomas patient and excellent feedback (Thanks!!):
> * replaced the two error-flags by a single handle_creation flag,  based
>   on the observation that the code after AnyEvent::Handle->new() does not die
>   (and that we'd probably still shutdown the socket if it did)
> * replaced the calls to shutdown($fh, SHUT_RD) with calls to close - in case
>   accept() fails, we don't want to sent anything + it sents one tcp-packet less
>   (only rst+ack, instead of fin+ack,rst+ack)
> * replaced syslog+dprint by a single warn (in our AnyEvent daemons that results
>   in the line being printed to STDERR+to syslog with level warning)
> * squashed 1/5 (introduce dprint sub) and 5/5 (use it)
> 
> tested on my workstation (the cleanup by having an deny-acl in
> /etc/default/pveproxy and connecting 1000 times with openssl s_client)
> 
> original cover-letter for v2:
> v1->v2:
> * increment of connection count now happens right before the AnyEvent::Handle
>   is created
> * the handle-creation is guarded by an error-flag, and if it fails the
>   connection count is decremented (bounded to 0) again
> * as suggested by Thomas - added a debug print sub which includes the
>   package name, linenumber and function name where the printing happens
> * refactored all active debug-prints to use it.
> 
> original cover-letter for v1:
> This patchset is the result of investigating a report in our community forum:
> https://forum.proxmox.com/threads/pveproxy-eats-available-ram.79617/
> 
> The first patch fixes an issue where pveproxy worker processes would never
> exit (and eat quite a bit of ram+cpu) when 'getpeername' returned an error.
> 
> The second seemed to me like a sensible further cleanup, and the third patch
> will hopefully provide the needed information when debugging such things in
> the future.
> 
> Huge thanks to Dominik, who analyzed this issue with me!
> 
> 
> Stoiko Ivanov (4):
>   add debug print helper
>   accept-phase: fix conn_count "leak"
>   accept-phase: shutdown socket on early error
>   add debug log for problems during accept
> 
>  PVE/APIServer/AnyEvent.pm | 68 ++++++++++++++++++++++++++++++---------
>  1 file changed, 53 insertions(+), 15 deletions(-)
> 



applied series, much thanks!




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-10 19:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10 14:02 [pve-devel] [PATCH http-server v3 0/4] improve error handling in accept_connections Stoiko Ivanov
2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 1/4] add debug print helper Stoiko Ivanov
2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 2/4] accept-phase: fix conn_count "leak" Stoiko Ivanov
2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 3/4] accept-phase: shutdown socket on early error Stoiko Ivanov
2020-12-10 14:02 ` [pve-devel] [PATCH http-server v3 4/4] add debug log for problems during accept Stoiko Ivanov
2020-12-10 19:25 ` [pve-devel] applied-series: [PATCH http-server v3 0/4] improve error handling in accept_connections Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal