public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing
@ 2023-02-17 15:25 Max Carrara
  2023-02-17 15:25 ` [pve-devel] [PATCH http-server 1/2] http-server: anyevent: add new subroutine for processing requests Max Carrara
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Max Carrara @ 2023-02-17 15:25 UTC (permalink / raw)
  To: pve-devel

This patch series refactors the request processing algorithm by
separating its data and logic from one another in order to make future
changes regarding requests more transparent and explicit.

Patch 1: Introduce the refactored algorithm
Patch 2: Replace and remove the existing algorithm with the new one

This refactor is conducted very carefully to ensure that none of the
existing behaviours change in any way. One exception however would be
the reading of the HTTP header itself: Instead of reading the
header by recursively appending / prepending callbacks to the handle's
and reading the header line-by-line, the entire header is read at
once, reducing the number of callbacks needed to parse the header from
the number of lines to one. This affects when the header line count
and size limits are checked - instead of being checked *while* the
header is being read, these limits are verified *after* the header
was read from the read buffer. If this change in behaviour is
undesirable, I'll gladly provide a v2.

Otherwise, the behaviour is identical; the logic is restructured,
but entirely preserved.


To give a more concrete example of what these changes would make more
transparent, the enhancement requested in #4494[1] would become just
another step (-> subroutine) in the refactored algorithm. That way it
becomes much clearer where the TLS handshake is being verified
and which responses (e.g. a `301`) are sent in case the handshake
never happened or was aborted.

[0] https://bugzilla.proxmox.com/show_bug.cgi?id=4494


pve-http-server:

Max Carrara (2):
  http-server: anyevent: add new subroutine for processing requests
  http-server: anyevent: replace request processing subroutine

 src/PVE/APIServer/AnyEvent.pm | 671 ++++++++++++++++++++++------------
 1 file changed, 445 insertions(+), 226 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH http-server 1/2] http-server: anyevent: add new subroutine for processing requests
  2023-02-17 15:25 [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Max Carrara
@ 2023-02-17 15:25 ` Max Carrara
  2023-02-28 14:05   ` Fabian Grünbichler
  2023-02-17 15:25 ` [pve-devel] [PATCH http-server 2/2] http-server: anyevent: replace request processing subroutine Max Carrara
  2023-02-17 17:15 ` [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Thomas Lamprecht
  2 siblings, 1 reply; 7+ messages in thread
From: Max Carrara @ 2023-02-17 15:25 UTC (permalink / raw)
  To: pve-devel

The `begin_process_next_request` subroutine and its auxiliary
subroutines are added to serve as a replacement for
`push_request_header` and `unshift_read_header` in order to simplify
the parsing logic. The entire header is read by a single callback
in `begin_process_next_request` and then processed step-by-step
instead of reading each line and handling it separately.
This is done in order to separate reading the incoming header from
the processing algorithm's logic while simultaneously making the
steps used to process the header more transparent and explicit.

This comes with one noteworthy change in behaviour:
Because the header is now read all at once from the input buffer,
the header field count limit is only checked for after the entire
header was already read. In `unshift_read_header` the limit is
checked for each line that is read.

The behaviour otherwise remains fully identical.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
 src/PVE/APIServer/AnyEvent.pm | 492 ++++++++++++++++++++++++++++++++++
 1 file changed, 492 insertions(+)

Note: The subroutines in `@process_steps` of
`begin_process_next_request` mimic the steps the original algorithm
takes. If necessary, these can of course be split up into smaller
steps.

diff --git a/src/PVE/APIServer/AnyEvent.pm b/src/PVE/APIServer/AnyEvent.pm
index 3cd77fa..bf7957a 100644
--- a/src/PVE/APIServer/AnyEvent.pm
+++ b/src/PVE/APIServer/AnyEvent.pm
@@ -1570,6 +1570,498 @@ sub push_request_header {
     warn $@ if $@;
 }
 
+sub begin_process_next_request {
+    my ($self, $reqstate) = @_;
+
+    # Match for *end* of HTTP header fields
+    my $re_accept = qr<\r?\n\r?\n>;
+
+    # Don't reject anything - rejection causes EBADMSG otherwise, aborting the callback
+    # and might match for unexpected things (such as TLS handshake data!)
+    my $re_reject = undef;
+
+    # Skip over all header fields, but don't match for CRs and NLs
+    # as those are required for re_accept above
+    my $re_skip = qr<^.*[^\r\n]>;
+
+    eval {
+	$reqstate->{hdl}->push_read(regex => $re_accept, $re_reject, $re_skip, sub {
+	    my ($handle, $data) = @_;
+
+	    # $data now contains the entire (raw) HTTP header
+	    $self->dprint("parse_request_header: handle: $handle [$data]");
+
+	    eval {
+		$reqstate->{keep_alive}--;
+
+		my $http_header = $data =~ s/^\s+|\s+$//gr ;
+		my ($http_method_line, $http_header_field_lines) = split(/\r?\n/, $http_header, 2);
+
+		$reqstate->{http_method_line} = $http_method_line;
+		$reqstate->{http_header_field_lines} = $http_header_field_lines;
+
+		# Array of references to the subroutines used to process the request
+		my @process_steps = \(
+		    &parse_request_header_method,
+		    &parse_request_header_fields,
+		    &postprocess_header_fields,
+		    &authenticate_and_handle_request,
+		);
+
+		for my $step (@process_steps) {
+		    my $result = $step->($self, $reqstate);
+
+		    my $success = $result->{success} || 0;
+		    if (!$success) {
+			$self->make_response_from_unsuccessful_result($reqstate, $result);
+			return;
+		    }
+		}
+
+	    };
+
+	    $self->dprint("parse_request_header: handle $handle DONE");
+	    warn $@ if $@;
+
+	});
+    };
+    warn $@ if $@;
+}
+
+sub make_response_from_unsuccessful_result {
+    my ($self, $reqstate, $result) = @_;
+
+    die "No HTTP response status code provided\n"
+	if !$result->{http_code};
+
+    eval {
+	my $response_object = HTTP::Response->new(
+	    $result->{http_code},
+	    $result->{http_message},
+	    $result->{http_header},
+	    $result->{http_content}
+	);
+
+	$self->dprint("Sending response: " . $result->{http_code} . " " . $result->{http_message});
+
+	$self->response(
+	    $reqstate,
+	    $response_object,
+	    $result->{response_mtime},
+	    $result->{response_nocomp},
+	    $result->{response_delay},
+	    $result->{response_stream_fh}
+	);
+
+	$self->client_do_disconnect($reqstate);
+    };
+
+    warn $@ if $@;
+}
+
+sub parse_request_header_method {
+    my ($self, $reqstate) = @_;
+
+    my $http_method_line = $reqstate->{http_method_line};
+    die "http_method_line not within reqstate\n"
+	if !$http_method_line;
+
+    my $re_http_method = qr<^(\S+)\040(\S+)\040HTTP\/(\d+)\.(\d+)>;
+
+    $self->dprint("Processing initial header line: $http_method_line");
+
+    if ($http_method_line =~ $re_http_method) {
+	my ($method, $uri, $major, $minor) = ($1, $2, $3, $4);
+
+	if ($major != 1) {
+	    return {
+		success => 0,
+		http_code => 506,
+		http_message => "HTTP protocol version $major.$minor not supported",
+	    };
+	}
+
+	# if an '@' comes before the first slash, proxy forwarding might consider
+	# the frist part of the uri to be part of an authority
+	if ($uri =~ m|^[^/]*@|) {
+	    return {
+		success => 0,
+		http_code => 400,
+		http_message => 'invalid uri',
+	    };
+	}
+
+	$reqstate->{proto}->{str} = "HTTP/$major.$minor";
+	$reqstate->{proto}->{maj} = $major;
+	$reqstate->{proto}->{min} = $minor;
+	$reqstate->{proto}->{ver} = $major*1000 + $minor;
+
+	$reqstate->{request} = HTTP::Request->new($method, $uri);
+	$reqstate->{starttime} = [gettimeofday];
+
+	# Only requests with valid HTTP methods are counted
+	$self->{request_count}++;
+
+	if ($self->{request_count} >= $self->{max_requests}) {
+	    $self->{end_loop} = 1;
+	}
+
+	return { success => 1 };
+    }
+
+    $self->dprint("Could not match for HTTP method / proto: $http_method_line");
+    return {
+	success => 0,
+	http_code => 400,
+	http_message => 'bad request',
+    };
+}
+
+sub parse_request_header_fields {
+    my ($self, $reqstate) = @_;
+
+    my $http_header_field_lines = $reqstate->{http_header_field_lines};
+
+    die "http_header_field_lines not within reqstate\n"
+	if !$reqstate->{http_header_field_lines};
+
+    die "request object not within reqstate\n"
+	if !$reqstate->{request};
+
+    my $request = $reqstate->{request};
+    my $state = { size => 0 };
+
+    # Matches a header field definition, e.g.
+    # 'Foo: Bar'
+    my $re_field = qr<^([^:\s]+)\s*:\s*(.*)>;
+
+    # Matches a line beginning with at least one whitespace character,
+    # followed by some other data; used to match multiline headers,
+    # e.g. if 'Foo: Bar' was matched previously, and the next line
+    # contains '   Qux', the resulting field will be 'Foo: Bar Qux'
+    my $re_field_multiline = qr<^\s+(.*)>;
+
+    my @lines = split(/\r?\n/, $http_header_field_lines);
+
+    die "too many http header lines (> $limit_max_headers)\n"
+        if scalar(@lines) >= $limit_max_headers;
+
+
+    for my $line (@lines) {
+	$self->dprint("Processing header line: $line");
+
+	die "http header too large\n"
+	    if ($state->{size} += length($line)) >= $limit_max_header_size;
+
+	if ($line =~ $re_field) {
+	    $request->push_header($state->{key}, $state->{value})
+		if $state->{key};
+	    ($state->{key}, $state->{value}) = ($1, $2);
+
+	} elsif ($line =~ $re_field_multiline) {
+
+	    if (!$state->{key}) {  # sanity check for invalid header
+		return {
+		    success => 0,
+		    http_code => 400,
+		    http_message => 'bad request',
+		};
+	    }
+
+	    $state->{value} .= " $1";
+
+	} else {
+	    return {
+		success => 0,
+		http_code => 506,
+		http_message => 'unable to parse request header',
+	    };
+	}
+
+    }
+
+    $request->push_header($state->{key}, $state->{value})
+	if $state->{key};
+
+    return { success => 1 };
+}
+
+sub postprocess_header_fields {
+    my ($self, $reqstate) = @_;
+
+    die "request object not within reqstate\n"
+	if !$reqstate->{request};
+
+    my $request = $reqstate->{request};
+    my $method = $request->method();
+
+    if (!$known_methods->{$method}) {
+	return {
+	    success => 0,
+	    http_code => HTTP_NOT_IMPLEMENTED,
+	    http_message => "method '$method' not available",
+	};
+    }
+
+    my $h_connection = $request->header('Connection');
+    my $h_accept_encoding = $request->header('Accept-Encoding');
+
+    $reqstate->{accept_gzip} = ($h_accept_encoding && $h_accept_encoding =~ m/gzip/) ? 1 : 0;
+
+    if ($h_connection) {
+	$reqstate->{keep_alive} = 0 if $h_connection =~ m/close/oi;
+    } else {
+	if ($reqstate->{proto}->{ver} < 1001) {
+	    $reqstate->{keep_alive} = 0;
+	}
+    }
+
+    my $h_transfer_encoding  = $request->header('Transfer-Encoding');
+    if ($h_transfer_encoding && lc($h_transfer_encoding) eq 'chunked') {
+	# Handle chunked transfer encoding
+	return {
+	    success => 0,
+	    http_code => 501,
+	    http_message => 'chunked transfer encoding not supported',
+	};
+
+    } elsif ($h_transfer_encoding) {
+	return {
+	    success => 0,
+	    http_code => 501,
+	    http_message => "Unknown transfer encoding '$h_transfer_encoding'",
+	};
+    }
+
+    my $h_pveclientip = $request->header('PVEClientIP');
+
+    # FIXME: how can we make PVEClientIP header trusted?
+    if ($self->{trusted_env} && $h_pveclientip) {
+	$reqstate->{peer_host} = $h_pveclientip;
+    } else {
+	$request->header('PVEClientIP', $reqstate->{peer_host});
+    }
+
+    if (my $rpcenv = $self->{rpcenv}) {
+	$rpcenv->set_request_host($request->header('Host'));
+    }
+
+    return { success => 1 };
+}
+
+sub authenticate_and_handle_request {
+    my ($self, $reqstate) = @_;
+
+    die "request not within reqstate\n"
+	if !$reqstate->{request};
+
+    my $request = $reqstate->{request};
+    my $method = $request->method();
+    my $path = uri_unescape($request->uri->path());
+    my $base_uri = $self->{base_uri};
+
+    # Begin authentication
+    $self->dprint("Begin authentication");
+    my $auth = {};
+
+    if ($self->{spiceproxy}) {
+	$self->dprint("Authenticating via spiceproxy");
+	my $connect_str = $request->header('Host');
+	my ($vmid, $node, $port) = $self->verify_spice_connect_url($connect_str);
+
+	if (!(defined($vmid) && $node && $port)) {
+	    return {
+		success => 0,
+		http_code => HTTP_UNAUTHORIZED,
+		http_message => 'invalid ticket',
+	    };
+	}
+
+	$self->handle_spice_proxy_request($reqstate, $connect_str, $vmid, $node, $port);
+	$self->dprint("spiceproxy request handled");
+	return { success => 1 };
+
+    } elsif ($path =~ m/^\Q$base_uri\E/) {
+	$self->dprint("Authenticating via base_uri regex");
+	my $token = $request->header('CSRFPreventionToken');
+	my $cookie = $request->header('Cookie');
+	my $auth_header = $request->header('Authorization');
+
+	# prefer actual cookie
+	my $ticket = PVE::APIServer::Formatter::extract_auth_value($cookie, $self->{cookie_name});
+
+	# fallback to cookie in 'Authorization' header
+	$ticket = PVE::APIServer::Formatter::extract_auth_value($auth_header, $self->{cookie_name})
+	    if !$ticket;
+
+	# finally, fallback to API token if no ticket has been provided so far
+	my $api_token;
+	$api_token = PVE::APIServer::Formatter::extract_auth_value($auth_header, $self->{apitoken_name})
+	    if !$ticket;
+
+	my ($rel_uri, $format) = $split_abs_uri->($path, $base_uri);
+
+	if (!$format) {
+	    return {
+		success => 0,
+		http_code => HTTP_NOT_IMPLEMENTED,
+		http_message => 'no such uri',
+	    };
+	}
+
+	$self->dprint("Calling auth_handler");
+	eval {
+	    $auth = $self->auth_handler(
+		$method,
+		$rel_uri,
+		$ticket,
+		$token,
+		$api_token,
+		$reqstate->{peer_host}
+	    );
+	};
+	$self->dprint("auth_handler done");
+
+	if (my $err = $@) {
+	    $self->dprint("auth_handler threw error $@");
+	    # HACK: see Note 1
+	    Net::SSLeay::ERR_clear_error();
+
+	    my $result = {
+		success => 0,
+		response_delay => 3, # always delay unauthorized calls by 3 seconds
+		response_nocomp => 0,
+	    };
+
+	    if (ref($err) eq "PVE::Exception") {
+		$result->{http_code} = $err->{code} || HTTP_INTERNAL_SERVER_ERROR;
+		$result->{http_message} = $err->{msg};
+
+	    } elsif (my $formatter = PVE::APIServer::Formatter::get_login_formatter($format)) {
+		my ($raw, $ct, $nocomp) =
+		    $formatter->($path, $auth, $self->{formatter_config});
+
+		if (ref($raw) && (ref($raw) eq 'HTTP::Response')) {
+		    $result->{http_code} = $raw->code;
+		    $result->{http_message} = $raw->message;
+		    $result->{http_header} = $raw->headers;
+		    $result->{http_content} = $raw->content;
+
+		} else {
+		    $result->{http_code} = HTTP_UNAUTHORIZED;
+		    $result->{http_message} = 'Login Required';
+		    $result->{http_header} = HTTP::Headers->new('Content-Type' => $ct);
+		    $result->{http_content} = $raw;
+
+		}
+
+		$result->{response_nocomp} = $nocomp;
+
+	    } else {
+		$result->{http_code} = HTTP_UNAUTHORIZED;
+		$result->{http_message} = $err;
+	    }
+
+	    return $result;
+	}
+    }
+
+    $reqstate->{log}->{userid} = $auth->{userid};
+
+    # Continue handling request
+    $self->dprint("Continuing to handle request");
+
+    my $h_content_length = $request->header('Content-Length');
+    my $h_content_type = $request->header('Content-Type');
+
+    if (!$h_content_length) {
+	$self->dprint("No Content-Length provided; handling request the usual way");
+	$self->handle_request($reqstate, $auth, $method, $path);
+	return { success => 1 };
+    }
+
+    if (!($method eq 'PUT' || $method eq 'POST')) {
+	return {
+	    success => 0,
+	    http_code => 501,
+	    http_message => "Unexpected content for method '$method'",
+	};
+    }
+
+    my ($content, $boundary) = $h_content_type ? parse_content_type($h_content_type) : ();
+
+    if ($auth->{isUpload} && !$self->{trusted_env}) {
+	die "upload 'Content-Type '$h_content_type' not implemented\n"
+	    if !($boundary && $content && ($content eq 'multipart/form-data'));
+
+	die "upload without content length header not supported"
+	    if !$h_content_length;
+
+	$self->dprint("start upload $path $content $boundary");
+
+	my $tmpfilename = get_upload_filename();
+	my $outfh = IO::File->new($tmpfilename, O_RDWR|O_CREAT|O_EXCL, 0600) ||
+	    die "unable to create temporary upload file '$tmpfilename'";
+
+	$reqstate->{keep_alive} = 0;
+
+	my $boundlen = length($boundary) + 8; # \015?\012--$boundary--\015?\012
+
+	my $state = {
+	    size => $h_content_length,
+	    boundary => $boundary,
+	    ctx => Digest::MD5->new,
+	    boundlen =>  $boundlen,
+	    maxheader => 2048 + $boundlen, # should be large enough
+	    params => decode_urlencoded($request->url->query()),
+	    phase => 0,
+	    read => 0,
+	    post_size => 0,
+	    starttime => [gettimeofday],
+	    outfh => $outfh,
+	};
+
+	$reqstate->{tmpfilename} = $tmpfilename;
+	$reqstate->{hdl}->on_read(sub {
+	    $self->file_upload_multipart($reqstate, $auth, $method, $path, $state);
+	});
+
+	return { success => 1 };
+    }
+
+    if ($h_content_length > $limit_max_post) {
+	return {
+	    success => 0,
+	    http_code => 501,
+	    'for data too large',
+	};
+    }
+
+    if (
+	!$content
+	|| $content eq 'application/x-www-form-urlencoded'
+	|| $content eq 'application/json'
+    ) {
+	$self->dprint("Received content '$content', reading chunk length of '$h_content_length'");
+
+	$reqstate->{hdl}->unshift_read(chunk => $h_content_length, sub {
+	    my ($hdl, $data) = @_;
+	    $self->dprint("Read data for content '$content'");
+	    $self->dprint("Continuing to handle request after reading content");
+	    $request->content($data);
+	    $self->handle_request($reqstate, $auth, $method, $path);
+	});
+	return { success => 1 };
+
+    } else {
+	return {
+	    success => 0,
+	    http_code => 506,
+	    http_message => "upload 'Content-Type '$h_content_type' not implemented",
+	};
+    }
+}
+
 sub accept {
     my ($self) = @_;
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH http-server 2/2] http-server: anyevent: replace request processing subroutine
  2023-02-17 15:25 [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Max Carrara
  2023-02-17 15:25 ` [pve-devel] [PATCH http-server 1/2] http-server: anyevent: add new subroutine for processing requests Max Carrara
@ 2023-02-17 15:25 ` Max Carrara
  2023-02-17 17:15 ` [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Thomas Lamprecht
  2 siblings, 0 replies; 7+ messages in thread
From: Max Carrara @ 2023-02-17 15:25 UTC (permalink / raw)
  To: pve-devel

All usages of the `push_request_header` subroutine  are replaced with
`begin_process_next_request`. Consequently, the `push_request_header`
and `unshift_read_header` subroutines are removed, as they are not
used anywhere anymore.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
 src/PVE/APIServer/AnyEvent.pm | 277 +---------------------------------
 1 file changed, 2 insertions(+), 275 deletions(-)

diff --git a/src/PVE/APIServer/AnyEvent.pm b/src/PVE/APIServer/AnyEvent.pm
index bf7957a..86f5e07 100644
--- a/src/PVE/APIServer/AnyEvent.pm
+++ b/src/PVE/APIServer/AnyEvent.pm
@@ -181,7 +181,7 @@ sub finish_response {
     if (!$self->{end_loop} && $reqstate->{keep_alive} > 0) {
 	# print "KEEPALIVE $reqstate->{keep_alive}\n" if $self->{debug};
 	$hdl->on_read(sub {
-	    eval { $self->push_request_header($reqstate); };
+	    eval { $self->begin_process_next_request($reqstate); };
 	    warn $@ if $@;
 	});
     } else {
@@ -1297,279 +1297,6 @@ sub get_upload_filename {
     return "/var/tmp/pveupload-" . Digest::MD5::md5_hex($tmpfile_seq_no . time() . $$);
 }
 
-sub unshift_read_header {
-    my ($self, $reqstate, $state) = @_;
-
-    $state = { size => 0, count => 0 } if !$state;
-
-    $reqstate->{hdl}->unshift_read(line => sub {
-	my ($hdl, $line) = @_;
-
-	eval {
-	    # print "$$: got header: $line\n" if $self->{debug};
-
-	    die "too many http header lines (> $limit_max_headers)\n" if ++$state->{count} >= $limit_max_headers;
-	    die "http header too large\n" if ($state->{size} += length($line)) >= $limit_max_header_size;
-
-	    my $r = $reqstate->{request};
-	    if ($line eq '') {
-
-		my $path = uri_unescape($r->uri->path());
-		my $method = $r->method();
-
-		$r->push_header($state->{key}, $state->{val})
-		    if $state->{key};
-
-		if (!$known_methods->{$method}) {
-		    my $resp = HTTP::Response->new(HTTP_NOT_IMPLEMENTED, "method '$method' not available");
-		    $self->response($reqstate, $resp);
-		    return;
-		}
-
-		my $conn = $r->header('Connection');
-		my $accept_enc = $r->header('Accept-Encoding');
-		$reqstate->{accept_gzip} = ($accept_enc && $accept_enc =~ m/gzip/) ? 1 : 0;
-
-		if ($conn) {
-		    $reqstate->{keep_alive} = 0 if $conn =~ m/close/oi;
-		} else {
-		    if ($reqstate->{proto}->{ver} < 1001) {
-			$reqstate->{keep_alive} = 0;
-		    }
-		}
-
-		my $te  = $r->header('Transfer-Encoding');
-		if ($te && lc($te) eq 'chunked') {
-		    # Handle chunked transfer encoding
-		    $self->error($reqstate, 501, "chunked transfer encoding not supported");
-		    return;
-		} elsif ($te) {
-		    $self->error($reqstate, 501, "Unknown transfer encoding '$te'");
-		    return;
-		}
-
-		my $pveclientip = $r->header('PVEClientIP');
-		my $base_uri = $self->{base_uri};
-
-		# fixme: how can we make PVEClientIP header trusted?
-		if ($self->{trusted_env} && $pveclientip) {
-		    $reqstate->{peer_host} = $pveclientip;
-		} else {
-		    $r->header('PVEClientIP', $reqstate->{peer_host});
-		}
-
-		my $len = $r->header('Content-Length');
-
-		my $host_header = $r->header('Host');
-		if (my $rpcenv = $self->{rpcenv}) {
-		    $rpcenv->set_request_host($host_header);
-		}
-
-		# header processing complete - authenticate now
-
-		my $auth = {};
-		if ($self->{spiceproxy}) {
-		    my $connect_str = $host_header;
-		    my ($vmid, $node, $port) = $self->verify_spice_connect_url($connect_str);
-		    if (!(defined($vmid) && $node && $port)) {
-			$self->error($reqstate, HTTP_UNAUTHORIZED, "invalid ticket");
-			return;
-		    }
-		    $self->handle_spice_proxy_request($reqstate, $connect_str, $vmid, $node, $port);
-		    return;
-		} elsif ($path =~ m/^\Q$base_uri\E/) {
-		    my $token = $r->header('CSRFPreventionToken');
-		    my $cookie = $r->header('Cookie');
-		    my $auth_header = $r->header('Authorization');
-
-		    # prefer actual cookie
-		    my $ticket = PVE::APIServer::Formatter::extract_auth_value($cookie, $self->{cookie_name});
-
-		    # fallback to cookie in 'Authorization' header
-		    $ticket = PVE::APIServer::Formatter::extract_auth_value($auth_header, $self->{cookie_name})
-			if !$ticket;
-
-		    # finally, fallback to API token if no ticket has been provided so far
-		    my $api_token;
-		    $api_token = PVE::APIServer::Formatter::extract_auth_value($auth_header, $self->{apitoken_name})
-			if !$ticket;
-
-		    my ($rel_uri, $format) = &$split_abs_uri($path, $self->{base_uri});
-		    if (!$format) {
-			$self->error($reqstate, HTTP_NOT_IMPLEMENTED, "no such uri");
-			return;
-		    }
-
-		    eval {
-			$auth = $self->auth_handler($method, $rel_uri, $ticket, $token, $api_token,
-						    $reqstate->{peer_host});
-		    };
-		    if (my $err = $@) {
-			# HACK: see Note 1
-			Net::SSLeay::ERR_clear_error();
-			# always delay unauthorized calls by 3 seconds
-			my $delay = 3;
-
-			if (ref($err) eq "PVE::Exception") {
-
-			    $err->{code} ||= HTTP_INTERNAL_SERVER_ERROR,
-			    my $resp = HTTP::Response->new($err->{code}, $err->{msg});
-			    $self->response($reqstate, $resp, undef, 0, $delay);
-
-			} elsif (my $formatter = PVE::APIServer::Formatter::get_login_formatter($format)) {
-			    my ($raw, $ct, $nocomp) =
-				$formatter->($path, $auth, $self->{formatter_config});
-			    my $resp;
-			    if (ref($raw) && (ref($raw) eq 'HTTP::Response')) {
-				$resp = $raw;
-			    } else {
-				$resp = HTTP::Response->new(HTTP_UNAUTHORIZED, "Login Required");
-				$resp->header("Content-Type" => $ct);
-				$resp->content($raw);
-			    }
-			    $self->response($reqstate, $resp, undef, $nocomp, $delay);
-			} else {
-			    my $resp = HTTP::Response->new(HTTP_UNAUTHORIZED, $err);
-			    $self->response($reqstate, $resp, undef, 0, $delay);
-			}
-			return;
-		    }
-		}
-
-		$reqstate->{log}->{userid} = $auth->{userid};
-
-		if ($len) {
-
-		    if (!($method eq 'PUT' || $method eq 'POST')) {
-			$self->error($reqstate, 501, "Unexpected content for method '$method'");
-			return;
-		    }
-
-		    my $ctype = $r->header('Content-Type');
-		    my ($ct, $boundary) = $ctype ? parse_content_type($ctype) : ();
-
-		    if ($auth->{isUpload} && !$self->{trusted_env}) {
-			die "upload 'Content-Type '$ctype' not implemented\n"
-			    if !($boundary && $ct && ($ct eq 'multipart/form-data'));
-
-			die "upload without content length header not supported" if !$len;
-
-			die "upload without content length header not supported" if !$len;
-
-			$self->dprint("start upload $path $ct $boundary");
-
-			my $tmpfilename = get_upload_filename();
-			my $outfh = IO::File->new($tmpfilename, O_RDWR|O_CREAT|O_EXCL, 0600) ||
-			    die "unable to create temporary upload file '$tmpfilename'";
-
-			$reqstate->{keep_alive} = 0;
-
-			my $boundlen = length($boundary) + 8; # \015?\012--$boundary--\015?\012
-
-			my $state = {
-			    size => $len,
-			    boundary => $boundary,
-			    ctx => Digest::MD5->new,
-			    boundlen =>  $boundlen,
-			    maxheader => 2048 + $boundlen, # should be large enough
-			    params => decode_urlencoded($r->url->query()),
-			    phase => 0,
-			    read => 0,
-			    post_size => 0,
-			    starttime => [gettimeofday],
-			    outfh => $outfh,
-			};
-			$reqstate->{tmpfilename} = $tmpfilename;
-			$reqstate->{hdl}->on_read(sub {
-			    $self->file_upload_multipart($reqstate, $auth, $method, $path, $state);
-			});
-			return;
-		    }
-
-		    if ($len > $limit_max_post) {
-			$self->error($reqstate, 501, "for data too large");
-			return;
-		    }
-
-		    if (!$ct || $ct eq 'application/x-www-form-urlencoded' || $ct eq 'application/json') {
-			$reqstate->{hdl}->unshift_read(chunk => $len, sub {
-			    my ($hdl, $data) = @_;
-			    $r->content($data);
-			    $self->handle_request($reqstate, $auth, $method, $path);
-		        });
-		    } else {
-			$self->error($reqstate, 506, "upload 'Content-Type '$ctype' not implemented");
-		    }
-		} else {
-		    $self->handle_request($reqstate, $auth, $method, $path);
-		}
-	    } elsif ($line =~ /^([^:\s]+)\s*:\s*(.*)/) {
-		$r->push_header($state->{key}, $state->{val}) if $state->{key};
-		($state->{key}, $state->{val}) = ($1, $2);
-		$self->unshift_read_header($reqstate, $state);
-	    } elsif ($line =~ /^\s+(.*)/) {
-		$state->{val} .= " $1";
-		$self->unshift_read_header($reqstate, $state);
-	    } else {
-		$self->error($reqstate, 506, "unable to parse request header");
-	    }
-	};
-	warn $@ if $@;
-    });
-};
-
-sub push_request_header {
-    my ($self, $reqstate) = @_;
-
-    eval {
-	$reqstate->{hdl}->push_read(line => sub {
-	    my ($hdl, $line) = @_;
-
-	    eval {
-		# print "got request header: $line\n" if $self->{debug};
-
-		$reqstate->{keep_alive}--;
-
-		if ($line =~ /(\S+)\040(\S+)\040HTTP\/(\d+)\.(\d+)/o) {
-		    my ($method, $url, $maj, $min) = ($1, $2, $3, $4);
-
-		    if ($maj != 1) {
-			$self->error($reqstate, 506, "http protocol version $maj.$min not supported");
-			return;
-		    }
-		    if ($url =~ m|^[^/]*@|) {
-			# if an '@' comes before the first slash proxy forwarding might consider
-			# the frist part of the url to be part of an authority...
-			$self->error($reqstate, 400, "invalid url");
-			return;
-		    }
-
-		    $self->{request_count}++; # only count valid request headers
-		    if ($self->{request_count} >= $self->{max_requests}) {
-			$self->{end_loop} = 1;
-		    }
-		    $reqstate->{log} = { requestline => $line };
-		    $reqstate->{proto}->{str} = "HTTP/$maj.$min";
-		    $reqstate->{proto}->{maj} = $maj;
-		    $reqstate->{proto}->{min} = $min;
-		    $reqstate->{proto}->{ver} = $maj*1000+$min;
-		    $reqstate->{request} = HTTP::Request->new($method, $url);
-		    $reqstate->{starttime} = [gettimeofday],
-
-		    $self->unshift_read_header($reqstate);
-		} elsif ($line eq '') {
-		    # ignore empty lines before requests (browser bugs?)
-		    $self->push_request_header($reqstate);
-		} else {
-		    $self->error($reqstate, 400, 'bad request');
-		}
-	    };
-	    warn $@ if $@;
-	});
-    };
-    warn $@ if $@;
-}
-
 sub begin_process_next_request {
     my ($self, $reqstate) = @_;
 
@@ -2243,7 +1970,7 @@ sub accept_connections {
 
 	    $self->dprint("ACCEPT FH" .  $clientfh->fileno() . " CONN$self->{conn_count}");
 
-	    $self->push_request_header($reqstate);
+	    $self->begin_process_next_request($reqstate);
 	}
     };
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing
  2023-02-17 15:25 [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Max Carrara
  2023-02-17 15:25 ` [pve-devel] [PATCH http-server 1/2] http-server: anyevent: add new subroutine for processing requests Max Carrara
  2023-02-17 15:25 ` [pve-devel] [PATCH http-server 2/2] http-server: anyevent: replace request processing subroutine Max Carrara
@ 2023-02-17 17:15 ` Thomas Lamprecht
  2023-02-20  9:40   ` Max Carrara
  2 siblings, 1 reply; 7+ messages in thread
From: Thomas Lamprecht @ 2023-02-17 17:15 UTC (permalink / raw)
  To: Proxmox VE development discussion, Max Carrara

Am 17/02/2023 um 16:25 schrieb Max Carrara:
> This patch series refactors the request processing algorithm by
> separating its data and logic from one another in order to make future
> changes regarding requests more transparent and explicit.
> 
> Patch 1: Introduce the refactored algorithm
> Patch 2: Replace and remove the existing algorithm with the new one

See not much benefit in having this split into two patches in this way,
constructing (one or ideally more) patches such that the refactored (out)
code is directly put to use in(stead of) it's old place has normally more
benefits, even if just allows one to see what actually moved (e.g., using
the `--color-moved=zebra --color-moved-ws=ignore-all-space` git switches)

But no need to resend now, I can squash locally for doing a more in-depth
review first.

> 
> This refactor is conducted very carefully to ensure that none of the
> existing behaviours change in any way.

Careful as in? ^^ Doesn't seem to be a very conservative "just split up",
but more involved; and there isn't any testing added to ensure the changes
don't mess with semantics - which tbf might not be the easiest thing to do
in a sensible manner, that is actually covering a lot, here.

But especially header parsing with some edge cases could be quite nicely
done with the respective request methods mocked.

> One exception however would be
> the reading of the HTTP header itself: Instead of reading the
> header by recursively appending / prepending callbacks to the handle's
> and reading the header line-by-line, the entire header is read at
> once, reducing the number of callbacks needed to parse the header from
> the number of lines to one. This affects when the header line count
> and size limits are checked - instead of being checked *while* the
> header is being read, these limits are verified *after* the header
> was read from the read buffer. If this change in behaviour is
> undesirable, I'll gladly provide a v2.

that seems to make those limits mostly (i.e., besides maybe proxied
requests) irrelevant though? If one spent all of the resources to process
a huge header fully it doesn't makes sense to only then die if it's deemed
to big.

But again, I'd wait out for a more in-depth review before going on a v2.

> 
> Otherwise, the behaviour is identical; the logic is restructured,
> but entirely preserved.
> 
> 
> To give a more concrete example of what these changes would make more
> transparent, the enhancement requested in #4494[1] would become just
> another step (-> subroutine) in the refactored algorithm. That way it
> becomes much clearer where the TLS handshake is being verified
> and which responses (e.g. a `301`) are sent in case the handshake
> never happened or was aborted.

A patch, even if just POC, would be a more concrete example ;-) that could
then _actually_ show that the changes actually provide a nicer interface to
extend.

> 
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=4494
> 
> 
> pve-http-server:
> 
> Max Carrara (2):
>   http-server: anyevent: add new subroutine for processing requests
>   http-server: anyevent: replace request processing subroutine
> 
>  src/PVE/APIServer/AnyEvent.pm | 671 ++++++++++++++++++++++------------
>  1 file changed, 445 insertions(+), 226 deletions(-)
> 

puh, almost doubles SLOC...




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing
  2023-02-17 17:15 ` [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Thomas Lamprecht
@ 2023-02-20  9:40   ` Max Carrara
  2023-02-20  9:59     ` [pve-devel] [PATCH] http-server: redirect HTTP to HTTPS Max Carrara
  0 siblings, 1 reply; 7+ messages in thread
From: Max Carrara @ 2023-02-20  9:40 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox VE development discussion

On 2/17/23 18:15, Thomas Lamprecht wrote:
> Am 17/02/2023 um 16:25 schrieb Max Carrara:
>> This patch series refactors the request processing algorithm by
>> separating its data and logic from one another in order to make future
>> changes regarding requests more transparent and explicit.
>>
>> Patch 1: Introduce the refactored algorithm
>> Patch 2: Replace and remove the existing algorithm with the new one
> 
> See not much benefit in having this split into two patches in this way,
> constructing (one or ideally more) patches such that the refactored (out)
> code is directly put to use in(stead of) it's old place has normally more
> benefits, even if just allows one to see what actually moved (e.g., using
> the `--color-moved=zebra --color-moved-ws=ignore-all-space` git switches)
> 
> But no need to resend now, I can squash locally for doing a more in-depth
> review first.

Alright, noted! Will keep that in mind.


>> This refactor is conducted very carefully to ensure that none of the
>> existing behaviours change in any way.
> 
> Careful as in? ^^ Doesn't seem to be a very conservative "just split up",
> but more involved; and there isn't any testing added to ensure the changes
> don't mess with semantics - which tbf might not be the easiest thing to do
> in a sensible manner, that is actually covering a lot, here.
> 
> But especially header parsing with some edge cases could be quite nicely
> done with the respective request methods mocked.

Mocking in which way exactly? Via `curl` or similar, or via separate
tests? If I should add (Perl) tests, where should I put them and how
are they best implemented?


>> One exception however would be
>> the reading of the HTTP header itself: Instead of reading the
>> header by recursively appending / prepending callbacks to the handle's
>> and reading the header line-by-line, the entire header is read at
>> once, reducing the number of callbacks needed to parse the header from
>> the number of lines to one. This affects when the header line count
>> and size limits are checked - instead of being checked *while* the
>> header is being read, these limits are verified *after* the header
>> was read from the read buffer. If this change in behaviour is
>> undesirable, I'll gladly provide a v2.
> 
> that seems to make those limits mostly (i.e., besides maybe proxied
> requests) irrelevant though? If one spent all of the resources to process
> a huge header fully it doesn't makes sense to only then die if it's deemed
> to big.
> 
> But again, I'd wait out for a more in-depth review before going on a v2.

Very fair point - I hadn't viewed it as too problematic in this case,
because HTTP headers rarely get that big anyway, don't they?

>> Otherwise, the behaviour is identical; the logic is restructured,
>> but entirely preserved.
>>
>>
>> To give a more concrete example of what these changes would make more
>> transparent, the enhancement requested in #4494[1] would become just
>> another step (-> subroutine) in the refactored algorithm. That way it
>> becomes much clearer where the TLS handshake is being verified
>> and which responses (e.g. a `301`) are sent in case the handshake
>> never happened or was aborted.
> 
> A patch, even if just POC, would be a more concrete example ;-) that could
> then _actually_ show that the changes actually provide a nicer interface to
> extend.

Sure! I'll send one in.






^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH] http-server: redirect HTTP to HTTPS
  2023-02-20  9:40   ` Max Carrara
@ 2023-02-20  9:59     ` Max Carrara
  0 siblings, 0 replies; 7+ messages in thread
From: Max Carrara @ 2023-02-20  9:59 UTC (permalink / raw)
  To: pve-devel

Note: This change isn't final yet and shall only serve as an example.

This change adds another subroutine in the request processing
algorithm that verifies whether the TLS handshake has occurred
after the HTTP header was read, redirecting the user to HTTPS if
it hasn't.

In order to let AnyEvent handle TLS handshakes automatically,
the `tls_autostart` callback is added to the handle's read queue,
replacing the `tls` and `tls_ctx` constructor attributes.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
 src/PVE/APIServer/AnyEvent.pm | 115 ++++++++++++++++++++++++++++++----
 1 file changed, 102 insertions(+), 13 deletions(-)

diff --git a/src/PVE/APIServer/AnyEvent.pm b/src/PVE/APIServer/AnyEvent.pm
index 86f5e07..059d710 100644
--- a/src/PVE/APIServer/AnyEvent.pm
+++ b/src/PVE/APIServer/AnyEvent.pm
@@ -22,6 +22,7 @@ use Compress::Zlib;
 use Digest::MD5;
 use Digest::SHA;
 use Encode;
+use Errno qw(EBADMSG);
 use Fcntl ();
 use Fcntl;
 use File::Find;
@@ -1331,6 +1332,7 @@ sub begin_process_next_request {
 		my @process_steps = \(
 		    &parse_request_header_method,
 		    &parse_request_header_fields,
+                    &ensure_tls_connection,
 		    &postprocess_header_fields,
 		    &authenticate_and_handle_request,
 		);
@@ -1513,6 +1515,36 @@ sub parse_request_header_fields {
     return { success => 1 };
 }
 
+sub ensure_tls_connection {
+    my ($self, $reqstate) = @_;
+
+    $self->dprint("TLS session: " . $reqstate->{hdl}->{tls});
+    $self->dprint("TLS handshake succeeded: " . $reqstate->{tls_handshake_succeeded});
+
+    if ($reqstate->{hdl}->{tls} && $reqstate->{tls_handshake_succeeded}) {
+	return { success => 1 };
+    }
+
+    my $request = $reqstate->{request};
+    my $h_host = $reqstate->{request}->header('Host');
+
+    die "request object not found in reqstate\n"
+	if !$request;
+
+    die "Header field 'Host' not found in request\n"
+	if !$h_host;
+
+    my $secure_host = "https://" . ($h_host =~ s/^http(s)?:\/\///r);
+    my $h_location = $secure_host . $request->uri();
+
+    return {
+	success => 0,
+	http_code => 301,
+	http_message => 'Moved Permanently',
+	http_header => HTTP::Headers->new('Location' => $h_location),
+    };
+}
+
 sub postprocess_header_fields {
     my ($self, $reqstate) = @_;
 
@@ -1950,26 +1982,27 @@ sub accept_connections {
 		timeout => $self->{timeout},
 		linger => 0, # avoid problems with ssh - really needed ?
 		on_eof   => sub {
-		    my ($hdl) = @_;
-		    eval {
-			$self->log_aborted_request($reqstate);
-			$self->client_do_disconnect($reqstate);
-		    };
-		    if (my $err = $@) { syslog('err', $err); }
+		    $self->handle_on_eof($reqstate, @_);
 		},
 		on_error => sub {
-		    my ($hdl, $fatal, $message) = @_;
-		    eval {
-			$self->log_aborted_request($reqstate, $message);
-			$self->client_do_disconnect($reqstate);
-		    };
-		    if (my $err = $@) { syslog('err', "$err"); }
+		    $self->handle_on_error($reqstate, @_);
 		},
-		($self->{tls_ctx} ? (tls => "accept", tls_ctx => $self->{tls_ctx}) : ()));
+		on_starttls => sub {
+		    $self->handle_on_starttls($reqstate, @_);
+		},
+		on_stoptls => sub {
+		    $self->handle_on_stoptls($reqstate, @_);
+		}
+	    );
 	    $handle_creation = 0;
 
 	    $self->dprint("ACCEPT FH" .  $clientfh->fileno() . " CONN$self->{conn_count}");
 
+	    if ($self->{tls_ctx}) {
+		$self->dprint("Setting TLS to autostart");
+		$reqstate->{hdl}->unshift_read(tls_autostart => $self->{tls_ctx}, "accept");
+	    }
+
 	    $self->begin_process_next_request($reqstate);
 	}
     };
@@ -1991,6 +2024,62 @@ sub accept_connections {
     $self->wait_end_loop() if $self->{end_loop};
 }
 
+sub handle_on_eof {
+    my ($self, $reqstate, $handle) = @_;
+
+    eval {
+	$self->log_aborted_request($reqstate);
+	$self->client_do_disconnect($reqstate);
+    };
+
+    if (my $err = $@) { syslog('err', "$err"); }
+}
+
+sub handle_on_error {
+    my ($self, $reqstate, $handle, $fatal, $message) = @_;
+
+    $fatal ||= 0;
+    $message ||= '';
+
+    $self->dprint("Encountered error '$!' - fatal: $fatal; handle: $handle;");
+    $self->dprint("message: $message;");
+
+    eval {
+	if ($!{EBADMSG}) {
+	    $self->error($reqstate, 400, 'bad request');
+	} else {
+	    $self->dprint("Error '$!' not handled!");
+	}
+
+	$self->log_aborted_request($reqstate, $message);
+	$self->client_do_disconnect($reqstate);
+    };
+
+    if (my $err = $@) { syslog('err', "$err"); }
+}
+
+sub handle_on_starttls {
+    my ($self, $reqstate, $handle, $success, $message) = @_;
+
+    eval {
+	$reqstate->{tls_handshake_succeeded} = $success;
+
+	if ($success) {
+	    $self->dprint("TLS handshake for handle $handle successful ($message); session started");
+	} else {
+	    $self->dprint("TLS handshake for handle $handle failed");
+	}
+    };
+
+    if (my $err = $@) { syslog('err', "$err"); }
+}
+
+sub handle_on_stoptls {
+    my ($self, $reqstate, $handle) = @_;
+
+    $self->dprint("TLS session for handle $handle stopped");
+}
+
 # Note: We can't open log file in non-blocking mode and use AnyEvent::Handle,
 # because we write from multiple processes, and that would arbitrarily mix output
 # of all processes.
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH http-server 1/2] http-server: anyevent: add new subroutine for processing requests
  2023-02-17 15:25 ` [pve-devel] [PATCH http-server 1/2] http-server: anyevent: add new subroutine for processing requests Max Carrara
@ 2023-02-28 14:05   ` Fabian Grünbichler
  0 siblings, 0 replies; 7+ messages in thread
From: Fabian Grünbichler @ 2023-02-28 14:05 UTC (permalink / raw)
  To: Proxmox VE development discussion

On February 17, 2023 4:25 pm, Max Carrara wrote:
> The `begin_process_next_request` subroutine and its auxiliary
> subroutines are added to serve as a replacement for
> `push_request_header` and `unshift_read_header` in order to simplify
> the parsing logic. The entire header is read by a single callback
> in `begin_process_next_request` and then processed step-by-step
> instead of reading each line and handling it separately.
> This is done in order to separate reading the incoming header from
> the processing algorithm's logic while simultaneously making the
> steps used to process the header more transparent and explicit.
> 
> This comes with one noteworthy change in behaviour:
> Because the header is now read all at once from the input buffer,
> the header field count limit is only checked for after the entire
> header was already read. In `unshift_read_header` the limit is
> checked for each line that is read.
> 
> The behaviour otherwise remains fully identical.
> 
> Signed-off-by: Max Carrara <m.carrara@proxmox.com>

Like Thomas already remarked - splitting these two patches doesn't make much
sense, since it's still needed to look at the total diff across both to see what
has changed..

I started going through the changes below and noticed some stuff, but quickly
came to the conclusion that we actually gain very little from switching to
parsing the full read buffer in one go instead of by line
- in the end, the actual parsing still happens by line
- unshift_read_header already contains the state machine handling all of the
steps except for parsing the very first line (and the state is rather
trivial/serial anyway)
- adding 'no TLS, respond with redirect' logic there should be a few lines of
code

OTOH, switching to non-line-based parsing here requires careful analysis of
interactions with AnyEvent internals and buffer sizes with potential for subtle
breakage (e.g., how does this fare under load).

it would still be possible to factor out the post-processing and
authentication/handling part, but that doesn't require turning the header
parsing onto its head.. unshift_read_header could simply call a post_processing
and a authenticate_and_handle_request sub instead of in-lining all that..

> ---
>  src/PVE/APIServer/AnyEvent.pm | 492 ++++++++++++++++++++++++++++++++++
>  1 file changed, 492 insertions(+)
> 
> Note: The subroutines in `@process_steps` of
> `begin_process_next_request` mimic the steps the original algorithm
> takes. If necessary, these can of course be split up into smaller
> steps.
> 
> diff --git a/src/PVE/APIServer/AnyEvent.pm b/src/PVE/APIServer/AnyEvent.pm
> index 3cd77fa..bf7957a 100644
> --- a/src/PVE/APIServer/AnyEvent.pm
> +++ b/src/PVE/APIServer/AnyEvent.pm
> @@ -1570,6 +1570,498 @@ sub push_request_header {
>      warn $@ if $@;
>  }
>  
> +sub begin_process_next_request {
> +    my ($self, $reqstate) = @_;
> +
> +    # Match for *end* of HTTP header fields
> +    my $re_accept = qr<\r?\n\r?\n>;
> +
> +    # Don't reject anything - rejection causes EBADMSG otherwise, aborting the callback
> +    # and might match for unexpected things (such as TLS handshake data!)
> +    my $re_reject = undef;
> +
> +    # Skip over all header fields, but don't match for CRs and NLs
> +    # as those are required for re_accept above
> +    my $re_skip = qr<^.*[^\r\n]>;
> +
> +    eval {
> +	$reqstate->{hdl}->push_read(regex => $re_accept, $re_reject, $re_skip, sub {
> +	    my ($handle, $data) = @_;
> +
> +	    # $data now contains the entire (raw) HTTP header
> +	    $self->dprint("parse_request_header: handle: $handle [$data]");
> +
> +	    eval {
> +		$reqstate->{keep_alive}--;
> +
> +		my $http_header = $data =~ s/^\s+|\s+$//gr ;

why the 'r' modifier? is this the replacement for removing empty lines at the
start of the request - if so, it lost the comment that states this reason and
the RE is unnecessarily opaque.

> +		my ($http_method_line, $http_header_field_lines) = split(/\r?\n/, $http_header, 2);

so now we are splitting by line again

> +
> +		$reqstate->{http_method_line} = $http_method_line;
> +		$reqstate->{http_header_field_lines} = $http_header_field_lines;
> +
> +		# Array of references to the subroutines used to process the request
> +		my @process_steps = \(
> +		    &parse_request_header_method,

and this one here will parse just the first line

> +		    &parse_request_header_fields,

and this one will parse by line again..

> +		    &postprocess_header_fields,
> +		    &authenticate_and_handle_request,

I can actually see the benefit of having these two split out since they actually
only operate on the parsed data - see top-level comment above.

> +		);
> +
> +		for my $step (@process_steps) {
> +		    my $result = $step->($self, $reqstate);
> +
> +		    my $success = $result->{success} || 0;
> +		    if (!$success) {
> +			$self->make_response_from_unsuccessful_result($reqstate, $result);
> +			return;
> +		    }
> +		}
> +
> +	    };
> +
> +	    $self->dprint("parse_request_header: handle $handle DONE");
> +	    warn $@ if $@;
> +
> +	});
> +    };
> +    warn $@ if $@;
> +}
> +
> +sub make_response_from_unsuccessful_result {
> +    my ($self, $reqstate, $result) = @_;
> +
> +    die "No HTTP response status code provided\n"
> +	if !$result->{http_code};
> +
> +    eval {
> +	my $response_object = HTTP::Response->new(
> +	    $result->{http_code},
> +	    $result->{http_message},
> +	    $result->{http_header},
> +	    $result->{http_content}
> +	);
> +
> +	$self->dprint("Sending response: " . $result->{http_code} . " " . $result->{http_message});
> +
> +	$self->response(
> +	    $reqstate,
> +	    $response_object,
> +	    $result->{response_mtime},
> +	    $result->{response_nocomp},
> +	    $result->{response_delay},
> +	    $result->{response_stream_fh}
> +	);
> +
> +	$self->client_do_disconnect($reqstate);
> +    };
> +
> +    warn $@ if $@;
> +}
> +
> +sub parse_request_header_method {
> +    my ($self, $reqstate) = @_;
> +
> +    my $http_method_line = $reqstate->{http_method_line};
> +    die "http_method_line not within reqstate\n"
> +	if !$http_method_line;
> +
> +    my $re_http_method = qr<^(\S+)\040(\S+)\040HTTP\/(\d+)\.(\d+)>;

this RE is different from the original one: qr<> vs =~ //o
- I don't think the 'o' does anything meaningful here, so dropping is probably okay
- the '/' that is escaped in the expression doesn't make sense now, since '/' is
not the delimiter anymore

> +
> +    $self->dprint("Processing initial header line: $http_method_line");

additional debug print, so it's not behaving identical as described by the
commit message ;)

> +
> +    if ($http_method_line =~ $re_http_method) {
> +	my ($method, $uri, $major, $minor) = ($1, $2, $3, $4);
> +
> +	if ($major != 1) {
> +	    return {
> +		success => 0,
> +		http_code => 506,
> +		http_message => "HTTP protocol version $major.$minor not supported",
> +	    };
> +	}
> +
> +	# if an '@' comes before the first slash, proxy forwarding might consider
> +	# the frist part of the uri to be part of an authority
> +	if ($uri =~ m|^[^/]*@|) {
> +	    return {
> +		success => 0,
> +		http_code => 400,
> +		http_message => 'invalid uri',
> +	    };
> +	}
> +

this used to also save the request line (the line being parsed here) for logging
purposes. so /var/log/pveproxy/access.log now looks like this:

::ffff:192.168.X.Y - user@realm [28/02/2023:13:49:46 +0100] "-" 200 1024
::ffff:192.168.X.Y - user@realm [28/02/2023:13:49:47 +0100] "-" 200 1352
::ffff:192.168.X.Y - user@realm [28/02/2023:13:49:48 +0100] "-" 200 73
::ffff:192.168.X.Y - user@realm [28/02/2023:13:49:50 +0100] "-" 200 1028
::ffff:192.168.X.Y - user@realm [28/02/2023:13:49:51 +0100] "-" 200 1360
::ffff:192.168.X.Y - user@realm [28/02/2023:13:49:52 +0100] "-" 200 73

which is not very helpful ;)

> +	$reqstate->{proto}->{str} = "HTTP/$major.$minor";
> +	$reqstate->{proto}->{maj} = $major;
> +	$reqstate->{proto}->{min} = $minor;
> +	$reqstate->{proto}->{ver} = $major*1000 + $minor;
> +
> +	$reqstate->{request} = HTTP::Request->new($method, $uri);
> +	$reqstate->{starttime} = [gettimeofday];
> +
> +	# Only requests with valid HTTP methods are counted
> +	$self->{request_count}++;
> +
> +	if ($self->{request_count} >= $self->{max_requests}) {
> +	    $self->{end_loop} = 1;
> +	}
> +
> +	return { success => 1 };
> +    }
> +
> +    $self->dprint("Could not match for HTTP method / proto: $http_method_line");

same as above

> +    return {
> +	success => 0,
> +	http_code => 400,
> +	http_message => 'bad request',
> +    };
> +}
> +
> +sub parse_request_header_fields {
> +    my ($self, $reqstate) = @_;
> +
> +    my $http_header_field_lines = $reqstate->{http_header_field_lines};
> +
> +    die "http_header_field_lines not within reqstate\n"
> +	if !$reqstate->{http_header_field_lines};
> +
> +    die "request object not within reqstate\n"
> +	if !$reqstate->{request};
> +
> +    my $request = $reqstate->{request};
> +    my $state = { size => 0 };
> +
> +    # Matches a header field definition, e.g.
> +    # 'Foo: Bar'
> +    my $re_field = qr<^([^:\s]+)\s*:\s*(.*)>;
> +
> +    # Matches a line beginning with at least one whitespace character,
> +    # followed by some other data; used to match multiline headers,
> +    # e.g. if 'Foo: Bar' was matched previously, and the next line
> +    # contains '   Qux', the resulting field will be 'Foo: Bar Qux'
> +    my $re_field_multiline = qr<^\s+(.*)>;
> +
> +    my @lines = split(/\r?\n/, $http_header_field_lines);
> +
> +    die "too many http header lines (> $limit_max_headers)\n"
> +        if scalar(@lines) >= $limit_max_headers;
> +
> +
> +    for my $line (@lines) {
> +	$self->dprint("Processing header line: $line");
> +
> +	die "http header too large\n"
> +	    if ($state->{size} += length($line)) >= $limit_max_header_size;
> +
> +	if ($line =~ $re_field) {
> +	    $request->push_header($state->{key}, $state->{value})
> +		if $state->{key};
> +	    ($state->{key}, $state->{value}) = ($1, $2);
> +
> +	} elsif ($line =~ $re_field_multiline) {
> +
> +	    if (!$state->{key}) {  # sanity check for invalid header
> +		return {
> +		    success => 0,
> +		    http_code => 400,
> +		    http_message => 'bad request',
> +		};
> +	    }
> +
> +	    $state->{value} .= " $1";
> +
> +	} else {
> +	    return {
> +		success => 0,
> +		http_code => 506,
> +		http_message => 'unable to parse request header',
> +	    };
> +	}
> +
> +    }
> +
> +    $request->push_header($state->{key}, $state->{value})
> +	if $state->{key};
> +
> +    return { success => 1 };
> +}
> +
> +sub postprocess_header_fields {
> +    my ($self, $reqstate) = @_;
> +
> +    die "request object not within reqstate\n"
> +	if !$reqstate->{request};
> +
> +    my $request = $reqstate->{request};
> +    my $method = $request->method();
> +
> +    if (!$known_methods->{$method}) {
> +	return {
> +	    success => 0,
> +	    http_code => HTTP_NOT_IMPLEMENTED,
> +	    http_message => "method '$method' not available",
> +	};
> +    }
> +
> +    my $h_connection = $request->header('Connection');
> +    my $h_accept_encoding = $request->header('Accept-Encoding');
> +
> +    $reqstate->{accept_gzip} = ($h_accept_encoding && $h_accept_encoding =~ m/gzip/) ? 1 : 0;
> +
> +    if ($h_connection) {
> +	$reqstate->{keep_alive} = 0 if $h_connection =~ m/close/oi;
> +    } else {
> +	if ($reqstate->{proto}->{ver} < 1001) {
> +	    $reqstate->{keep_alive} = 0;
> +	}
> +    }
> +
> +    my $h_transfer_encoding  = $request->header('Transfer-Encoding');
> +    if ($h_transfer_encoding && lc($h_transfer_encoding) eq 'chunked') {
> +	# Handle chunked transfer encoding
> +	return {
> +	    success => 0,
> +	    http_code => 501,
> +	    http_message => 'chunked transfer encoding not supported',
> +	};
> +
> +    } elsif ($h_transfer_encoding) {
> +	return {
> +	    success => 0,
> +	    http_code => 501,
> +	    http_message => "Unknown transfer encoding '$h_transfer_encoding'",
> +	};
> +    }
> +
> +    my $h_pveclientip = $request->header('PVEClientIP');
> +
> +    # FIXME: how can we make PVEClientIP header trusted?
> +    if ($self->{trusted_env} && $h_pveclientip) {
> +	$reqstate->{peer_host} = $h_pveclientip;
> +    } else {
> +	$request->header('PVEClientIP', $reqstate->{peer_host});
> +    }
> +
> +    if (my $rpcenv = $self->{rpcenv}) {
> +	$rpcenv->set_request_host($request->header('Host'));
> +    }
> +
> +    return { success => 1 };
> +}
> +
> +sub authenticate_and_handle_request {
> +    my ($self, $reqstate) = @_;
> +
> +    die "request not within reqstate\n"
> +	if !$reqstate->{request};
> +
> +    my $request = $reqstate->{request};
> +    my $method = $request->method();
> +    my $path = uri_unescape($request->uri->path());
> +    my $base_uri = $self->{base_uri};
> +
> +    # Begin authentication
> +    $self->dprint("Begin authentication");
> +    my $auth = {};
> +
> +    if ($self->{spiceproxy}) {
> +	$self->dprint("Authenticating via spiceproxy");
> +	my $connect_str = $request->header('Host');
> +	my ($vmid, $node, $port) = $self->verify_spice_connect_url($connect_str);
> +
> +	if (!(defined($vmid) && $node && $port)) {
> +	    return {
> +		success => 0,
> +		http_code => HTTP_UNAUTHORIZED,
> +		http_message => 'invalid ticket',
> +	    };
> +	}
> +
> +	$self->handle_spice_proxy_request($reqstate, $connect_str, $vmid, $node, $port);
> +	$self->dprint("spiceproxy request handled");
> +	return { success => 1 };
> +
> +    } elsif ($path =~ m/^\Q$base_uri\E/) {
> +	$self->dprint("Authenticating via base_uri regex");
> +	my $token = $request->header('CSRFPreventionToken');
> +	my $cookie = $request->header('Cookie');
> +	my $auth_header = $request->header('Authorization');
> +
> +	# prefer actual cookie
> +	my $ticket = PVE::APIServer::Formatter::extract_auth_value($cookie, $self->{cookie_name});
> +
> +	# fallback to cookie in 'Authorization' header
> +	$ticket = PVE::APIServer::Formatter::extract_auth_value($auth_header, $self->{cookie_name})
> +	    if !$ticket;
> +
> +	# finally, fallback to API token if no ticket has been provided so far
> +	my $api_token;
> +	$api_token = PVE::APIServer::Formatter::extract_auth_value($auth_header, $self->{apitoken_name})
> +	    if !$ticket;
> +
> +	my ($rel_uri, $format) = $split_abs_uri->($path, $base_uri);
> +
> +	if (!$format) {
> +	    return {
> +		success => 0,
> +		http_code => HTTP_NOT_IMPLEMENTED,
> +		http_message => 'no such uri',
> +	    };
> +	}
> +
> +	$self->dprint("Calling auth_handler");
> +	eval {
> +	    $auth = $self->auth_handler(
> +		$method,
> +		$rel_uri,
> +		$ticket,
> +		$token,
> +		$api_token,
> +		$reqstate->{peer_host}
> +	    );
> +	};
> +	$self->dprint("auth_handler done");
> +
> +	if (my $err = $@) {
> +	    $self->dprint("auth_handler threw error $@");
> +	    # HACK: see Note 1
> +	    Net::SSLeay::ERR_clear_error();
> +
> +	    my $result = {
> +		success => 0,
> +		response_delay => 3, # always delay unauthorized calls by 3 seconds
> +		response_nocomp => 0,
> +	    };
> +
> +	    if (ref($err) eq "PVE::Exception") {
> +		$result->{http_code} = $err->{code} || HTTP_INTERNAL_SERVER_ERROR;
> +		$result->{http_message} = $err->{msg};
> +
> +	    } elsif (my $formatter = PVE::APIServer::Formatter::get_login_formatter($format)) {
> +		my ($raw, $ct, $nocomp) =
> +		    $formatter->($path, $auth, $self->{formatter_config});
> +
> +		if (ref($raw) && (ref($raw) eq 'HTTP::Response')) {
> +		    $result->{http_code} = $raw->code;
> +		    $result->{http_message} = $raw->message;
> +		    $result->{http_header} = $raw->headers;
> +		    $result->{http_content} = $raw->content;
> +
> +		} else {
> +		    $result->{http_code} = HTTP_UNAUTHORIZED;
> +		    $result->{http_message} = 'Login Required';
> +		    $result->{http_header} = HTTP::Headers->new('Content-Type' => $ct);
> +		    $result->{http_content} = $raw;
> +
> +		}
> +
> +		$result->{response_nocomp} = $nocomp;
> +
> +	    } else {
> +		$result->{http_code} = HTTP_UNAUTHORIZED;
> +		$result->{http_message} = $err;
> +	    }
> +
> +	    return $result;
> +	}
> +    }
> +
> +    $reqstate->{log}->{userid} = $auth->{userid};
> +
> +    # Continue handling request
> +    $self->dprint("Continuing to handle request");
> +
> +    my $h_content_length = $request->header('Content-Length');
> +    my $h_content_type = $request->header('Content-Type');
> +
> +    if (!$h_content_length) {
> +	$self->dprint("No Content-Length provided; handling request the usual way");
> +	$self->handle_request($reqstate, $auth, $method, $path);
> +	return { success => 1 };
> +    }
> +
> +    if (!($method eq 'PUT' || $method eq 'POST')) {
> +	return {
> +	    success => 0,
> +	    http_code => 501,
> +	    http_message => "Unexpected content for method '$method'",
> +	};
> +    }
> +
> +    my ($content, $boundary) = $h_content_type ? parse_content_type($h_content_type) : ();
> +
> +    if ($auth->{isUpload} && !$self->{trusted_env}) {
> +	die "upload 'Content-Type '$h_content_type' not implemented\n"
> +	    if !($boundary && $content && ($content eq 'multipart/form-data'));
> +
> +	die "upload without content length header not supported"
> +	    if !$h_content_length;
> +
> +	$self->dprint("start upload $path $content $boundary");
> +
> +	my $tmpfilename = get_upload_filename();
> +	my $outfh = IO::File->new($tmpfilename, O_RDWR|O_CREAT|O_EXCL, 0600) ||
> +	    die "unable to create temporary upload file '$tmpfilename'";
> +
> +	$reqstate->{keep_alive} = 0;
> +
> +	my $boundlen = length($boundary) + 8; # \015?\012--$boundary--\015?\012
> +
> +	my $state = {
> +	    size => $h_content_length,
> +	    boundary => $boundary,
> +	    ctx => Digest::MD5->new,
> +	    boundlen =>  $boundlen,
> +	    maxheader => 2048 + $boundlen, # should be large enough
> +	    params => decode_urlencoded($request->url->query()),
> +	    phase => 0,
> +	    read => 0,
> +	    post_size => 0,
> +	    starttime => [gettimeofday],
> +	    outfh => $outfh,
> +	};
> +
> +	$reqstate->{tmpfilename} = $tmpfilename;
> +	$reqstate->{hdl}->on_read(sub {
> +	    $self->file_upload_multipart($reqstate, $auth, $method, $path, $state);
> +	});
> +
> +	return { success => 1 };
> +    }
> +
> +    if ($h_content_length > $limit_max_post) {
> +	return {
> +	    success => 0,
> +	    http_code => 501,
> +	    'for data too large',
> +	};
> +    }
> +
> +    if (
> +	!$content
> +	|| $content eq 'application/x-www-form-urlencoded'
> +	|| $content eq 'application/json'
> +    ) {
> +	$self->dprint("Received content '$content', reading chunk length of '$h_content_length'");
> +
> +	$reqstate->{hdl}->unshift_read(chunk => $h_content_length, sub {
> +	    my ($hdl, $data) = @_;
> +	    $self->dprint("Read data for content '$content'");
> +	    $self->dprint("Continuing to handle request after reading content");
> +	    $request->content($data);
> +	    $self->handle_request($reqstate, $auth, $method, $path);
> +	});
> +	return { success => 1 };
> +
> +    } else {
> +	return {
> +	    success => 0,
> +	    http_code => 506,
> +	    http_message => "upload 'Content-Type '$h_content_type' not implemented",
> +	};
> +    }
> +}
> +
>  sub accept {
>      my ($self) = @_;
>  
> -- 
> 2.30.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-28 14:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-17 15:25 [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Max Carrara
2023-02-17 15:25 ` [pve-devel] [PATCH http-server 1/2] http-server: anyevent: add new subroutine for processing requests Max Carrara
2023-02-28 14:05   ` Fabian Grünbichler
2023-02-17 15:25 ` [pve-devel] [PATCH http-server 2/2] http-server: anyevent: replace request processing subroutine Max Carrara
2023-02-17 17:15 ` [pve-devel] [PATCH http-server 0/2] refactor HTTP request processing Thomas Lamprecht
2023-02-20  9:40   ` Max Carrara
2023-02-20  9:59     ` [pve-devel] [PATCH] http-server: redirect HTTP to HTTPS Max Carrara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal