From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 0DF711FF15C for <inbox@lore.proxmox.com>; Wed, 5 Mar 2025 22:46:01 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id C9EF61F0A5; Wed, 5 Mar 2025 22:45:56 +0100 (CET) Date: Wed, 05 Mar 2025 21:44:55 +0000 To: pve-devel@lists.proxmox.com MIME-Version: 1.0 Message-ID: <mailman.799.1741211155.293.pve-devel@lists.proxmox.com> List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Post: <mailto:pve-devel@lists.proxmox.com> From: Rob Rozestraten via pve-devel <pve-devel@lists.proxmox.com> Precedence: list Cc: Rob Rozestraten <admin@truthsolo.net> X-Mailman-Version: 2.1.29 X-BeenThere: pve-devel@lists.proxmox.com List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> Subject: [pve-devel] [PATCH pve-http-server v2 0/1] fix unexpected EOF for client when closing TLS session Content-Type: multipart/mixed; boundary="===============8245613465816549151==" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> --===============8245613465816549151== Content-Type: message/rfc822 Content-Disposition: inline Return-Path: <admin@truthsolo.net> X-Original-To: pve-devel@lists.proxmox.com Delivered-To: pve-devel@lists.proxmox.com Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 8B4EED3D0B for <pve-devel@lists.proxmox.com>; Wed, 5 Mar 2025 22:45:55 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 6DA821EFC2 for <pve-devel@lists.proxmox.com>; Wed, 5 Mar 2025 22:45:25 +0100 (CET) Received: from mail-41103.protonmail.ch (mail-41103.protonmail.ch [185.70.41.103]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for <pve-devel@lists.proxmox.com>; Wed, 5 Mar 2025 22:45:24 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=truthsolo.net; s=protonmail3; t=1741211102; x=1741470302; bh=rH6sjGHW8bFfBu1hj84KbSERC+tppg7EIA577xwUhzQ=; h=Date:To:From:Cc:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector: List-Unsubscribe:List-Unsubscribe-Post; b=edZh/fKiJnMwzLFtqH7LmLBGxfOR/FcWmGEZ7xPb3znlrqlY+XWY0KZbc94aRfcy1 YL1t/iUQQbdiEodKWMm78zbyz38Die1X49TUsym1EsSPLRqbCB9SN2ceDNvlzHDTMu dVJZpCDmNPB48U/qWnzjG2VSRpddef+RP6EDQL5Lbwh1NIsuHASqGatJCsggF9QlcD CUJWNVrBsGF+PL/PWE0SefhplYpIiL/urKAPp+THRX7H8VAiNb6inkPe/oQ4jMhH/g RNt65kiYwZvm4dLic7JEDx60HMLmgzNeNutIEniCjkz2qwlchx/xeKlAQ8q93Wnlz8 CclEKm5HTTtzg== Date: Wed, 05 Mar 2025 21:44:55 +0000 To: pve-devel@lists.proxmox.com From: Rob Rozestraten <admin@truthsolo.net> Subject: [PATCH pve-http-server v2 0/1] fix unexpected EOF for client when closing TLS session Message-ID: <20250305214447.128975-1-admin@truthsolo.net> Feedback-ID: 48530542:user:proton X-Pm-Message-ID: 13dec5a5018347c36713fece96392a7c004bed09 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.001 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain DMARC_PASS -0.1 DMARC pass policy RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_PASS -0.001 SPF: HELO matches SPF record SPF_PASS -0.001 SPF: sender matches SPF record > A little bit of analysis of why this is needed and okay would be great to= have here in the commit message ;) Thanks for the feedback, Fabian! First time with git-email.. I'm adding details with logs here, and a detailed description of the issue & solution in the commit. > nit: the if is not needed, as `stoptls` only does anything anyway if $hdl= has a FH and is a TLS-enabled handle ;) Thanks! :D I added some debugging to AnyEvent.pm. Here is a curl request that returns = 200 OK: 1741209019.890976 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): = ACCEPT FH10 CONN1 1741209019.891069 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): = Setting TLS to autostart ::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:10:19 -0700] "GET= /api2/json/cluster HTTP/1.1" 200 360 1741209019.916084 worker[1636452]: PVE::APIServer::AnyEvent +423: finish_re= sponse: KEEPALIVE 99 1741209019.916693 worker[1636452]: AnyEvent::Handle +1350: __ANON__: EOF ac= cept_connections() 1741209019.916716 worker[1636452]: PVE::APIServer::AnyEvent +1951: (eval): = >>> accept_connections().on_eof 1741209019.916756 worker[1636452]: PVE::APIServer::AnyEvent +1954: client_d= o_disconnect: START client_do_disconnect() 1741209019.916827 worker[1636452]: PVE::APIServer::AnyEvent +1954: client_d= o_disconnect: close connection AnyEvent::Handle=3DHASH(0x595788db52a8) = = =20 1741209019.916845 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> shutdown($hdl->{fh}, 1) 1741209019.916889 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> ok 1741209019.916920 worker[1636452]: PVE::APIServer::AnyEvent +1954: client_d= o_disconnect: CLOSE FH10 CONN0 A curl request that returns 400:=20 1741209023.423134 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): = ACCEPT FH10 CONN1 1741209023.423207 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): = Setting TLS to autostart ::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:10:23 -0700] "GET= /api2/json/cluster/nextid?vmid=3D106 HTTP/1.1" 400 55 = = =20 1741209023.439439 worker[1636452]: PVE::APIServer::AnyEvent +423: finish_re= sponse: DRAIN finish_response() 1741209023.439486 worker[1636452]: PVE::APIServer::AnyEvent +219: (eval): >= >> finish_response().on_drain 1741209023.439547 worker[1636452]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: START client_do_disconnect() 1741209023.439591 worker[1636452]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: close connection AnyEvent::Handle=3DHASH(0x595788de2318) = = =20 1741209023.439610 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> shutdown($hdl->{fh}, 1) 1741209023.439648 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> ok 1741209023.439682 worker[1636452]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: CLOSE FH10 CONN0 Requests returning 400 send "Connection: close" and tear down the socket im= mediately, resulting in an unexpected EOF at the client. Requests returning= 200 send "Connection: keep-alive" and have a 5 second timeout. If we make a request but let the connection sit idle, the 5 second timeout = is reached, and we receive unexpected EOF again: 1741209026.694101 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): = ACCEPT FH10 CONN1 1741209026.694173 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): = Setting TLS to autostart ::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:10:26 -0700] "GET= /api2/json/cluster HTTP/1.1" 200 360 1741209026.698288 worker[1636452]: PVE::APIServer::AnyEvent +423: finish_re= sponse: KEEPALIVE 99 1741209031.699939 worker[1636452]: AnyEvent::Handle +655: __ANON__: ERROR a= ccept_connections() 1741209031.700154 worker[1636452]: PVE::APIServer::AnyEvent +1962: (eval): = >>> accept_connections().on_error 1741209031.700186 worker[1636452]: PVE::APIServer::AnyEvent +1969: client_d= o_disconnect: START client_do_disconnect() 1741209031.700212 worker[1636452]: PVE::APIServer::AnyEvent +1969: client_d= o_disconnect: close connection AnyEvent::Handle=3DHASH(0x595788de1e68) 1741209031.700240 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> shutdown($hdl->{fh}, 1) 1741209031.700362 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> ok 1741209031.700408 worker[1636452]: PVE::APIServer::AnyEvent +1969: client_d= o_disconnect: CLOSE FH10 CONN0 To fix this, we can send a TLS close notify before shutting down the socket= . If we do stoptls() before clearing the handlers, however, we enter client_d= o_disconnect() again: 1741209459.761847 worker[1639023]: PVE::APIServer::AnyEvent +1910: (eval): = ACCEPT FH10 CONN1 1741209459.761948 worker[1639023]: PVE::APIServer::AnyEvent +1910: (eval): = Setting TLS to autostart ::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:17:39 -0700] "GET= /api2/json/cluster/nextid?vmid=3D106 HTTP/1.1" 400 55 = = =20 1741209459.787932 worker[1639023]: PVE::APIServer::AnyEvent +423: finish_re= sponse: DRAIN finish_response() 1741209459.788003 worker[1639023]: PVE::APIServer::AnyEvent +219: (eval): >= >> finish_response().on_drain 1741209459.788030 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: START client_do_disconnect() 1741209459.788076 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: close connection AnyEvent::Handle=3DHASH(0x603027214400) = = =20 1741209459.788130 worker[1639023]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> $hdl->stoptls() 1741209459.788221 worker[1639023]: PVE::APIServer::AnyEvent +219: (eval): >= >> finish_response().on_drain 1741209459.788248 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: START client_do_disconnect() 1741209459.788493 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: ERROR: detected empty handle 1741209459.788553 worker[1639023]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> shutdown($hdl->{fh}, 1) 1741209459.788593 worker[1639023]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> ok 1741209459.788649 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do= _disconnect: CLOSE FH10 CONN0 So we move these after the clearing of handlers. This was fixed by commit f= 737984 for some cases, perhaps for all cases by clearing handlers prior to = shutdown? > If I read the docs right, this could block (would that be an issue here?)= and could potentially destroy the handle (so that might need to be recheck= ed afterwards to prevent spurious warnings?) >=20 > what happens if we initiate the teardown, and the client never acks it? I don't see this blocking in my observations. The server continues with soc= ket teardown immediately after sending TLS close notify. If no response is = received from the client, it remains in FIN_WAIT1 until that timeout is rea= ched. This was tested by making a request with keepalive then immediately b= locking all outbound on the client prior to reaching the 5 second timeout: 1741209811.245870 worker[1640562]: PVE::APIServer::AnyEvent +1910: (eval): = ACCEPT FH10 CONN1 1741209811.245963 worker[1640562]: PVE::APIServer::AnyEvent +1910: (eval): = Setting TLS to autostart ::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:23:31 -0700] "GET= /api2/json/cluster HTTP/1.1" 200 360 1741209811.260657 worker[1640562]: PVE::APIServer::AnyEvent +423: finish_re= sponse: KEEPALIVE 99 1741209816.262383 worker[1640562]: AnyEvent::Handle +655: __ANON__: ERROR a= ccept_connections() 1741209816.262532 worker[1640562]: PVE::APIServer::AnyEvent +1962: (eval): = >>> accept_connections().on_error 1741209816.262631 worker[1640562]: PVE::APIServer::AnyEvent +1969: client_d= o_disconnect: START client_do_disconnect() 1741209816.262706 worker[1640562]: PVE::APIServer::AnyEvent +1969: client_d= o_disconnect: close connection AnyEvent::Handle=3DHASH(0x5eede90cc068) = = =20 1741209816.262750 worker[1640562]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> $hdl->stoptls() 1741209816.262924 worker[1640562]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> shutdown($hdl->{fh}, 1) 1741209816.262975 worker[1640562]: PVE::APIServer::AnyEvent +191: __ANON__:= >>> ok 1741209816.262997 worker[1640562]: PVE::APIServer::AnyEvent +1969: client_d= o_disconnect: CLOSE FH10 CONN0 There is almost no time between stoptls() and shutdown(). We can see the cl= ient never responded by inspecting the connection on the server: FIN-WAIT-1 0 25 [::ffff:10.36.88.27]:8006 [::ffff:10.36.88.28]:48= 382 As for the handle being destroyed, it gets deleted in client_do_disconnect(= ) and I do not see it being used beyond there. I observed no warnings durin= g testing, but it's certainly possible I missed something. If there are any= additional tests I can do here, please let me know! cheers, Rob Rozestraten (1): fix unexpected EOF for client when closing TLS session src/PVE/APIServer/AnyEvent.pm | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --=20 2.48.1 --===============8245613465816549151== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel --===============8245613465816549151==--