From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id 0DF711FF15C
	for <inbox@lore.proxmox.com>; Wed,  5 Mar 2025 22:46:01 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id C9EF61F0A5;
	Wed,  5 Mar 2025 22:45:56 +0100 (CET)
Date: Wed, 05 Mar 2025 21:44:55 +0000
To: pve-devel@lists.proxmox.com
MIME-Version: 1.0
Message-ID: <mailman.799.1741211155.293.pve-devel@lists.proxmox.com>
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
From: Rob Rozestraten via pve-devel <pve-devel@lists.proxmox.com>
Precedence: list
Cc: Rob Rozestraten <admin@truthsolo.net>
X-Mailman-Version: 2.1.29
X-BeenThere: pve-devel@lists.proxmox.com
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
Subject: [pve-devel] [PATCH pve-http-server v2 0/1] fix unexpected EOF for
 client when closing TLS session
Content-Type: multipart/mixed; boundary="===============8245613465816549151=="
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

--===============8245613465816549151==
Content-Type: message/rfc822
Content-Disposition: inline

Return-Path: <admin@truthsolo.net>
X-Original-To: pve-devel@lists.proxmox.com
Delivered-To: pve-devel@lists.proxmox.com
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by lists.proxmox.com (Postfix) with ESMTPS id 8B4EED3D0B
	for <pve-devel@lists.proxmox.com>; Wed,  5 Mar 2025 22:45:55 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 6DA821EFC2
	for <pve-devel@lists.proxmox.com>; Wed,  5 Mar 2025 22:45:25 +0100 (CET)
Received: from mail-41103.protonmail.ch (mail-41103.protonmail.ch [185.70.41.103])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by firstgate.proxmox.com (Proxmox) with ESMTPS
	for <pve-devel@lists.proxmox.com>; Wed,  5 Mar 2025 22:45:24 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=truthsolo.net;
	s=protonmail3; t=1741211102; x=1741470302;
	bh=rH6sjGHW8bFfBu1hj84KbSERC+tppg7EIA577xwUhzQ=;
	h=Date:To:From:Cc:Subject:Message-ID:Feedback-ID:From:To:Cc:Date:
	 Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector:
	 List-Unsubscribe:List-Unsubscribe-Post;
	b=edZh/fKiJnMwzLFtqH7LmLBGxfOR/FcWmGEZ7xPb3znlrqlY+XWY0KZbc94aRfcy1
	 YL1t/iUQQbdiEodKWMm78zbyz38Die1X49TUsym1EsSPLRqbCB9SN2ceDNvlzHDTMu
	 dVJZpCDmNPB48U/qWnzjG2VSRpddef+RP6EDQL5Lbwh1NIsuHASqGatJCsggF9QlcD
	 CUJWNVrBsGF+PL/PWE0SefhplYpIiL/urKAPp+THRX7H8VAiNb6inkPe/oQ4jMhH/g
	 RNt65kiYwZvm4dLic7JEDx60HMLmgzNeNutIEniCjkz2qwlchx/xeKlAQ8q93Wnlz8
	 CclEKm5HTTtzg==
Date: Wed, 05 Mar 2025 21:44:55 +0000
To: pve-devel@lists.proxmox.com
From: Rob Rozestraten <admin@truthsolo.net>
Subject: [PATCH pve-http-server v2 0/1] fix unexpected EOF for client when closing TLS session
Message-ID: <20250305214447.128975-1-admin@truthsolo.net>
Feedback-ID: 48530542:user:proton
X-Pm-Message-ID: 13dec5a5018347c36713fece96392a7c004bed09
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.001 Adjusted score from AWL reputation of From: address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DKIM_SIGNED               0.1 Message has a DKIM or DK signature, not necessarily valid
	DKIM_VALID               -0.1 Message has at least one valid DKIM or DK signature
	DKIM_VALID_AU            -0.1 Message has a valid DKIM or DK signature from author's domain
	DKIM_VALID_EF            -0.1 Message has a valid DKIM or DK signature from envelope-from domain
	DMARC_PASS               -0.1 DMARC pass policy
	RCVD_IN_DNSWL_NONE     -0.0001 Sender listed at https://www.dnswl.org/, no trust
	SPF_HELO_PASS          -0.001 SPF: HELO matches SPF record
	SPF_PASS               -0.001 SPF: sender matches SPF record

> A little bit of analysis of why this is needed and okay would be great to=
 have here in the commit message ;)

Thanks for the feedback, Fabian! First time with git-email..

I'm adding details with logs here, and a detailed description of the
issue & solution in the commit.

> nit: the if is not needed, as `stoptls` only does anything anyway if $hdl=
 has a FH and is a TLS-enabled handle ;)

Thanks! :D

I added some debugging to AnyEvent.pm. Here is a curl request that returns =
200 OK:

1741209019.890976 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): =
ACCEPT FH10 CONN1
1741209019.891069 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): =
Setting TLS to autostart
::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:10:19 -0700] "GET=
 /api2/json/cluster HTTP/1.1" 200 360
1741209019.916084 worker[1636452]: PVE::APIServer::AnyEvent +423: finish_re=
sponse: KEEPALIVE 99
1741209019.916693 worker[1636452]: AnyEvent::Handle +1350: __ANON__: EOF ac=
cept_connections()
1741209019.916716 worker[1636452]: PVE::APIServer::AnyEvent +1951: (eval): =
>>> accept_connections().on_eof
1741209019.916756 worker[1636452]: PVE::APIServer::AnyEvent +1954: client_d=
o_disconnect: START client_do_disconnect()
1741209019.916827 worker[1636452]: PVE::APIServer::AnyEvent +1954: client_d=
o_disconnect: close connection AnyEvent::Handle=3DHASH(0x595788db52a8)     =
                                                                           =
              =20
1741209019.916845 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> shutdown($hdl->{fh}, 1)
1741209019.916889 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> ok
1741209019.916920 worker[1636452]: PVE::APIServer::AnyEvent +1954: client_d=
o_disconnect: CLOSE FH10 CONN0

A curl request that returns 400:=20

1741209023.423134 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): =
ACCEPT FH10 CONN1
1741209023.423207 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): =
Setting TLS to autostart
::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:10:23 -0700] "GET=
 /api2/json/cluster/nextid?vmid=3D106 HTTP/1.1" 400 55                     =
                                                                           =
              =20
1741209023.439439 worker[1636452]: PVE::APIServer::AnyEvent +423: finish_re=
sponse: DRAIN finish_response()
1741209023.439486 worker[1636452]: PVE::APIServer::AnyEvent +219: (eval): >=
>> finish_response().on_drain
1741209023.439547 worker[1636452]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: START client_do_disconnect()
1741209023.439591 worker[1636452]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: close connection AnyEvent::Handle=3DHASH(0x595788de2318)      =
                                                                           =
              =20
1741209023.439610 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> shutdown($hdl->{fh}, 1)
1741209023.439648 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> ok
1741209023.439682 worker[1636452]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: CLOSE FH10 CONN0

Requests returning 400 send "Connection: close" and tear down the socket im=
mediately, resulting in an unexpected EOF at the client. Requests returning=
 200 send "Connection: keep-alive" and have a 5 second timeout.

If we make a request but let the connection sit idle, the 5 second timeout =
is reached, and we receive unexpected EOF again:

1741209026.694101 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): =
ACCEPT FH10 CONN1
1741209026.694173 worker[1636452]: PVE::APIServer::AnyEvent +1910: (eval): =
Setting TLS to autostart
::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:10:26 -0700] "GET=
 /api2/json/cluster HTTP/1.1" 200 360
1741209026.698288 worker[1636452]: PVE::APIServer::AnyEvent +423: finish_re=
sponse: KEEPALIVE 99
1741209031.699939 worker[1636452]: AnyEvent::Handle +655: __ANON__: ERROR a=
ccept_connections()
1741209031.700154 worker[1636452]: PVE::APIServer::AnyEvent +1962: (eval): =
>>> accept_connections().on_error
1741209031.700186 worker[1636452]: PVE::APIServer::AnyEvent +1969: client_d=
o_disconnect: START client_do_disconnect()
1741209031.700212 worker[1636452]: PVE::APIServer::AnyEvent +1969: client_d=
o_disconnect: close connection AnyEvent::Handle=3DHASH(0x595788de1e68)
1741209031.700240 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> shutdown($hdl->{fh}, 1)
1741209031.700362 worker[1636452]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> ok
1741209031.700408 worker[1636452]: PVE::APIServer::AnyEvent +1969: client_d=
o_disconnect: CLOSE FH10 CONN0

To fix this, we can send a TLS close notify before shutting down the socket=
.

If we do stoptls() before clearing the handlers, however, we enter client_d=
o_disconnect() again:

1741209459.761847 worker[1639023]: PVE::APIServer::AnyEvent +1910: (eval): =
ACCEPT FH10 CONN1
1741209459.761948 worker[1639023]: PVE::APIServer::AnyEvent +1910: (eval): =
Setting TLS to autostart
::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:17:39 -0700] "GET=
 /api2/json/cluster/nextid?vmid=3D106 HTTP/1.1" 400 55                     =
                                                                           =
              =20
1741209459.787932 worker[1639023]: PVE::APIServer::AnyEvent +423: finish_re=
sponse: DRAIN finish_response()
1741209459.788003 worker[1639023]: PVE::APIServer::AnyEvent +219: (eval): >=
>> finish_response().on_drain
1741209459.788030 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: START client_do_disconnect()
1741209459.788076 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: close connection AnyEvent::Handle=3DHASH(0x603027214400)      =
                                                                           =
              =20
1741209459.788130 worker[1639023]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> $hdl->stoptls()
1741209459.788221 worker[1639023]: PVE::APIServer::AnyEvent +219: (eval): >=
>> finish_response().on_drain
1741209459.788248 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: START client_do_disconnect()
1741209459.788493 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: ERROR: detected empty handle
1741209459.788553 worker[1639023]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> shutdown($hdl->{fh}, 1)
1741209459.788593 worker[1639023]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> ok
1741209459.788649 worker[1639023]: PVE::APIServer::AnyEvent +222: client_do=
_disconnect: CLOSE FH10 CONN0

So we move these after the clearing of handlers. This was fixed by commit f=
737984 for some cases, perhaps for all cases by clearing handlers prior to =
shutdown?

> If I read the docs right, this could block (would that be an issue here?)=
 and could potentially destroy the handle (so that might need to be recheck=
ed afterwards to prevent spurious warnings?)
>=20
> what happens if we initiate the teardown, and the client never acks it?

I don't see this blocking in my observations. The server continues with soc=
ket teardown immediately after sending TLS close notify. If no response is =
received from the client, it remains in FIN_WAIT1 until that timeout is rea=
ched. This was tested by making a request with keepalive then immediately b=
locking all outbound on the client prior to reaching the 5 second timeout:

1741209811.245870 worker[1640562]: PVE::APIServer::AnyEvent +1910: (eval): =
ACCEPT FH10 CONN1
1741209811.245963 worker[1640562]: PVE::APIServer::AnyEvent +1910: (eval): =
Setting TLS to autostart
::ffff:10.36.88.28 - root@pam!katello_test [05/03/2025:14:23:31 -0700] "GET=
 /api2/json/cluster HTTP/1.1" 200 360
1741209811.260657 worker[1640562]: PVE::APIServer::AnyEvent +423: finish_re=
sponse: KEEPALIVE 99
1741209816.262383 worker[1640562]: AnyEvent::Handle +655: __ANON__: ERROR a=
ccept_connections()
1741209816.262532 worker[1640562]: PVE::APIServer::AnyEvent +1962: (eval): =
>>> accept_connections().on_error
1741209816.262631 worker[1640562]: PVE::APIServer::AnyEvent +1969: client_d=
o_disconnect: START client_do_disconnect()
1741209816.262706 worker[1640562]: PVE::APIServer::AnyEvent +1969: client_d=
o_disconnect: close connection AnyEvent::Handle=3DHASH(0x5eede90cc068)     =
                                                                           =
              =20
1741209816.262750 worker[1640562]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> $hdl->stoptls()
1741209816.262924 worker[1640562]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> shutdown($hdl->{fh}, 1)
1741209816.262975 worker[1640562]: PVE::APIServer::AnyEvent +191: __ANON__:=
 >>> ok
1741209816.262997 worker[1640562]: PVE::APIServer::AnyEvent +1969: client_d=
o_disconnect: CLOSE FH10 CONN0

There is almost no time between stoptls() and shutdown(). We can see the cl=
ient never responded by inspecting the connection on the server:

FIN-WAIT-1 0      25     [::ffff:10.36.88.27]:8006  [::ffff:10.36.88.28]:48=
382

As for the handle being destroyed, it gets deleted in client_do_disconnect(=
) and I do not see it being used beyond there. I observed no warnings durin=
g testing, but it's certainly possible I missed something. If there are any=
 additional tests I can do here, please let me know!

cheers,

Rob Rozestraten (1):
  fix unexpected EOF for client when closing TLS session

 src/PVE/APIServer/AnyEvent.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--=20
2.48.1




--===============8245613465816549151==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

--===============8245613465816549151==--