From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 76A121FF140 for ; Fri, 24 Apr 2026 14:19:31 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 96F8E1865E; Fri, 24 Apr 2026 14:11:53 +0200 (CEST) From: Kefu Chai To: pve-devel@lists.proxmox.com Subject: [PATCH v3 http-server 0/1] fix pveproxy OOM in websocket and spice proxy handlers Date: Fri, 24 Apr 2026 20:11:39 +0800 Message-ID: <20260424121140.3687865-1-k.chai@proxmox.com> X-Mailer: git-send-email 2.47.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1777032617262 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.331 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: 52C4NX2AR2ILHUO72H7FSRNTGRXDF4L2 X-Message-ID-Hash: 52C4NX2AR2ILHUO72H7FSRNTGRXDF4L2 X-MailFrom: k.chai@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: see v2's cover letter [1] for the problem description and the approach. Changes since v2: * extract handle_proxy_eof(); the four on_eof sites were copy-paste of each other with only $reader and the peer handle differing. * fix a busy-loop in the on_eof drain loop: v2's unguarded `while length($hdl->{rbuf})` spins when the reader's `return if !$peer` short-circuits without consuming rbuf. reachable on a ws client close that sets block_disconnect on the backend handle, so a final reply from the backend pins the worker at 100% CPU instead of completing teardown. the new loop bails on peer-gone or zero progress. * clear on_drain in apply_read_backpressure() after firing instead of leaving the wrapper installed when prev_on_drain is undef. no functional impact (idempotent re-set of on_read) but stops pinning a reader reference for the rest of the connection. both of the above are verified with the same synthetic AnyEvent setup used for v1/v2. reverting just the busy-loop guard reproduces a spin that trips a 2 s alarm; reverting just the on_drain clear leaves the wrapper installed after the drain. on the peer-gone branch the drain loop no-ops and rbuf is released on handle teardown, same as the pre-v2 behavior (before this series added on_eof draining, rbuf at on_eof was always discarded). I audited the users: * PDM migration's control tunnel (mtunnel) completes each command synchronously via write_tunnel, so its teardown carries no protocol data; disk data goes over separate NBD-over-ws tunnels set up by forward_unix_socket, and a connection drop there surfaces as a clean migration abort on the source side rather than silent corruption. * NoVNC and SPICE display (plus termproxy shell output) lose at most a final frame or line, cosmetic. * SPICE USB passthrough is the one case with potential real data loss, but that requires an abrupt ws client close mid-transfer, which is rare. [1] https://lore.proxmox.com/pve-devel/20260413125650.2569621-1-k.chai@proxmox.com/ Kefu Chai (1): fix #7483: apiserver: add backpressure to proxy handlers src/PVE/APIServer/AnyEvent.pm | 178 +++++++++++++++++++++++++--------- 1 file changed, 133 insertions(+), 45 deletions(-) -- 2.47.3