From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 87F751FF138 for ; Wed, 04 Mar 2026 14:46:22 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 406D29F65; Wed, 4 Mar 2026 14:47:25 +0100 (CET) From: Hannes Laimer To: pve-devel@lists.proxmox.com Subject: [PATCH access-control/common 0/2] address probblem with SIGCHLD handler being temporarily overwritten Date: Wed, 4 Mar 2026 14:46:47 +0100 Message-ID: <20260304134649.82272-1-h.laimer@proxmox.com> X-Mailer: git-send-email 2.47.3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1772631984564 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.065 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: YJ2NSZOAUELZ6G2L5GMNXPZKVQWJMUJK X-Message-ID-Hash: YJ2NSZOAUELZ6G2L5GMNXPZKVQWJMUJK X-MailFrom: h.laimer@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Thanks a lot @Fabian and @Fiona for helping me debug this! The problem is that some libaries do overwrite the SIGCHLD handler temporarily, if the library is called fast enough this can lead to lost CHLD signals which in turn prevents `worker_reaper` from being called in RESTEnvironment. So tasks won't get cleaned-up until a different SIGCHLD arrives at the same `pvedeamon` process triggering `worker_reaper`. As @Fabian mentioned in [1] a general re-work of the task handling, potentially with `pidfd`s, would make a lot of sense. These two patches address the problem in the task handling structure as it currently is. They - run the PAM lib call in a fork, so signal handler changes the library does are isloated from our process - run `worker_reaper` periodically (5s) do catch any other potential instances of this, since it would be possible that the same happens with other libs, not just PAM [1] https://lore.proxmox.com/pve-devel/1772617908.i4bmsyq0kp.astroid@yuna.none/T/#m7b0f3873be5755f330e288cfa50905744f225b2b pve-common: Hannes Laimer (1): RESTEnvironment: periodically reap workers as SIGCHLD fallback src/PVE/RESTEnvironment.pm | 9 +++++++++ 1 file changed, 9 insertions(+) pve-access-control: Hannes Laimer (1): pam: fork for PAM authentication to isolate SIGCHLD handler src/PVE/Auth/PAM.pm | 74 +++++++++++++++++++++++++-------------------- 1 file changed, 42 insertions(+), 32 deletions(-) Summary over all repositories: 2 files changed, 51 insertions(+), 32 deletions(-) -- Generated by murpp 0.9.0