all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH access-control/common 0/2] address probblem with SIGCHLD handler being temporarily overwritten
@ 2026-03-04 13:46 Hannes Laimer
  2026-03-04 13:46 ` [PATCH pve-common 1/1] RESTEnvironment: periodically reap workers as SIGCHLD fallback Hannes Laimer
  2026-03-04 13:46 ` [PATCH pve-access-control 1/1] pam: fork for PAM authentication to isolate SIGCHLD handler Hannes Laimer
  0 siblings, 2 replies; 3+ messages in thread
From: Hannes Laimer @ 2026-03-04 13:46 UTC (permalink / raw)
  To: pve-devel

Thanks a lot @Fabian and @Fiona for helping me debug this!

The problem is that some libaries do overwrite the SIGCHLD handler
temporarily, if the library is called fast enough this can lead to lost
CHLD signals which in turn prevents `worker_reaper` from being called in
RESTEnvironment. So tasks won't get cleaned-up until a different SIGCHLD
arrives at the same `pvedeamon` process triggering `worker_reaper`.

As @Fabian mentioned in [1] a general re-work of the task handling,
potentially with `pidfd`s, would make a lot of sense.

These two patches address the problem in the task handling structure as
it currently is. They
 - run the PAM lib call in a fork, so signal handler changes the library
   does are isloated from our process
 - run `worker_reaper` periodically (5s) do catch any other potential
   instances of this, since it would be possible that the same happens
   with other libs, not just PAM

[1] https://lore.proxmox.com/pve-devel/1772617908.i4bmsyq0kp.astroid@yuna.none/T/#m7b0f3873be5755f330e288cfa50905744f225b2b


pve-common:

Hannes Laimer (1):
  RESTEnvironment: periodically reap workers as SIGCHLD fallback

 src/PVE/RESTEnvironment.pm | 9 +++++++++
 1 file changed, 9 insertions(+)


pve-access-control:

Hannes Laimer (1):
  pam: fork for PAM authentication to isolate SIGCHLD handler

 src/PVE/Auth/PAM.pm | 74 +++++++++++++++++++++++++--------------------
 1 file changed, 42 insertions(+), 32 deletions(-)


Summary over all repositories:
  2 files changed, 51 insertions(+), 32 deletions(-)

-- 
Generated by murpp 0.9.0




^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH pve-common 1/1] RESTEnvironment: periodically reap workers as SIGCHLD fallback
  2026-03-04 13:46 [PATCH access-control/common 0/2] address probblem with SIGCHLD handler being temporarily overwritten Hannes Laimer
@ 2026-03-04 13:46 ` Hannes Laimer
  2026-03-04 13:46 ` [PATCH pve-access-control 1/1] pam: fork for PAM authentication to isolate SIGCHLD handler Hannes Laimer
  1 sibling, 0 replies; 3+ messages in thread
From: Hannes Laimer @ 2026-03-04 13:46 UTC (permalink / raw)
  To: pve-devel

Libraries may temporarily override $SIG{CHLD}, causing worker exit
signals to be lost. Poll worker_reaper every 5 seconds via an AnyEvent
timer to catch any missed signals in API server contexts where this can
be problematic.

Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
---
 src/PVE/RESTEnvironment.pm | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/src/PVE/RESTEnvironment.pm b/src/PVE/RESTEnvironment.pm
index 81d7e29..cb44823 100644
--- a/src/PVE/RESTEnvironment.pm
+++ b/src/PVE/RESTEnvironment.pm
@@ -37,6 +37,7 @@ my $rest_env;
 
 my $WORKER_PIDS;
 my $WORKER_FLAG = 0;
+my $worker_reaper_timer;
 
 my $log_task_result = sub {
     my ($upid, $user, $status) = @_;
@@ -124,6 +125,14 @@ sub init {
         }
     };
 
+    # Periodically reap workers as a fallback in case a library call temporarily overrides
+    # $SIG{CHLD} and causes us to miss a signal. Only useful when an event loop is running.
+    $worker_reaper_timer = AnyEvent->timer(
+        after => 5,
+        interval => 5,
+        cb => sub { $worker_reaper->() },
+    );
+
     # environment types
     # cli  ... command started fron command line
     # pub  ... access from public server (pveproxy)
-- 
2.47.3





^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH pve-access-control 1/1] pam: fork for PAM authentication to isolate SIGCHLD handler
  2026-03-04 13:46 [PATCH access-control/common 0/2] address probblem with SIGCHLD handler being temporarily overwritten Hannes Laimer
  2026-03-04 13:46 ` [PATCH pve-common 1/1] RESTEnvironment: periodically reap workers as SIGCHLD fallback Hannes Laimer
@ 2026-03-04 13:46 ` Hannes Laimer
  1 sibling, 0 replies; 3+ messages in thread
From: Hannes Laimer @ 2026-03-04 13:46 UTC (permalink / raw)
  To: pve-devel

PAM modules can temporarily override $SIG{CHLD}, causing SIGCHLDs from
RESTEnvironment worker processes to be lost. Run the PAM interaction in
a subprocess via PVE::Tools::run_fork to contain any signal handler
manipulation to the child.

Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
---
 src/PVE/Auth/PAM.pm | 74 +++++++++++++++++++++++++--------------------
 1 file changed, 42 insertions(+), 32 deletions(-)

diff --git a/src/PVE/Auth/PAM.pm b/src/PVE/Auth/PAM.pm
index 3aacfc0..8586da5 100755
--- a/src/PVE/Auth/PAM.pm
+++ b/src/PVE/Auth/PAM.pm
@@ -27,45 +27,55 @@ sub authenticate_user {
     # user (www-data) need to be able to read /etc/passwd /etc/shadow
     die "no password\n" if !$password;
 
-    my $pamh = Authen::PAM->new(
-        'proxmox-ve-auth',
-        $username,
-        sub {
-            my @res;
-            while (@_) {
-                my $msg_type = shift;
-                my $msg = shift;
-                push @res, (0, $password);
-            }
-            push @res, 0;
-            return @res;
-        },
-    );
-
-    if (!ref($pamh)) {
-        my $err = $pamh->pam_strerror($pamh);
-        die "error during PAM init: $err";
+    # PAM modules may temporarily override $SIG{CHLD}, causing SIGCHLDs from
+    # RESTEnvironment workers to be lost. Running the PAM interaction in a fork
+    # isolates any such handler manipulation from the parent process.
+    my $client_ip;
+    if (my $rpcenv = PVE::RPCEnvironment::get()) {
+        $client_ip = $rpcenv->get_client_ip();
     }
 
-    if (my $rpcenv = PVE::RPCEnvironment::get()) {
-        if (my $ip = $rpcenv->get_client_ip()) {
-            $pamh->pam_set_item(PAM_RHOST(), $ip);
+    PVE::Tools::run_fork(sub {
+        my $pamh = Authen::PAM->new(
+            'proxmox-ve-auth',
+            $username,
+            sub {
+                my @res;
+                while (@_) {
+                    my $msg_type = shift;
+                    my $msg = shift;
+                    push @res, (0, $password);
+                }
+                push @res, 0;
+                return @res;
+            },
+        );
+
+        if (!ref($pamh)) {
+            my $err = $pamh->pam_strerror($pamh);
+            die "error during PAM init: $err";
         }
-    }
 
-    my $res;
+        if ($client_ip) {
+            $pamh->pam_set_item(PAM_RHOST(), $client_ip);
+        }
 
-    if (($res = $pamh->pam_authenticate(0)) != PAM_SUCCESS) {
-        my $err = $pamh->pam_strerror($res);
-        die "$err\n";
-    }
+        my $res;
 
-    if (($res = $pamh->pam_acct_mgmt(0)) != PAM_SUCCESS) {
-        my $err = $pamh->pam_strerror($res);
-        die "$err\n";
-    }
+        if (($res = $pamh->pam_authenticate(0)) != PAM_SUCCESS) {
+            my $err = $pamh->pam_strerror($res);
+            die "$err\n";
+        }
+
+        if (($res = $pamh->pam_acct_mgmt(0)) != PAM_SUCCESS) {
+            my $err = $pamh->pam_strerror($res);
+            die "$err\n";
+        }
+
+        $pamh = 0; # call destructor
 
-    $pamh = 0; # call destructor
+        return 1;
+    });
 
     return 1;
 }
-- 
2.47.3





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-04 13:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-04 13:46 [PATCH access-control/common 0/2] address probblem with SIGCHLD handler being temporarily overwritten Hannes Laimer
2026-03-04 13:46 ` [PATCH pve-common 1/1] RESTEnvironment: periodically reap workers as SIGCHLD fallback Hannes Laimer
2026-03-04 13:46 ` [PATCH pve-access-control 1/1] pam: fork for PAM authentication to isolate SIGCHLD handler Hannes Laimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal