public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
@ 2024-11-29 15:37 Dominik Csapak
  2024-12-02  9:04 ` Fabian Grünbichler
  2024-12-02 16:47 ` Thomas Lamprecht
  0 siblings, 2 replies; 5+ messages in thread
From: Dominik Csapak @ 2024-11-29 15:37 UTC (permalink / raw)
  To: pbs-devel

so we don't leave around a zombie process when the old daemon still
needs to run, because of e.g. a running task.

Since this is mostly a cosmetic issue though, only try a clean up
once, so we don't unnecessarily block or run into other issues here.
(It could happen that it didn't exit at that point, but it's very
unlikely.)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
maybe the comment could be improved, but i tried to be not overly
verbose there, since it's not really an issue anyway

 proxmox-daemon/src/server.rs | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
index efea9078..edc64795 100644
--- a/proxmox-daemon/src/server.rs
+++ b/proxmox-daemon/src/server.rs
@@ -165,10 +165,12 @@ impl Reloader {
                 // No matter how we managed to get here, this is the time where we bail out quickly:
                 unsafe { libc::_exit(-1) }
             }
-            Ok(ForkResult::Parent { child }) => {
+            Ok(ForkResult::Parent {
+                child: middle_child,
+            }) => {
                 log::debug!(
                     "forked off a new server (first pid: {}), waiting for 2nd pid",
-                    child
+                    middle_child
                 );
                 std::mem::drop(pnew);
                 let mut pold = std::fs::File::from(pold);
@@ -211,6 +213,13 @@ impl Reloader {
                     log::error!("child vanished during reload: {}", e);
                 }
 
+                // try exactly once to get rid of the zombie process of middle_child, but
+                // non blocking and without error handling, since it's just cosmetic
+                let _ = nix::sys::wait::waitpid(
+                    middle_child,
+                    Some(nix::sys::wait::WaitPidFlag::WNOHANG),
+                );
+
                 Ok(())
             }
             Err(e) => {
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
  2024-11-29 15:37 [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork Dominik Csapak
@ 2024-12-02  9:04 ` Fabian Grünbichler
  2024-12-02 16:47 ` Thomas Lamprecht
  1 sibling, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2024-12-02  9:04 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On November 29, 2024 4:37 pm, Dominik Csapak wrote:
> so we don't leave around a zombie process when the old daemon still
> needs to run, because of e.g. a running task.
> 
> Since this is mostly a cosmetic issue though, only try a clean up
> once, so we don't unnecessarily block or run into other issues here.
> (It could happen that it didn't exit at that point, but it's very
> unlikely.)
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> maybe the comment could be improved, but i tried to be not overly
> verbose there, since it's not really an issue anyway
> 
>  proxmox-daemon/src/server.rs | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
> index efea9078..edc64795 100644
> --- a/proxmox-daemon/src/server.rs
> +++ b/proxmox-daemon/src/server.rs
> @@ -165,10 +165,12 @@ impl Reloader {
>                  // No matter how we managed to get here, this is the time where we bail out quickly:
>                  unsafe { libc::_exit(-1) }
>              }
> -            Ok(ForkResult::Parent { child }) => {
> +            Ok(ForkResult::Parent {
> +                child: middle_child,
> +            }) => {
>                  log::debug!(
>                      "forked off a new server (first pid: {}), waiting for 2nd pid",
> -                    child
> +                    middle_child
>                  );
>                  std::mem::drop(pnew);
>                  let mut pold = std::fs::File::from(pold);
> @@ -211,6 +213,13 @@ impl Reloader {
>                      log::error!("child vanished during reload: {}", e);
>                  }
>  
> +                // try exactly once to get rid of the zombie process of middle_child, but
> +                // non blocking and without error handling, since it's just cosmetic
> +                let _ = nix::sys::wait::waitpid(
> +                    middle_child,
> +                    Some(nix::sys::wait::WaitPidFlag::WNOHANG),
> +                );

looking at the possible errors here:

       EAGAIN The PID file descriptor specified in id is nonblocking and
       the process that it refers to has not terminated.

not using pidfds here, not applicable

       ECHILD (for wait()) The calling process does not have any
       unwaited-for children.

we are not calling wait, but waitpid, not applicable

       ECHILD (for waitpid() or waitid()) The process specified by pid
       (waitpid()) or idtype and id (waitid()) does not exist or is not
       a child of the calling process.  (This can  happen for one's own
       child if the action for SIGCHLD is set to SIG_IGN.  See also the
       Linux Notes section about threads.)

this one would mean the code above is buggy, so logging the error would
make sense?

       EINTR  WNOHANG was not set and an unblocked signal or a SIGCHLD
       was caught; see signal(7).

we set WNOHANG, so not applicable

       EINVAL The options argument was invalid.

this would also mean we do something wrong and we should log the error

       ESRCH  (for wait() or waitpid()) pid is equal to INT_MIN.

shouldn't happen either

so I think logging the error here (which should never happen ;)) should
be fine?

other than that, consider this:

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

> +
>                  Ok(())
>              }
>              Err(e) => {
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
  2024-11-29 15:37 [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork Dominik Csapak
  2024-12-02  9:04 ` Fabian Grünbichler
@ 2024-12-02 16:47 ` Thomas Lamprecht
  2024-12-03  9:14   ` Dominik Csapak
  1 sibling, 1 reply; 5+ messages in thread
From: Thomas Lamprecht @ 2024-12-02 16:47 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

Am 29.11.24 um 16:37 schrieb Dominik Csapak:
> so we don't leave around a zombie process when the old daemon still
> needs to run, because of e.g. a running task.
> 
> Since this is mostly a cosmetic issue though, only try a clean up
> once, so we don't unnecessarily block or run into other issues here.
> (It could happen that it didn't exit at that point, but it's very
> unlikely.)
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> maybe the comment could be improved, but i tried to be not overly
> verbose there, since it's not really an issue anyway
> 
>  proxmox-daemon/src/server.rs | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
> index efea9078..edc64795 100644
> --- a/proxmox-daemon/src/server.rs
> +++ b/proxmox-daemon/src/server.rs
> @@ -165,10 +165,12 @@ impl Reloader {
>                  // No matter how we managed to get here, this is the time where we bail out quickly:
>                  unsafe { libc::_exit(-1) }
>              }
> -            Ok(ForkResult::Parent { child }) => {
> +            Ok(ForkResult::Parent {
> +                child: middle_child,
> +            }) => {
>                  log::debug!(
>                      "forked off a new server (first pid: {}), waiting for 2nd pid",
> -                    child
> +                    middle_child
>                  );
>                  std::mem::drop(pnew);
>                  let mut pold = std::fs::File::from(pold);
> @@ -211,6 +213,13 @@ impl Reloader {
>                      log::error!("child vanished during reload: {}", e);
>                  }
>  
> +                // try exactly once to get rid of the zombie process of middle_child, but
> +                // non blocking and without error handling, since it's just cosmetic
> +                let _ = nix::sys::wait::waitpid(
> +                    middle_child,
> +                    Some(nix::sys::wait::WaitPidFlag::WNOHANG),
> +                );

why not blocking though? If that does not work something would be seriously
wrong. But not _that_ hard feelings, as long as the old process exits this
will be cleaned up by systemd anyway, but I really would like to have some
error handling here, as that definitively can only help.

> +
>                  Ok(())
>              }
>              Err(e) => {



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
  2024-12-02 16:47 ` Thomas Lamprecht
@ 2024-12-03  9:14   ` Dominik Csapak
  2024-12-03  9:29     ` Thomas Lamprecht
  0 siblings, 1 reply; 5+ messages in thread
From: Dominik Csapak @ 2024-12-03  9:14 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox Backup Server development discussion

On 12/2/24 17:47, Thomas Lamprecht wrote:
> Am 29.11.24 um 16:37 schrieb Dominik Csapak:
>> so we don't leave around a zombie process when the old daemon still
>> needs to run, because of e.g. a running task.
>>
>> Since this is mostly a cosmetic issue though, only try a clean up
>> once, so we don't unnecessarily block or run into other issues here.
>> (It could happen that it didn't exit at that point, but it's very
>> unlikely.)
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>> maybe the comment could be improved, but i tried to be not overly
>> verbose there, since it's not really an issue anyway
>>
>>   proxmox-daemon/src/server.rs | 13 +++++++++++--
>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
>> index efea9078..edc64795 100644
>> --- a/proxmox-daemon/src/server.rs
>> +++ b/proxmox-daemon/src/server.rs
>> @@ -165,10 +165,12 @@ impl Reloader {
>>                   // No matter how we managed to get here, this is the time where we bail out quickly:
>>                   unsafe { libc::_exit(-1) }
>>               }
>> -            Ok(ForkResult::Parent { child }) => {
>> +            Ok(ForkResult::Parent {
>> +                child: middle_child,
>> +            }) => {
>>                   log::debug!(
>>                       "forked off a new server (first pid: {}), waiting for 2nd pid",
>> -                    child
>> +                    middle_child
>>                   );
>>                   std::mem::drop(pnew);
>>                   let mut pold = std::fs::File::from(pold);
>> @@ -211,6 +213,13 @@ impl Reloader {
>>                       log::error!("child vanished during reload: {}", e);
>>                   }
>>   
>> +                // try exactly once to get rid of the zombie process of middle_child, but
>> +                // non blocking and without error handling, since it's just cosmetic
>> +                let _ = nix::sys::wait::waitpid(
>> +                    middle_child,
>> +                    Some(nix::sys::wait::WaitPidFlag::WNOHANG),
>> +                );
> 
> why not blocking though? If that does not work something would be seriously
> wrong. But not _that_ hard feelings, as long as the old process exits this
> will be cleaned up by systemd anyway, but I really would like to have some
> error handling here, as that definitively can only help.

my fear was that if there's something wrong with the middle child (e.g. something hangs)
we'll never close the parent process either and have two old processes hanging around instead of one.

but yes (also as fabian said), logging the error here at least would be good
i'll send a v2

> 
>> +
>>                   Ok(())
>>               }
>>               Err(e) => {
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
  2024-12-03  9:14   ` Dominik Csapak
@ 2024-12-03  9:29     ` Thomas Lamprecht
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Lamprecht @ 2024-12-03  9:29 UTC (permalink / raw)
  To: Dominik Csapak, Proxmox Backup Server development discussion

Am 03.12.24 um 10:14 schrieb Dominik Csapak:
> my fear was that if there's something wrong with the middle child (e.g. something hangs)
> we'll never close the parent process either and have two old processes hanging around instead of one.

FWIW: You could still add a generous timeout. IMO it's less likely that we
hang here if the fork of the actual re-exec-self process worked, as then
the middle one just needs to exit, compared to the middle child not being
scheduling for a while after the work and before it exits, which is still
unlikely, but less so.

The old process being left-over itself was not a problem per se after all.
If you really fear something can hang then it would be much better to use
a timeout here and log an error, as that way it would be even more likely
to notice such bugs which a single probe using WNOHANG will always mask.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-12-03  9:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-29 15:37 [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork Dominik Csapak
2024-12-02  9:04 ` Fabian Grünbichler
2024-12-02 16:47 ` Thomas Lamprecht
2024-12-03  9:14   ` Dominik Csapak
2024-12-03  9:29     ` Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal