public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH proxmox v2] daemon: clean up middle process of double fork
Date: Fri, 7 Feb 2025 15:35:59 +0100	[thread overview]
Message-ID: <i2imrv7n43mpqp5cdh4wue2etvudhsaggk7d3lqwwcmp7uhyyw@hes3mbx3hhtt> (raw)
In-Reply-To: <20241203112349.2446383-1-d.csapak@proxmox.com>

On Tue, Dec 03, 2024 at 12:23:49PM +0100, Dominik Csapak wrote:
> so we don't leave around a zombie process when the old daemon still
> needs to run, because of e.g. a running task.
> 
> Since this is mostly a cosmetic issue though, only try a clean up with a
> 10 second timeout, so we don't block forever. (It could happen that it
> didn't exit at that point, but it's very unlikely.)
> 
> In case we do run into the timeout here, the process will not be
> collected until the parent process exits and the middle process is
> collected by pid 1.
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes from v1:
> * use a timeout
> * log the error
> 
>  proxmox-daemon/src/server.rs | 33 +++++++++++++++++++++++++++++++--
>  1 file changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
> index efea9078..27153fc2 100644
> --- a/proxmox-daemon/src/server.rs
> +++ b/proxmox-daemon/src/server.rs
> @@ -8,6 +8,7 @@ use std::os::unix::io::{AsRawFd, FromRawFd, IntoRawFd, OwnedFd, RawFd};
>  use std::panic::UnwindSafe;
>  use std::path::PathBuf;
>  use std::pin::{pin, Pin};
> +use std::time::Duration;
>  
>  use anyhow::{bail, format_err, Error};
>  use futures::future::{self, Either};
> @@ -15,6 +16,7 @@ use nix::unistd::{fork, ForkResult};
>  
>  use proxmox_sys::fd::fd_change_cloexec;
>  use proxmox_sys::fs::CreateOptions;
> +use proxmox_sys::linux::timer;
>  
>  type BoxedStoreFunc = Box<dyn FnOnce() -> Result<String, Error> + UnwindSafe + Send>;
>  
> @@ -165,10 +167,12 @@ impl Reloader {

The error case above here does a `log::error!()` - that is literally the
only thing which could possibly hang.
We could have a 2nd socketpair for stringified errors (or turn the
existing one into a datagram based  socket with a type prefix for
messages) and just use a blocking wait().

But if we really want to use a timeout, I'd rather add a pidfd wrapper,
use `pidfd_open()` (it's our fork, so since we don't `wait()` for it, we
should get the correct process, and we could - in the future - replace
`fork()` with an equivalent `clone()` returning a pidfd).
The pidfd can be used with poll/epoll (so it would even work for async
code via `AsyncFd`).

This is also a case where we could consider switching to a vfork() type
of fork. That might even play nice with whatever logger we might be
using, making it even less likely for anything to block...

>                  // No matter how we managed to get here, this is the time where we bail out quickly:
>                  unsafe { libc::_exit(-1) }
>              }
> -            Ok(ForkResult::Parent { child }) => {
> +            Ok(ForkResult::Parent {
> +                child: middle_child,
> +            }) => {
>                  log::debug!(
>                      "forked off a new server (first pid: {}), waiting for 2nd pid",
> -                    child
> +                    middle_child
>                  );
>                  std::mem::drop(pnew);
>                  let mut pold = std::fs::File::from(pold);
> @@ -211,6 +215,10 @@ impl Reloader {
>                      log::error!("child vanished during reload: {}", e);
>                  }
>  
> +                if let Err(e) = waitpid_with_timeout(middle_child, Duration::from_secs(10)) {
> +                    log::error!("waitpid for middle process failed: {e}");
> +                }
> +
>                  Ok(())
>              }
>              Err(e) => {
> @@ -230,6 +238,27 @@ impl Reloader {
>      }
>  }
>  
> +fn waitpid_with_timeout(pid: nix::unistd::Pid, timeout: std::time::Duration) -> Result<(), Error> {
> +    // unblock the timeout signal temporarily
> +    let _sigblock_guard = timer::unblock_timeout_signal();
> +
> +    // setup a timeout timer
> +    let mut timer = timer::Timer::create(
> +        timer::Clock::Realtime,
> +        timer::TimerEvent::ThisThreadSignal(timer::SIGTIMEOUT),
> +    )?;
> +
> +    timer.arm(
> +        timer::TimerSpec::new()
> +            .value(Some(timeout))
> +            .interval(Some(Duration::from_millis(10))),
> +    )?;
> +
> +    nix::sys::wait::waitpid(pid, None)?;
> +
> +    Ok(())
> +}
> +
>  fn fd_store_func(fd: RawFd) -> Result<BoxedStoreFunc, Error> {
>      let fd = unsafe {
>          OwnedFd::from_raw_fd(nix::fcntl::fcntl(
> -- 
> 2.39.5


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


      reply	other threads:[~2025-02-07 14:36 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03 11:23 Dominik Csapak
2025-02-07 14:35 ` Wolfgang Bumiller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=i2imrv7n43mpqp5cdh4wue2etvudhsaggk7d3lqwwcmp7uhyyw@hes3mbx3hhtt \
    --to=w.bumiller@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal