* [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
@ 2024-11-29 15:37 Dominik Csapak
2024-12-02 9:04 ` Fabian Grünbichler
2024-12-02 16:47 ` Thomas Lamprecht
0 siblings, 2 replies; 5+ messages in thread
From: Dominik Csapak @ 2024-11-29 15:37 UTC (permalink / raw)
To: pbs-devel
so we don't leave around a zombie process when the old daemon still
needs to run, because of e.g. a running task.
Since this is mostly a cosmetic issue though, only try a clean up
once, so we don't unnecessarily block or run into other issues here.
(It could happen that it didn't exit at that point, but it's very
unlikely.)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
maybe the comment could be improved, but i tried to be not overly
verbose there, since it's not really an issue anyway
proxmox-daemon/src/server.rs | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
index efea9078..edc64795 100644
--- a/proxmox-daemon/src/server.rs
+++ b/proxmox-daemon/src/server.rs
@@ -165,10 +165,12 @@ impl Reloader {
// No matter how we managed to get here, this is the time where we bail out quickly:
unsafe { libc::_exit(-1) }
}
- Ok(ForkResult::Parent { child }) => {
+ Ok(ForkResult::Parent {
+ child: middle_child,
+ }) => {
log::debug!(
"forked off a new server (first pid: {}), waiting for 2nd pid",
- child
+ middle_child
);
std::mem::drop(pnew);
let mut pold = std::fs::File::from(pold);
@@ -211,6 +213,13 @@ impl Reloader {
log::error!("child vanished during reload: {}", e);
}
+ // try exactly once to get rid of the zombie process of middle_child, but
+ // non blocking and without error handling, since it's just cosmetic
+ let _ = nix::sys::wait::waitpid(
+ middle_child,
+ Some(nix::sys::wait::WaitPidFlag::WNOHANG),
+ );
+
Ok(())
}
Err(e) => {
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
2024-11-29 15:37 [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork Dominik Csapak
@ 2024-12-02 9:04 ` Fabian Grünbichler
2024-12-02 16:47 ` Thomas Lamprecht
1 sibling, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2024-12-02 9:04 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On November 29, 2024 4:37 pm, Dominik Csapak wrote:
> so we don't leave around a zombie process when the old daemon still
> needs to run, because of e.g. a running task.
>
> Since this is mostly a cosmetic issue though, only try a clean up
> once, so we don't unnecessarily block or run into other issues here.
> (It could happen that it didn't exit at that point, but it's very
> unlikely.)
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> maybe the comment could be improved, but i tried to be not overly
> verbose there, since it's not really an issue anyway
>
> proxmox-daemon/src/server.rs | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
> index efea9078..edc64795 100644
> --- a/proxmox-daemon/src/server.rs
> +++ b/proxmox-daemon/src/server.rs
> @@ -165,10 +165,12 @@ impl Reloader {
> // No matter how we managed to get here, this is the time where we bail out quickly:
> unsafe { libc::_exit(-1) }
> }
> - Ok(ForkResult::Parent { child }) => {
> + Ok(ForkResult::Parent {
> + child: middle_child,
> + }) => {
> log::debug!(
> "forked off a new server (first pid: {}), waiting for 2nd pid",
> - child
> + middle_child
> );
> std::mem::drop(pnew);
> let mut pold = std::fs::File::from(pold);
> @@ -211,6 +213,13 @@ impl Reloader {
> log::error!("child vanished during reload: {}", e);
> }
>
> + // try exactly once to get rid of the zombie process of middle_child, but
> + // non blocking and without error handling, since it's just cosmetic
> + let _ = nix::sys::wait::waitpid(
> + middle_child,
> + Some(nix::sys::wait::WaitPidFlag::WNOHANG),
> + );
looking at the possible errors here:
EAGAIN The PID file descriptor specified in id is nonblocking and
the process that it refers to has not terminated.
not using pidfds here, not applicable
ECHILD (for wait()) The calling process does not have any
unwaited-for children.
we are not calling wait, but waitpid, not applicable
ECHILD (for waitpid() or waitid()) The process specified by pid
(waitpid()) or idtype and id (waitid()) does not exist or is not
a child of the calling process. (This can happen for one's own
child if the action for SIGCHLD is set to SIG_IGN. See also the
Linux Notes section about threads.)
this one would mean the code above is buggy, so logging the error would
make sense?
EINTR WNOHANG was not set and an unblocked signal or a SIGCHLD
was caught; see signal(7).
we set WNOHANG, so not applicable
EINVAL The options argument was invalid.
this would also mean we do something wrong and we should log the error
ESRCH (for wait() or waitpid()) pid is equal to INT_MIN.
shouldn't happen either
so I think logging the error here (which should never happen ;)) should
be fine?
other than that, consider this:
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> +
> Ok(())
> }
> Err(e) => {
> --
> 2.39.5
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
2024-11-29 15:37 [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork Dominik Csapak
2024-12-02 9:04 ` Fabian Grünbichler
@ 2024-12-02 16:47 ` Thomas Lamprecht
2024-12-03 9:14 ` Dominik Csapak
1 sibling, 1 reply; 5+ messages in thread
From: Thomas Lamprecht @ 2024-12-02 16:47 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Dominik Csapak
Am 29.11.24 um 16:37 schrieb Dominik Csapak:
> so we don't leave around a zombie process when the old daemon still
> needs to run, because of e.g. a running task.
>
> Since this is mostly a cosmetic issue though, only try a clean up
> once, so we don't unnecessarily block or run into other issues here.
> (It could happen that it didn't exit at that point, but it's very
> unlikely.)
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> maybe the comment could be improved, but i tried to be not overly
> verbose there, since it's not really an issue anyway
>
> proxmox-daemon/src/server.rs | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
> index efea9078..edc64795 100644
> --- a/proxmox-daemon/src/server.rs
> +++ b/proxmox-daemon/src/server.rs
> @@ -165,10 +165,12 @@ impl Reloader {
> // No matter how we managed to get here, this is the time where we bail out quickly:
> unsafe { libc::_exit(-1) }
> }
> - Ok(ForkResult::Parent { child }) => {
> + Ok(ForkResult::Parent {
> + child: middle_child,
> + }) => {
> log::debug!(
> "forked off a new server (first pid: {}), waiting for 2nd pid",
> - child
> + middle_child
> );
> std::mem::drop(pnew);
> let mut pold = std::fs::File::from(pold);
> @@ -211,6 +213,13 @@ impl Reloader {
> log::error!("child vanished during reload: {}", e);
> }
>
> + // try exactly once to get rid of the zombie process of middle_child, but
> + // non blocking and without error handling, since it's just cosmetic
> + let _ = nix::sys::wait::waitpid(
> + middle_child,
> + Some(nix::sys::wait::WaitPidFlag::WNOHANG),
> + );
why not blocking though? If that does not work something would be seriously
wrong. But not _that_ hard feelings, as long as the old process exits this
will be cleaned up by systemd anyway, but I really would like to have some
error handling here, as that definitively can only help.
> +
> Ok(())
> }
> Err(e) => {
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
2024-12-02 16:47 ` Thomas Lamprecht
@ 2024-12-03 9:14 ` Dominik Csapak
2024-12-03 9:29 ` Thomas Lamprecht
0 siblings, 1 reply; 5+ messages in thread
From: Dominik Csapak @ 2024-12-03 9:14 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox Backup Server development discussion
On 12/2/24 17:47, Thomas Lamprecht wrote:
> Am 29.11.24 um 16:37 schrieb Dominik Csapak:
>> so we don't leave around a zombie process when the old daemon still
>> needs to run, because of e.g. a running task.
>>
>> Since this is mostly a cosmetic issue though, only try a clean up
>> once, so we don't unnecessarily block or run into other issues here.
>> (It could happen that it didn't exit at that point, but it's very
>> unlikely.)
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>> maybe the comment could be improved, but i tried to be not overly
>> verbose there, since it's not really an issue anyway
>>
>> proxmox-daemon/src/server.rs | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/proxmox-daemon/src/server.rs b/proxmox-daemon/src/server.rs
>> index efea9078..edc64795 100644
>> --- a/proxmox-daemon/src/server.rs
>> +++ b/proxmox-daemon/src/server.rs
>> @@ -165,10 +165,12 @@ impl Reloader {
>> // No matter how we managed to get here, this is the time where we bail out quickly:
>> unsafe { libc::_exit(-1) }
>> }
>> - Ok(ForkResult::Parent { child }) => {
>> + Ok(ForkResult::Parent {
>> + child: middle_child,
>> + }) => {
>> log::debug!(
>> "forked off a new server (first pid: {}), waiting for 2nd pid",
>> - child
>> + middle_child
>> );
>> std::mem::drop(pnew);
>> let mut pold = std::fs::File::from(pold);
>> @@ -211,6 +213,13 @@ impl Reloader {
>> log::error!("child vanished during reload: {}", e);
>> }
>>
>> + // try exactly once to get rid of the zombie process of middle_child, but
>> + // non blocking and without error handling, since it's just cosmetic
>> + let _ = nix::sys::wait::waitpid(
>> + middle_child,
>> + Some(nix::sys::wait::WaitPidFlag::WNOHANG),
>> + );
>
> why not blocking though? If that does not work something would be seriously
> wrong. But not _that_ hard feelings, as long as the old process exits this
> will be cleaned up by systemd anyway, but I really would like to have some
> error handling here, as that definitively can only help.
my fear was that if there's something wrong with the middle child (e.g. something hangs)
we'll never close the parent process either and have two old processes hanging around instead of one.
but yes (also as fabian said), logging the error here at least would be good
i'll send a v2
>
>> +
>> Ok(())
>> }
>> Err(e) => {
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork
2024-12-03 9:14 ` Dominik Csapak
@ 2024-12-03 9:29 ` Thomas Lamprecht
0 siblings, 0 replies; 5+ messages in thread
From: Thomas Lamprecht @ 2024-12-03 9:29 UTC (permalink / raw)
To: Dominik Csapak, Proxmox Backup Server development discussion
Am 03.12.24 um 10:14 schrieb Dominik Csapak:
> my fear was that if there's something wrong with the middle child (e.g. something hangs)
> we'll never close the parent process either and have two old processes hanging around instead of one.
FWIW: You could still add a generous timeout. IMO it's less likely that we
hang here if the fork of the actual re-exec-self process worked, as then
the middle one just needs to exit, compared to the middle child not being
scheduling for a while after the work and before it exits, which is still
unlikely, but less so.
The old process being left-over itself was not a problem per se after all.
If you really fear something can hang then it would be much better to use
a timeout here and log an error, as that way it would be even more likely
to notice such bugs which a single probe using WNOHANG will always mask.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-12-03 9:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-29 15:37 [pbs-devel] [PATCH proxmox] daemon: clean up middle process of double fork Dominik Csapak
2024-12-02 9:04 ` Fabian Grünbichler
2024-12-02 16:47 ` Thomas Lamprecht
2024-12-03 9:14 ` Dominik Csapak
2024-12-03 9:29 ` Thomas Lamprecht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox