From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] PVE child process behavior question
Date: Thu, 22 May 2025 08:30:42 +0200 (CEST) [thread overview]
Message-ID: <1283184248.17536.1747895442851@webmail.proxmox.com> (raw)
In-Reply-To: <mailman.538.1747833190.394.pve-devel@lists.proxmox.com>
> Denis Kanchev via pve-devel <pve-devel@lists.proxmox.com> hat am 21.05.2025 15:13 CEST geschrieben:
> Hello,
>
> We had an issue with a customer migrating a VM between nodes using our
> shared storage solution.
>
> On the target host the OOM killer killed the main migration process, but
> the child process (which actually performs the migration) kept on
> working, which we did not expect, and that caused some issues.
could you be more specific which process got killed?
when you do a migration, a task worker is forked and its UPID is returned
to the caller for further querying.
as part of the migration, other processes get spawned:
- ssh tunnel to the target node
- storage migration processes (on both nodes)
- VM state management CLI calls (on the target node)
which of those is the "main migration process"? which is the child process?
> This leads us to the broader question - after a request is submitted,
> the parent can be terminated, and not return a response to the client,
> while the work is being done, and the request can be wrongly retried or
> considered unfinished.
the parent should return almost immediately, as all it is doing at that
point is returning the UPID to the client (the process then continues to
do other work though, but that is no longer related to this task).
the only exception is for "sync" task workers, like in a CLI context,
where the "parent" has no other work to do, so it waits for the child/task
to finish and prints its output while doing so, and some "bulk action"
style API calls that fork multiple task workers and poll them themselves.
> Should the child processes terminate together with the parent to guard
> against this, or is this expected behavior?
the parent (API worker process) and child (task worker process) have no
direct relation after the task worker has been spawned.
> Here is an example patch to do this:
>
>
> diff --git a/src/PVE/RESTEnvironment.pm b/src/PVE/RESTEnvironment.pm
>
> index bfde7e6..744fffc 100644
>
> --- a/src/PVE/RESTEnvironment.pm
>
> +++ b/src/PVE/RESTEnvironment.pm
>
> @@ -13,8 +13,9 @@ use Fcntl qw(:flock);
>
> use IO::File;
>
> use IO::Handle;
>
> use IO::Select;
>
> -use POSIX qw(:sys_wait_h EINTR);
>
> +use POSIX qw(:sys_wait_h EINTR SIGKILL);
>
> use AnyEvent;
>
> +use Linux::Prctl qw(set_pdeathsig);
>
>
> use PVE::Exception qw(raise raise_perm_exc);
>
> use PVE::INotify;
>
> @@ -549,6 +550,9 @@ sub fork_worker {
>
> POSIX::setsid();
>
> }
>
>
> + # The signal that the calling process will get when its parent dies
>
> + set_pdeathsig(SIGKILL);
that has weird implications with regards to threads, so I don't think that
is a good idea..
>
> +
>
> POSIX::close ($psync[0]);
>
> POSIX::close ($ctrlfd[0]) if $sync;
>
> POSIX::close ($csync[1]);
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-05-22 6:31 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-21 13:13 Denis Kanchev via pve-devel
2025-05-22 6:30 ` Fabian Grünbichler [this message]
2025-05-22 6:55 ` Denis Kanchev via pve-devel
[not found] ` <857cbd6c-6866-417d-a71f-f5b5297bf09c@storpool.com>
2025-05-22 8:22 ` Fabian Grünbichler
2025-05-28 6:13 ` Denis Kanchev via pve-devel
[not found] ` <CAHXTzuk7tYRJV_j=88RWc3R3C7AkiEdFUXi88m5qwnDeYDEC+A@mail.gmail.com>
2025-05-28 6:33 ` Fabian Grünbichler
2025-05-29 7:33 ` Denis Kanchev via pve-devel
[not found] ` <CAHXTzumXeyJQQCj+45Hmy5qdU+BTFBYbHVgPy0u3VS-qS=_bDQ@mail.gmail.com>
2025-06-02 7:37 ` Fabian Grünbichler
2025-06-02 8:35 ` Denis Kanchev via pve-devel
[not found] ` <CAHXTzukAMG9050Ynn-KRSqhCz2Y0m6vnAQ7FEkCmEdQT3HapfQ@mail.gmail.com>
2025-06-02 8:49 ` Fabian Grünbichler
2025-06-02 9:18 ` Denis Kanchev via pve-devel
[not found] ` <CAHXTzu=AiNx0iTWFEUU2kdzx9-RopwLc7rqGui6f0Q=+Hy52=w@mail.gmail.com>
2025-06-02 11:42 ` Fabian Grünbichler
2025-06-02 13:23 ` Denis Kanchev via pve-devel
[not found] ` <CAHXTzu=qrZe2eEZro7qteR=fDjJQX13syfB9fs5VfFbG7Vy6vQ@mail.gmail.com>
2025-06-02 14:31 ` Fabian Grünbichler
2025-06-04 12:52 ` Denis Kanchev via pve-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1283184248.17536.1747895442851@webmail.proxmox.com \
--to=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal