From: Stefan Reiter <s.reiter@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH common] allow longer timeout for cancelling 'vzdump' jobs
Date: Wed, 27 Jan 2021 12:11:01 +0100 [thread overview]
Message-ID: <a8286e0c-fc75-3128-4abb-dac1d0d643c3@proxmox.com> (raw)
In-Reply-To: <37a43b7e-1919-bc0b-ac84-08411c86bd4d@proxmox.com>
On 26/01/2021 19:23, Thomas Lamprecht wrote:
> On 14.01.21 16:39, Stefan Reiter wrote:
>> This attempts to solve the issue where on slow network storages,
>> aborting a backup job (which may wait for buffers to flush) could take
>> longer than 5 seconds, and would thus result in the task being killed by
>> SIGKILL, not removing the backup lock in the process.
>>
>> Make the implementation future-proof by using a map from task type to a
>> timeout value. Default stays at 5, so tasks other than 'vzdump' are not
>> affected.
>>
>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>> ---
>> src/PVE/RESTEnvironment.pm | 16 ++++++++++++----
>> 1 file changed, 12 insertions(+), 4 deletions(-)
>>
>
> Not to sure about that map there in pve-common, that module should stay rather
> agnostic of user special treatment.
>
> Did you thought about passing that explicitly on worker creation, or setting it
> in the RPCEnv inside a worker?
I generally agree that it's a bit misplaced, but I don't see a way to
encode it in the worker - the only info we have in check_worker and
stop_task is the UPID, and I don't think it makes sense to encode a
timeout in that? Or is there a way I'm not seeing to retrieve additional
info about a worker from the UPID alone?
We could at least put the map in pve-manager, but I'm not sure if that's
any better.
>
>> diff --git a/src/PVE/RESTEnvironment.pm b/src/PVE/RESTEnvironment.pm
>> index d5b84d0..8a0cb9a 100644
>> --- a/src/PVE/RESTEnvironment.pm
>> +++ b/src/PVE/RESTEnvironment.pm
>> @@ -365,8 +365,16 @@ sub active_workers {
>> return $res;
>> }
>>
>> +my $timeout_map = {
>> + # backup cancellation on slow target storages might take a while, avoid
>> + # leaving the VM in locked state
>> + "vzdump" => 60,
>> +};
>> +
>> my $kill_process_group = sub {
>> - my ($pid, $pstart) = @_;
>> + my ($pid, $pstart, $timeout) = @_;
>> +
>> + $timeout //= 5;
>>
>> # send kill to process group (negative pid)
>> my $kpid = -$pid;
>> @@ -374,8 +382,7 @@ my $kill_process_group = sub {
>> # always send signal to all pgrp members
>> kill(15, $kpid); # send TERM signal
>>
>> - # give max 5 seconds to shut down
>> - for (my $i = 0; $i < 5; $i++) {
>> + for (my $i = 0; $i < $timeout; $i++) {
>> return if !PVE::ProcFSTools::check_process_running($pid, $pstart);
>> sleep (1);
>> }
>> @@ -394,7 +401,8 @@ sub check_worker {
>> return 0 if !$running;
>>
>> if ($killit) {
>> - &$kill_process_group($task->{pid});
>> + my $type = $task->{type};
>> + &$kill_process_group($task->{pid}, undef, $timeout_map->{$type});
>> return 0;
>> }
>>
>>
>
>
prev parent reply other threads:[~2021-01-27 11:11 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-14 15:39 Stefan Reiter
2021-01-26 18:23 ` Thomas Lamprecht
2021-01-27 11:11 ` Stefan Reiter [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a8286e0c-fc75-3128-4abb-dac1d0d643c3@proxmox.com \
--to=s.reiter@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal