all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [PATCH cluster v2 2/2] cfs lock: unlock when encountering signal
Date: Thu, 19 Feb 2026 14:45:55 +0100	[thread overview]
Message-ID: <a447c6cd-e6c1-4c25-958b-34eb9fbd22a6@proxmox.com> (raw)
In-Reply-To: <f216de9f-fa99-43fa-8404-963d6ab67c88@proxmox.com>

Am 18.02.26 um 7:33 PM schrieb Thomas Lamprecht:
> Am 18.02.26 um 16:45 schrieb Fiona Ebner:
>> If the lock directory is not removed after failing because of a
>> signal, it won't be possible to acquire the lock anymore before the
>> 120 second timeout imposed on the lock by pmxcfs. This can easily
>> happen by a second, unrelated task in production and is quite
>> surprising. Install a signal handler that releases the lock if it was
>> already acquired. If an old handler is defined, it is invoked,
>> otherwise the signal is raised again. Just using 'die' would change
>> the execution flow compared to before the change.
>>
>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> ---
>>  src/PVE/Cluster.pm | 16 ++++++++++++++++
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/src/PVE/Cluster.pm b/src/PVE/Cluster.pm
>> index bdb465f..7165d1c 100644
>> --- a/src/PVE/Cluster.pm
>> +++ b/src/PVE/Cluster.pm
>> @@ -615,6 +615,22 @@ my $cfs_lock = sub {
>>  
>>      my $is_code_err = 0;
>>      eval {
>> +        # catch signals to release the lock - further defer to old handler if one was set
>> +        my $old_sig;
>> +        $old_sig->{$_} = $SIG{$_} for qw(INT TERM QUIT HUP PIPE);
> 
> really a non-issue in practice and basically the same thing under the hood, but
> this could probably just a map, something like (untested):
> 
> my $old_sig = { map { $_ => $SIG{$_} qw(INT TERM QUIT HUP PIPE) };

Will do!

>> +
>> +        local $SIG{INT} = local $SIG{TERM} = local $SIG{QUIT} = local $SIG{HUP} =
>> +            local $SIG{PIPE} = sub {
>> +                my $signame = $_[0];
>> +                rmdir $filename if $got_lock; # if we held the lock always unlock again
> 
> Could be nice to output a warning if above rmdir fails?

Good point! Will also add it to the original line I copied this from.

>> +                if ($old_sig->{$signame}) {
>> +                    $old_sig->{$signame}->(@_);
>> +                } else {
>> +                    $SIG{$signame} = 'DEFAULT';
>> +                    POSIX::raise($signame);
> 
> hmm, this reads alright, but then I'm wondering if it should be added elsewhere?
> As I found not a single "POSIX::raise" or "raise\(" instance in our perl code
> inside the /usr/share/perl5/{PVE,Proxmox} directories on a recent PVE 9 system, but
> we have quite a few signal overrides, and while I did not checked those, I do believe
> to remember that some of those fallback to the handler defined by the calling site.

The only ones I found that do invoke the previous handler are in
PVE::Daemon. They also do not use raise, but terminate the server.

For some other ones it's most likely intentional to convert the signal
to a simple die. For example PVE:VZDump::QemuServer, where it makes
sense to just catch the signal and proceed with aborting the backup
rather than raise it again.

Compared to those, cfs_lock() is quite low in the call chains and there
are callers that just warn about an error from cfs_lock(). So while it
is essential to not convert a signal to a simple die in cfs_lock(), it
might not be for other current signal overrides.

> Describing how exactly the code flow changes would be nice in any case.

Do you mean expanding on the sentence mentioning "code flow" in the
commit message or something else?

>> +                }
>> +                die "interrupted by signal\n";
>> +            };
>>  
>>          mkdir $lockdir;
>>  





  reply	other threads:[~2026-02-19 13:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-18 15:44 [PATCH-SERIES cluster v2 0/2] cfs lock: small improvements Fiona Ebner
2026-02-18 15:44 ` [PATCH cluster v2 1/2] cfs lock: attempt to acquire lock more frequently Fiona Ebner
2026-02-18 15:44 ` [PATCH cluster v2 2/2] cfs lock: unlock when encountering signal Fiona Ebner
2026-02-18 18:33   ` Thomas Lamprecht
2026-02-19 13:45     ` Fiona Ebner [this message]
2026-02-18 18:33 ` partially-applied: [PATCH-SERIES cluster v2 0/2] cfs lock: small improvements Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a447c6cd-e6c1-4c25-958b-34eb9fbd22a6@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal