From: Fiona Ebner <f.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [PATCH cluster v2 2/2] cfs lock: unlock when encountering signal
Date: Thu, 19 Feb 2026 14:45:55 +0100 [thread overview]
Message-ID: <a447c6cd-e6c1-4c25-958b-34eb9fbd22a6@proxmox.com> (raw)
In-Reply-To: <f216de9f-fa99-43fa-8404-963d6ab67c88@proxmox.com>
Am 18.02.26 um 7:33 PM schrieb Thomas Lamprecht:
> Am 18.02.26 um 16:45 schrieb Fiona Ebner:
>> If the lock directory is not removed after failing because of a
>> signal, it won't be possible to acquire the lock anymore before the
>> 120 second timeout imposed on the lock by pmxcfs. This can easily
>> happen by a second, unrelated task in production and is quite
>> surprising. Install a signal handler that releases the lock if it was
>> already acquired. If an old handler is defined, it is invoked,
>> otherwise the signal is raised again. Just using 'die' would change
>> the execution flow compared to before the change.
>>
>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> ---
>> src/PVE/Cluster.pm | 16 ++++++++++++++++
>> 1 file changed, 16 insertions(+)
>>
>> diff --git a/src/PVE/Cluster.pm b/src/PVE/Cluster.pm
>> index bdb465f..7165d1c 100644
>> --- a/src/PVE/Cluster.pm
>> +++ b/src/PVE/Cluster.pm
>> @@ -615,6 +615,22 @@ my $cfs_lock = sub {
>>
>> my $is_code_err = 0;
>> eval {
>> + # catch signals to release the lock - further defer to old handler if one was set
>> + my $old_sig;
>> + $old_sig->{$_} = $SIG{$_} for qw(INT TERM QUIT HUP PIPE);
>
> really a non-issue in practice and basically the same thing under the hood, but
> this could probably just a map, something like (untested):
>
> my $old_sig = { map { $_ => $SIG{$_} qw(INT TERM QUIT HUP PIPE) };
Will do!
>> +
>> + local $SIG{INT} = local $SIG{TERM} = local $SIG{QUIT} = local $SIG{HUP} =
>> + local $SIG{PIPE} = sub {
>> + my $signame = $_[0];
>> + rmdir $filename if $got_lock; # if we held the lock always unlock again
>
> Could be nice to output a warning if above rmdir fails?
Good point! Will also add it to the original line I copied this from.
>> + if ($old_sig->{$signame}) {
>> + $old_sig->{$signame}->(@_);
>> + } else {
>> + $SIG{$signame} = 'DEFAULT';
>> + POSIX::raise($signame);
>
> hmm, this reads alright, but then I'm wondering if it should be added elsewhere?
> As I found not a single "POSIX::raise" or "raise\(" instance in our perl code
> inside the /usr/share/perl5/{PVE,Proxmox} directories on a recent PVE 9 system, but
> we have quite a few signal overrides, and while I did not checked those, I do believe
> to remember that some of those fallback to the handler defined by the calling site.
The only ones I found that do invoke the previous handler are in
PVE::Daemon. They also do not use raise, but terminate the server.
For some other ones it's most likely intentional to convert the signal
to a simple die. For example PVE:VZDump::QemuServer, where it makes
sense to just catch the signal and proceed with aborting the backup
rather than raise it again.
Compared to those, cfs_lock() is quite low in the call chains and there
are callers that just warn about an error from cfs_lock(). So while it
is essential to not convert a signal to a simple die in cfs_lock(), it
might not be for other current signal overrides.
> Describing how exactly the code flow changes would be nice in any case.
Do you mean expanding on the sentence mentioning "code flow" in the
commit message or something else?
>> + }
>> + die "interrupted by signal\n";
>> + };
>>
>> mkdir $lockdir;
>>
next prev parent reply other threads:[~2026-02-19 13:45 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-18 15:44 [PATCH-SERIES cluster v2 0/2] cfs lock: small improvements Fiona Ebner
2026-02-18 15:44 ` [PATCH cluster v2 1/2] cfs lock: attempt to acquire lock more frequently Fiona Ebner
2026-02-18 15:44 ` [PATCH cluster v2 2/2] cfs lock: unlock when encountering signal Fiona Ebner
2026-02-18 18:33 ` Thomas Lamprecht
2026-02-19 13:45 ` Fiona Ebner [this message]
2026-02-18 18:33 ` partially-applied: [PATCH-SERIES cluster v2 0/2] cfs lock: small improvements Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a447c6cd-e6c1-4c25-958b-34eb9fbd22a6@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox