From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 630871FF138 for ; Wed, 18 Feb 2026 16:44:17 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 40CE125A2; Wed, 18 Feb 2026 16:45:16 +0100 (CET) From: Fiona Ebner To: pve-devel@lists.proxmox.com Subject: [PATCH cluster v2 2/2] cfs lock: unlock when encountering signal Date: Wed, 18 Feb 2026 16:44:30 +0100 Message-ID: <20260218154438.184685-3-f.ebner@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260218154438.184685-1-f.ebner@proxmox.com> References: <20260218154438.184685-1-f.ebner@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1771429473647 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.016 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: QXGEFVIO2P3FNKA4ENLR3B5WZKUM2RQ7 X-Message-ID-Hash: QXGEFVIO2P3FNKA4ENLR3B5WZKUM2RQ7 X-MailFrom: f.ebner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: If the lock directory is not removed after failing because of a signal, it won't be possible to acquire the lock anymore before the 120 second timeout imposed on the lock by pmxcfs. This can easily happen by a second, unrelated task in production and is quite surprising. Install a signal handler that releases the lock if it was already acquired. If an old handler is defined, it is invoked, otherwise the signal is raised again. Just using 'die' would change the execution flow compared to before the change. Signed-off-by: Fiona Ebner --- src/PVE/Cluster.pm | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/src/PVE/Cluster.pm b/src/PVE/Cluster.pm index bdb465f..7165d1c 100644 --- a/src/PVE/Cluster.pm +++ b/src/PVE/Cluster.pm @@ -615,6 +615,22 @@ my $cfs_lock = sub { my $is_code_err = 0; eval { + # catch signals to release the lock - further defer to old handler if one was set + my $old_sig; + $old_sig->{$_} = $SIG{$_} for qw(INT TERM QUIT HUP PIPE); + + local $SIG{INT} = local $SIG{TERM} = local $SIG{QUIT} = local $SIG{HUP} = + local $SIG{PIPE} = sub { + my $signame = $_[0]; + rmdir $filename if $got_lock; # if we held the lock always unlock again + if ($old_sig->{$signame}) { + $old_sig->{$signame}->(@_); + } else { + $SIG{$signame} = 'DEFAULT'; + POSIX::raise($signame); + } + die "interrupted by signal\n"; + }; mkdir $lockdir; -- 2.47.3