From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id 4F7831FF136
	for <inbox@lore.proxmox.com>; Mon, 09 Feb 2026 11:39:37 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 738823615;
	Mon,  9 Feb 2026 11:40:18 +0100 (CET)
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH storage] fix #7295: lvm plugin: rollback: keep original volume
 when allocation fails and saferemove is used
Date: Mon,  9 Feb 2026 11:40:00 +0100
Message-ID: <20260209104009.44272-1-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.47.3
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2
X-Bm-Transport-Timestamp: 1770633529698
X-SPAM-LEVEL: Spam detection results:  0
	AWL                    -0.016 Adjusted score from AWL reputation of From:
 address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DMARC_MISSING             0.1 Missing DMARC policy
	KAM_DMARC_STATUS         0.01 Test Rule for DKIM or SPF Failure with Strict
 Alignment
	RCVD_IN_VALIDITY_CERTIFIED_BLOCKED  0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked.  See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
	RCVD_IN_VALIDITY_RPBL_BLOCKED  0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked.  See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
	RCVD_IN_VALIDITY_SAFE_BLOCKED  0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked.  See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_PASS               -0.001 SPF: sender matches SPF record
Message-ID-Hash: QLPJE4XEN6PHA3E7FSGMJYIVELU55TUQ
X-Message-ID-Hash: QLPJE4XEN6PHA3E7FSGMJYIVELU55TUQ
X-MailFrom: f.ebner@proxmox.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop;
 banned-address; emergency; member-moderation; nonmember-moderation;
 administrivia; implicit-dest; max-recipients; max-size; news-moderation;
 no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.10
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Owner: <mailto:pve-devel-owner@lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Subscribe: <mailto:pve-devel-join@lists.proxmox.com>
List-Unsubscribe: <mailto:pve-devel-leave@lists.proxmox.com>

When the 'saferemove' storage configuration option is used, the
'current' top volume of the backing chain is renamed and removal is
delegated to a worker task. Thus, the space for that volume is still
allocated and cannot be re-used for the new top volume (to be backed
by the snapshot volume). With a full enough pool, allocating the new
top volume fails. In this case, executing the saferemove worker and
removing the 'current' volume is bad, because there won't be any top
volume anymore. Fixing that situation would mean manually allocating a
new top volume backed by the snapshot afterwards.

Improve the situation by keeping the original top volume around and
renaming it back if allocating the new volume failed.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 src/PVE/Storage/LVMPlugin.pm | 41 ++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
index 32a8339..987e60b 100644
--- a/src/PVE/Storage/LVMPlugin.pm
+++ b/src/PVE/Storage/LVMPlugin.pm
@@ -1117,46 +1117,45 @@ sub volume_rollback_is_possible {
 }
 
 my sub volume_snapshot_rollback_locked {
-    my ($class, $scfg, $storeid, $volname, $snap, $cleanup_worker) = @_;
+    my ($class, $scfg, $storeid, $volname, $snap) = @_;
 
     my $format = ($class->parse_volname($volname))[6];
 
     die "can't rollback snapshot for '$format' volume\n" if $format ne 'qcow2';
 
-    $cleanup_worker->$* = eval { free_snap_image($class, $storeid, $scfg, $volname, 'current'); };
+    my $cleanup_worker = eval { free_snap_image($class, $storeid, $scfg, $volname, 'current'); };
     die "error deleting snapshot $snap $@\n" if $@;
 
     eval { alloc_snap_image($class, $storeid, $scfg, $volname, $snap) };
-    die "can't allocate new volume $volname: $@\n" if $@;
+    if (my $err = $@) {
+        if ($cleanup_worker) { # rename original image back
+            eval {
+                my $vg = $scfg->{vgname};
+                my $cmd = ['lvrename', $vg, "del-${volname}", $volname];
+                run_command($cmd, errmsg => "lvrename '${vg}/del-${volname}' error");
+            };
+            warn $@ if $@;
+        }
+        die "can't allocate new volume $volname: $err\n";
+    }
 
-    return undef;
+    return $cleanup_worker;
 }
 
 sub volume_snapshot_rollback {
     my ($class, $scfg, $storeid, $volname, $snap) = @_;
 
-    my $cleanup_worker;
-
-    eval {
-        $class->cluster_lock_storage(
-            $storeid,
-            $scfg->{shared},
-            undef,
-            sub {
-                volume_snapshot_rollback_locked(
-                    $class, $scfg, $storeid, $volname, $snap, \$cleanup_worker,
-                );
-            },
-        );
-    };
-    my $err = $@;
+    my $cleanup_worker = $class->cluster_lock_storage(
+        $storeid,
+        $scfg->{shared},
+        undef,
+        sub { volume_snapshot_rollback_locked($class, $scfg, $storeid, $volname, $snap); },
+    );
 
     # Spawn outside of the locked section, because with 'saferemove', the cleanup worker also needs
     # to obtain the lock, and in CLI context, it will be awaited synchronously, see fork_worker().
     fork_cleanup_worker($cleanup_worker);
 
-    die $err if $err;
-
     return;
 }
 
-- 
2.47.3