all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs
@ 2025-08-13 13:40 Max R. Carrara
  2025-08-13 13:40 ` [pve-devel] [PATCH pve-manager master v2 1/2] fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation Max R. Carrara
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Max R. Carrara @ 2025-08-13 13:40 UTC (permalink / raw)
  To: pve-devel

Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs - v2
===========================================================

In short:  When creating an OSD via the API, the logical volumes backing
the OSD's DB and WAL do not have autoactivation enabled. Ceph requires
autoactivation on LVs, as it otherwise never activates them directly
itself. Fix this by setting autoactivation when creating those LVs as
well as providing a helper script that enables autoactivation for them
during an upgrade.

Notable Changes
---------------

- Add missing replacement for `LVMPlugin::lvcreate()` helper in
  OSD.pm (thanks Fabian!)

- Drastically limit the scope of the helper script that runs during
  upgrades, most of which was discussed off-list with Fabian as well
  (thanks!). In particular:
  - No longer activate matched LVs
  - No longer try to bring up OSDs on the node
  - Limit enabling autoactivation to LVs which back an OSD WAL or OSD DB
    * previously, LVs that back OSD devices with type "block" were also
      matched

- Limit the output of the helper script, most of which was also
  discussed off-list with Fabian (thanks again!)
  - The script is now completely silent unless an error is encountered
  - An invoked command (ceph-volume / lvs / lvchange) fails, the
    captured stderr of that command is dumped before exiting
  - Otherwise, stdout is processed (or swallowed) and stderr is
    suppressed in order to not worry any users with spurious /
    unwarranted LVM errors

- Since pve-manager was bumped in the meantime, run the helper script
  when upgrading from a version < 9.0.6 instead of < 9.0.5

- Additionally require that the script runs when upgrading from a
  version >= 9.0~~
  - In other words, the script is only called in postinst when upgrading
    from versions where 9.0~~ <= version < 9.0.6

Additional Notes
----------------

The changes to the helper script and when it is executed are made to
reduce any potential side-effects that calling to LVM might have. If
calling `lvs` and `lvchange --setautoactivation y` fails, then
something's *really* wrong with the device anyway--and in that case, we
also dump the captured stderr.

Furthermore, if a node wasn't rebooted since new OSDs with DB/WAL were
set up, enabling autoactivation is all that needs to be done.

Even if the user rebooted the node before updating it, the update should
enable autoactivation for the OSD LVs--after that, another reboot is
sufficient to bring the OSDs back up. 

Alternatively, if one wants to avoid a reboot for whatever reason, the
affected OSDs can be brought up again as follows:

1. Check for affected OSD LVs:

  # lvs --options lv_name,vg_name,autoactivation,active

  The names of the affected LVs should begin with "osd-wal" or "osd-db"
  followed by a UUID4.

  Example output:
  # lvs --options lv_name,vg_name,autoactivation,active
    LV                                             VG                                        AutoAct Active
    osd-db-2947e348-fe1b-4c38-b9d0-d24f3b8de70f    ceph-1dce1129-a411-4a14-8508-edcc8626c594               
    osd-wal-cc31c0cf-2b40-4ea6-afc7-eea8b767f7f5   ceph-31a0a43c-990a-40dc-9027-6412b0f6673c               
    osd-wal-20db00cf-b3c2-491a-bf17-4fd7c29aba6a   ceph-534964a2-b764-4ed5-a3c2-fd21aeda116a               
    osd-block-dd00fa96-695e-442c-99bd-ba09c2d3bd03 ceph-6bdd82ad-09bb-4c36-9382-1c302b462d7a enabled active
    osd-db-42648fa6-eb1f-43db-91d3-16ef6605c62b    ceph-778869db-311b-475c-ba40-a61e531cc127               
    osd-block-3a178277-9ccc-488e-bfc5-4089a347195c ceph-e7e5ac3e-56d4-4fc2-b853-7ed2465d6a69 enabled active
    data                                           pve                                       enabled active
    root                                           pve                                       enabled active
    swap                                           pve                                       enabled active

2. For OSD LVs for which the "AutoAct" column is empty, run the
   following command:

  # lvchange --setautoactivation y <vg_name>/<lv_name>

3. For OSD LVs which aren't active, run the following command:

  # lvchange --activate y <vg_name>/<lv_name>

4. Double-check that autoactivation is set and that the affected LVs are
   activated:

  # lvs --options lv_name,vg_name,autoactivation,active

  Example output:
  # lvs --options lv_name,vg_name,autoactivation,active
    LV                                             VG                                        AutoAct Active
    osd-db-2947e348-fe1b-4c38-b9d0-d24f3b8de70f    ceph-1dce1129-a411-4a14-8508-edcc8626c594 enabled active
    osd-wal-cc31c0cf-2b40-4ea6-afc7-eea8b767f7f5   ceph-31a0a43c-990a-40dc-9027-6412b0f6673c enabled active
    osd-wal-20db00cf-b3c2-491a-bf17-4fd7c29aba6a   ceph-534964a2-b764-4ed5-a3c2-fd21aeda116a enabled active
    osd-block-dd00fa96-695e-442c-99bd-ba09c2d3bd03 ceph-6bdd82ad-09bb-4c36-9382-1c302b462d7a enabled active
    osd-db-42648fa6-eb1f-43db-91d3-16ef6605c62b    ceph-778869db-311b-475c-ba40-a61e531cc127 enabled active
    osd-block-3a178277-9ccc-488e-bfc5-4089a347195c ceph-e7e5ac3e-56d4-4fc2-b853-7ed2465d6a69 enabled active
    data                                           pve                                       enabled active
    root                                           pve                                       enabled active
    swap                                           pve                                       enabled active


5. Finally, bring up all the OSDs again:

  # ceph-volume lvm activate --all

Previous Versions
-----------------

v1: https://lore.proxmox.com/pve-devel/20250812164631.428424-1-m.carrara@proxmox.com/T/

Summary of Changes
------------------

Max R. Carrara (2):
  fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation
  fix #6652: d/postinst: enable autoactivation for Ceph OSD LVs

 PVE/API2/Ceph/OSD.pm                  |  27 +++-
 bin/Makefile                          |   3 +-
 bin/pve-osd-lvm-enable-autoactivation | 176 ++++++++++++++++++++++++++
 debian/postinst                       |  16 +++
 4 files changed, 219 insertions(+), 3 deletions(-)
 create mode 100644 bin/pve-osd-lvm-enable-autoactivation

-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] [PATCH pve-manager master v2 1/2] fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation
  2025-08-13 13:40 [pve-devel] [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs Max R. Carrara
@ 2025-08-13 13:40 ` Max R. Carrara
  2025-08-13 13:40 ` [pve-devel] [PATCH pve-manager master v2 2/2] fix #6652: d/postinst: enable autoactivation for Ceph OSD LVs Max R. Carrara
  2025-08-13 14:14 ` [pve-devel] applied: (subset) [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing " Fabian Grünbichler
  2 siblings, 0 replies; 4+ messages in thread
From: Max R. Carrara @ 2025-08-13 13:40 UTC (permalink / raw)
  To: pve-devel

... by adding an inline helper sub for `lvcreate` instead of using the
LVM storage plugin's helper sub.

Autoactivation is required for LVs used by Ceph OSDs, as Ceph
otherwise doesn't activate them by itself.

This is a regression from f296ffc4e4d in pve-storage [0].

[0]: https://git.proxmox.com/?p=pve-storage.git;a=commitdiff;h=f296ffc4e4d64b574c3001dc7cc6af3da1406441

Fixes: #6652
Signed-off-by: Max R. Carrara <m.carrara@proxmox.com>
---
 PVE/API2/Ceph/OSD.pm | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm
index 23e187ce..0f850415 100644
--- a/PVE/API2/Ceph/OSD.pm
+++ b/PVE/API2/Ceph/OSD.pm
@@ -423,6 +423,29 @@ __PACKAGE__->register_method({
         # See FIXME below
         my @udev_trigger_devs = ();
 
+        # $size is in kibibytes
+        my $osd_lvcreate = sub {
+            my ($vg, $lv, $size) = @_;
+
+            my $cmd = [
+                '/sbin/lvcreate',
+                '-aly',
+                '-Wy',
+                '--yes',
+                '--size',
+                $size . "k",
+                '--name',
+                $lv,
+                # explicitly enable autoactivation, because Ceph never explicitly
+                # activates LVs by itself
+                '--setautoactivation',
+                'y',
+                $vg,
+            ];
+
+            run_command($cmd, errmsg => "lvcreate '$vg/$lv' error");
+        };
+
         my $create_part_or_lv = sub {
             my ($dev, $size, $type) = @_;
 
@@ -443,7 +466,7 @@ __PACKAGE__->register_method({
                 my $lv = $type . "-" . UUID::uuid();
 
                 PVE::Storage::LVMPlugin::lvm_create_volume_group($dev->{devpath}, $vg);
-                PVE::Storage::LVMPlugin::lvcreate($vg, $lv, "${size}k");
+                $osd_lvcreate->($lv, $vg, $size);
 
                 if (PVE::Diskmanage::is_partition($dev->{devpath})) {
                     eval { PVE::Diskmanage::change_parttype($dev->{devpath}, '8E00'); };
@@ -475,7 +498,7 @@ __PACKAGE__->register_method({
 
                 my $lv = $type . "-" . UUID::uuid();
 
-                PVE::Storage::LVMPlugin::lvcreate($vg, $lv, "${size}k");
+                $osd_lvcreate->($vg, $lv, $size);
 
                 return "$vg/$lv";
 
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] [PATCH pve-manager master v2 2/2] fix #6652: d/postinst: enable autoactivation for Ceph OSD LVs
  2025-08-13 13:40 [pve-devel] [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs Max R. Carrara
  2025-08-13 13:40 ` [pve-devel] [PATCH pve-manager master v2 1/2] fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation Max R. Carrara
@ 2025-08-13 13:40 ` Max R. Carrara
  2025-08-13 14:14 ` [pve-devel] applied: (subset) [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing " Fabian Grünbichler
  2 siblings, 0 replies; 4+ messages in thread
From: Max R. Carrara @ 2025-08-13 13:40 UTC (permalink / raw)
  To: pve-devel

Introduce a new helper script named pve-osd-lvm-enable-autoactivation,
which silently enables autoactivation for certain logical volumes
used by Ceph OSDs. The helper script is called in debian/postinst when
upgrading from a version >= '9.0~~' and < '9.0.6'. This means that
only OSD LVs that were created on PVE 9 up until version 9.0.6 of
pve-manager are touched. This is done to limit the amount of
installations on which the script is executed.

Additionally, each LV used by an OSD must belong to either an OSD WAL
or OSD DB. Ensure this by checking whether the LV's name and its VG's
name match the format we use in our API on creation [0][1].

For VGs, the name must begin with "ceph-" and be followed by a
UUID4. For example, "ceph-31a0a43c-990a-40dc-9027-6412b0f6673c" is a
matching VG name.

For LVs, the name must begin with either "osd-db-" or "osd-wal-" and
be followed by a UUID4. For example,
"osd-db-2947e348-fe1b-4c38-b9d0-d24f3b8de70f" and
"osd-wal-20db00cf-b3c2-491a-bf17-4fd7c29aba6a" are matching LV names.

To prevent users from getting worried (or the like) by spurious or
unwarranted errors / warnings that might occur, the script itself does
not produce any output. The STDERR of all commands the script calls is
only dumped if the invoked command itself fails. If a call to
`ceph-volume`, `lvs` or `lvchange` fails, something is definitely
wrong with the user's cluster anyway.

[0]: https://git.proxmox.com/?p=pve-manager.git;a=blob;f=PVE/API2/Ceph/OSD.pm;h=23e187ce1884cef705daa3bda7e3800b05518f3d;hb=refs/heads/master#l442
[1]: https://git.proxmox.com/?p=pve-manager.git;a=blob;f=PVE/API2/Ceph/OSD.pm;h=23e187ce1884cef705daa3bda7e3800b05518f3d;hb=refs/heads/master#l476

Fixes: #6652
Signed-off-by: Max R. Carrara <m.carrara@proxmox.com>
---
 bin/Makefile                          |   3 +-
 bin/pve-osd-lvm-enable-autoactivation | 176 ++++++++++++++++++++++++++
 debian/postinst                       |  16 +++
 3 files changed, 194 insertions(+), 1 deletion(-)
 create mode 100644 bin/pve-osd-lvm-enable-autoactivation

diff --git a/bin/Makefile b/bin/Makefile
index 777e6759..0a0df34d 100644
--- a/bin/Makefile
+++ b/bin/Makefile
@@ -32,7 +32,8 @@ HELPERS =			\
 	pve-startall-delay	\
 	pve-init-ceph-crash	\
 	pve-firewall-commit	\
-	pve-sdn-commit
+	pve-sdn-commit		\
+	pve-osd-lvm-enable-autoactivation
 
 MIGRATIONS =			\
 	pve-lvm-disable-autoactivation		\
diff --git a/bin/pve-osd-lvm-enable-autoactivation b/bin/pve-osd-lvm-enable-autoactivation
new file mode 100644
index 00000000..e2e222bd
--- /dev/null
+++ b/bin/pve-osd-lvm-enable-autoactivation
@@ -0,0 +1,176 @@
+#!/usr/bin/perl
+
+use v5.36;
+
+use JSON qw(decode_json);
+
+use PVE::Tools;
+
+my sub ceph_volume_lvm_osd_info : prototype() () {
+    my $cmd = [
+        "/usr/sbin/ceph-volume", "lvm", "list", "--format", "json",
+    ];
+
+    my $stdout = '';
+    my $outfunc = sub($line) {
+        $stdout .= "$line\n";
+    };
+
+    my $stderr = '';
+    my $errfunc = sub($line) {
+        $stderr .= "$line\n";
+    };
+
+    eval {
+        PVE::Tools::run_command(
+            $cmd,
+            timeout => 10,
+            outfunc => $outfunc,
+            errfunc => $errfunc,
+        );
+    };
+    if (my $err = $@) {
+        $err = "$err\n" if $err !~ m/\n$/;
+
+        print STDERR $stderr;
+        *STDERR->flush();
+
+        die $err;
+    }
+
+    my $osd_info = decode_json($stdout);
+
+    return $osd_info;
+}
+
+my sub lvs : prototype() () {
+    my $cmd = [
+        "/usr/sbin/lvs",
+        "--noheadings",
+        "--separator",
+        ":",
+        "--options",
+        "lv_name,vg_name,autoactivation",
+    ];
+
+    my $all_lvs = {};
+
+    my $outfunc = sub($line) {
+        $line = PVE::Tools::trim($line);
+
+        my ($lv_name, $vg_name, $autoactivation) = split(':', $line, -1);
+
+        return undef if ($lv_name eq '' || $vg_name eq '');
+
+        $all_lvs->{"$vg_name/$lv_name"} = {
+            autoactivation => $autoactivation,
+        };
+    };
+
+    my $stderr = '';
+    my $errfunc = sub($line) {
+        $stderr .= "$line\n";
+    };
+
+    eval {
+        PVE::Tools::run_command(
+            $cmd,
+            timeout => 10,
+            outfunc => $outfunc,
+            errfunc => $errfunc,
+        );
+    };
+    if (my $err = $@) {
+        $err = "$err\n" if $err !~ m/\n$/;
+
+        print STDERR $stderr;
+        *STDERR->flush();
+
+        die $err;
+    }
+
+    return $all_lvs;
+}
+
+my sub main : prototype() () {
+    my $osd_info = ceph_volume_lvm_osd_info();
+    my $all_lvs = lvs();
+
+    my $re_uuid4 = qr/
+	\b
+	[0-9a-fA-F]{8}
+	- [0-9a-fA-F]{4}
+	- [0-9a-fA-F]{4}
+	- [0-9a-fA-F]{4}
+	- [0-9a-fA-F]{12}
+	\b
+    /x;
+
+    # $re_lv_name and $re_vg_name specifically match the LV and VG names we
+    # assign in OSD.pm in order to avoid modifying LVs created through means
+    # other than our API
+    my $re_lv_name = qr/^ (osd-db|osd-wal) - $re_uuid4 $/nx;
+    my $re_vg_name = qr/^ (ceph) - $re_uuid4 $/nx;
+
+    my @osd_lvs_no_autoactivation = ();
+
+    for my $osd (keys $osd_info->%*) {
+        for my $osd_lv ($osd_info->{$osd}->@*) {
+            my ($lv_name, $vg_name) = $osd_lv->@{qw(lv_name vg_name)};
+
+            next if $all_lvs->{$osd_lv}->{autoactivation};
+
+            next if $lv_name !~ $re_lv_name;
+            next if $vg_name !~ $re_vg_name;
+
+            my $osd_lv = "$vg_name/$lv_name";
+
+            push(@osd_lvs_no_autoactivation, $osd_lv) if !$all_lvs->{$osd_lv}->{autoactivation};
+        }
+    }
+
+    my $has_err = 0;
+
+    # Logical volumes are formatted as "vg_name/lv_name", which is necessary for lvchange
+    for my $lv (@osd_lvs_no_autoactivation) {
+        my $log = '';
+        my $logfunc = sub($line) {
+            $log .= "$line\n";
+        };
+
+        eval {
+            my $cmd = [
+                '/usr/sbin/lvchange', '--setautoactivation', 'y', $lv,
+            ];
+
+            PVE::Tools::run_command(
+                $cmd,
+                logfunc => $logfunc,
+                timeout => 10,
+            );
+        };
+        if (my $err = $@) {
+            $has_err = 1;
+
+	    $err = "$err\n" if $err !~ m/\n$/;
+
+            print STDERR $log;
+	    *STDERR->flush();
+
+            warn("Error: Failed to enable autoactivation for Ceph OSD logical volume '$lv'\n");
+            warn("$err");
+
+            next;
+        }
+
+    }
+
+    if ($has_err) {
+        warn("Couldn't enable autoactivation for all Ceph OSD DB/WAL logical volumes.\n");
+        exit 1;
+    }
+
+    return undef;
+}
+
+main();
diff --git a/debian/postinst b/debian/postinst
index b6e07fd9..8e8f1a07 100755
--- a/debian/postinst
+++ b/debian/postinst
@@ -133,6 +133,18 @@ migrate_apt_auth_conf() {
     fi
 }
 
+ceph_osd_lvm_enable_autoactivation() {
+    if ! test -e /usr/sbin/ceph-volume; then
+        return
+    fi
+
+    if ! /usr/share/pve-manager/helpers/pve-osd-lvm-enable-autoactivation; then
+        printf "\nEnabling autoactivation for logical volumes used by Ceph OSDs failed.";
+        printf " Check the output above for errors and try to enable autoactivation for OSD LVs";
+        printf " manually by running '/usr/share/pve-manager/helpers/pve-osd-lvm-enable-autoactivation'";
+    fi
+}
+
 # Copied from dh_installtmpfiles/13.24.2
 if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
        if [ -x "$(command -v systemd-tmpfiles)" ]; then
@@ -246,6 +258,10 @@ case "$1" in
         fi
     fi
 
+    if test -n "$2" && dpkg --compare-versions "$2" 'ge' '9.0~~' && dpkg --compare-versions "$2" 'lt' '9.0.6'; then
+        ceph_osd_lvm_enable_autoactivation
+    fi
+
     ;;
 
   abort-upgrade|abort-remove|abort-deconfigure)
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] applied: (subset) [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs
  2025-08-13 13:40 [pve-devel] [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs Max R. Carrara
  2025-08-13 13:40 ` [pve-devel] [PATCH pve-manager master v2 1/2] fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation Max R. Carrara
  2025-08-13 13:40 ` [pve-devel] [PATCH pve-manager master v2 2/2] fix #6652: d/postinst: enable autoactivation for Ceph OSD LVs Max R. Carrara
@ 2025-08-13 14:14 ` Fabian Grünbichler
  2 siblings, 0 replies; 4+ messages in thread
From: Fabian Grünbichler @ 2025-08-13 14:14 UTC (permalink / raw)
  To: pve-devel, Max R. Carrara


On Wed, 13 Aug 2025 15:40:24 +0200, Max R. Carrara wrote:
> Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs - v2
> ===========================================================
> 
> In short:  When creating an OSD via the API, the logical volumes backing
> the OSD's DB and WAL do not have autoactivation enabled. Ceph requires
> autoactivation on LVs, as it otherwise never activates them directly
> itself. Fix this by setting autoactivation when creating those LVs as
> well as providing a helper script that enables autoactivation for them
> during an upgrade.
> 
> [...]

Applied the first patch, thanks!

@Thomas - I am torn on the second one, after some off-list discussion it's now
limited to just upgrades from PVE 9 to later, but it still does a lot in
postinst... In any case the version number is one behind now though, so that
needs to be bumped if it gets applied!

[1/2] fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation
      commit: 92bbc0c89fe7331ab122ff396f5e23ab31fa0765

Best regards,
-- 
Fabian Grünbichler <f.gruenbichler@proxmox.com>


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-08-13 14:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-13 13:40 [pve-devel] [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs Max R. Carrara
2025-08-13 13:40 ` [pve-devel] [PATCH pve-manager master v2 1/2] fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation Max R. Carrara
2025-08-13 13:40 ` [pve-devel] [PATCH pve-manager master v2 2/2] fix #6652: d/postinst: enable autoactivation for Ceph OSD LVs Max R. Carrara
2025-08-13 14:14 ` [pve-devel] applied: (subset) [PATCH pve-manager master v2 0/2] Fix #6652: LVM Autoactivation Missing " Fabian Grünbichler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal