public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH manager 0/5] ceph: add 'pveceph upgrade-check' subcommand
@ 2026-04-28  2:45 Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 1/5] pve8to9: extract ceph checks into PVE::Ceph::UpgradeCheck Kefu Chai
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Kefu Chai @ 2026-04-28  2:45 UTC (permalink / raw)
  To: pve-devel

Hi all,

this series grew out of a chat with Shannon about how easy it is for
upgraded Ceph clusters to silently miss out on features unlocked by
later releases. After a Ceph upgrade, two OSDmap settings often stay
at their old values, and admins forget to bump them:

- require_osd_release: leaving this behind blocks OSD-side features
  the new release would otherwise enable. We warn if it is behind the
  running version.
  https://docs.ceph.com/en/latest/rados/operations/require-osd-release/

- require_min_compat_client: bumping this unlocks newer on-map
  features like pg-upmap-primary and the read-balancer, but it's a
  one-way change that excludes older clients on enable of any
  dependent feature. We notice (not warn) and walk the operator
  through 'ceph features' first, so they don't lock anyone out by
  accident.
  https://docs.ceph.com/en/latest/rados/operations/require-min-compat-client/

Both checks are advisory only. Shannon and I agreed that turning these
knobs on the operator's behalf during an unattended upgrade would be a
show-stopper, so we just nudge.

While I was in there, I also added a 'pveceph upgrade-check' subcommand
so operators can run a Ceph sanity check without having to wade through
the full pve8to9 output. Ceph upgrades can happen multiple times across
a PVE major-release cycle, and the standalone command makes more sense
for that cadence.

The plumbing: a new PVE::Ceph::UpgradeCheck module returns structured
{ level, msg } records, so 'pveceph upgrade-check' and the pve8to9
"CHECKING HYPER-CONVERGED CEPH STATUS" section share the same checks
through their own log helpers. Patch 1 is a pure refactor of pve8to9's
check_ceph() into the new module, byte-diff verified on a real cluster
to make sure pve8to9 emits the same messages in the same order.

Tested on a live PVE 9 / Ceph Squid cluster: pve8to9 output is
byte-identical before/after the refactor (modulo the
require_min_compat_client message wording this series polishes), and
'pveceph upgrade-check' fires both the positive case
(require_osd_release == squid) and the negative case
(require_min_compat_client == luminous) correctly. Summary line counts
match the records emitted.

A pve-docs follow-up for the new subcommand will land separately once
this patchset gets merged.

Thanks!

Kefu Chai (5):
  pve8to9: extract ceph checks into PVE::Ceph::UpgradeCheck
  ceph: add pveceph upgrade-check command
  ceph: add require_osd_release upgrade check
  ceph: add require_min_compat_client upgrade check
  ceph: drop duplicate release-to-codename map in upgrade checks

 PVE/CLI/pve8to9.pm       | 203 ++----------------
 PVE/CLI/pveceph.pm       |  45 ++++
 PVE/Ceph/Makefile        |   1 +
 PVE/Ceph/Releases.pm     |  14 ++
 PVE/Ceph/Tools.pm        |   6 +
 PVE/Ceph/UpgradeCheck.pm | 441 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 527 insertions(+), 183 deletions(-)
 create mode 100644 PVE/Ceph/UpgradeCheck.pm

-- 
2.47.3





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH manager 1/5] pve8to9: extract ceph checks into PVE::Ceph::UpgradeCheck
  2026-04-28  2:45 [PATCH manager 0/5] ceph: add 'pveceph upgrade-check' subcommand Kefu Chai
@ 2026-04-28  2:45 ` Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 2/5] ceph: add pveceph upgrade-check command Kefu Chai
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Kefu Chai @ 2026-04-28  2:45 UTC (permalink / raw)
  To: pve-devel

Move the body of check_ceph() into a new PVE::Ceph::UpgradeCheck module.
The module exposes run_checks() which returns an arrayref of
{ level, msg } records, and each caller formats the records with its
own log_* helpers. This matches the idiomatic PVE pattern where modules
return data and callers handle presentation.

Prepares the ground for adding more ceph upgrade checks and for
exposing the same checks via a standalone 'pveceph upgrade-check'
subcommand in a follow-up. No behaviour change: pve8to9 emits the
same messages, in the same order, through the same log_* helpers.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
---
 PVE/CLI/pve8to9.pm       | 203 +++--------------------
 PVE/Ceph/Makefile        |   1 +
 PVE/Ceph/UpgradeCheck.pm | 342 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 363 insertions(+), 183 deletions(-)
 create mode 100644 PVE/Ceph/UpgradeCheck.pm

diff --git a/PVE/CLI/pve8to9.pm b/PVE/CLI/pve8to9.pm
index afc4785e..06dde101 100644
--- a/PVE/CLI/pve8to9.pm
+++ b/PVE/CLI/pve8to9.pm
@@ -14,6 +14,7 @@ use PVE::API2::Cluster::Ceph;
 
 use PVE::AccessControl;
 use PVE::Ceph::Tools;
+use PVE::Ceph::UpgradeCheck;
 use PVE::Cluster;
 use PVE::Corosync;
 use PVE::INotify;
@@ -61,21 +62,6 @@ my $older_suites = {
 
 my ($min_pve_major, $min_pve_minor, $min_pve_pkgrel) = (8, 4, 0);
 
-my $ceph_release2code = {
-    '12' => 'Luminous',
-    '13' => 'Mimic',
-    '14' => 'Nautilus',
-    '15' => 'Octopus',
-    '16' => 'Pacific',
-    '17' => 'Quincy',
-    '18' => 'Reef',
-    '19' => 'Squid',
-    '20' => 'Tentacle',
-};
-my $ceph_supported_release = 19; # the version we support for upgrading (i.e., available on both)
-my $ceph_supported_code_name = $ceph_release2code->{"$ceph_supported_release"}
-    or die "inconsistent source code, could not map expected ceph version to code name!";
-
 my $forced_legacy_cgroup = 0;
 
 my $counters = {
@@ -588,180 +574,31 @@ sub check_cluster_corosync {
     }
 }
 
-sub check_ceph {
-    print_header("CHECKING HYPER-CONVERGED CEPH STATUS");
-
-    if (PVE::Ceph::Tools::check_ceph_inited(1)) {
-        log_info("hyper-converged ceph setup detected!");
-    } else {
-        log_skip("no hyper-converged ceph setup detected!");
-        return;
-    }
-
-    log_info("getting Ceph status/health information..");
-    my $ceph_status = eval { PVE::API2::Ceph->status({ node => $nodename }); };
-    my $noout = eval { PVE::API2::Cluster::Ceph->get_flag({ flag => "noout" }); };
-    if ($@) {
-        log_fail("failed to get 'noout' flag status - $@");
-    }
-
-    my $noout_wanted = 1;
-
-    if (!$ceph_status || !$ceph_status->{health}) {
-        log_fail("unable to determine Ceph status!");
-    } else {
-        my $ceph_health = $ceph_status->{health}->{status};
-        if (!$ceph_health) {
-            log_fail("unable to determine Ceph health!");
-        } elsif ($ceph_health eq 'HEALTH_OK') {
-            log_pass("Ceph health reported as 'HEALTH_OK'.");
-        } elsif (
-            $ceph_health eq 'HEALTH_WARN'
-            && $noout
-            && (keys %{ $ceph_status->{health}->{checks} } == 1)
-        ) {
-            log_pass(
-                "Ceph health reported as 'HEALTH_WARN' with a single failing check and 'noout' flag set."
-            );
-        } else {
-            log_warn(
-                "Ceph health reported as '$ceph_health'.\n      Use the PVE dashboard or 'ceph -s'"
-                    . " to determine the specific issues and try to resolve them.");
-        }
-    }
-
-    # TODO: check OSD min-required version, if to low it breaks stuff!
-
-    log_info("checking local Ceph version..");
-    if (my $release = eval { PVE::Ceph::Tools::get_local_version(1) }) {
-        my $code_name = $ceph_release2code->{"$release"} || 'unknown';
-        if ($release == $ceph_supported_release) {
-            log_pass(
-                "found expected Ceph $ceph_supported_release $ceph_supported_code_name release.");
-        } elsif ($release > $ceph_supported_release) {
-            log_warn(
-                "found newer Ceph release $release $code_name as the expected $ceph_supported_release"
-                    . " $ceph_supported_code_name, installed third party repos?!");
-        } else {
-            log_fail("Hyper-converged Ceph $release $code_name is to old for upgrade!\n"
-                . "      Upgrade Ceph first to $ceph_supported_code_name following our how-to:\n"
-                . "      <https://pve.proxmox.com/wiki/Category:Ceph_Upgrade>");
-        }
-    } else {
-        log_fail("unable to determine local Ceph version!");
-    }
-
-    log_info("getting Ceph daemon versions..");
-    my $ceph_versions = eval { PVE::Ceph::Tools::get_cluster_versions(undef, 1); };
-    if (!$ceph_versions) {
-        log_fail("unable to determine Ceph daemon versions!");
-    } else {
-        my $services = [
-            { 'key' => 'mon', 'name' => 'monitor' },
-            { 'key' => 'mgr', 'name' => 'manager' },
-            { 'key' => 'mds', 'name' => 'MDS' },
-            { 'key' => 'osd', 'name' => 'OSD' },
-        ];
-
-        my $ceph_versions_simple = {};
-        my $ceph_versions_commits = {};
-        for my $type (keys %$ceph_versions) {
-            for my $full_version (keys $ceph_versions->{$type}->%*) {
-                if ($full_version =~ m/^(.*) \((.*)\).*\(.*\)$/) {
-                    # String is in the form of
-                    # ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)
-                    # only check the first part, e.g. 'ceph version 17.2.6', the commit hash can
-                    # be different
-                    $ceph_versions_simple->{$type}->{$1} = 1;
-                    $ceph_versions_commits->{$type}->{$2} = 1;
-                }
-            }
-        }
-
-        for my $service (@$services) {
-            my ($name, $key) = $service->@{ 'name', 'key' };
-            if (my $service_versions = $ceph_versions_simple->{$key}) {
-                if (keys %$service_versions == 0) {
-                    log_skip("no running instances detected for daemon type $name.");
-                } elsif (keys %$service_versions == 1) {
-                    log_pass("single running version detected for daemon type $name.");
-                } else {
-                    log_warn("multiple running versions detected for daemon type $name!");
-                }
-            } else {
-                log_skip("unable to determine versions of running Ceph $name instances.");
-            }
-            my $service_commits = $ceph_versions_commits->{$key};
-            log_info(
-                "different builds of same version detected for an $name. Are you in the middle of the upgrade?"
-            ) if $service_commits && keys %$service_commits > 1;
-        }
+sub log_ceph_upgrade_message {
+    my ($message) = @_;
 
-        my $overall_versions = $ceph_versions->{overall};
-        if (!$overall_versions) {
-            log_warn("unable to determine overall Ceph daemon versions!");
-        } elsif (keys %$overall_versions == 1) {
-            log_pass("single running overall version detected for all Ceph daemon types.");
-            $noout_wanted = !$upgraded; # off post-upgrade, on pre-upgrade
-        } elsif (keys $ceph_versions_simple->{overall}->%* != 1) {
-            log_warn(
-                "overall version mismatch detected, check 'ceph versions' output for details!");
-        }
-    }
-
-    if ($noout) {
-        if ($noout_wanted) {
-            log_pass("'noout' flag set to prevent rebalancing during cluster-wide upgrades.");
-        } else {
-            log_warn("'noout' flag set, Ceph cluster upgrade seems finished.");
-        }
-    } elsif ($noout_wanted) {
-        log_warn("'noout' flag not set - recommended to prevent rebalancing during upgrades.");
-    }
+    my ($level, $msg) = $message->@{qw(level msg)};
 
-    log_info("checking Ceph config..");
-    my $conf = PVE::Cluster::cfs_read_file('ceph.conf');
-    if (%$conf) {
-        my $global = $conf->{global};
+    return log_pass($msg) if $level eq 'pass';
+    return log_info($msg) if $level eq 'info';
+    return log_notice($msg) if $level eq 'notice';
+    return log_warn($msg) if $level eq 'warn';
+    return log_fail($msg) if $level eq 'fail';
+    return log_skip($msg) if $level eq 'skip';
 
-        my $global_monhost = $global->{mon_host} // $global->{"mon host"} // $global->{"mon-host"};
-        if (!defined($global_monhost)) {
-            log_warn(
-                "No 'mon_host' entry found in ceph config.\n  It's recommended to add mon_host with"
-                    . " all monitor addresses (without ports) to the global section.");
-        }
-
-        my $ipv6 = $global->{ms_bind_ipv6} // $global->{"ms bind ipv6"}
-            // $global->{"ms-bind-ipv6"};
-        if ($ipv6) {
-            my $ipv4 = $global->{ms_bind_ipv4} // $global->{"ms bind ipv4"}
-                // $global->{"ms-bind-ipv4"};
-            if ($ipv6 eq 'true' && (!defined($ipv4) || $ipv4 ne 'false')) {
-                log_warn(
-                    "'ms_bind_ipv6' is enabled but 'ms_bind_ipv4' is not disabled.\n  Make sure to"
-                        . " disable 'ms_bind_ipv4' for ipv6 only clusters, or add an ipv4 network to public/cluster network."
-                );
-            }
-        }
+    return log_info($msg);
+}
 
-        if (defined($global->{keyring})) {
-            log_warn(
-                "[global] config section contains 'keyring' option, which will prevent services from"
-                    . " starting with Nautilus.\n Move 'keyring' option to [client] section instead."
-            );
-        }
+sub check_ceph {
+    print_header("CHECKING HYPER-CONVERGED CEPH STATUS");
 
-    } else {
-        log_warn("Empty ceph config found");
-    }
+    my $messages = PVE::Ceph::UpgradeCheck::run_checks(
+        nodename => $nodename,
+        upgraded => $upgraded,
+    );
 
-    my $local_ceph_ver = PVE::Ceph::Tools::get_local_version(1);
-    if (defined($local_ceph_ver)) {
-        if ($local_ceph_ver <= 14) {
-            log_fail("local Ceph version too low, at least Octopus required..");
-        }
-    } else {
-        log_fail("unable to determine local Ceph version.");
+    for my $m ($messages->@*) {
+        log_ceph_upgrade_message($m);
     }
 }
 
diff --git a/PVE/Ceph/Makefile b/PVE/Ceph/Makefile
index 2901ebe5..b64912bb 100644
--- a/PVE/Ceph/Makefile
+++ b/PVE/Ceph/Makefile
@@ -4,6 +4,7 @@ PERLSOURCE =   \
 	Releases.pm \
 	Services.pm \
 	Tools.pm \
+	UpgradeCheck.pm \
 
 all:
 
diff --git a/PVE/Ceph/UpgradeCheck.pm b/PVE/Ceph/UpgradeCheck.pm
new file mode 100644
index 00000000..6998caf2
--- /dev/null
+++ b/PVE/Ceph/UpgradeCheck.pm
@@ -0,0 +1,342 @@
+package PVE::Ceph::UpgradeCheck;
+
+# Produces advisory messages about a Ceph cluster's upgrade-readiness.
+#
+# Callers (PVE::CLI::pve8to9, 'pveceph upgrade-check') invoke run_checks()
+# and format the returned records with their own log_* helpers.
+#
+# Each record is a hashref of the form:
+#     { level => 'pass'|'info'|'notice'|'warn'|'fail'|'skip', msg => 'text' }
+
+use strict;
+use warnings;
+
+use PVE::API2::Ceph;
+use PVE::API2::Cluster::Ceph;
+use PVE::Ceph::Tools;
+use PVE::Cluster;
+
+my $ceph_release2code = {
+    '12' => 'Luminous',
+    '13' => 'Mimic',
+    '14' => 'Nautilus',
+    '15' => 'Octopus',
+    '16' => 'Pacific',
+    '17' => 'Quincy',
+    '18' => 'Reef',
+    '19' => 'Squid',
+    '20' => 'Tentacle',
+};
+my $default_supported_release = 19; # available before and after the current major upgrade
+my $default_supported_code_name = $ceph_release2code->{"$default_supported_release"}
+    or die "inconsistent source code, could not map expected ceph version to code name!";
+
+sub run_checks {
+    my (%args) = @_;
+
+    my $nodename = $args{nodename}
+        or die "run_checks: 'nodename' argument is required\n";
+    my $supported_release = $args{supported_release} // $default_supported_release;
+    my $upgraded = $args{upgraded} // 0;
+
+    my @messages;
+
+    if (!PVE::Ceph::Tools::check_ceph_inited(1)) {
+        push @messages, { level => 'skip', msg => "no hyper-converged ceph setup detected!" };
+        return \@messages;
+    }
+    push @messages, { level => 'info', msg => "hyper-converged ceph setup detected!" };
+
+    my ($health_msgs, $noout) = check_health($nodename);
+    push @messages, $health_msgs->@*;
+
+    # TODO: check OSD min-required version, if to low it breaks stuff!
+
+    my ($version_msgs, $noout_wanted) = check_versions($supported_release, $upgraded);
+    push @messages, $version_msgs->@*;
+
+    push @messages, check_noout_flag($noout, $noout_wanted)->@*;
+
+    push @messages, check_config()->@*;
+
+    push @messages, check_local_version_minimum()->@*;
+
+    return \@messages;
+}
+
+sub check_health {
+    my ($nodename) = @_;
+
+    my @out;
+    push @out, { level => 'info', msg => "getting Ceph status/health information.." };
+
+    my $ceph_status = eval { PVE::API2::Ceph->status({ node => $nodename }); };
+    my $noout = eval { PVE::API2::Cluster::Ceph->get_flag({ flag => "noout" }); };
+    if ($@) {
+        push @out, { level => 'fail', msg => "failed to get 'noout' flag status - $@" };
+    }
+
+    if (!$ceph_status || !$ceph_status->{health}) {
+        push @out, { level => 'fail', msg => "unable to determine Ceph status!" };
+        return (\@out, $noout);
+    }
+
+    my $ceph_health = $ceph_status->{health}->{status};
+    if (!$ceph_health) {
+        push @out, { level => 'fail', msg => "unable to determine Ceph health!" };
+    } elsif ($ceph_health eq 'HEALTH_OK') {
+        push @out, { level => 'pass', msg => "Ceph health reported as 'HEALTH_OK'." };
+    } elsif (
+        $ceph_health eq 'HEALTH_WARN'
+        && $noout
+        && (keys %{ $ceph_status->{health}->{checks} } == 1)
+    ) {
+        push @out,
+            {
+                level => 'pass',
+                msg =>
+                "Ceph health reported as 'HEALTH_WARN' with a single failing check and 'noout' flag set.",
+            };
+    } else {
+        push @out,
+            {
+                level => 'warn',
+                msg =>
+                "Ceph health reported as '$ceph_health'.\n      Use the PVE dashboard or 'ceph -s'"
+                . " to determine the specific issues and try to resolve them.",
+            };
+    }
+
+    return (\@out, $noout);
+}
+
+sub check_versions {
+    my ($supported_release, $upgraded) = @_;
+
+    my @out;
+    my $noout_wanted = 1;
+
+    my $supported_code_name = $supported_release == $default_supported_release
+        ? $default_supported_code_name
+        : ($ceph_release2code->{"$supported_release"} // 'unknown');
+
+    push @out, { level => 'info', msg => "checking local Ceph version.." };
+    if (my $release = eval { PVE::Ceph::Tools::get_local_version(1) }) {
+        my $code_name = $ceph_release2code->{"$release"} || 'unknown';
+        if ($release == $supported_release) {
+            push @out,
+                {
+                    level => 'pass',
+                    msg => "found expected Ceph $supported_release $supported_code_name release.",
+                };
+        } elsif ($release > $supported_release) {
+            push @out,
+                {
+                    level => 'warn',
+                    msg => "found newer Ceph release $release $code_name as the expected"
+                    . " $supported_release $supported_code_name, installed third party repos?!",
+                };
+        } else {
+            push @out,
+                {
+                    level => 'fail',
+                    msg => "Hyper-converged Ceph $release $code_name is to old for upgrade!\n"
+                    . "      Upgrade Ceph first to $supported_code_name following our how-to:\n"
+                    . "      <https://pve.proxmox.com/wiki/Category:Ceph_Upgrade>",
+                };
+        }
+    } else {
+        push @out, { level => 'fail', msg => "unable to determine local Ceph version!" };
+    }
+
+    push @out, { level => 'info', msg => "getting Ceph daemon versions.." };
+    my $ceph_versions = eval { PVE::Ceph::Tools::get_cluster_versions(undef, 1); };
+    if (!$ceph_versions) {
+        push @out, { level => 'fail', msg => "unable to determine Ceph daemon versions!" };
+        return (\@out, $noout_wanted);
+    }
+
+    my $services = [
+        { 'key' => 'mon', 'name' => 'monitor' },
+        { 'key' => 'mgr', 'name' => 'manager' },
+        { 'key' => 'mds', 'name' => 'MDS' },
+        { 'key' => 'osd', 'name' => 'OSD' },
+    ];
+
+    my $ceph_versions_simple = {};
+    my $ceph_versions_commits = {};
+    for my $type (keys %$ceph_versions) {
+        for my $full_version (keys $ceph_versions->{$type}->%*) {
+            if ($full_version =~ m/^(.*) \((.*)\).*\(.*\)$/) {
+                # String is in the form of
+                # ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)
+                # only check the first part, e.g. 'ceph version 17.2.6', the commit hash can
+                # be different
+                $ceph_versions_simple->{$type}->{$1} = 1;
+                $ceph_versions_commits->{$type}->{$2} = 1;
+            }
+        }
+    }
+
+    for my $service (@$services) {
+        my ($name, $key) = $service->@{ 'name', 'key' };
+        if (my $service_versions = $ceph_versions_simple->{$key}) {
+            if (keys %$service_versions == 0) {
+                push @out,
+                    {
+                        level => 'skip',
+                        msg => "no running instances detected for daemon type $name.",
+                    };
+            } elsif (keys %$service_versions == 1) {
+                push @out,
+                    {
+                        level => 'pass',
+                        msg => "single running version detected for daemon type $name.",
+                    };
+            } else {
+                push @out,
+                    {
+                        level => 'warn',
+                        msg => "multiple running versions detected for daemon type $name!",
+                    };
+            }
+        } else {
+            push @out,
+                {
+                    level => 'skip',
+                    msg => "unable to determine versions of running Ceph $name instances.",
+                };
+        }
+        my $service_commits = $ceph_versions_commits->{$key};
+        if ($service_commits && keys %$service_commits > 1) {
+            push @out,
+                {
+                    level => 'info',
+                    msg =>
+                    "different builds of same version detected for an $name. Are you in the middle of the upgrade?",
+                };
+        }
+    }
+
+    my $overall_versions = $ceph_versions->{overall};
+    if (!$overall_versions) {
+        push @out, { level => 'warn', msg => "unable to determine overall Ceph daemon versions!" };
+    } elsif (keys %$overall_versions == 1) {
+        push @out,
+            {
+                level => 'pass',
+                msg => "single running overall version detected for all Ceph daemon types.",
+            };
+        $noout_wanted = !$upgraded; # off post-upgrade, on pre-upgrade
+    } elsif (keys $ceph_versions_simple->{overall}->%* != 1) {
+        push @out,
+            {
+                level => 'warn',
+                msg =>
+                "overall version mismatch detected, check 'ceph versions' output for details!",
+            };
+    }
+
+    return (\@out, $noout_wanted);
+}
+
+sub check_noout_flag {
+    my ($noout, $noout_wanted) = @_;
+
+    my @out;
+    if ($noout) {
+        if ($noout_wanted) {
+            push @out,
+                {
+                    level => 'pass',
+                    msg => "'noout' flag set to prevent rebalancing during cluster-wide upgrades.",
+                };
+        } else {
+            push @out,
+                {
+                    level => 'warn',
+                    msg => "'noout' flag set, Ceph cluster upgrade seems finished.",
+                };
+        }
+    } elsif ($noout_wanted) {
+        push @out,
+            {
+                level => 'warn',
+                msg => "'noout' flag not set - recommended to prevent rebalancing during upgrades.",
+            };
+    }
+
+    return \@out;
+}
+
+sub check_config {
+    my @out;
+
+    push @out, { level => 'info', msg => "checking Ceph config.." };
+    my $conf = PVE::Cluster::cfs_read_file('ceph.conf');
+    if (!%$conf) {
+        push @out, { level => 'warn', msg => "Empty ceph config found" };
+        return \@out;
+    }
+
+    my $global = $conf->{global};
+
+    my $global_monhost = $global->{mon_host} // $global->{"mon host"} // $global->{"mon-host"};
+    if (!defined($global_monhost)) {
+        push @out,
+            {
+                level => 'warn',
+                msg =>
+                "No 'mon_host' entry found in ceph config.\n  It's recommended to add mon_host with"
+                . " all monitor addresses (without ports) to the global section.",
+            };
+    }
+
+    my $ipv6 = $global->{ms_bind_ipv6} // $global->{"ms bind ipv6"} // $global->{"ms-bind-ipv6"};
+    if ($ipv6) {
+        my $ipv4 = $global->{ms_bind_ipv4} // $global->{"ms bind ipv4"}
+            // $global->{"ms-bind-ipv4"};
+        if ($ipv6 eq 'true' && (!defined($ipv4) || $ipv4 ne 'false')) {
+            push @out,
+                {
+                    level => 'warn',
+                    msg =>
+                    "'ms_bind_ipv6' is enabled but 'ms_bind_ipv4' is not disabled.\n  Make sure to"
+                    . " disable 'ms_bind_ipv4' for ipv6 only clusters, or add an ipv4 network to public/cluster network.",
+                };
+        }
+    }
+
+    if (defined($global->{keyring})) {
+        push @out,
+            {
+                level => 'warn',
+                msg =>
+                "[global] config section contains 'keyring' option, which will prevent services from"
+                . " starting with Nautilus.\n Move 'keyring' option to [client] section instead.",
+            };
+    }
+
+    return \@out;
+}
+
+sub check_local_version_minimum {
+    my @out;
+
+    my $local_ceph_ver = PVE::Ceph::Tools::get_local_version(1);
+    if (defined($local_ceph_ver)) {
+        if ($local_ceph_ver <= 14) {
+            push @out,
+                {
+                    level => 'fail',
+                    msg => "local Ceph version too low, at least Octopus required..",
+                };
+        }
+    } else {
+        push @out, { level => 'fail', msg => "unable to determine local Ceph version." };
+    }
+
+    return \@out;
+}
+
+1;
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH manager 2/5] ceph: add pveceph upgrade-check command
  2026-04-28  2:45 [PATCH manager 0/5] ceph: add 'pveceph upgrade-check' subcommand Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 1/5] pve8to9: extract ceph checks into PVE::Ceph::UpgradeCheck Kefu Chai
@ 2026-04-28  2:45 ` Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 3/5] ceph: add require_osd_release upgrade check Kefu Chai
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Kefu Chai @ 2026-04-28  2:45 UTC (permalink / raw)
  To: pve-devel

Expose the Ceph upgrade checks via a new 'pveceph upgrade-check'
subcommand, so operators can run a post-upgrade Ceph readiness check
against the release they are currently running, independently of a PVE
major-version upgrade.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
---
 PVE/CLI/pveceph.pm | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/PVE/CLI/pveceph.pm b/PVE/CLI/pveceph.pm
index d8867106..e5bbbdce 100755
--- a/PVE/CLI/pveceph.pm
+++ b/PVE/CLI/pveceph.pm
@@ -24,6 +24,7 @@ use PVE::Tools qw(run_command);
 use PVE::Ceph::Releases;
 use PVE::Ceph::Services;
 use PVE::Ceph::Tools;
+use PVE::Ceph::UpgradeCheck;
 
 use PVE::API2::Ceph;
 use PVE::API2::Ceph::FS;
@@ -498,6 +499,49 @@ __PACKAGE__->register_method({
     },
 });
 
+__PACKAGE__->register_method({
+    name => 'upgrade-check',
+    path => 'upgrade-check',
+    method => 'GET',
+    description =>
+        "Run post-upgrade Ceph readiness checks for the currently installed release.",
+    parameters => {
+        additionalProperties => 0,
+        properties => {
+            node => get_standard_option('pve-node'),
+        },
+    },
+    returns => { type => 'null' },
+    code => sub {
+        my ($param) = @_;
+
+        my $supported_release = PVE::Ceph::Tools::get_local_version(1);
+        if (!$supported_release) {
+            my $default_codename = PVE::Ceph::Releases::get_default_ceph_release_codename();
+            my $info = PVE::Ceph::Releases::get_ceph_release_info($default_codename);
+            $supported_release = int($info->{release}) if $info;
+        }
+        die "could not determine local Ceph major release\n" if !$supported_release;
+
+        my $messages = PVE::Ceph::UpgradeCheck::run_checks(
+            nodename => $param->{node},
+            supported_release => $supported_release,
+        );
+
+        my $counters = { pass => 0, info => 0, notice => 0, warn => 0, fail => 0, skip => 0 };
+        for my $m ($messages->@*) {
+            $counters->{ $m->{level} }++ if exists $counters->{ $m->{level} };
+            print uc($m->{level}) . ": $m->{msg}\n";
+        }
+
+        print "\n";
+        print "Summary: $counters->{pass} pass, $counters->{notice} notices,"
+            . " $counters->{warn} warnings, $counters->{fail} failures.\n";
+
+        return undef;
+    },
+});
+
 my $format_osddetails = sub {
     my ($data, $schema, $options) = @_;
 
@@ -616,6 +660,7 @@ our $cmddef = {
     install => [__PACKAGE__, 'install', []],
     purge => [__PACKAGE__, 'purge', []],
     status => [__PACKAGE__, 'status', []],
+    'upgrade-check' => [__PACKAGE__, 'upgrade-check', [], { node => $nodename }],
 };
 
 1;
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH manager 3/5] ceph: add require_osd_release upgrade check
  2026-04-28  2:45 [PATCH manager 0/5] ceph: add 'pveceph upgrade-check' subcommand Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 1/5] pve8to9: extract ceph checks into PVE::Ceph::UpgradeCheck Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 2/5] ceph: add pveceph upgrade-check command Kefu Chai
@ 2026-04-28  2:45 ` Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 4/5] ceph: add require_min_compat_client " Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 5/5] ceph: drop duplicate release-to-codename map in upgrade checks Kefu Chai
  4 siblings, 0 replies; 6+ messages in thread
From: Kefu Chai @ 2026-04-28  2:45 UTC (permalink / raw)
  To: pve-devel

Add a get_osd_dump() helper in PVE::Ceph::Tools that wraps the
'osd dump' mon command, and use it to implement a new advisory check
in PVE::Ceph::UpgradeCheck: require_osd_release.

The check warns if require_osd_release is unset or older than the
currently installed Ceph release on the node, and suggests running
'ceph osd require-osd-release <codename>' once all OSDs are upgraded.
Not setting this flag blocks a number of features that the OSDs would
otherwise support.

See
https://docs.ceph.com/en/latest/rados/operations/require-osd-release/

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
---
 PVE/Ceph/Releases.pm     | 14 ++++++++++
 PVE/Ceph/Tools.pm        |  6 +++++
 PVE/Ceph/UpgradeCheck.pm | 57 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/PVE/Ceph/Releases.pm b/PVE/Ceph/Releases.pm
index 324fcfaf..7eabda60 100644
--- a/PVE/Ceph/Releases.pm
+++ b/PVE/Ceph/Releases.pm
@@ -131,6 +131,20 @@ sub get_available_ceph_release_codenames($include_unstable_releases = 0) {
 
 my $_default_ceph_release_codename;
 
+# Return the codename (e.g. 'squid') whose major release matches $major,
+# searching all Ceph releases tracked in this module (i.e., not restricted
+# to releases available on the current PVE version). Returns undef if no
+# release matches. The inverse of looking up the major from
+# get_ceph_release_info($codename).
+sub get_codename_for_major_release($major) {
+    my $releases = get_ceph_release_def();
+    for my $codename (keys $releases->%*) {
+        my ($release_major) = split(/\./, $releases->{$codename}->{release});
+        return $codename if $release_major == $major;
+    }
+    return undef;
+}
+
 sub get_default_ceph_release_codename {
     if (!defined($_default_ceph_release_codename)) {
         my $ceph_releases = get_all_available_ceph_releases();
diff --git a/PVE/Ceph/Tools.pm b/PVE/Ceph/Tools.pm
index c731ac14..4edc967b 100644
--- a/PVE/Ceph/Tools.pm
+++ b/PVE/Ceph/Tools.pm
@@ -112,6 +112,12 @@ sub get_cluster_versions {
     return $rados->mon_command({ prefix => $cmd });
 }
 
+sub get_osd_dump {
+    my ($rados) = @_;
+    $rados = PVE::RADOS->new() if !$rados;
+    return $rados->mon_command({ prefix => 'osd dump', format => 'json' });
+}
+
 sub get_config {
     my $key = shift;
 
diff --git a/PVE/Ceph/UpgradeCheck.pm b/PVE/Ceph/UpgradeCheck.pm
index 6998caf2..5c454fd1 100644
--- a/PVE/Ceph/UpgradeCheck.pm
+++ b/PVE/Ceph/UpgradeCheck.pm
@@ -13,6 +13,7 @@ use warnings;
 
 use PVE::API2::Ceph;
 use PVE::API2::Cluster::Ceph;
+use PVE::Ceph::Releases;
 use PVE::Ceph::Tools;
 use PVE::Cluster;
 
@@ -50,7 +51,7 @@ sub run_checks {
     my ($health_msgs, $noout) = check_health($nodename);
     push @messages, $health_msgs->@*;
 
-    # TODO: check OSD min-required version, if to low it breaks stuff!
+    push @messages, check_require_osd_release($supported_release)->@*;
 
     my ($version_msgs, $noout_wanted) = check_versions($supported_release, $upgraded);
     push @messages, $version_msgs->@*;
@@ -339,4 +340,58 @@ sub check_local_version_minimum {
     return \@out;
 }
 
+# returns the numeric release value (e.g. 19.2) for a given codename, or undef
+# if the codename is not known to PVE::Ceph::Releases.
+my sub release_number {
+    my ($codename) = @_;
+    return undef if !$codename;
+    my $info = PVE::Ceph::Releases::get_ceph_release_info($codename);
+    return $info ? $info->{release} : undef;
+}
+
+sub check_require_osd_release {
+    my ($supported_release) = @_;
+
+    my @out;
+
+    my $osdmap = eval { PVE::Ceph::Tools::get_osd_dump() };
+    if ($@ || !$osdmap) {
+        my $err = $@ || 'empty osd dump';
+        push @out, { level => 'warn', msg => "could not query osd dump: $err" };
+        return \@out;
+    }
+
+    my $current = $osdmap->{require_osd_release} // '';
+    if (!$current) {
+        push @out,
+            {
+                level => 'warn',
+                msg => "require_osd_release is not set. Run"
+                . " 'ceph osd require-osd-release <codename>' after all OSDs are upgraded to"
+                . " the new release.",
+            };
+        return \@out;
+    }
+
+    my $expected_codename = PVE::Ceph::Releases::get_codename_for_major_release($supported_release)
+        // PVE::Ceph::Releases::get_default_ceph_release_codename();
+    my $expected_release = release_number($expected_codename);
+    my $current_release = release_number($current);
+
+    if (!defined($current_release) || $current_release < $expected_release) {
+        push @out,
+            {
+                level => 'warn',
+                msg => "require_osd_release is '$current', older than '$expected_codename'."
+                . " Once all OSDs are upgraded, run"
+                . " 'ceph osd require-osd-release $expected_codename' to unlock features"
+                . " that depend on the newer release.",
+            };
+    } else {
+        push @out, { level => 'pass', msg => "require_osd_release is at '$current'." };
+    }
+
+    return \@out;
+}
+
 1;
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH manager 4/5] ceph: add require_min_compat_client upgrade check
  2026-04-28  2:45 [PATCH manager 0/5] ceph: add 'pveceph upgrade-check' subcommand Kefu Chai
                   ` (2 preceding siblings ...)
  2026-04-28  2:45 ` [PATCH manager 3/5] ceph: add require_osd_release upgrade check Kefu Chai
@ 2026-04-28  2:45 ` Kefu Chai
  2026-04-28  2:45 ` [PATCH manager 5/5] ceph: drop duplicate release-to-codename map in upgrade checks Kefu Chai
  4 siblings, 0 replies; 6+ messages in thread
From: Kefu Chai @ 2026-04-28  2:45 UTC (permalink / raw)
  To: pve-devel

Notice if require_min_compat_client is unset or older than the current
backend default, with a reminder to run 'ceph features' first to check
connected clients. Bumping the flag unlocks commands that use newer
on-map features such as pg-upmap-primary and the read-balancer;
enabling any of those features afterwards will exclude older clients.

The check is notice-only. Admins need to decide on a case-by-case
basis whether it is safe to bump.

See
https://docs.ceph.com/en/latest/rados/operations/require-min-compat-client/

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
---
 PVE/Ceph/UpgradeCheck.pm | 49 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/PVE/Ceph/UpgradeCheck.pm b/PVE/Ceph/UpgradeCheck.pm
index 5c454fd1..09418df3 100644
--- a/PVE/Ceph/UpgradeCheck.pm
+++ b/PVE/Ceph/UpgradeCheck.pm
@@ -52,6 +52,7 @@ sub run_checks {
     push @messages, $health_msgs->@*;
 
     push @messages, check_require_osd_release($supported_release)->@*;
+    push @messages, check_require_min_compat_client($supported_release)->@*;
 
     my ($version_msgs, $noout_wanted) = check_versions($supported_release, $upgraded);
     push @messages, $version_msgs->@*;
@@ -394,4 +395,52 @@ sub check_require_osd_release {
     return \@out;
 }
 
+sub check_require_min_compat_client {
+    my ($supported_release) = @_;
+
+    my @out;
+
+    my $osdmap = eval { PVE::Ceph::Tools::get_osd_dump() };
+    if ($@ || !$osdmap) {
+        my $err = $@ || 'empty osd dump';
+        push @out, { level => 'warn', msg => "could not query osd dump: $err" };
+        return \@out;
+    }
+
+    my $current = $osdmap->{require_min_compat_client} // '';
+    my $expected_codename = PVE::Ceph::Releases::get_codename_for_major_release($supported_release)
+        // PVE::Ceph::Releases::get_default_ceph_release_codename();
+
+    if (!$current) {
+        push @out,
+            {
+                level => 'notice',
+                msg => "require_min_compat_client is unset. Check connected clients with"
+                . " 'ceph features', then 'ceph osd set-require-min-compat-client <release>'"
+                . " to unlock features like pg-upmap-primary and the read-balancer."
+                . " Enabling any of those features afterwards will exclude older clients.",
+            };
+        return \@out;
+    }
+
+    my $expected_release = release_number($expected_codename);
+    my $current_release = release_number($current);
+
+    if (!defined($current_release) || $current_release < $expected_release) {
+        push @out,
+            {
+                level => 'notice',
+                msg => "require_min_compat_client is '$current' (< '$expected_codename')."
+                . " If 'ceph features' shows no clients older than '$expected_codename',"
+                . " 'ceph osd set-require-min-compat-client $expected_codename' unlocks"
+                . " features like pg-upmap-primary and the read-balancer."
+                . " Enabling any of those features afterwards will exclude older clients.",
+            };
+    } else {
+        push @out, { level => 'pass', msg => "require_min_compat_client is at '$current'." };
+    }
+
+    return \@out;
+}
+
 1;
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH manager 5/5] ceph: drop duplicate release-to-codename map in upgrade checks
  2026-04-28  2:45 [PATCH manager 0/5] ceph: add 'pveceph upgrade-check' subcommand Kefu Chai
                   ` (3 preceding siblings ...)
  2026-04-28  2:45 ` [PATCH manager 4/5] ceph: add require_min_compat_client " Kefu Chai
@ 2026-04-28  2:45 ` Kefu Chai
  4 siblings, 0 replies; 6+ messages in thread
From: Kefu Chai @ 2026-04-28  2:45 UTC (permalink / raw)
  To: pve-devel

Drop the local $ceph_release2code map in PVE::Ceph::UpgradeCheck and
look up codenames through PVE::Ceph::Releases::get_codename_for_major_release()
instead. The map duplicated data already maintained in Releases.pm, so
adding a new Ceph release would have required two updates.

No functional change.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
---
 PVE/Ceph/UpgradeCheck.pm | 27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/PVE/Ceph/UpgradeCheck.pm b/PVE/Ceph/UpgradeCheck.pm
index 09418df3..4103c136 100644
--- a/PVE/Ceph/UpgradeCheck.pm
+++ b/PVE/Ceph/UpgradeCheck.pm
@@ -17,20 +17,10 @@ use PVE::Ceph::Releases;
 use PVE::Ceph::Tools;
 use PVE::Cluster;
 
-my $ceph_release2code = {
-    '12' => 'Luminous',
-    '13' => 'Mimic',
-    '14' => 'Nautilus',
-    '15' => 'Octopus',
-    '16' => 'Pacific',
-    '17' => 'Quincy',
-    '18' => 'Reef',
-    '19' => 'Squid',
-    '20' => 'Tentacle',
-};
 my $default_supported_release = 19; # available before and after the current major upgrade
-my $default_supported_code_name = $ceph_release2code->{"$default_supported_release"}
-    or die "inconsistent source code, could not map expected ceph version to code name!";
+my $default_supported_code_name =
+    ucfirst(PVE::Ceph::Releases::get_codename_for_major_release($default_supported_release)
+        // die "inconsistent source code, could not map expected ceph version to code name!\n");
 
 sub run_checks {
     my (%args) = @_;
@@ -118,13 +108,18 @@ sub check_versions {
     my @out;
     my $noout_wanted = 1;
 
-    my $supported_code_name = $supported_release == $default_supported_release
+    my $supported_code_name =
+        $supported_release == $default_supported_release
         ? $default_supported_code_name
-        : ($ceph_release2code->{"$supported_release"} // 'unknown');
+        : do {
+            my $codename = PVE::Ceph::Releases::get_codename_for_major_release($supported_release);
+            defined($codename) ? ucfirst($codename) : 'unknown';
+        };
 
     push @out, { level => 'info', msg => "checking local Ceph version.." };
     if (my $release = eval { PVE::Ceph::Tools::get_local_version(1) }) {
-        my $code_name = $ceph_release2code->{"$release"} || 'unknown';
+        my $codename = PVE::Ceph::Releases::get_codename_for_major_release($release);
+        my $code_name = defined($codename) ? ucfirst($codename) : 'unknown';
         if ($release == $supported_release) {
             push @out,
                 {
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-28  2:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28  2:45 [PATCH manager 0/5] ceph: add 'pveceph upgrade-check' subcommand Kefu Chai
2026-04-28  2:45 ` [PATCH manager 1/5] pve8to9: extract ceph checks into PVE::Ceph::UpgradeCheck Kefu Chai
2026-04-28  2:45 ` [PATCH manager 2/5] ceph: add pveceph upgrade-check command Kefu Chai
2026-04-28  2:45 ` [PATCH manager 3/5] ceph: add require_osd_release upgrade check Kefu Chai
2026-04-28  2:45 ` [PATCH manager 4/5] ceph: add require_min_compat_client " Kefu Chai
2026-04-28  2:45 ` [PATCH manager 5/5] ceph: drop duplicate release-to-codename map in upgrade checks Kefu Chai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal