From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id B0BD01FF138 for ; Mon, 29 Jun 2026 06:30:42 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8BAAA4E81; Mon, 29 Jun 2026 06:30:40 +0200 (CEST) From: Kefu Chai To: pve-devel@lists.proxmox.com Subject: [PATCH v2 manager 1/1] ceph: mds: reimplement hotstandby via ceph fs set allow_standby_replay Date: Mon, 29 Jun 2026 12:30:05 +0800 Message-ID: <20260629043005.1663891-2-k.chai@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260629043005.1663891-1-k.chai@proxmox.com> References: <20260629043005.1663891-1-k.chai@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1782707425416 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.270 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: 342PZUFC2YZVJEUXHMYUTMWX4WTJ4XJF X-Message-ID-Hash: 342PZUFC2YZVJEUXHMYUTMWX4WTJ4XJF X-MailFrom: k.chai@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: PVE was writing two per-MDS config options into ceph.conf on every MDS creation: [mds.] mds_standby_for_name = pve mds_standby_replay = true (when hotstandby=1) Neither exists in Ceph Squid or Tentacle; both are absent from src/common/options/mds.yaml.in and silently ignored. mds_standby_for_name = 'pve' was always wrong: the PVE default filesystem name is 'cephfs', not 'pve' (FS.pm: $fs_name = $param->{name} // 'cephfs'), so the option always pointed at a nonexistent filesystem. The option is a no-op in modern Ceph regardless. mds_standby_replay was the old per-daemon standby replay knob. The feature still exists in Squid/Tentacle but moved to a per-filesystem setting: 'ceph fs set allow_standby_replay true'. So the old key has had no effect since Squid. Fix: - Drop the unconditional mds_standby_for_name write. - When 'hotstandby' is set, call 'ceph fs set allow_standby_replay true' instead. A new optional 'filesystem' parameter (defaults to 'cephfs') names the target filesystem. - If the mon command fails, warn and continue: the MDS can serve as a standby regardless, and standby replay can be enabled later. Signed-off-by: Kefu Chai --- PVE/API2/Ceph/MDS.pm | 39 ++++++++++++++++++++++++++++++++------- 1 file changed, 32 insertions(+), 7 deletions(-) diff --git a/PVE/API2/Ceph/MDS.pm b/PVE/API2/Ceph/MDS.pm index 31b6fb7e..9025802b 100644 --- a/PVE/API2/Ceph/MDS.pm +++ b/PVE/API2/Ceph/MDS.pm @@ -151,8 +151,18 @@ __PACKAGE__->register_method({ optional => 1, default => 0, description => - "Determines whether a ceph-mds daemon should poll and replay the log of an active MDS. " - . "Faster switch on MDS failure, but needs more idle resources.", + "Use together with 'filesystem' to enable standby-replay " + . "for the given CephFS. Keeps a standby MDS replaying the " + . "active MDS journal for faster failover.", + }, + filesystem => { + type => 'string', + optional => 1, + default => 'cephfs', + pattern => qr|^[^:/\s]+$|, + description => + "The name of the CephFS filesystem to enable standby replay " + . "for when 'hotstandby' is set. Defaults to 'cephfs'.", }, }, }, @@ -194,11 +204,6 @@ __PACKAGE__->register_method({ } $cfg->{$section}->{host} = $nodename; - $cfg->{$section}->{'mds_standby_for_name'} = 'pve'; - - if ($param->{hotstandby}) { - $cfg->{$section}->{'mds_standby_replay'} = 'true'; - } cfs_write_file('ceph.conf', $cfg); @@ -214,6 +219,26 @@ __PACKAGE__->register_method({ die "$err\n"; } + + if ($param->{hotstandby}) { + my $fs_name = $param->{filesystem} // 'cephfs'; + print "Enabling standby replay for filesystem '$fs_name'...\n"; + eval { + $rados->mon_command({ + prefix => 'fs set', + fs_name => $fs_name, + var => 'allow_standby_replay', + val => 'true', + format => 'plain', + }); + }; + if (my $err = $@) { + chomp $err; + warn "Could not enable standby replay for '$fs_name': $err\n" + . "Run 'ceph fs set $fs_name allow_standby_replay true'" + . " manually once the filesystem exists.\n"; + } + } }; return $rpcenv->fork_worker('cephcreatemds', "mds.$mds_id", $authuser, $worker); -- 2.47.3