From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 475361FF187 for ; Fri, 2 Jan 2026 16:03:50 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A2C7CDC73; Fri, 2 Jan 2026 16:04:58 +0100 (CET) From: Maximiliano Sandoval To: Aaron Lauterer In-Reply-To: <20260102134635.458369-1-a.lauterer@proxmox.com> (Aaron Lauterer's message of "Fri, 2 Jan 2026 14:46:35 +0100") References: <20260102134635.458369-1-a.lauterer@proxmox.com> User-Agent: mu4e 1.12.9; emacs 30.1 Date: Fri, 02 Jan 2026 16:04:24 +0100 Message-ID: MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1767366233296 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.088 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH docs] pveceph: document the change of ceph networks X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Cc: pve-devel@lists.proxmox.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Aaron Lauterer writes: Some small points below: > ceph networks (public, cluster) can be changed on the fly in a running > cluster. But the procedure, especially for the ceph public network is > a bit more involved. By documenting it, we will hopefully reduce the > number of issues our users run into when they try to attempt a network > change on their own. > > Signed-off-by: Aaron Lauterer > --- > Before I apply this commit I would like to get at least one T-b where you tested > both scenarios to make sure the instructions are clear to follow and that I > didn't miss anything. > > pveceph.adoc | 186 +++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 186 insertions(+) > > diff --git a/pveceph.adoc b/pveceph.adoc > index 63c5ca9..c4a4f91 100644 > --- a/pveceph.adoc > +++ b/pveceph.adoc > @@ -1192,6 +1192,192 @@ ceph osd unset noout > You can now start up the guests. Highly available guests will change their state > to 'started' when they power on. > > + > +[[pveceph_network_change]] > +Network Changes > +~~~~~~~~~~~~~~~ > + > +It is possible to change the networks used by Ceph in a HCI setup without any > +downtime if *both the old and new networks can be configured at the same time*. > + > +The procedure differs depending on which network you want to change. > + > +After the new network has been configured on all hosts, make sure you test it > +before proceeding with the changes. One way is to ping all hosts on the new > +network. If you use a large MTU, make sure to also test that it works. For > +example by sending ping packets that will result in a final packet at the max > +MTU size. > + > +To test an MTU of 9000, you will need the following packet sizes: > + > +[horizontal] > +IPv4:: The overhead of IP and ICMP is '28' bytes; the resulting packet size for > +the ping then is '8972' bytes. I would personally mention that this is "generally" the case, as one could be dealing with bigger headers, e.g. when q-in-q is used. > +IPv6:: The overhead is '48' bytes and the resulting packet size is > +'8952' bytes. > + > +The resulting ping command will look like this for an IPv4: > +[source,bash] > +---- > +ping -M do -s 8972 {target IP} > +---- > + > +When you are switching between IPv4 and IPv6 networks, you need to make sure > +that the following options in the `ceph.conf` file are correctly set to `true` > +or `false`. These config options configure if Ceph services should bind to IPv4 > +or IPv6 addresses. > +---- > +ms_bind_ipv4 = true > +ms_bind_ipv6 = false > +---- > + > +[[pveceph_network_change_public]] > +Change the Ceph Public Network > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > + > +The Ceph Public network is the main communication channel in a Ceph cluster > +between the different services and clients (for example, a VM). Changing it to > +a different network is not as simple as changing the Ceph Cluster network. The > +main reason is that besides the configuration in the `ceph.conf` file, the Ceph > +MONs (monitors) have an internal configuration where they keep track of all the > +other MONs that are part of the cluster, the 'monmap'. > + > +Therefore, the procedure to change the Ceph Public network is a bit more > +involved: > + > +1. Change `public_network` in the `ceph.conf` file This is mentioned in the warning below, but maybe more emphasis could be made here to only touch this one value. Additionally, please use the full path here. There are versions at /etc/pve and /etc/ceph and this is the first time in this new section where one needs to modify one (even if it is mentioned below in the expanded version). > +2. Restart non MON services: OSDs, MGRs and MDS on one host > +3. Wait until Ceph is back to 'Health_OK' Should be HEALTH_OK instead. > +4. Verify services are using the new network > +5. Continue restarting services on the next host > +6. Destroy one MON > +7. Recreate MON > +8. Wait until Ceph is back to 'Health_OK' Should be HEALTH_OK instead. > +9. Continue destroying and recreating MONs > + > +You first need to edit the `/etc/pve/ceph.conf` file. Change the > +`public_network` line to match the new subnet. > + > +---- > +cluster_network = 10.9.9.30/24 > +---- > + > +WARNING: Do not change the `mon_host` line or any `[mon.HOSTNAME]` sections. > +These will be updated automatically when the MONs are destroyed and recreated. > + > +NOTE: Don't worry if the host bits (for example, the last octet) are set by > +default, the netmask in CIDR notation defines the network part. > + > +After you have changed the network, you need to restart the non MON services in > +the cluster for the changes to take effect. Do so one node at a time! To restart all > +non MON services on one node, you can use the following commands on that node. > +Ceph has `systemd` targets for each type of service. > + > +[source,bash] > +---- > +systemctl restart ceph-osd.target > +systemctl restart ceph-mgr.target > +systemctl restart ceph-mds.target > +---- > +NOTE: You will only have MDS' (Metadata Server) if you use CephFS. > + > +NOTE: After the first OSD service got restarted, the GUI will complain that > +the OSD is not reachable anymore. This is not an issue,; VMs can still reach Is the double punctuation here intentional? > +them. The reason for the message is that the MGR service cannot reach the OSD > +anymore. The error will vanish after the MGR services get restarted. > + > +WARNING: Do not restart OSDs on multiple hosts at the same time. Chances are > +that for some PGs (placement groups), 2 out of the (default) 3 replicas will > +be down. This will result in I/O being halted until the minimum required number > +(`min_size`) of replicas is available again. > + > +To verify that the services are listening on the new network, you can run the > +following command on each node: > + > +[source,bash] > +---- > +ss -tulpn | grep ceph > +---- > + > +NOTE: Since OSDs will also listen on the Ceph Cluster network, expect to see that > +network too in the output of `ss -tulpn`. > + > +Once the Ceph cluster is back in a fully healthy state ('Health_OK'), and the Same here, HEALTH_OK. > +services are listening on the new network, continue to restart the services on > +the host. > + > +The last services that need to be moved to the new network are the Ceph MONs > +themselves. The easiest way is to destroy and recreate each monitor one by > +one. This way, any mention of it in the `ceph.conf` and the monitor internal > +`monmap` is handled automatically. > + > +Destroy the first MON and create it again. Wait a few moments before you > +continue on to the next MON in the cluster, and make sure the cluster reports > +'Health_OK' before proceeding. > + > +Once all MONs are recreated, you can verify that any mention of MONs in the > +`ceph.conf` file references the new network. That means mainly the `mon_host` > +line and the `[mon.HOSTNAME]` sections. > + > +One final `ss -tulpn | grep ceph` should show that the old network is not used > +by any Ceph service anymore. > + > +[[pveceph_network_change_cluster]] > +Change the Ceph Cluster Network > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > + > +The Ceph Cluster network is used for the replication traffic between the OSDs. > +Therefore, it can be beneficial to place it on its own fast physical network. > + > +The overall procedure is: > + > +1. Change `cluster_network` in the `ceph.conf` file > +2. Restart OSDs on one host > +3. Wait until Ceph is back to 'Health_OK' > +4. Verify OSDs are using the new network > +5. Continue restarting OSDs on the next host > + > +You first need to edit the `/etc/pve/ceph.conf` file. Change the > +`cluster_network` line to match the new subnet. > + > +---- > +cluster_network = 10.9.9.30/24 > +---- > + > +NOTE: Don't worry if the host bits (for example, the last octet) are set by > +default; the netmask in CIDR notation defines the network part. > + > +After you have changed the network, you need to restart the OSDs in the cluster > +for the changes to take effect. Do so one node at a time! > +To restart all OSDs on one node, you can use the following command on the CLI on > +that node: > + > +[source,bash] > +---- > +systemctl restart ceph-osd.target > +---- > + > +WARNING: Do not restart OSDs on multiple hosts at the same time. Chances are > +that for some PGs (placement groups), 2 out of the (default) 3 replicas will > +be down. This will result in I/O being halted until the minimum required number > +(`min_size`) of replicas is available again. > + > +To verify that the OSD services are listening on the new network, you can either > +check the *OSD Details -> Network* tab in the *Ceph -> OSD* panel or by running > +the following command on the host: > +[source,bash] > +---- > +ss -tulpn | grep ceph-osd > +---- > + > +NOTE: Since OSDs will also listen on the Ceph Public network, expect to see that > +network too in the output of `ss -tulpn`. > + > +Once the Ceph cluster is back in a fully healthy state ('Health_OK'), and the Same, should be HEALTH_OK. > +OSDs are listening on the new network, continue to restart the OSDs on the next > +host. > + > + > [[pve_ceph_mon_and_ts]] > Ceph Monitoring and Troubleshooting > ----------------------------------- -- Maximiliano _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel