public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Alexander Zeidler <a.zeidler@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH docs v2 5/6] ceph: maintenance: revise and expand section "Replace OSDs"
Date: Wed,  5 Feb 2025 11:08:49 +0100	[thread overview]
Message-ID: <20250205100850.3-5-a.zeidler@proxmox.com> (raw)
In-Reply-To: <20250205100850.3-1-a.zeidler@proxmox.com>

Remove redundant information that is already described in section
“Destroy OSDs” and link to it.

Mention and link to the troubleshooting section, as replacing the OSD
may not fix the underyling problem.

Mention that the replacement disk should be of the same type and size
and comply with the recommendations.

Mention how to acknowledge warnings of crashed OSDs.

Signed-off-by: Alexander Zeidler <a.zeidler@proxmox.com>
---
v2
* no changes

 pveceph.adoc | 45 +++++++++++++--------------------------------
 1 file changed, 13 insertions(+), 32 deletions(-)

diff --git a/pveceph.adoc b/pveceph.adoc
index 81a6cc7..a471fb9 100644
--- a/pveceph.adoc
+++ b/pveceph.adoc
@@ -1035,43 +1035,24 @@ Ceph Maintenance
 Replace OSDs
 ~~~~~~~~~~~~
 
-One of the most common maintenance tasks in Ceph is to replace the disk of an
-OSD. If a disk is already in a failed state, then you can go ahead and run
-through the steps in xref:pve_ceph_osd_destroy[Destroy OSDs]. Ceph will recreate
-those copies on the remaining OSDs if possible. This rebalancing will start as
-soon as an OSD failure is detected or an OSD was actively stopped.
+With the following steps you can replace the disk of an OSD, which is
+one of the most common maintenance tasks in Ceph. If there is a
+problem with an OSD while its disk still seems to be healthy, read the
+xref:pve_ceph_mon_and_ts[troubleshooting] section first.
 
-NOTE: With the default size/min_size (3/2) of a pool, recovery only starts when
-`size + 1` nodes are available. The reason for this is that the Ceph object
-balancer xref:pve_ceph_device_classes[CRUSH] defaults to a full node as
-`failure domain'.
+. If the disk failed, get a
+xref:pve_ceph_recommendation_disk[recommended] replacement disk of the
+same type and size.
 
-To replace a functioning disk from the GUI, go through the steps in
-xref:pve_ceph_osd_destroy[Destroy OSDs]. The only addition is to wait until
-the cluster shows 'HEALTH_OK' before stopping the OSD to destroy it.
+. xref:pve_ceph_osd_destroy[Destroy] the OSD in question.
 
-On the command line, use the following commands:
+. Detach the old disk from the server and attach the new one.
 
-----
-ceph osd out osd.<id>
-----
-
-You can check with the command below if the OSD can be safely removed.
-
-----
-ceph osd safe-to-destroy osd.<id>
-----
-
-Once the above check tells you that it is safe to remove the OSD, you can
-continue with the following commands:
-
-----
-systemctl stop ceph-osd@<id>.service
-pveceph osd destroy <id>
-----
+. xref:pve_ceph_osd_create[Create] the OSD again.
 
-Replace the old disk with the new one and use the same procedure as described
-in xref:pve_ceph_osd_create[Create OSDs].
+. After automatic rebalancing, the cluster status should switch back
+to `HEALTH_OK`. Any still listed crashes can be acknowledged by
+running, for example, `ceph crash archive-all`.
 
 Trim/Discard
 ~~~~~~~~~~~~
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

  parent reply	other threads:[~2025-02-05 10:09 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05 10:08 [pve-devel] [PATCH docs v2 1/6] ceph: add anchors for use in troubleshooting section Alexander Zeidler
2025-02-05 10:08 ` [pve-devel] [PATCH docs v2 2/6] ceph: correct heading capitalization Alexander Zeidler
2025-02-05 10:08 ` [pve-devel] [PATCH docs v2 3/6] ceph: troubleshooting: revise and add frequently needed information Alexander Zeidler
2025-02-05 10:08 ` [pve-devel] [PATCH docs v2 4/6] ceph: osd: revise and expand the section "Destroy OSDs" Alexander Zeidler
2025-02-05 10:08 ` Alexander Zeidler [this message]
2025-02-05 10:08 ` [pve-devel] [PATCH docs v2 6/6] pvecm: remove node: mention Ceph and its steps for safe removal Alexander Zeidler
2025-02-05 14:20 ` [pve-devel] [PATCH docs v2 1/6] ceph: add anchors for use in troubleshooting section Max Carrara
2025-03-24 16:42 ` [pve-devel] applied: " Aaron Lauterer
2025-03-26 10:20   ` Max Carrara
2025-03-26 13:13     ` Aaron Lauterer
2025-03-26 13:36       ` Max Carrara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250205100850.3-5-a.zeidler@proxmox.com \
    --to=a.zeidler@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal