[pve-devel] [PATCH docs] pveceph: document cluster shutdown

public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed

* [pve-devel] [PATCH docs] pveceph: document cluster shutdown
@ 2024-03-19 15:00 Aaron Lauterer
  2024-03-19 16:48 ` Stefan Sterz
  0 siblings, 1 reply; 2+ messages in thread
From: Aaron Lauterer @ 2024-03-19 15:00 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 pveceph.adoc | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/pveceph.adoc b/pveceph.adoc
index 089ac80..7b493c5 100644
--- a/pveceph.adoc
+++ b/pveceph.adoc
@@ -1080,6 +1080,56 @@ scrubs footnote:[Ceph scrubbing {cephdocs-url}/rados/configuration/osd-config-re
 are executed.
 
 
+[[pveceph_shutdown]]
+Shutdown {pve} + Ceph HCI cluster
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To shut down the whole {pve} + Ceph cluster, first stop all Ceph clients. This
+will mainly be VMs and containers. If you have additional clients that might
+access a Ceph FS or an installed RADOS GW, stop these as well.
+High available guests will switch their state to 'stopped' when powered down
+via the {pve} tooling.
+
+Once all clients, VMs and containers are off or not accessing the Ceph cluster
+anymore, verify that the Ceph cluster is in a healthy state. Either via the Web UI
+or the CLI:
+
+----
+ceph -s
+----
+
+Then enable the following OSD flags in the Ceph -> OSD panel or the CLI:
+
+----
+ceph osd set noout
+ceph osd set norecover
+ceph osd set norebalance
+ceph osd set nobackfill
+ceph osd set nodown
+ceph osd set pause
+----
+
+This will halt all self-healing actions for Ceph and the 'pause' will stop any client IO.
+
+Start powering down the nodes one node at a time. Power down nodes with a
+Monitor (MON) last.
+
+When powering on the cluster, start the nodes with Monitors (MONs) first. Once
+all nodes are up and running, confirm that all Ceph services are up and running
+before you unset the OSD flags:
+
+----
+ceph osd unset noout
+ceph osd unset norecover
+ceph osd unset norebalance
+ceph osd unset nobackfill
+ceph osd unset nodown
+ceph osd unset pause
+----
+
+You can now start up the guests. High available guests will change their state
+to 'started' when they power on.
+
 Ceph Monitoring and Troubleshooting
 -----------------------------------
 
-- 
2.39.2





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [pve-devel] [PATCH docs] pveceph: document cluster shutdown
  2024-03-19 15:00 [pve-devel] [PATCH docs] pveceph: document cluster shutdown Aaron Lauterer
@ 2024-03-19 16:48 ` Stefan Sterz
  0 siblings, 0 replies; 2+ messages in thread
From: Stefan Sterz @ 2024-03-19 16:48 UTC (permalink / raw)
  To: Proxmox VE development discussion

On Tue Mar 19, 2024 at 4:00 PM CET, Aaron Lauterer wrote:
> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
> ---
>  pveceph.adoc | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
>
> diff --git a/pveceph.adoc b/pveceph.adoc
> index 089ac80..7b493c5 100644
> --- a/pveceph.adoc
> +++ b/pveceph.adoc
> @@ -1080,6 +1080,56 @@ scrubs footnote:[Ceph scrubbing {cephdocs-url}/rados/configuration/osd-config-re
>  are executed.
>
>
> +[[pveceph_shutdown]]
> +Shutdown {pve} + Ceph HCI cluster
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +To shut down the whole {pve} + Ceph cluster, first stop all Ceph clients. This
> +will mainly be VMs and containers. If you have additional clients that might
> +access a Ceph FS or an installed RADOS GW, stop these as well.
> +High available guests will switch their state to 'stopped' when powered down

I think this should be "Highly available" or "High availability".

> +via the {pve} tooling.
> +
> +Once all clients, VMs and containers are off or not accessing the Ceph cluster
> +anymore, verify that the Ceph cluster is in a healthy state. Either via the Web UI
> +or the CLI:
> +
> +----
> +ceph -s
> +----
> +
> +Then enable the following OSD flags in the Ceph -> OSD panel or the CLI:
> +
> +----
> +ceph osd set noout
> +ceph osd set norecover
> +ceph osd set norebalance
> +ceph osd set nobackfill
> +ceph osd set nodown
> +ceph osd set pause
> +----
> +
> +This will halt all self-healing actions for Ceph and the 'pause' will stop any client IO.
> +
> +Start powering down the nodes one node at a time. Power down nodes with a
> +Monitor (MON) last.

Might benefit from re-phrasing to avoid people only reading this while
already in the middle of shutting down:

Start powering down your nodes without a monitor (MON). After these
nodes are down, also shut down hosts with monitors.

> +
> +When powering on the cluster, start the nodes with Monitors (MONs) first. Once
> +all nodes are up and running, confirm that all Ceph services are up and running
> +before you unset the OSD flags:
> +
> +----
> +ceph osd unset noout
> +ceph osd unset norecover
> +ceph osd unset norebalance
> +ceph osd unset nobackfill
> +ceph osd unset nodown
> +ceph osd unset pause
> +----
> +
> +You can now start up the guests. High available guests will change their state

see above

> +to 'started' when they power on.
> +
>  Ceph Monitoring and Troubleshooting
>  -----------------------------------
>





^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-03-19 16:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-19 15:00 [pve-devel] [PATCH docs] pveceph: document cluster shutdown Aaron Lauterer
2024-03-19 16:48 ` Stefan Sterz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal