all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH manager 0/1] ceph: add opt-in locality-aware replica reads (crush_location_hook)
@ 2026-03-25  3:51 Kefu Chai
  2026-03-25  3:51 ` [PATCH manager 1/1] " Kefu Chai
  2026-03-26  3:44 ` [PATCH manager 0/1] " Kefu Chai
  0 siblings, 2 replies; 3+ messages in thread
From: Kefu Chai @ 2026-03-25  3:51 UTC (permalink / raw)
  To: pve-devel

This patch was prompted by a forum thread [1] in which a user reported
persistent high IO wait on PostgreSQL VMs running on a three-AZ Ceph
cluster. The discussion surfaced a general optimization opportunity:
librbd, by default, always reads from the primary OSD regardless of
its location. In a multi-AZ deployment, that can mean every read pays
a cross-AZ round-trip even when a same-AZ replica is available.

rbd_read_from_replica_policy = localize addresses this by directing
librbd to prefer the nearest replica, but it requires the client to
declare its own position in the CRUSH hierarchy. This patch ships a
hook script that supplies that position by querying the live CRUSH map
(ceph osd crush find), and wires it up as an opt-in in pveceph init.

The benefit scales with topology: in a multi-AZ cluster it keeps reads
within the same AZ; in a hyperconverged setup, reads to a co-located
OSD never leave the host at all. The feature is opt-in because it can
degrade performance when replicas are equidistant or when the hook
falls back to an incorrect CRUSH root — see the commit message for
details.

[1] https://forum.proxmox.com/threads/ceph-vm-with-high-io-wait.181751/
  

Kefu Chai (1):
  ceph: add opt-in locality-aware replica reads (crush_location_hook)

 PVE/API2/Ceph.pm                       | 17 ++++++++++
 bin/Makefile                           |  3 +-
 bin/ceph-crush-location                | 43 ++++++++++++++++++++++++++
 www/manager6/ceph/CephInstallWizard.js |  8 ++++-
 4 files changed, 69 insertions(+), 2 deletions(-)
 create mode 100644 bin/ceph-crush-location

-- 
2.47.3





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-26  3:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-25  3:51 [PATCH manager 0/1] ceph: add opt-in locality-aware replica reads (crush_location_hook) Kefu Chai
2026-03-25  3:51 ` [PATCH manager 1/1] " Kefu Chai
2026-03-26  3:44 ` [PATCH manager 0/1] " Kefu Chai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal