public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH manager 0/1] ceph: add opt-in locality-aware replica reads (crush_location_hook)
@ 2026-03-25  3:51 Kefu Chai
  2026-03-25  3:51 ` [PATCH manager 1/1] " Kefu Chai
  2026-03-26  3:44 ` [PATCH manager 0/1] " Kefu Chai
  0 siblings, 2 replies; 3+ messages in thread
From: Kefu Chai @ 2026-03-25  3:51 UTC (permalink / raw)
  To: pve-devel

This patch was prompted by a forum thread [1] in which a user reported
persistent high IO wait on PostgreSQL VMs running on a three-AZ Ceph
cluster. The discussion surfaced a general optimization opportunity:
librbd, by default, always reads from the primary OSD regardless of
its location. In a multi-AZ deployment, that can mean every read pays
a cross-AZ round-trip even when a same-AZ replica is available.

rbd_read_from_replica_policy = localize addresses this by directing
librbd to prefer the nearest replica, but it requires the client to
declare its own position in the CRUSH hierarchy. This patch ships a
hook script that supplies that position by querying the live CRUSH map
(ceph osd crush find), and wires it up as an opt-in in pveceph init.

The benefit scales with topology: in a multi-AZ cluster it keeps reads
within the same AZ; in a hyperconverged setup, reads to a co-located
OSD never leave the host at all. The feature is opt-in because it can
degrade performance when replicas are equidistant or when the hook
falls back to an incorrect CRUSH root — see the commit message for
details.

[1] https://forum.proxmox.com/threads/ceph-vm-with-high-io-wait.181751/
  

Kefu Chai (1):
  ceph: add opt-in locality-aware replica reads (crush_location_hook)

 PVE/API2/Ceph.pm                       | 17 ++++++++++
 bin/Makefile                           |  3 +-
 bin/ceph-crush-location                | 43 ++++++++++++++++++++++++++
 www/manager6/ceph/CephInstallWizard.js |  8 ++++-
 4 files changed, 69 insertions(+), 2 deletions(-)
 create mode 100644 bin/ceph-crush-location

-- 
2.47.3





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-26  3:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-25  3:51 [PATCH manager 0/1] ceph: add opt-in locality-aware replica reads (crush_location_hook) Kefu Chai
2026-03-25  3:51 ` [PATCH manager 1/1] " Kefu Chai
2026-03-26  3:44 ` [PATCH manager 0/1] " Kefu Chai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal