From: "Max R. Carrara" <m.carrara@proxmox.com>
To: "Maximiliano Sandoval" <m.sandoval@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH pve-manager v2 0/6] Fix #6816: Prevent ceph-exporter Daemon from Crashing on Startup - v2
Date: Tue, 23 Dec 2025 16:51:15 +0100 [thread overview]
Message-ID: <DF5PSCOC546P.1LW5IZGGYU7IL@proxmox.com> (raw)
In-Reply-To: <s8oldit4bdh.fsf@proxmox.com>
On Tue Dec 23, 2025 at 1:43 PM CET, Maximiliano Sandoval wrote:
> "Max R. Carrara" <m.carrara@proxmox.com> writes:
>
> > Fix #6816: Prevent ceph-exporter Daemon from Crashing on Startup - v2
> > =====================================================================
> >
> > tl;dr: Stop ceph-exporter.service from ending up in a crash loop by
> > handing it a custom keyring file and setting its group to `www-data`,
> > similar to what we did for ceph-crash.service [0] before.
> >
> > This is a refresh of a somewhat older series that has been rebased, with
> > the version guard in `debian/postinst` adapted. The description from the
> > previous version is provided here again for the reader's convenience.
> >
> > Currently, the `ceph-exporter` daemon ends up in a short startup crash
> > loop before ultimately failing to start at all, because it tries to
> > access the keyring file at `/etc/pve/priv/ceph.client.admin.keyring`,
> > for which it doesn't have the permissions to do so.
> >
> > Instead of giving it access to the admin ring, give it its own keyring
> > located at `/etc/pve/ceph/ceph.client.exporter.keyring`. This file and
> > its corresponding section in `/etc/pve/ceph.conf` is created when the
> > first MON is created via the API. If the cluster has already been set
> > up, a postinst hook creates the keyring file and adapts
> > `/etc/pve/ceph.conf` instead.
> >
> > The core logic of all of this was already added for `ceph-crash` a while
> > ago [0] and is reused throughout the series, with some alterations to
> > the original code in order to make it a little more generic.
>
> I tested this series and it works as advertised modulo a race condition:
>
> When the ceph-exporter unit is started before installing this series it
> will fail and systemd will retry a handful of times, during this time
> `systemctl is-failed ceph-exporter.service` returns 'activating' instead
> of 'failed'. This might explain that then the reset-failed is never
> called. This results in ceph-exporter being restarted as part of the
> postinst script but failing because the reset-failed was never called
> and there have been too many attempts already.
>
> Otherwise, it works as expected. Thanks!
>
> Tested-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Thanks a ton for testing this! That's a really good catch.
As discussed off-list, `ceph-exporter` won't be reset and restarted
anymore in debian/postinst. See v3 [0] for an update.
[0]: https://lore.proxmox.com/pve-devel/20251223153419.507507-1-m.carrara@proxmox.com/
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-12-23 15:50 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-22 14:18 Max R. Carrara
2025-12-22 14:18 ` [pve-devel] [PATCH pve-manager v2 1/6] ceph: tools: add helper sub for creating or updating keyring files Max R. Carrara
2025-12-22 14:19 ` [pve-devel] [PATCH pve-manager v2 2/6] fix #6816: api: ceph: create 'client.exporter' w/ keyring Max R. Carrara
2025-12-22 14:19 ` [pve-devel] [PATCH pve-manager v2 3/6] fix #6816: bin: add pve-ceph-keyring helper and call it in postinst Max R. Carrara
2025-12-22 14:19 ` [pve-devel] [PATCH pve-manager v2 4/6] ceph: tools: simplify helper sub for crash keyring file Max R. Carrara
2025-12-22 14:19 ` [pve-devel] [PATCH pve-manager v2 5/6] bin: make pve-init-ceph-crash call pve-ceph-keyring Max R. Carrara
2025-12-22 14:19 ` [pve-devel] [PATCH ceph v2 6/6] fix #6816: patches: make ceph-exporter use custom keyring Max R. Carrara
2025-12-23 12:43 ` [pve-devel] [PATCH pve-manager v2 0/6] Fix #6816: Prevent ceph-exporter Daemon from Crashing on Startup - v2 Maximiliano Sandoval
2025-12-23 15:51 ` Max R. Carrara [this message]
2025-12-23 15:51 ` [pve-devel] superseded: " Max R. Carrara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DF5PSCOC546P.1LW5IZGGYU7IL@proxmox.com \
--to=m.carrara@proxmox.com \
--cc=m.sandoval@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox