From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id DE0CB93642 for ; Tue, 9 Apr 2024 12:28:47 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id C7A7D1B3AA for ; Tue, 9 Apr 2024 12:28:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 9 Apr 2024 12:28:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D974B42EF4 for ; Tue, 9 Apr 2024 12:28:46 +0200 (CEST) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 09 Apr 2024 12:28:46 +0200 Message-Id: To: "Proxmox VE development discussion" From: "Max Carrara" X-Mailer: aerc 0.17.0-72-g6a84f1331f1c References: <20240402145523.683008-1-m.carrara@proxmox.com> In-Reply-To: X-SPAM-LEVEL: Spam detection results: 0 AWL -0.373 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_ASCII_DIVIDERS 0.8 Email that uses ascii formatting dividers and possible spam tricks KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Subject: Re: [pve-devel] [PATCH v5 pve-storage, pve-manager 00/11] Fix #4759: Configure Permissions for ceph-crash.service X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2024 10:28:47 -0000 On Tue Apr 9, 2024 at 11:48 AM CEST, Maximiliano Sandoval wrote: > > Max Carrara writes: > > > Fix #4759: Configure Permissions for ceph-crash.service - Version 5 > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > I tested this patch series on a testing cluster updated to > no-subscription with ceph-base 18.2.2-pve1. For the purposes of testing > I removed the version check against 0.0.0. > > The following things were working as expected: > > - There are no more ceph-crash errors in the journal > - /etc/pve/ceph.conf contains: > ``` > [client.crash] > keyring =3D /etc/pve/ceph/$cluster.$name.keyring > ``` > - The new keyring is the right place at > ``` > # ls /etc/pve/ceph > ceph.client.crash.keyring > ``` > - After a few minutes the crash reports at /var/lib/ceph/crash/ were > moved to /var/lib/ceph/crash/posted. Thanks a lot for testing this, much appreciated! > > One thing that was broken is running the ceph-crash binary directly: > > ``` > # ceph-crash > INFO:ceph-crash:pinging cluster to exercise our key > 2024-04-09T11:42:31.591+0200 7009fca926c0 -1 auth: unable to find a keyri= ng on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied > 2024-04-09T11:42:31.595+0200 7009fca926c0 -1 auth: unable to find a keyri= ng on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied > 2024-04-09T11:42:31.595+0200 7009fca926c0 -1 auth: unable to find a keyri= ng on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied > 2024-04-09T11:42:31.595+0200 7009fca926c0 -1 auth: unable to find a keyri= ng on /etc/pve/priv/ceph.client.admin.keyring: (13) Permission denied > 2024-04-09T11:42:31.595+0200 7009fca926c0 -1 monclient: keyring not found > [errno 13] RADOS permission denied (error connecting to the cluster) That's not actually "broken" (even though it looks like it, tbh) - that's just how Ceph rolls in this case ... On startup `ceph-crash` will first check if the cluster is even reachable [0]. I'm not sure why it resorts to looking up the admin keyring first. > INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s Here it does actually then monitor the crash dir as expected, so it works just fine. The usual errors that appear every 10 minutes are otherwise silenced by a patch on our side [1] (which were the most annoying kinds of errors anyway). > ``` [0]: https://git.proxmox.com/?p=3Dceph.git;a=3Dblob;f=3Dceph/src/ceph-crash= .in;h=3D0e02837fadd4dde8abd66985b485836402e10a37;hb=3DHEAD#l131 [1]: https://git.proxmox.com/?p=3Dceph.git;a=3Dblob;f=3Dpatches/0017-ceph-c= rash-change-order-of-client-names.patch;h=3D8131fced55f3e4c757bd22c16539070= f83480a19;hb=3DHEAD > > -- > Maximiliano > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel