From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 6EAFD1FF183 for ; Wed, 10 Sep 2025 09:00:35 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8B03318B17; Wed, 10 Sep 2025 09:00:38 +0200 (CEST) Date: Wed, 10 Sep 2025 09:00:03 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox VE development discussion References: <20250909170515.606422-1-m.carrara@proxmox.com> In-Reply-To: <20250909170515.606422-1-m.carrara@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.17.0 (https://github.com/astroidmail/astroid) Message-Id: <1757487297.c08wh8yz7v.astroid@yuna.none> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1757487581901 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.049 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH ceph master v1] pybind/rbd: disable on_progress callbacks to prevent MGR segfaults X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" On September 9, 2025 7:05 pm, Max R. Carrara wrote: > Currently, *all* MGRs collectively segfault on Ceph v19.2.3 running on > Debian Trixie if a client requests the removal of an RBD image from > the RBD trash (#6635 [0]). > > After a lot of investigation, the cause of this still isn't clear to > me; the most likely culprit are some internal changes to Python > sub-interpreters that happened between Python versions 3.12 and 3.13. > > What leads me to this conclusion is the following: > 1. A user on our forum noted [0] that the issue disappeared as soon as > they set up a Ceph MGR inside a Debian Bookworm VM. Bookworm has > Python version 3.11, before any substantial changes to > sub-interpreters [1][2] were made. did you try with stock Debian Trixie packages (the Ceph version is still 18.2 there, which might help narrowing it down)? in any case, it would be good to bring this issue to upstream's attention as well! > 2. There is an upstream issue [3] regarding another segfault during > MGR startup. The author concluded that this problem is related to > sub-interpreters and opened another issue [4] on Python's issue > tracker that goes into more detail. > > Even though this is for a completely different code path, it shows > that issues related to sub-interpreters are popping up elsewhere > at the very least. did you try reproducing that one? it seems it requires an optional ceph-mgr plugin that we have packaged as well, so should be fairly straight-forward.. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel