From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 3F85AC5B0 for ; Mon, 11 Apr 2022 11:09:10 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 21EE422B5 for ; Mon, 11 Apr 2022 11:08:40 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 8781F22AB for ; Mon, 11 Apr 2022 11:08:39 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 5836B40913 for ; Mon, 11 Apr 2022 11:08:39 +0200 (CEST) Message-ID: Date: Mon, 11 Apr 2022 11:08:38 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: Proxmox VE development discussion , Thomas Lamprecht , =?UTF-8?Q?Fabian_Gr=c3=bcnbichler?= References: <20220406114657.452190-1-a.lauterer@proxmox.com> <1649404843.ds1yioa8qv.astroid@nora.none> <2a67ca76-f10f-5c2f-44a7-9d9da0c36c78@proxmox.com> From: Aaron Lauterer In-Reply-To: <2a67ca76-f10f-5c2f-44a7-9d9da0c36c78@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 1.566 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -3.086 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH v2 storage] rbd: alloc image: fix #3970 avoid ambiguous rbd path X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Apr 2022 09:09:10 -0000 On 4/11/22 09:39, Thomas Lamprecht wrote: > On 08.04.22 10:04, Fabian Grünbichler wrote: >> On April 6, 2022 1:46 pm, Aaron Lauterer wrote: >>> If two RBD storages use the same pool, but connect to different >>> clusters, we cannot say to which cluster the mapped RBD image belongs to >>> if krbd is used. To avoid potential data loss, we need to verify that no >>> other storage is configured that could have a volume mapped under the >>> same path before we create the image. >>> >>> The ambiguous mapping is in >>> /dev/rbd/// where the namespace is optional. >>> >>> Once we can tell the clusters apart in the mapping, we can remove these >>> checks again. >>> >>> See bug #3969 for more information on the root cause. >>> >>> Signed-off-by: Aaron Lauterer >> >> Acked-by: Fabian Grünbichler >> Reviewed-by: Fabian Grünbichler >> >> (small nit below, and given the rather heavy-handed approach a 2nd ack >> might not hurt.. IMHO, a few easily fixable false-positives beat more >> users actually running into this with move disk/volume and losing >> data..) > > The obvious question to me is: why bother with this workaround when we can > make udev create the symlink now already? > > Patching the rules file and/or binary shipped by ceph-common, or shipping our > own such script + rule, would seem relatively simple. The thinking was to implement a stop gap to have more time to consider a solution that we can upstream. Fabian might have some more thoughts on it but yeah, right now we could patch the udev rules and the ceph-rbdnamer which is called by the rule to create the current paths and then additionally the cluster specific ones. Unfortunately, it seems like the unwieldy cluster fsid is the only identifier we have for the cluster. Some more (smaller) changes might be necessary, if the implementation we manage to upstream will be a bit different. But that should not be much of an issue AFAICT.