From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <a.lauterer@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id C1A7388A0
 for <pve-devel@lists.proxmox.com>; Tue, 22 Aug 2023 11:42:39 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 983A0A51E
 for <pve-devel@lists.proxmox.com>; Tue, 22 Aug 2023 11:42:39 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Tue, 22 Aug 2023 11:42:38 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id E3BBF432AD
 for <pve-devel@lists.proxmox.com>; Tue, 22 Aug 2023 11:42:37 +0200 (CEST)
Message-ID: <c5a08c77-992e-48b3-81e8-8e70a0c0136c@proxmox.com>
Date: Tue, 22 Aug 2023 11:42:37 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
To: Fiona Ebner <f.ebner@proxmox.com>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <20230614111022.1432946-1-a.lauterer@proxmox.com>
 <edd3bab8-8474-2085-45b0-b5a78d51b6e5@proxmox.com>
Content-Language: en-US
From: Aaron Lauterer <a.lauterer@proxmox.com>
In-Reply-To: <edd3bab8-8474-2085-45b0-b5a78d51b6e5@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.084 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] [PATCH v2 storage 1/2] rbd: improve handling of
 missing images
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 22 Aug 2023 09:42:39 -0000

Thanks. I'll look into your comments regarding the actual error handling. But 
there is one important question that we need to decide on. See below.

On 8/21/23 17:05, Fiona Ebner wrote:
> Am 14.06.23 um 13:10 schrieb Aaron Lauterer:
>> It can happen, that an RBD image isn't cleaned up 100%. Calling 'rbd ls
>> -l' will then show errors that it is not possible to open the image in
>> question:
>> ```
>> rbd: error opening vm-103-disk-1: (2) No such file or directory
>> rbd: listing images failed: (2) No such file or directory
>> ```
>>
>> Originally we only showed the last error line which is too generic and
>> doesn't give a good hint what is actually wrong.
>>
>> We can improve that by catching these specific errors and add the
>> problematic disk images to the returned list with a size of '-1'.
>>
> 
> What do you think about logging a warning instead, hinting that it might
> be a partially removed image? The thing I'm a bit worried about is that
> existing scripts/tools interacting with our API might get confused by
> the -1. And if I use the UI, I don't see it with either approach,
> because your next patch hides it. If I use the CLI, I'll see either the
> warning or the -1 depending on the approach.
> 

Showing warnings on the CLI is a good idea either way, but the question is, do 
we want to list a broken image? If yes, then the users are more likely to notice 
that something is amiss. If we only log it, then chances are rather low as only 
users who use the CLI will see the warnings.
The downside though is, as you mentioned, that some external tools/scripts might 
be confused about it.

I am not sure how easy it is to pass through a dedicated parameter to the 
storage plugin. If, we could indicate that we want the disk images listed when 
we call it from the GUI for example. Though that might introduce much more 
complexity and discrepancy depending on which tool is used. Therefore probably 
not a good idea.

How we render a broken image in the GUI is something that can then be done 
almost any way we seem fit. It could be even something like "-1 (broken?)" in 
the Size column.