From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9C0CB97E6B for ; Wed, 6 Mar 2024 15:04:42 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 7B4FB189B3 for ; Wed, 6 Mar 2024 15:04:12 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 6 Mar 2024 15:04:11 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3DCA5487BC for ; Wed, 6 Mar 2024 15:04:11 +0100 (CET) Message-ID: <5a2c2cae-1974-4f45-8a58-30ff7792a8f7@proxmox.com> Date: Wed, 6 Mar 2024 15:04:10 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Friedrich Weber , Proxmox VE development discussion , Hannes Duerr References: <20240306104703.115366-1-h.duerr@proxmox.com> <1f999e2b-7ada-4978-9f40-27481a81bd3b@proxmox.com> From: Fiona Ebner In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.071 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Mar 2024 14:04:42 -0000 Am 06.03.24 um 14:14 schrieb Friedrich Weber: > On 06/03/2024 13:40, Fiona Ebner wrote: >> Am 06.03.24 um 11:47 schrieb Hannes Duerr: >>> @@ -3820,7 +3821,13 @@ __PACKAGE__->register_method({ >>> >>> if ($target) { >>> # always deactivate volumes - avoid lvm LVs to be active on several nodes >>> - PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running; >>> + eval { >>> + PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running; >>> + }; >>> + my $err = $@; >>> + if ($err) { >>> + log_warn("$err\n"); >>> + } >>> PVE::Storage::deactivate_volumes($storecfg, $newvollist); >> >> We might also want to catch errors here. Otherwise, the whole clone >> operation (which might've taken hours) can still fail just because of a >> deactivation error. But when failing here, we shouldn't move the config >> file (or the LV can get active on multiple nodes more easily). > > I think succeeding but not moving the config file when deactivating > $newvollist fails sounds like it could lead to unexpected behavior. > Right now, when running `qm clone 101 [...] --target node2` on node1 > succeeds, one can be sure there will be an VM 101 on node2. But if we > cannot deactivate $newvollist and thus don't move the config file, the > command succeeds but VM 101 instead exists on node1 (correct me if I'm > wrong), which may be confusing e.g. if the clone is automated. > Yes, but the question is what is worse: Needing to re-do the clone or having the VM config on the wrong node? > To avoid that, I'd lean towards keeping the behavior of failing the task > if deactivating $newvollist fails. After all, at least in case of LVM > not being able to deactivate because the device is in use, we just > created $newvollist so hopefully nobody else should be accessing it. Fine by me. Yes, it's unlikely to fail. And we can still adapt later if users ever complain about it.