From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 745D49334F for ; Wed, 4 Jan 2023 11:50:58 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 549F31DD3A for ; Wed, 4 Jan 2023 11:50:58 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 4 Jan 2023 11:50:57 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 6EAA744E3B for ; Wed, 4 Jan 2023 11:50:57 +0100 (CET) Message-ID: Date: Wed, 4 Jan 2023 11:50:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 Content-Language: en-US To: pve-devel@lists.proxmox.com, c.heiss@proxmox.com References: <20230102123633.2493599-1-c.heiss@proxmox.com> <20230102123633.2493599-3-c.heiss@proxmox.com> From: Fiona Ebner In-Reply-To: <20230102123633.2493599-3-c.heiss@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.880 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -1.708 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH storage] fix #4289: pbs: wait for backup verification to finish before updating volume attribute X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Jan 2023 10:50:58 -0000 Am 02.01.23 um 13:36 schrieb Christoph Heiss: > diff --git a/PVE/Storage/PBSPlugin.pm b/PVE/Storage/PBSPlugin.pm > index 4320974..1cdbc11 100644 > --- a/PVE/Storage/PBSPlugin.pm > +++ b/PVE/Storage/PBSPlugin.pm > @@ -906,8 +906,30 @@ sub get_volume_attribute { > return; > } > > +sub wait_for_verify_finish { > + my ($conn, $node, $datastore, $attrs) = @_; > + > + my $param = { > + running => 'true', > + since => $attrs->{'backup-time'}, > + store => $datastore, > + typefilter => 'verify', > + }; > + > + my $taskname = sprintf('%s:%s/%s/%X', > + $datastore, > + @{$attrs}{qw(backup-type backup-id backup-time)}, > + ); I don't think it's likely that the task name format here will change often, but as you already mentioned in the cover letter, it's not ideal to have it hard-coded here. > + > + while (1) { > + my $res = eval { $conn->get("/api2/json/nodes/$node/tasks", $param); }; > + last if !grep { $_->{worker_id} eq $taskname } @$res; > + sleep(1); > + } > +} > + > @@ -921,6 +943,9 @@ sub update_volume_attribute { > my $conn = pbs_api_connect($scfg, $password); > my $datastore = $scfg->{datastore}; > > + $logfunc->('info', 'waiting for server to finish backup verification...') if $logfunc; Should only be printed if there is actually a verification we need to wait for. > + wait_for_verify_finish($conn, $scfg->{server}, $datastore, $param); To me, it feels out of place to be concerned with waiting on verification in (the rather low-level) update_volume_attribute(), which is a rather specific thing to do. I'd say it's fine to fail there when the snapshot is locked by verification or some other operation. Waiting for verification also can increase the backup duration/time holding the vzdump lock on the PVE side quite a bit. It might not seem that big of a deal, because usually only manual backups use 'protected'. But by doing it in update_volume_attribute(), you also do it for 'notes', where it's not needed and which is relevant to backup jobs where the increased wait might be very noticeable. So at least, it should only be done for 'protected' if doing it in update_volume_attribute(). It would be better if the protected flag could be specified upon creation already. Would also fix the following race I guess: 1. backup finishes 2. prune running on PBS 3. protected status set from PVE If going for the waiting approach after all, I think it should rather be done in vzdump, before calling update_volume_attribute(). And the helper to wait on verification should likely be part of PBSClient.pm (would need to teach it to use an API connection first).