From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id E383B941C7 for ; Tue, 10 Jan 2023 12:11:45 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id BFF559CEA for ; Tue, 10 Jan 2023 12:11:45 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 10 Jan 2023 12:11:45 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id BCD7744695 for ; Tue, 10 Jan 2023 12:11:44 +0100 (CET) Date: Tue, 10 Jan 2023 12:11:41 +0100 From: Christoph Heiss To: Fiona Ebner Cc: pve-devel@lists.proxmox.com Message-ID: <20230110111141.2hxrozsr7fatvswj@maui.proxmox.com> References: <20230102123633.2493599-1-c.heiss@proxmox.com> <20230102123633.2493599-3-c.heiss@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-SPAM-LEVEL: Spam detection results: 0 AWL -0.004 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [pbsclient.pm, pbsplugin.pm] Subject: Re: [pve-devel] [PATCH storage] fix #4289: pbs: wait for backup verification to finish before updating volume attribute X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Jan 2023 11:11:45 -0000 Thanks for the review! On Wed, Jan 04, 2023 at 11:50:38AM +0100, Fiona Ebner wrote: > Am 02.01.23 um 13:36 schrieb Christoph Heiss: > > diff --git a/PVE/Storage/PBSPlugin.pm b/PVE/Storage/PBSPlugin.pm > > index 4320974..1cdbc11 100644 > > --- a/PVE/Storage/PBSPlugin.pm > > +++ b/PVE/Storage/PBSPlugin.pm > > @@ -906,8 +906,30 @@ sub get_volume_attribute { > > return; > > } > > > > +sub wait_for_verify_finish { > > + my ($conn, $node, $datastore, $attrs) = @_; > > + > > + my $param = { > > + running => 'true', > > + since => $attrs->{'backup-time'}, > > + store => $datastore, > > + typefilter => 'verify', > > + }; > > + > > + my $taskname = sprintf('%s:%s/%s/%X', > > + $datastore, > > + @{$attrs}{qw(backup-type backup-id backup-time)}, > > + ); > > I don't think it's likely that the task name format here will change > often, but as you already mentioned in the cover letter, it's not ideal > to have it hard-coded here. > > > + > > + while (1) { > > + my $res = eval { $conn->get("/api2/json/nodes/$node/tasks", $param); }; > > + last if !grep { $_->{worker_id} eq $taskname } @$res; > > + sleep(1); > > + } > > +} > > + > > @@ -921,6 +943,9 @@ sub update_volume_attribute { > > my $conn = pbs_api_connect($scfg, $password); > > my $datastore = $scfg->{datastore}; > > > > + $logfunc->('info', 'waiting for server to finish backup verification...') if $logfunc; > > Should only be printed if there is actually a verification we need to > wait for. Makes sense. > > > + wait_for_verify_finish($conn, $scfg->{server}, $datastore, $param); > > To me, it feels out of place to be concerned with waiting on > verification in (the rather low-level) update_volume_attribute(), which > is a rather specific thing to do. I'd say it's fine to fail there when > the snapshot is locked by verification or some other operation. > > Waiting for verification also can increase the backup duration/time > holding the vzdump lock on the PVE side quite a bit. That was one of my concerns too. Especially for very big VMs this can probably delay the task quite a bit. > It might not seem that big of a deal, because usually only manual > backups use 'protected'. But by doing it in > update_volume_attribute(), you also do it for 'notes', where it's not > needed and which is relevant to backup jobs where the increased wait > might be very noticeable. So at least, it should only be done for > 'protected' if doing it in update_volume_attribute(). That is actually the case now - updating notes takes a different path through update_volume_notes(). > > It would be better if the protected flag could be specified upon > creation already. Would also fix the following race I guess: It definitely would be a lot cleaner. I'll see what I can do and rework the whole series. Probably involves adding a new parameter to the `proxmox-backup-client backup` command and API(?) AFAICS. But this would not be all that bad of a feature for the backup client in general, I think. And I guess I need to figure out a way how to detect whether the new parameter is supported or not? In case this it not supported, just keeping the current behavior (i.e. best-effort via the API and maybe failing) is probably the sensible way. > 1. backup finishes > 2. prune running on PBS > 3. protected status set from PVE > > If going for the waiting approach after all, I think it should rather be > done in vzdump, before calling update_volume_attribute(). And the helper > to wait on verification should likely be part of PBSClient.pm (would > need to teach it to use an API connection first).