From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 4691897107 for ; Mon, 4 Mar 2024 12:12:12 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 26C3816114 for ; Mon, 4 Mar 2024 12:12:12 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 4 Mar 2024 12:12:11 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2CC6B43AC2 for ; Mon, 4 Mar 2024 12:12:11 +0100 (CET) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Mon, 04 Mar 2024 12:12:09 +0100 Message-Id: From: "Hannes Laimer" To: "Thomas Lamprecht" , "Proxmox Backup Server development discussion" X-Mailer: aerc 0.14.0 References: <20240301150315.12253-1-h.laimer@proxmox.com> <07b5578e-52c4-4c41-83e2-20f5f73fce93@proxmox.com> In-Reply-To: <07b5578e-52c4-4c41-83e2-20f5f73fce93@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL -0.007 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [maintenance.rs, datastore.rs, proxmox-backup-proxy.rs] Subject: Re: [pbs-devel] [PATCH proxmox-backup v2] datastore: remove datastore from internal cache based on maintenance mode X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2024 11:12:12 -0000 On Mon Mar 4, 2024 at 11:42 AM CET, Thomas Lamprecht wrote: > Am 01/03/2024 um 16:03 schrieb Hannes Laimer: > > We keep a DataStore cache, so ChunkStore's and lock files are kept by > > the proxy process and don't have to be reopened every time. However, fo= r > > specific maintenance modes, e.g. 'offline', our process should not keep > > file in that datastore open. This clears the cache entry of a datastore > > if it is in a specific maintanance mode and the last task finished, whi= ch > > also drops any files still open by the process. > > One always asks themselves if command sockets are the right approach, but > for this it seems alright. > > Some code style comments inline. > > > Signed-off-by: Hannes Laimer > > Tested-by: Gabriel Goller > > Reviewed-by: Gabriel Goller > > --- > >=20 > > v2, thanks @Gabriel: > > - improve comments > > - remove not needed &'s and .clone()'s > >=20 > > pbs-api-types/src/maintenance.rs | 6 +++++ > > pbs-datastore/src/datastore.rs | 41 ++++++++++++++++++++++++++++-- > > pbs-datastore/src/task_tracking.rs | 23 ++++++++++------- > > src/api2/config/datastore.rs | 18 +++++++++++++ > > src/bin/proxmox-backup-proxy.rs | 8 ++++++ > > 5 files changed, 85 insertions(+), 11 deletions(-) > >=20 > > diff --git a/pbs-api-types/src/maintenance.rs b/pbs-api-types/src/maint= enance.rs > > index 1b03ca94..a1564031 100644 > > --- a/pbs-api-types/src/maintenance.rs > > +++ b/pbs-api-types/src/maintenance.rs > > @@ -77,6 +77,12 @@ pub struct MaintenanceMode { > > } > > =20 > > impl MaintenanceMode { > > + /// Used for deciding whether the datastore is cleared from the in= ternal cache after the last > > + /// task finishes, so all open files are closed. > > + pub fn clear_from_cache(&self) -> bool { > > that function name makes it sound like calling it does actively clears it= , > but this is only for checking if a required condition for clearing is met= . > > So maybe use a name that better convey that and maybe even avoid coupling > this to an action that a user of ours executes, as this might have some u= se > for other call sites too. > > From top of my head one could use `is_offline` as name, adding a note to > the doc-comment that this is e.g. used to check if a datastore can be > removed from the cache would still be fine though. > I agree, the name is somewhat misleading. The idea was to make it easy to potentially add more modes here in the future, so maybe something a little more general like `is_accessible` would make sense? > > + self.ty =3D=3D MaintenanceType::Offline > > + } > > + > > pub fn check(&self, operation: Option) -> Result<(), Er= ror> { > > if self.ty =3D=3D MaintenanceType::Delete { > > bail!("datastore is being deleted"); > > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datasto= re.rs > > index 2f0e5279..f26dff83 100644 > > --- a/pbs-datastore/src/datastore.rs > > +++ b/pbs-datastore/src/datastore.rs > > @@ -104,8 +104,27 @@ impl Clone for DataStore { > > impl Drop for DataStore { > > fn drop(&mut self) { > > if let Some(operation) =3D self.operation { > > - if let Err(e) =3D update_active_operations(self.name(), op= eration, -1) { > > - log::error!("could not update active operations - {}",= e); > > + let mut last_task =3D false; > > + match update_active_operations(self.name(), operation, -1)= { > > + Err(e) =3D> log::error!("could not update active opera= tions - {}", e), > > + Ok(updated_operations) =3D> { > > + last_task =3D updated_operations.read + updated_op= erations.write =3D=3D 0; > > + } > > + } > > + > > + // remove datastore from cache iff=20 > > + // - last task finished, and > > + // - datastore is in a maintenance mode that mandates it > > + let remove_from_cache =3D last_task > > + && pbs_config::datastore::config() > > + .and_then(|(s, _)| s.lookup::("da= tastore", self.name())) > > + .map_or(false, |c| { > > + c.get_maintenance_mode() > > + .map_or(false, |m| m.clear_from_cache()) > > + }); > > + > > + if remove_from_cache { > > + DATASTORE_MAP.lock().unwrap().remove(self.name()); > > } > > } > > } > > @@ -193,6 +212,24 @@ impl DataStore { > > Ok(()) > > } > > =20 > > + /// trigger clearing cache entries based on maintenance mode. Entr= ies will only > > + /// be cleared iff there is no other task running, if there is, th= e end of the > > + /// last running task will trigger the clearing of the cache entry= . > > + pub fn update_datastore_cache() -> Result<(), Error> { > > why does this work on all but not a single datastore, after all we always= want to > remove a specific one? > Actually just missed that our command_socket also does args, will update this in v3. > > + let (config, _digest) =3D pbs_config::datastore::config()?; > > + for (store, (_, _)) in &config.sections { > > + let datastore: DataStoreConfig =3D config.lookup("datastor= e", store)?; > > + if datastore > > + .get_maintenance_mode() > > + .map_or(false, |m| m.clear_from_cache()) > > + { > > + let _ =3D DataStore::lookup_datastore(store, Some(Oper= ation::Lookup)); > > A comment that the actual removal from the cache happens through the drop= handler > would be good, as this is a bit to subtle for my taste, if one stumbles o= ver this > in a few months down the line it might cause a bit to much easily to avoi= d head > scratching... > > Alternatively, factor the actual check-maintenance-mode-and-remove-from-c= ache out > of the drop handler and call that explicit here, all you need of outside = info is > the name there anyway. I think that would entail having to open the file twice in the drop handler, once for updating it, and once for reading it. But just reading it here and explicitly clearing it from the cache seems reasonable, it makes it way clrearer what's happening. I'll change that in a v3. Thanks for the review!