public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup] datastore: remove datastore from internal cache based on maintenance mode
@ 2024-02-22 10:33 Hannes Laimer
  2024-02-29 17:13 ` Gabriel Goller
  0 siblings, 1 reply; 3+ messages in thread
From: Hannes Laimer @ 2024-02-22 10:33 UTC (permalink / raw)
  To: pbs-devel

We keep a DataStore cache, so ChunkStore's and lock files are kept by
the proxy process and don't have to be reopened every time. However, for
specific maintenance modes, e.g. 'offline', our process should not keep
file in that datastore open. This clears the cache entry of a datastore
if it is in a specific maintanance mode and the last task finished, which
also drops any files still open by the process.

Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
---

Setting a maintanance mode happends in the API process, but the
datastore cache is used by the proxy process. So we have to somehow
tell the proxy process it has drop entries from the cache(so files for
that datatsore are closed). If a task is runnig while the maintenance
mode is changed, this happens when the last reference to the datastore
is dropped(the last task finished). If no task is running, this would
not happen. So we either have to do some kind of polling, or trigger it
manually. Triggering it manually through the command socket seemed
reasonable, and it also works(does nothing) if there is still a task
running, which would clear the cache entry once it is done. So that is
what this patch does.

 pbs-api-types/src/maintenance.rs   |  6 +++++
 pbs-datastore/src/datastore.rs     | 40 ++++++++++++++++++++++++++++--
 pbs-datastore/src/task_tracking.rs | 23 ++++++++++-------
 src/api2/config/datastore.rs       | 18 ++++++++++++++
 src/bin/proxmox-backup-proxy.rs    |  8 ++++++
 5 files changed, 84 insertions(+), 11 deletions(-)

diff --git a/pbs-api-types/src/maintenance.rs b/pbs-api-types/src/maintenance.rs
index 1b03ca94..9aa29a76 100644
--- a/pbs-api-types/src/maintenance.rs
+++ b/pbs-api-types/src/maintenance.rs
@@ -77,6 +77,12 @@ pub struct MaintenanceMode {
 }
 
 impl MaintenanceMode {
+    /// Used for deciding weather the datastore is cleared from the internal cache after the last
+    /// task finishes, so locks on files get freed up.
+    pub fn clear_from_cache(&self) -> bool {
+        self.ty == MaintenanceType::Offline
+    }
+
     pub fn check(&self, operation: Option<Operation>) -> Result<(), Error> {
         if self.ty == MaintenanceType::Delete {
             bail!("datastore is being deleted");
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 2f0e5279..79209e7f 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -104,8 +104,26 @@ impl Clone for DataStore {
 impl Drop for DataStore {
     fn drop(&mut self) {
         if let Some(operation) = self.operation {
-            if let Err(e) = update_active_operations(self.name(), operation, -1) {
-                log::error!("could not update active operations - {}", e);
+            let mut last_task = false;
+            match update_active_operations(self.name(), operation, -1) {
+                Err(e) => log::error!("could not update active operations - {}", e),
+                Ok(updated_operations) => {
+                    last_task = updated_operations.read + updated_operations.write == 0;
+                }
+            }
+
+            // remove datastore from cache iff last task finished
+            // and datastore is in a maintenance mode that says it should be
+            let remove_from_cache = last_task
+                && pbs_config::datastore::config()
+                    .and_then(|(s, _)| s.lookup::<DataStoreConfig>("datastore", &self.name()))
+                    .map_or(false, |c| {
+                        c.get_maintenance_mode()
+                            .map_or(false, |m| m.clear_from_cache())
+                    });
+
+            if remove_from_cache {
+                DATASTORE_MAP.lock().unwrap().remove(self.name());
             }
         }
     }
@@ -193,6 +211,24 @@ impl DataStore {
         Ok(())
     }
 
+    /// trigger clearing cache entries based on maintenance mode. Entries will only
+    /// be cleared iff there is no other task running, if there is, the end of the
+    /// last running task will trigger the clearing of the cache entry.
+    pub fn update_datastore_cache() -> Result<(), Error> {
+        let (config, _digest) = pbs_config::datastore::config()?;
+        for (store, (_, _)) in &config.sections {
+            let datastore: DataStoreConfig = config.lookup("datastore", &store)?;
+            if datastore
+                .get_maintenance_mode()
+                .map_or(false, |m| m.clear_from_cache())
+            {
+                let _ = DataStore::lookup_datastore(&store, Some(Operation::Lookup));
+            }
+        }
+
+        Ok(())
+    }
+
     /// Open a raw database given a name and a path.
     ///
     /// # Safety
diff --git a/pbs-datastore/src/task_tracking.rs b/pbs-datastore/src/task_tracking.rs
index 18fbd4ba..85772099 100644
--- a/pbs-datastore/src/task_tracking.rs
+++ b/pbs-datastore/src/task_tracking.rs
@@ -91,15 +91,23 @@ pub fn get_active_operations_locked(
     Ok((data, lock.unwrap()))
 }
 
-pub fn update_active_operations(name: &str, operation: Operation, count: i64) -> Result<(), Error> {
+pub fn update_active_operations(
+    name: &str,
+    operation: Operation,
+    count: i64,
+) -> Result<ActiveOperationStats, Error> {
     let path = PathBuf::from(format!("{}/{}", crate::ACTIVE_OPERATIONS_DIR, name));
 
     let (_lock, options) = open_lock_file(name)?;
 
     let pid = std::process::id();
     let starttime = procfs::PidStat::read_from_pid(Pid::from_raw(pid as pid_t))?.starttime;
-    let mut updated = false;
 
+    let mut updated_active_operations = match operation {
+        Operation::Read => ActiveOperationStats { read: 1, write: 0 },
+        Operation::Write => ActiveOperationStats { read: 0, write: 1 },
+        Operation::Lookup => ActiveOperationStats { read: 0, write: 0 },
+    };
     let mut updated_tasks: Vec<TaskOperations> = match file_read_optional_string(&path)? {
         Some(data) => serde_json::from_str::<Vec<TaskOperations>>(&data)?
             .iter_mut()
@@ -108,12 +116,12 @@ pub fn update_active_operations(name: &str, operation: Operation, count: i64) ->
                     Some(stat) if pid == task.pid && stat.starttime != task.starttime => None,
                     Some(_) => {
                         if pid == task.pid {
-                            updated = true;
                             match operation {
                                 Operation::Read => task.active_operations.read += count,
                                 Operation::Write => task.active_operations.write += count,
                                 Operation::Lookup => (), // no IO must happen there
                             };
+                            updated_active_operations = task.active_operations.clone();
                         }
                         Some(task.clone())
                     }
@@ -124,15 +132,11 @@ pub fn update_active_operations(name: &str, operation: Operation, count: i64) ->
         None => Vec::new(),
     };
 
-    if !updated {
+    if updated_tasks.is_empty() {
         updated_tasks.push(TaskOperations {
             pid,
             starttime,
-            active_operations: match operation {
-                Operation::Read => ActiveOperationStats { read: 1, write: 0 },
-                Operation::Write => ActiveOperationStats { read: 0, write: 1 },
-                Operation::Lookup => ActiveOperationStats { read: 0, write: 0 },
-            },
+            active_operations: updated_active_operations,
         })
     }
     replace_file(
@@ -141,4 +145,5 @@ pub fn update_active_operations(name: &str, operation: Operation, count: i64) ->
         options,
         false,
     )
+    .map(|_| updated_active_operations)
 }
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index d571334c..519fadc9 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -389,7 +389,9 @@ pub fn update_datastore(
         data.tuning = update.tuning;
     }
 
+    let mut maintenance_mode_changed = false;
     if update.maintenance_mode.is_some() {
+        maintenance_mode_changed = data.maintenance_mode != update.maintenance_mode;
         data.maintenance_mode = update.maintenance_mode;
     }
 
@@ -403,6 +405,22 @@ pub fn update_datastore(
         jobstate::update_job_last_run_time("garbage_collection", &name)?;
     }
 
+    // tell the proxy it might have to clear a cache entry
+    if maintenance_mode_changed {
+        tokio::spawn(async {
+            if let Ok(proxy_pid) =
+                proxmox_rest_server::read_pid(pbs_buildcfg::PROXMOX_BACKUP_PROXY_PID_FN)
+            {
+                let sock = proxmox_rest_server::ctrl_sock_from_pid(proxy_pid);
+                let _ = proxmox_rest_server::send_raw_command(
+                    sock,
+                    "{\"command\":\"update-datastore-cache\"}\n",
+                )
+                .await;
+            }
+        });
+    }
+
     Ok(())
 }
 
diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
index 5d5bcdd3..d5e3e810 100644
--- a/src/bin/proxmox-backup-proxy.rs
+++ b/src/bin/proxmox-backup-proxy.rs
@@ -289,6 +289,14 @@ async fn run() -> Result<(), Error> {
         Ok(Value::Null)
     })?;
 
+    // clear cache entries for datastores that are in a specific maintenance mode
+    command_sock.register_command("update-datastore-cache".to_string(), |_value| {
+        if let Err(err) = DataStore::update_datastore_cache() {
+            log::error!("could not trigger update datastore cache: {err}");
+        }
+        Ok(Value::Null)
+    })?;
+
     let connections = proxmox_rest_server::connection::AcceptBuilder::new()
         .debug(debug)
         .rate_limiter_lookup(Arc::new(lookup_rate_limiter))
-- 
2.39.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup] datastore: remove datastore from internal cache based on maintenance mode
  2024-02-22 10:33 [pbs-devel] [PATCH proxmox-backup] datastore: remove datastore from internal cache based on maintenance mode Hannes Laimer
@ 2024-02-29 17:13 ` Gabriel Goller
  2024-03-01 10:43   ` Hannes Laimer
  0 siblings, 1 reply; 3+ messages in thread
From: Gabriel Goller @ 2024-02-29 17:13 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

Tested this thoroughly and it works as advertised!

Just some small nits inline:

>  impl MaintenanceMode {
> +    /// Used for deciding weather the datastore is cleared from the internal cache after the last

Small typo: weather -> whether

> +            // remove datastore from cache iff last task finished
> +            // and datastore is in a maintenance mode that says it should be

Did something get cut off here? The sentence doesn't sound correct...

> +            let remove_from_cache = last_task
> +                && pbs_config::datastore::config()
> +                    .and_then(|(s, _)| s.lookup::<DataStoreConfig>("datastore", &self.name()))

`self.name()` is enough here, we don't need `&self.name()`

> +    pub fn update_datastore_cache() -> Result<(), Error> {
> +        let (config, _digest) = pbs_config::datastore::config()?;
> +        for (store, (_, _)) in &config.sections {
> +            let datastore: DataStoreConfig = config.lookup("datastore", &store)?;

`store` is already a &String, we don't need to write `&store`

> +            if datastore
> +                .get_maintenance_mode()
> +                .map_or(false, |m| m.clear_from_cache())
> +            {
> +                let _ = DataStore::lookup_datastore(&store, Some(Operation::Lookup));

Same here

>                              match operation {
>                                  Operation::Read => task.active_operations.read += count,
>                                  Operation::Write => task.active_operations.write += count,
>                                  Operation::Lookup => (), // no IO must happen there
>                              };
> +                            updated_active_operations = task.active_operations.clone();

You can remove the `.clone()` call here


Consider:

Tested-by: Gabriel Goller <g.goller@proxmox.com>
Reviewed-by: Gabriel Goller <g.goller@proxmox.com>




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup] datastore: remove datastore from internal cache based on maintenance mode
  2024-02-29 17:13 ` Gabriel Goller
@ 2024-03-01 10:43   ` Hannes Laimer
  0 siblings, 0 replies; 3+ messages in thread
From: Hannes Laimer @ 2024-03-01 10:43 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

Thanks for the review, will send a v2

On Thu Feb 29, 2024 at 6:13 PM CET, Gabriel Goller wrote:
> Tested this thoroughly and it works as advertised!
>
> Just some small nits inline:
>
> >  impl MaintenanceMode {
> > +    /// Used for deciding weather the datastore is cleared from the internal cache after the last
>
> Small typo: weather -> whether
>
> > +            // remove datastore from cache iff last task finished
> > +            // and datastore is in a maintenance mode that says it should be
>
> Did something get cut off here? The sentence doesn't sound correct...
>

I see where you're coming from, if the maintanance mode says it should
be clear from cache, since the maintenance mode defines weather it
should be or not. But I'll phrase this better

> > +            let remove_from_cache = last_task
> > +                && pbs_config::datastore::config()
> > +                    .and_then(|(s, _)| s.lookup::<DataStoreConfig>("datastore", &self.name()))
>
> `self.name()` is enough here, we don't need `&self.name()`
>
> > +    pub fn update_datastore_cache() -> Result<(), Error> {
> > +        let (config, _digest) = pbs_config::datastore::config()?;
> > +        for (store, (_, _)) in &config.sections {
> > +            let datastore: DataStoreConfig = config.lookup("datastore", &store)?;
>
> `store` is already a &String, we don't need to write `&store`
>
> > +            if datastore
> > +                .get_maintenance_mode()
> > +                .map_or(false, |m| m.clear_from_cache())
> > +            {
> > +                let _ = DataStore::lookup_datastore(&store, Some(Operation::Lookup));
>
> Same here
>
> >                              match operation {
> >                                  Operation::Read => task.active_operations.read += count,
> >                                  Operation::Write => task.active_operations.write += count,
> >                                  Operation::Lookup => (), // no IO must happen there
> >                              };
> > +                            updated_active_operations = task.active_operations.clone();
>
> You can remove the `.clone()` call here
>
>
> Consider:
>
> Tested-by: Gabriel Goller <g.goller@proxmox.com>
> Reviewed-by: Gabriel Goller <g.goller@proxmox.com>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-03-01 10:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-22 10:33 [pbs-devel] [PATCH proxmox-backup] datastore: remove datastore from internal cache based on maintenance mode Hannes Laimer
2024-02-29 17:13 ` Gabriel Goller
2024-03-01 10:43   ` Hannes Laimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal