public inbox for pdm-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Lukas Wagner <l.wagner@proxmox.com>
To: pdm-devel@lists.proxmox.com
Subject: [pdm-devel] [PATCH proxmox-datacenter-manager v7 24/24] api: pve: rrd: trigger and wait for metric collection when requesting RRD data
Date: Tue, 26 Aug 2025 15:51:19 +0200	[thread overview]
Message-ID: <20250826135119.336510-25-l.wagner@proxmox.com> (raw)
In-Reply-To: <20250826135119.336510-1-l.wagner@proxmox.com>

Since we now default to a much longer collection interval (10 min), the hourly
RRD data might have a noticable gap an the end. So circumvent this, we
now trigger metric collection for a single remote when requesting
hourly RRD data, waiting for the completion of metric collection up to a
short timeout of 5 seconds. If the timeout expires, which can happen in
the metric collection for this remote is particularly slow (bad
connection), or if the metric collection task is currently busy with a
full-run that's taking a long time, we simply return the data that we
already have.

Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
---

Notes:
    New in v7.

 server/src/api/pve/rrddata.rs | 43 +++++++++++++++++++++++++++--------
 1 file changed, 34 insertions(+), 9 deletions(-)

diff --git a/server/src/api/pve/rrddata.rs b/server/src/api/pve/rrddata.rs
index b16c2313..b6a04037 100644
--- a/server/src/api/pve/rrddata.rs
+++ b/server/src/api/pve/rrddata.rs
@@ -1,3 +1,5 @@
+use std::time::Duration;
+
 use anyhow::Error;
 use serde_json::Value;
 
@@ -10,6 +12,7 @@ use pdm_api_types::rrddata::{LxcDataPoint, NodeDataPoint, QemuDataPoint};
 use pdm_api_types::{NODE_SCHEMA, PRIV_RESOURCE_AUDIT, VMID_SCHEMA};
 
 use crate::api::rrd_common::{self, DataPoint};
+use crate::metric_collection;
 
 impl DataPoint for NodeDataPoint {
     fn new(time: u64) -> Self {
@@ -161,7 +164,7 @@ impl DataPoint for LxcDataPoint {
     },
 )]
 /// Read qemu stats
-fn get_qemu_rrd_data(
+async fn get_qemu_rrd_data(
     remote: String,
     vmid: u32,
     timeframe: RrdTimeframe,
@@ -169,8 +172,7 @@ fn get_qemu_rrd_data(
     _param: Value,
 ) -> Result<Vec<QemuDataPoint>, Error> {
     let base = format!("pve/{remote}/qemu/{vmid}");
-
-    rrd_common::create_datapoints_from_rrd(&base, timeframe, cf)
+    get_rrd_datapoints(remote, base, timeframe, cf).await
 }
 
 #[api(
@@ -191,7 +193,7 @@ fn get_qemu_rrd_data(
     },
 )]
 /// Read lxc stats
-fn get_lxc_rrd_data(
+async fn get_lxc_rrd_data(
     remote: String,
     vmid: u32,
     timeframe: RrdTimeframe,
@@ -199,8 +201,7 @@ fn get_lxc_rrd_data(
     _param: Value,
 ) -> Result<Vec<LxcDataPoint>, Error> {
     let base = format!("pve/{remote}/lxc/{vmid}");
-
-    rrd_common::create_datapoints_from_rrd(&base, timeframe, cf)
+    get_rrd_datapoints(remote, base, timeframe, cf).await
 }
 
 #[api(
@@ -221,7 +222,7 @@ fn get_lxc_rrd_data(
     },
 )]
 /// Read node stats
-fn get_node_rrd_data(
+async fn get_node_rrd_data(
     remote: String,
     node: String,
     timeframe: RrdTimeframe,
@@ -229,9 +230,33 @@ fn get_node_rrd_data(
     _param: Value,
 ) -> Result<Vec<NodeDataPoint>, Error> {
     let base = format!("pve/{remote}/node/{node}");
-
-    rrd_common::create_datapoints_from_rrd(&base, timeframe, cf)
+    get_rrd_datapoints(remote, base, timeframe, cf).await
 }
+
+async fn get_rrd_datapoints<T: DataPoint + Send + 'static>(
+    remote: String,
+    basepath: String,
+    timeframe: RrdTimeframe,
+    mode: RrdMode,
+) -> Result<Vec<T>, Error> {
+    const WAIT_FOR_NEWEST_METRIC_TIMEOUT: Duration = Duration::from_secs(5);
+
+    if timeframe == RrdTimeframe::Hour {
+        // Let's wait for a limited time for the most recent metrics. If the connection to the remote
+        // is super slow or if the metric collection tasks currently busy with collecting other
+        // metrics, we just return the data we already have, not the newest one.
+        let _ = tokio::time::timeout(WAIT_FOR_NEWEST_METRIC_TIMEOUT, async {
+            metric_collection::trigger_metric_collection(Some(remote), true).await
+        })
+        .await;
+    }
+
+    tokio::task::spawn_blocking(move || {
+        rrd_common::create_datapoints_from_rrd(&basepath, timeframe, mode)
+    })
+    .await?
+}
+
 pub const QEMU_RRD_ROUTER: Router = Router::new().get(&API_METHOD_GET_QEMU_RRD_DATA);
 pub const LXC_RRD_ROUTER: Router = Router::new().get(&API_METHOD_GET_LXC_RRD_DATA);
 pub const NODE_RRD_ROUTER: Router = Router::new().get(&API_METHOD_GET_NODE_RRD_DATA);
-- 
2.47.2



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel


  parent reply	other threads:[~2025-08-26 13:52 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-26 13:50 [pdm-devel] [PATCH proxmox-datacenter-manager v7 00/24] metric collection improvements (concurrency, API, CLI) Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 01/24] metric collection: split top_entities split into separate module Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 02/24] metric collection: save metric data to RRD in separate task Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 03/24] metric collection: rework metric poll task Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 04/24] metric collection: persist state after metric collection Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 05/24] metric collection: skip if last_collection < MIN_COLLECTION_INTERVAL Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 06/24] metric collection: collect overdue metrics on startup/timer change Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 07/24] metric collection: add tests for the fetch_remotes function Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 08/24] metric collection: add test for fetch_overdue Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 09/24] metric collection: pass rrd cache instance as function parameter Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 10/24] metric collection: add test for rrd task Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 11/24] metric collection: wrap rrd_cache::Cache in a struct Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 12/24] metric collection: record remote response time in metric database Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 13/24] metric collection: save time needed for collection run to RRD Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 14/24] metric collection: periodically clean removed remotes from statefile Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 15/24] api: add endpoint to trigger metric collection Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 16/24] api: remotes: trigger immediate metric collection for newly added nodes Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 17/24] api: add api for querying metric collection RRD data Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 18/24] api: metric-collection: add status endpoint Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 19/24] pdm-client: add metric collection API methods Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 20/24] cli: add commands for metric-collection trigger and status Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 21/24] metric collection: skip missed timer ticks Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 22/24] metric collection: use JoinSet instead of joining from handles in a Vec Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 23/24] metric collection: allow to wait until completion when triggering collection manually Lukas Wagner
2025-08-26 13:51 ` Lukas Wagner [this message]
2025-08-28 19:37 ` [pdm-devel] applied: [PATCH proxmox-datacenter-manager v7 00/24] metric collection improvements (concurrency, API, CLI) Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250826135119.336510-25-l.wagner@proxmox.com \
    --to=l.wagner@proxmox.com \
    --cc=pdm-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal