From: Lukas Wagner <l.wagner@proxmox.com>
To: pdm-devel@lists.proxmox.com
Subject: [pdm-devel] [PATCH proxmox-datacenter-manager v7 24/24] api: pve: rrd: trigger and wait for metric collection when requesting RRD data
Date: Tue, 26 Aug 2025 15:51:19 +0200 [thread overview]
Message-ID: <20250826135119.336510-25-l.wagner@proxmox.com> (raw)
In-Reply-To: <20250826135119.336510-1-l.wagner@proxmox.com>
Since we now default to a much longer collection interval (10 min), the hourly
RRD data might have a noticable gap an the end. So circumvent this, we
now trigger metric collection for a single remote when requesting
hourly RRD data, waiting for the completion of metric collection up to a
short timeout of 5 seconds. If the timeout expires, which can happen in
the metric collection for this remote is particularly slow (bad
connection), or if the metric collection task is currently busy with a
full-run that's taking a long time, we simply return the data that we
already have.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
---
Notes:
New in v7.
server/src/api/pve/rrddata.rs | 43 +++++++++++++++++++++++++++--------
1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/server/src/api/pve/rrddata.rs b/server/src/api/pve/rrddata.rs
index b16c2313..b6a04037 100644
--- a/server/src/api/pve/rrddata.rs
+++ b/server/src/api/pve/rrddata.rs
@@ -1,3 +1,5 @@
+use std::time::Duration;
+
use anyhow::Error;
use serde_json::Value;
@@ -10,6 +12,7 @@ use pdm_api_types::rrddata::{LxcDataPoint, NodeDataPoint, QemuDataPoint};
use pdm_api_types::{NODE_SCHEMA, PRIV_RESOURCE_AUDIT, VMID_SCHEMA};
use crate::api::rrd_common::{self, DataPoint};
+use crate::metric_collection;
impl DataPoint for NodeDataPoint {
fn new(time: u64) -> Self {
@@ -161,7 +164,7 @@ impl DataPoint for LxcDataPoint {
},
)]
/// Read qemu stats
-fn get_qemu_rrd_data(
+async fn get_qemu_rrd_data(
remote: String,
vmid: u32,
timeframe: RrdTimeframe,
@@ -169,8 +172,7 @@ fn get_qemu_rrd_data(
_param: Value,
) -> Result<Vec<QemuDataPoint>, Error> {
let base = format!("pve/{remote}/qemu/{vmid}");
-
- rrd_common::create_datapoints_from_rrd(&base, timeframe, cf)
+ get_rrd_datapoints(remote, base, timeframe, cf).await
}
#[api(
@@ -191,7 +193,7 @@ fn get_qemu_rrd_data(
},
)]
/// Read lxc stats
-fn get_lxc_rrd_data(
+async fn get_lxc_rrd_data(
remote: String,
vmid: u32,
timeframe: RrdTimeframe,
@@ -199,8 +201,7 @@ fn get_lxc_rrd_data(
_param: Value,
) -> Result<Vec<LxcDataPoint>, Error> {
let base = format!("pve/{remote}/lxc/{vmid}");
-
- rrd_common::create_datapoints_from_rrd(&base, timeframe, cf)
+ get_rrd_datapoints(remote, base, timeframe, cf).await
}
#[api(
@@ -221,7 +222,7 @@ fn get_lxc_rrd_data(
},
)]
/// Read node stats
-fn get_node_rrd_data(
+async fn get_node_rrd_data(
remote: String,
node: String,
timeframe: RrdTimeframe,
@@ -229,9 +230,33 @@ fn get_node_rrd_data(
_param: Value,
) -> Result<Vec<NodeDataPoint>, Error> {
let base = format!("pve/{remote}/node/{node}");
-
- rrd_common::create_datapoints_from_rrd(&base, timeframe, cf)
+ get_rrd_datapoints(remote, base, timeframe, cf).await
}
+
+async fn get_rrd_datapoints<T: DataPoint + Send + 'static>(
+ remote: String,
+ basepath: String,
+ timeframe: RrdTimeframe,
+ mode: RrdMode,
+) -> Result<Vec<T>, Error> {
+ const WAIT_FOR_NEWEST_METRIC_TIMEOUT: Duration = Duration::from_secs(5);
+
+ if timeframe == RrdTimeframe::Hour {
+ // Let's wait for a limited time for the most recent metrics. If the connection to the remote
+ // is super slow or if the metric collection tasks currently busy with collecting other
+ // metrics, we just return the data we already have, not the newest one.
+ let _ = tokio::time::timeout(WAIT_FOR_NEWEST_METRIC_TIMEOUT, async {
+ metric_collection::trigger_metric_collection(Some(remote), true).await
+ })
+ .await;
+ }
+
+ tokio::task::spawn_blocking(move || {
+ rrd_common::create_datapoints_from_rrd(&basepath, timeframe, mode)
+ })
+ .await?
+}
+
pub const QEMU_RRD_ROUTER: Router = Router::new().get(&API_METHOD_GET_QEMU_RRD_DATA);
pub const LXC_RRD_ROUTER: Router = Router::new().get(&API_METHOD_GET_LXC_RRD_DATA);
pub const NODE_RRD_ROUTER: Router = Router::new().get(&API_METHOD_GET_NODE_RRD_DATA);
--
2.47.2
_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel
next prev parent reply other threads:[~2025-08-26 13:52 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-26 13:50 [pdm-devel] [PATCH proxmox-datacenter-manager v7 00/24] metric collection improvements (concurrency, API, CLI) Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 01/24] metric collection: split top_entities split into separate module Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 02/24] metric collection: save metric data to RRD in separate task Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 03/24] metric collection: rework metric poll task Lukas Wagner
2025-08-26 13:50 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 04/24] metric collection: persist state after metric collection Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 05/24] metric collection: skip if last_collection < MIN_COLLECTION_INTERVAL Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 06/24] metric collection: collect overdue metrics on startup/timer change Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 07/24] metric collection: add tests for the fetch_remotes function Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 08/24] metric collection: add test for fetch_overdue Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 09/24] metric collection: pass rrd cache instance as function parameter Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 10/24] metric collection: add test for rrd task Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 11/24] metric collection: wrap rrd_cache::Cache in a struct Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 12/24] metric collection: record remote response time in metric database Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 13/24] metric collection: save time needed for collection run to RRD Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 14/24] metric collection: periodically clean removed remotes from statefile Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 15/24] api: add endpoint to trigger metric collection Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 16/24] api: remotes: trigger immediate metric collection for newly added nodes Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 17/24] api: add api for querying metric collection RRD data Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 18/24] api: metric-collection: add status endpoint Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 19/24] pdm-client: add metric collection API methods Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 20/24] cli: add commands for metric-collection trigger and status Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 21/24] metric collection: skip missed timer ticks Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 22/24] metric collection: use JoinSet instead of joining from handles in a Vec Lukas Wagner
2025-08-26 13:51 ` [pdm-devel] [PATCH proxmox-datacenter-manager v7 23/24] metric collection: allow to wait until completion when triggering collection manually Lukas Wagner
2025-08-26 13:51 ` Lukas Wagner [this message]
2025-08-28 19:37 ` [pdm-devel] applied: [PATCH proxmox-datacenter-manager v7 00/24] metric collection improvements (concurrency, API, CLI) Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250826135119.336510-25-l.wagner@proxmox.com \
--to=l.wagner@proxmox.com \
--cc=pdm-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.