public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data
@ 2025-05-23 16:00 Aaron Lauterer
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 1/2] cfs status.c: drop old pve2-vm rrd schema support Aaron Lauterer
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:00 UTC (permalink / raw)
  To: pve-devel

This patch series expands the RRD format for nodes and VMs. For all types
(nodes, VMs, storage) we adjust the aggregation to align them with the way they
are done on the Backup Server. Therefore, we have new RRD defitions for all
3 types.

New values are added for nodes and VMs. In particular:

Nodes:
* memfree
* membuffers
* memcached
* arcsize
* pressures:
  * cpu some
  * io some
  * io full
  * mem some
  * mem full

VMs:
* memhost (memory consumption of all processes in the guests cgroup, host view)
* pressures:
  * cpu some
  * cpu full
  * io some
  * io full
  * mem some
  * mem full

To not lose old RRD data, we need to migrate the old RRD files to the ones with
the new schema. Some initial performance tests showed that migrating 10k VM
RRD files took ~2m40s single threaded. This is way to long to do it within the
pmxcfs itself. Therefore this will be a dedicated step. I wrote a small rust
tool that binds to librrd to to the migraton.

We could include it in a post-install step when upgrading to PVE 9.

To avoid missing data and key errors in the journal, we need to ship some
changes to PVE 8 that can handle the new format sent out by pvestatd. Those
patches are the first in the series and are marked with a "-pve8" postfix in the
repo name.

This RFC series so far only handles migration and any changes needed for the new
fields. It does not yet include any GUI patches to add additional graphs to the
summary pages of nodes and guests.

Plans:
* Add GUI parts:
  * Additional graphs, mostly for pressures.
  * add more info the memory graph. e.g. ZFS ARC
  * add host memory view of guests in graph and gauge

* pve8to9:
  * have a check how many RRD files are present and verify that there is enough
	space on the root FS


How to test:
1. build pve-cluster with the pve8 patches and install it on all nodes.
2. build all the other packages and install them.
   build the migration tool with cargo and copy the binary to the nodes for now.
3. run the migration tool on the first host
4. continue running the migration tool on the other nodes one by one

If you uncomment the extra logging in the pmxcfs/status.c you should see how
the different situations are handled.
In the PVE8 patches start at line 1373, in the later patches for PVE9 it starts
 at line 1565.

cluster-pve8:

Aaron Lauterer (2):
  cfs status.c: drop old pve2-vm rrd schema support
  status: handle new pve9- metrics update data

 src/pmxcfs/status.c | 56 ++++++++++++++++++++++++++++++++++-----------
 src/pmxcfs/status.h |  2 ++
 2 files changed, 45 insertions(+), 13 deletions(-)


pve9-rrd-migration-tool:

Aaron Lauterer (1):
  introduce rrd migration tool for pve8 -> pve9


cluster:

Aaron Lauterer (1):
  status: introduce new pve9- rrd and metric format

 src/pmxcfs/status.c | 242 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 217 insertions(+), 25 deletions(-)


common:

Aaron Lauterer (1):
  add helper to fetch value from smaps_rollup for pid

Folke Gleumes (3):
  fix error in pressure parsing
  add functions to retrieve pressures for vm/ct
  metrics: add buffer and cache to meminfo

 src/PVE/ProcFSTools.pm | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)


manager:

Aaron Lauterer (5):
  api2tools: drop old VM rrd schema
  pvestatd: collect and distribute new pve9- metrics
  api: nodes: rrd and rrddata fetch from new pve9-node rrd files if
    present
  api2tools: extract stats: handle existence of new pve9- data
  ui: rrdmodels: add new columns

 PVE/API2/Nodes.pm                    |   8 +-
 PVE/API2Tools.pm                     |  24 +----
 PVE/Service/pvestatd.pm              | 128 +++++++++++++++++++++------
 www/manager6/data/model/RRDModels.js |  16 ++++
 4 files changed, 126 insertions(+), 50 deletions(-)


storage:

Aaron Lauterer (1):
  status: rrddata: use new pve9 rrd location if file is present

 src/PVE/API2/Storage/Status.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)


qemu-server:

Aaron Lauterer (3):
  vmstatus: add memhost for host view of vm mem consumption
  vmstatus: switch mem stat to PSS of VM cgroup
  rrddata: use new pve9 rrd location if file is present

Folke Gleumes (1):
  metrics: add pressure to metrics

 PVE/API2/Qemu.pm  |  4 +++-
 PVE/QemuServer.pm | 23 +++++++++++++++++++----
 2 files changed, 22 insertions(+), 5 deletions(-)


container:

Aaron Lauterer (1):
  rrddata: use new pve9 rrd location if file is present

 src/PVE/API2/LXC.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


Summary over all repositories:
  12 files changed, 457 insertions(+), 98 deletions(-)

-- 
Generated by git-murpp 0.8.1


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH cluster-pve8 1/2] cfs status.c: drop old pve2-vm rrd schema support
  2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
@ 2025-05-23 16:00 ` Aaron Lauterer
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data Aaron Lauterer
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:00 UTC (permalink / raw)
  To: pve-devel

the newer pve2.3-vm schema has been introduced with commit ba9dcfc1 back
in 2013. By now there should be no cluster where an older node might
still send the old pve2-vm schema.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/pmxcfs/status.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/src/pmxcfs/status.c b/src/pmxcfs/status.c
index cda3921..77a18d8 100644
--- a/src/pmxcfs/status.c
+++ b/src/pmxcfs/status.c
@@ -1277,17 +1277,10 @@ update_rrd_data(
 			create_rrd_file(filename, argcount, rrd_def_node);
 		}
 
-	} else if ((strncmp(key, "pve2-vm/", 8) == 0) ||
-		   (strncmp(key, "pve2.3-vm/", 10) == 0)) {
-		const char *vmid;
+	} else if (strncmp(key, "pve2.3-vm/", 10) == 0) {
+		const char *vmid = key + 10;
 
-		if (strncmp(key, "pve2-vm/", 8) == 0) {
-			vmid = key + 8;
-			skip = 2;
-		} else {
-			vmid = key + 10;
-			skip = 4;
-		}
+		skip = 4;
 
 		if (strchr(vmid, '/') != NULL)
 			goto keyerror;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data
  2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 1/2] cfs status.c: drop old pve2-vm rrd schema support Aaron Lauterer
@ 2025-05-23 16:00 ` Aaron Lauterer
  2025-05-23 16:35   ` Aaron Lauterer
  2025-06-02 13:31   ` Thomas Lamprecht
  2025-05-23 16:00 ` [pve-devel] [PATCH pve9-rrd-migration-tool 1/1] introduce rrd migration tool for pve8 -> pve9 Aaron Lauterer
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:00 UTC (permalink / raw)
  To: pve-devel

For PVE9 there will be additional fields in the metrics that are
collected. The new columns/fields are added at the end of the current
ones. Therefore, if we get the new format, we need to cut it.

Paths to rrd filenames needed to be set manually to 'pve2-...' and will
use the 'node' part instead of the full key, as that could also be
'pve9-...' which does not exists.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/pmxcfs/status.c | 51 ++++++++++++++++++++++++++++++++++++++-------
 src/pmxcfs/status.h |  2 ++
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/src/pmxcfs/status.c b/src/pmxcfs/status.c
index 77a18d8..3fdb179 100644
--- a/src/pmxcfs/status.c
+++ b/src/pmxcfs/status.c
@@ -1236,6 +1236,8 @@ rrd_skip_data(
 	return data;
 }
 
+static char* rrd_format_update_buffer = NULL;
+
 static void
 update_rrd_data(
 	const char *key,
@@ -1255,9 +1257,15 @@ update_rrd_data(
 
 	char *filename = NULL;
 
+	if (!rrd_format_update_buffer) {
+	    rrd_format_update_buffer = (char*)malloc(RRD_FORMAT_BUFFER_SIZE);
+	}
+
 	int skip = 0;
+	int data_cutoff = 0; // how many columns after initial skip should be a cut-off
 
-	if (strncmp(key, "pve2-node/", 10) == 0) {
+	if (strncmp(key, "pve2-node/", 10) == 0 ||
+		strncmp(key, "pve9-node/", 10) == 0) {
 		const char *node = key + 10;
 
 		skip = 2;
@@ -1268,7 +1276,11 @@ update_rrd_data(
 		if (strlen(node) < 1)
 			goto keyerror;
 
-		filename = g_strdup_printf(RRDDIR "/%s", key);
+		if (strncmp(key, "pve9-node/", 10) == 0) {
+		    data_cutoff = 13;
+		}
+
+		filename = g_strdup_printf(RRDDIR "/pve2-node/%s", node);
 
 		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
 
@@ -1277,8 +1289,15 @@ update_rrd_data(
 			create_rrd_file(filename, argcount, rrd_def_node);
 		}
 
-	} else if (strncmp(key, "pve2.3-vm/", 10) == 0) {
-		const char *vmid = key + 10;
+	} else if (strncmp(key, "pve2.3-vm/", 10) == 0 ||
+		strncmp(key, "pve9-vm/", 8) == 0) {
+
+		const char *vmid;
+		if (strncmp(key, "pve2.3-vm/", 10) == 0) {
+		    vmid = key + 10;
+		} else {
+		    vmid = key + 8;
+		}
 
 		skip = 4;
 
@@ -1288,6 +1307,10 @@ update_rrd_data(
 		if (strlen(vmid) < 1)
 			goto keyerror;
 
+		if (strncmp(key, "pve9-vm/", 8) == 0) {
+		    data_cutoff = 11;
+		}
+
 		filename = g_strdup_printf(RRDDIR "/%s/%s", "pve2-vm", vmid);
 
 		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
@@ -1297,7 +1320,8 @@ update_rrd_data(
 			create_rrd_file(filename, argcount, rrd_def_vm);
 		}
 
-	} else if (strncmp(key, "pve2-storage/", 13) == 0) {
+	} else if (strncmp(key, "pve2-storage/", 13) == 0 ||
+		strncmp(key, "pve9-storage/", 13) == 0) {
 		const char *node = key + 13;
 
 		const char *storage = node;
@@ -1315,7 +1339,7 @@ update_rrd_data(
 		if (strlen(storage) < 1)
 			goto keyerror;
 
-		filename = g_strdup_printf(RRDDIR "/%s", key);
+		filename = g_strdup_printf(RRDDIR "/pve2-storage/%s", node);
 
 		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
 
@@ -1335,7 +1359,20 @@ update_rrd_data(
 
 	const char *dp = skip ? rrd_skip_data(data, skip) : data;
 
-	const char *update_args[] = { dp, NULL };
+	if (data_cutoff) {
+	    const char *cut = rrd_skip_data(dp, data_cutoff);
+	    const int data_len = cut - dp - 1; // -1 to remove last colon
+	    snprintf(rrd_format_update_buffer, RRD_FORMAT_BUFFER_SIZE, "%.*s", data_len, dp);
+	} else {
+	    snprintf(rrd_format_update_buffer, RRD_FORMAT_BUFFER_SIZE, "%s", dp);
+	}
+
+	const char *update_args[] = { rrd_format_update_buffer, NULL };
+
+	// TODO: remove in non RFC, but useful for debug logging to see if data is handled correctly
+	// cfs_message("KEY: %s", key);
+	// cfs_message("DATA: %s", dp);
+	// cfs_message("BUFFER: %s", rrd_format_update_buffer);
 
 	if (use_daemon) {
 		int status;
diff --git a/src/pmxcfs/status.h b/src/pmxcfs/status.h
index 041cb34..1a38f43 100644
--- a/src/pmxcfs/status.h
+++ b/src/pmxcfs/status.h
@@ -34,6 +34,8 @@
 
 #define CFS_MAX_STATUS_SIZE (32*1024)
 
+#define RRD_FORMAT_BUFFER_SIZE (1024 * 1024) // 1 MiB
+
 typedef struct cfs_clnode cfs_clnode_t;
 typedef struct cfs_clinfo cfs_clinfo_t;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH pve9-rrd-migration-tool 1/1] introduce rrd migration tool for pve8 -> pve9
  2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 1/2] cfs status.c: drop old pve2-vm rrd schema support Aaron Lauterer
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data Aaron Lauterer
@ 2025-05-23 16:00 ` Aaron Lauterer
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster 1/1] status: introduce new pve9- rrd and metric format Aaron Lauterer
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:00 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 .cargo/config.toml      |   8 +
 .gitignore              |   5 +
 Cargo.toml              |  20 ++
 build.rs                |  29 +++
 src/lib.rs              |   5 +
 src/main.rs             | 504 ++++++++++++++++++++++++++++++++++++++++
 src/parallel_handler.rs | 162 +++++++++++++
 wrapper.h               |   1 +
 8 files changed, 734 insertions(+)
 create mode 100644 .cargo/config.toml
 create mode 100644 .gitignore
 create mode 100644 Cargo.toml
 create mode 100644 build.rs
 create mode 100644 src/lib.rs
 create mode 100644 src/main.rs
 create mode 100644 src/parallel_handler.rs
 create mode 100644 wrapper.h

diff --git a/.cargo/config.toml b/.cargo/config.toml
new file mode 100644
index 0000000..a439c97
--- /dev/null
+++ b/.cargo/config.toml
@@ -0,0 +1,8 @@
+[source]
+[source.debian-packages]
+directory = "/usr/share/cargo/registry"
+[source.crates-io]
+replace-with = "debian-packages"
+
+[profile.release]
+debug=true
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..7741e63
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,5 @@
+./target
+./build
+
+Cargo.lock
+
diff --git a/Cargo.toml b/Cargo.toml
new file mode 100644
index 0000000..d3523f3
--- /dev/null
+++ b/Cargo.toml
@@ -0,0 +1,20 @@
+[package]
+name = "proxmox_rrd_migration_8-9"
+version = "0.1.0"
+edition = "2021"
+authors = [
+    "Aaron Lauterer <a.lauterer@proxmox.com>",
+    "Proxmox Support Team <support@proxmox.com>",
+]
+license = "AGPL-3"
+homepage = "https://www.proxmox.com"
+
+[dependencies]
+anyhow = "1.0.86"
+pico-args = "0.5.0"
+proxmox-async = "0.4"
+crossbeam-channel = "0.5"
+
+[build-dependencies]
+bindgen = "0.66.1"
+pkg-config = "0.3"
diff --git a/build.rs b/build.rs
new file mode 100644
index 0000000..56d07cc
--- /dev/null
+++ b/build.rs
@@ -0,0 +1,29 @@
+use std::env;
+use std::path::PathBuf;
+
+fn main() {
+    println!("cargo:rustc-link-lib=rrd");
+
+    println!("cargo:rerun-if-changed=wrapper.h");
+    // The bindgen::Builder is the main entry point
+    // to bindgen, and lets you build up options for
+    // the resulting bindings.
+
+    let bindings = bindgen::Builder::default()
+        // The input header we would like to generate
+        // bindings for.
+        .header("wrapper.h")
+        // Tell cargo to invalidate the built crate whenever any of the
+        // included header files changed.
+        .parse_callbacks(Box::new(bindgen::CargoCallbacks))
+        // Finish the builder and generate the bindings.
+        .generate()
+        // Unwrap the Result and panic on failure.
+        .expect("Unable to generate bindings");
+
+    // Write the bindings to the $OUT_DIR/bindings.rs file.
+    let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());
+    bindings
+        .write_to_file(out_path.join("bindings.rs"))
+        .expect("Couldn't write bindings!");
+}
diff --git a/src/lib.rs b/src/lib.rs
new file mode 100644
index 0000000..a38a13a
--- /dev/null
+++ b/src/lib.rs
@@ -0,0 +1,5 @@
+#![allow(non_upper_case_globals)]
+#![allow(non_camel_case_types)]
+#![allow(non_snake_case)]
+
+include!(concat!(env!("OUT_DIR"), "/bindings.rs"));
diff --git a/src/main.rs b/src/main.rs
new file mode 100644
index 0000000..43f181c
--- /dev/null
+++ b/src/main.rs
@@ -0,0 +1,504 @@
+use anyhow::{bail, Error, Result};
+use proxmox_rrd_migration_8_9::{rrd_clear_error, rrd_create_r2, rrd_get_context, rrd_get_error};
+use std::ffi::{CStr, CString, OsString};
+use std::fs;
+use std::os::unix::ffi::OsStrExt;
+use std::os::unix::fs::PermissionsExt;
+use std::path::PathBuf;
+use std::sync::Arc;
+
+use crate::parallel_handler::ParallelHandler;
+
+pub mod parallel_handler;
+
+const BASE_DIR: &str = "/var/lib/rrdcached/db";
+const SOURCE_SUBDIR_NODE: &str = "pve2-node";
+const SOURCE_SUBDIR_GUEST: &str = "pve2-vm";
+const SOURCE_SUBDIR_STORAGE: &str = "pve2-storage";
+const TARGET_SUBDIR_NODE: &str = "pve9-node";
+const TARGET_SUBDIR_GUEST: &str = "pve9-vm";
+const TARGET_SUBDIR_STORAGE: &str = "pve9-storage";
+const MAX_THREADS: usize = 4;
+const RRD_STEP_SIZE: usize = 60;
+
+// RRAs are defined in the following way:
+//
+// RRA:CF:xff:step:rows
+// CF: AVERAGE or MAX
+// xff: 0.5
+// steps: stepsize is defined on rrd file creation! example: with 60 seconds step size:
+//	e.g. 1 => 60 sec, 30 => 1800 seconds or 30 min
+// rows: how many aggregated rows are kept, as in how far back in time we store data
+//
+// how many seconds are aggregated per RRA: steps * stepsize * rows
+// how many hours are aggregated per RRA: steps * stepsize * rows / 3600
+// how many days are aggregated per RRA: steps * stepsize * rows / 3600 / 24
+// https://oss.oetiker.ch/rrdtool/tut/rrd-beginners.en.html#Understanding_by_an_example
+
+const RRD_VM_DEF: [&CStr; 25] = [
+    c"DS:maxcpu:GAUGE:120:0:U",
+    c"DS:cpu:GAUGE:120:0:U",
+    c"DS:maxmem:GAUGE:120:0:U",
+    c"DS:mem:GAUGE:120:0:U",
+    c"DS:maxdisk:GAUGE:120:0:U",
+    c"DS:disk:GAUGE:120:0:U",
+    c"DS:netin:DERIVE:120:0:U",
+    c"DS:netout:DERIVE:120:0:U",
+    c"DS:diskread:DERIVE:120:0:U",
+    c"DS:diskwrite:DERIVE:120:0:U",
+    c"DS:memhost:GAUGE:120:0:U",
+    c"DS:pressurecpusome:GAUGE:120:0:U",
+    c"DS:pressurecpufull:GAUGE:120:0:U",
+    c"DS:pressureiosome:GAUGE:120:0:U",
+    c"DS:pressureiofull:GAUGE:120:0:U",
+    c"DS:pressurememorysome:GAUGE:120:0:U",
+    c"DS:pressurememoryfull:GAUGE:120:0:U",
+    c"RRA:AVERAGE:0.5:1:1440",    // 1 min * 1440 => 1 day
+    c"RRA:AVERAGE:0.5:30:1440",   // 30 min * 1440 => 30 day
+    c"RRA:AVERAGE:0.5:360:1440",  // 6 hours * 1440 => 360 day ~1 year
+    c"RRA:AVERAGE:0.5:10080:570", // 1 week * 570 => ~10 years
+    c"RRA:MAX:0.5:1:1440",        // 1 min * 1440 => 1 day
+    c"RRA:MAX:0.5:30:1440",       // 30 min * 1440 => 30 day
+    c"RRA:MAX:0.5:360:1440",      // 6 hours * 1440 => 360 day ~1 year
+    c"RRA:MAX:0.5:10080:570",     // 1 week * 570 => ~10 years
+];
+
+const RRD_NODE_DEF: [&CStr; 29] = [
+    c"DS:loadavg:GAUGE:120:0:U",
+    c"DS:maxcpu:GAUGE:120:0:U",
+    c"DS:cpu:GAUGE:120:0:U",
+    c"DS:iowait:GAUGE:120:0:U",
+    c"DS:memtotal:GAUGE:120:0:U",
+    c"DS:memused:GAUGE:120:0:U",
+    c"DS:swaptotal:GAUGE:120:0:U",
+    c"DS:swapused:GAUGE:120:0:U",
+    c"DS:roottotal:GAUGE:120:0:U",
+    c"DS:rootused:GAUGE:120:0:U",
+    c"DS:netin:DERIVE:120:0:U",
+    c"DS:netout:DERIVE:120:0:U",
+    c"DS:memfree:GAUGE:120:0:U",
+    c"DS:membuffers:GAUGE:120:0:U",
+    c"DS:memcached:GAUGE:120:0:U",
+    c"DS:arcsize:GAUGE:120:0:U",
+    c"DS:pressurecpusome:GAUGE:120:0:U",
+    c"DS:pressureiosome:GAUGE:120:0:U",
+    c"DS:pressureiofull:GAUGE:120:0:U",
+    c"DS:pressurememorysome:GAUGE:120:0:U",
+    c"DS:pressurememoryfull:GAUGE:120:0:U",
+    c"RRA:AVERAGE:0.5:1:1440",    // 1 min * 1440 => 1 day
+    c"RRA:AVERAGE:0.5:30:1440",   // 30 min * 1440 => 30 day
+    c"RRA:AVERAGE:0.5:360:1440",  // 6 hours * 1440 => 360 day ~1 year
+    c"RRA:AVERAGE:0.5:10080:570", // 1 week * 570 => ~10 years
+    c"RRA:MAX:0.5:1:1440",        // 1 min * 1440 => 1 day
+    c"RRA:MAX:0.5:30:1440",       // 30 min * 1440 => 30 day
+    c"RRA:MAX:0.5:360:1440",      // 6 hours * 1440 => 360 day ~1 year
+    c"RRA:MAX:0.5:10080:570",     // 1 week * 570 => ~10 years
+];
+
+const RRD_STORAGE_DEF: [&CStr; 10] = [
+    c"DS:total:GAUGE:120:0:U",
+    c"DS:used:GAUGE:120:0:U",
+    c"RRA:AVERAGE:0.5:1:1440",    // 1 min * 1440 => 1 day
+    c"RRA:AVERAGE:0.5:30:1440",   // 30 min * 1440 => 30 day
+    c"RRA:AVERAGE:0.5:360:1440",  // 6 hours * 1440 => 360 day ~1 year
+    c"RRA:AVERAGE:0.5:10080:570", // 1 week * 570 => ~10 years
+    c"RRA:MAX:0.5:1:1440",        // 1 min * 1440 => 1 day
+    c"RRA:MAX:0.5:30:1440",       // 30 min * 1440 => 30 day
+    c"RRA:MAX:0.5:360:1440",      // 6 hours * 1440 => 360 day ~1 year
+    c"RRA:MAX:0.5:10080:570",     // 1 week * 570 => ~10 years
+];
+
+const HELP: &str = "\
+proxmox-rrd-migration tool
+
+Migrates existing RRD graph data to the new format.
+
+Use this only in the process of upgrading from Proxmox VE 8 to 9 according to the upgrade guide!
+
+USAGE:
+    proxmox-rrd-migration [OPTIONS]
+
+    FLAGS:
+        -h, --help                  Prints this help information
+
+    OPTIONS:
+        --force                     Migrate, even if the target already exists.
+                                    This will overwrite any migrated RRD files!
+
+        --threads THREADS           Number of paralell threads. Default from 1 to 4.
+
+        --test                      For internal use only.
+                                    Tests parallel guest migration only!
+        --source                    For internal use only. Source directory.
+        --target                    For internal use only. Target directory.
+      ";
+
+#[derive(Debug)]
+struct Args {
+    force: bool,
+    threads: Option<usize>,
+    test: bool,
+    source: Option<PathBuf>,
+    target: Option<PathBuf>,
+}
+
+fn parse_args() -> Result<Args, Error> {
+    let mut pargs = pico_args::Arguments::from_env();
+
+    // Help has a higher priority and should be handled separately.
+    if pargs.contains(["-h", "--help"]) {
+        print!("{}", HELP);
+        std::process::exit(0);
+    }
+
+    let mut args = Args {
+        threads: pargs.opt_value_from_str("--threads").unwrap(),
+        force: false,
+        test: false,
+        source: pargs.opt_value_from_str("--source").unwrap(),
+        target: pargs.opt_value_from_str("--target").unwrap(),
+    };
+
+    if pargs.contains("--test") {
+        args.test = true;
+    }
+    if pargs.contains("--force") {
+        args.force = true;
+    }
+
+    // It's up to the caller what to do with the remaining arguments.
+    let remaining = pargs.finish();
+    if !remaining.is_empty() {
+        bail!(format!("Warning: unused arguments left: {:?}", remaining));
+    }
+
+    Ok(args)
+}
+
+fn main() {
+    let args = match parse_args() {
+        Ok(v) => v,
+        Err(e) => {
+            eprintln!("Error: {}.", e);
+            std::process::exit(1);
+        }
+    };
+
+    let mut source_dir_guests: PathBuf = [BASE_DIR, SOURCE_SUBDIR_GUEST].iter().collect();
+    let mut target_dir_guests: PathBuf = [BASE_DIR, TARGET_SUBDIR_GUEST].iter().collect();
+    let source_dir_nodes: PathBuf = [BASE_DIR, SOURCE_SUBDIR_NODE].iter().collect();
+    let target_dir_nodes: PathBuf = [BASE_DIR, TARGET_SUBDIR_NODE].iter().collect();
+    let source_dir_storage: PathBuf = [BASE_DIR, SOURCE_SUBDIR_STORAGE].iter().collect();
+    let target_dir_storage: PathBuf = [BASE_DIR, TARGET_SUBDIR_STORAGE].iter().collect();
+
+    if args.test {
+        source_dir_guests = args.source.clone().unwrap();
+        target_dir_guests = args.target.clone().unwrap();
+    }
+
+    if !args.force && target_dir_guests.exists() {
+        eprintln!(
+            "Aborting! Target path for guests already exists. Use '--force' to still migrate. It will overwrite existing files!"
+        );
+        std::process::exit(1);
+    }
+    if !args.force && target_dir_nodes.exists() {
+        eprintln!(
+            "Aborting! Target path for nodes already exists. Use '--force' to still migrate. It will overwrite existing files!"
+        );
+        std::process::exit(1);
+    }
+    if !args.force && target_dir_storage.exists() {
+        eprintln!(
+            "Aborting! Target path for storages already exists. Use '--force' to still migrate. It will overwrite existing files!"
+        );
+        std::process::exit(1);
+    }
+
+    if !args.test {
+        if let Err(e) = migrate_nodes(source_dir_nodes, target_dir_nodes) {
+            eprintln!("Error migrating nodes: {}", e);
+            std::process::exit(1);
+        }
+        if let Err(e) = migrate_storage(source_dir_storage, target_dir_storage) {
+            eprintln!("Error migrating storage: {}", e);
+            std::process::exit(1);
+        }
+    }
+    if let Err(e) = migrate_guests(source_dir_guests, target_dir_guests, set_threads(&args)) {
+        eprintln!("Error migrating guests: {}", e);
+        std::process::exit(1);
+    }
+}
+
+/// Set number of threads
+///
+/// Either a fixed parameter or determining a range between 1 to 4 threads
+///  based on the number of CPU cores available in the system.
+fn set_threads(args: &Args) -> usize {
+    if args.threads.is_some() {
+        return args.threads.unwrap();
+    }
+    // check for a way to get physical cores and not threads?
+    let cpus: usize = String::from_utf8_lossy(
+        std::process::Command::new("nproc")
+            .output()
+            .expect("Error running nproc")
+            .stdout
+            .as_slice()
+            .trim_ascii(),
+    )
+    .parse::<usize>()
+    .expect("Could not parse nproc output");
+
+    if cpus < 32 {
+        let threads = cpus / 8;
+        if threads == 0 {
+            return 1;
+        }
+        return threads;
+    }
+    return MAX_THREADS;
+}
+
+/// Migrate guest RRD files
+///
+/// In parallel to speed up the process as most time is spent on converting the
+/// data to the new format.
+fn migrate_guests(
+    source_dir_guests: PathBuf,
+    target_dir_guests: PathBuf,
+    threads: usize,
+) -> Result<(), Error> {
+    println!("Migrating RRD data for guests…");
+    println!("Using {} thread(s)", threads);
+
+    let mut guest_source_files: Vec<(CString, OsString)> = Vec::new();
+
+    fs::read_dir(&source_dir_guests)?
+        .filter(|f| f.is_ok())
+        .map(|f| f.unwrap().path())
+        .filter(|f| f.is_file())
+        .for_each(|file| {
+            let path = CString::new(file.as_path().as_os_str().as_bytes())
+                .expect("Could not convert path to CString.");
+            let fname = file
+                .file_name()
+                .map(|v| v.to_os_string())
+                .expect("Could not convert fname to OsString.");
+            guest_source_files.push((path, fname))
+        });
+    if !target_dir_guests.exists() {
+        println!("Creating new directory: '{}'", target_dir_guests.display());
+        std::fs::create_dir(&target_dir_guests)?;
+    }
+
+    let total_guests = guest_source_files.len();
+    let guests = Arc::new(std::sync::atomic::AtomicUsize::new(0));
+    let guests2 = guests.clone();
+    let start_time = std::time::SystemTime::now();
+
+    let migration_pool = ParallelHandler::new(
+        "guest rrd migration",
+        threads,
+        move |(path, fname): (CString, OsString)| {
+            let mut source: [*const i8; 2] = [std::ptr::null(); 2];
+            source[0] = path.as_ptr();
+
+            let node_name = fname;
+            let mut target_path = target_dir_guests.clone();
+            target_path.push(node_name);
+
+            let target_path = CString::new(target_path.to_str().unwrap()).unwrap();
+
+            unsafe {
+                rrd_get_context();
+                rrd_clear_error();
+                let res = rrd_create_r2(
+                    target_path.as_ptr(),
+                    RRD_STEP_SIZE as u64,
+                    0,
+                    0,
+                    source.as_mut_ptr(),
+                    std::ptr::null(),
+                    RRD_VM_DEF.len() as i32,
+                    RRD_VM_DEF.map(|v| v.as_ptr()).as_mut_ptr(),
+                );
+                if res != 0 {
+                    bail!(
+                        "RRD create Error: {}",
+                        CStr::from_ptr(rrd_get_error()).to_string_lossy()
+                    );
+                }
+            }
+            let current_guests = guests2.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
+            if current_guests > 0 && current_guests % 200 == 0 {
+                println!("Migrated {} of {} guests", current_guests, total_guests);
+            }
+            Ok(())
+        },
+    );
+    let migration_channel = migration_pool.channel();
+
+    for file in guest_source_files {
+        let migration_channel = migration_channel.clone();
+        migration_channel.send(file)?;
+    }
+
+    drop(migration_channel);
+    migration_pool.complete()?;
+
+    let elapsed = start_time.elapsed()?.as_secs_f64();
+    let guests = guests.load(std::sync::atomic::Ordering::SeqCst);
+    println!("Migrated {} guests in {:.2}s", guests, elapsed,);
+
+    Ok(())
+}
+
+/// Migrate node RRD files
+///
+/// In serial as the number of nodes will not be high.
+fn migrate_nodes(source_dir_nodes: PathBuf, target_dir_nodes: PathBuf) -> Result<(), Error> {
+    println!("Migrating RRD data for nodes…");
+
+    if !target_dir_nodes.exists() {
+        println!("Creating new directory: '{}'", target_dir_nodes.display());
+        std::fs::create_dir(&target_dir_nodes)?;
+    }
+
+    let mut node_source_files: Vec<(CString, OsString)> = Vec::new();
+    fs::read_dir(&source_dir_nodes)?
+        .filter(|f| f.is_ok())
+        .map(|f| f.unwrap().path())
+        .filter(|f| f.is_file())
+        .for_each(|file| {
+            let path = CString::new(file.as_path().as_os_str().as_bytes())
+                .expect("Could not convert path to CString.");
+            let fname = file
+                .file_name()
+                .map(|v| v.to_os_string())
+                .expect("Could not convert fname to OsString.");
+            node_source_files.push((path, fname))
+        });
+
+    for file in node_source_files {
+        println!("Node: '{}'", PathBuf::from(file.1.clone()).display());
+        let mut source: [*const i8; 2] = [std::ptr::null(); 2];
+
+        source[0] = file.0.as_ptr();
+
+        let node_name = file.1;
+        let mut target_path = target_dir_nodes.clone();
+        target_path.push(node_name);
+
+        let target_path = CString::new(target_path.to_str().unwrap()).unwrap();
+
+        unsafe {
+            rrd_get_context();
+            rrd_clear_error();
+            let res = rrd_create_r2(
+                target_path.as_ptr(),
+                RRD_STEP_SIZE as u64,
+                0,
+                0,
+                source.as_mut_ptr(),
+                std::ptr::null(),
+                RRD_NODE_DEF.len() as i32,
+                RRD_NODE_DEF.map(|v| v.as_ptr()).as_mut_ptr(),
+            );
+            if res != 0 {
+                bail!(
+                    "RRD create Error: {}",
+                    CStr::from_ptr(rrd_get_error()).to_string_lossy()
+                );
+            }
+        }
+    }
+    println!("Migrated all nodes");
+
+    Ok(())
+}
+
+/// Migrate storage RRD files
+///
+/// In serial as the number of storage will not be that high.
+fn migrate_storage(source_dir_storage: PathBuf, target_dir_storage: PathBuf) -> Result<(), Error> {
+    println!("Migrating RRD data for storages…");
+
+    if !target_dir_storage.exists() {
+        println!("Creating new directory: '{}'", target_dir_storage.display());
+        std::fs::create_dir(&target_dir_storage)?;
+    }
+
+    // storage has another layer of directories per node over which we need to iterate
+    fs::read_dir(&source_dir_storage)?
+        .filter(|f| f.is_ok())
+        .map(|f| f.unwrap().path())
+        .filter(|f| f.is_dir())
+        .try_for_each(|node| {
+            let mut storage_source_files: Vec<(CString, OsString)> = Vec::new();
+
+            let mut source_node_subdir = source_dir_storage.clone();
+            source_node_subdir.push(&node.file_name().unwrap());
+
+            let mut target_node_subdir = target_dir_storage.clone();
+            target_node_subdir.push(&node.file_name().unwrap());
+
+            fs::create_dir(target_node_subdir.as_path())?;
+            let metadata = target_node_subdir.metadata()?;
+            let mut permissions = metadata.permissions();
+            permissions.set_mode(0o755);
+
+            fs::read_dir(&source_node_subdir)?
+                .filter(|f| f.is_ok())
+                .map(|f| f.unwrap().path())
+                .filter(|f| f.is_file())
+                .for_each(|file| {
+                    let path = CString::new(file.as_path().as_os_str().as_bytes())
+                        .expect("Could not convert path to CString.");
+                    let fname = file
+                        .file_name()
+                        .map(|v| v.to_os_string())
+                        .expect("Could not convert fname to OsString.");
+                    storage_source_files.push((path, fname))
+                });
+
+            for file in storage_source_files {
+                println!("Storage: '{}'", PathBuf::from(file.1.clone()).display());
+                let mut source: [*const i8; 2] = [std::ptr::null(); 2];
+
+                source[0] = file.0.as_ptr();
+
+                let node_name = file.1;
+                let mut target_path = target_node_subdir.clone();
+                target_path.push(node_name);
+
+                let target_path = CString::new(target_path.to_str().unwrap()).unwrap();
+
+                unsafe {
+                    rrd_get_context();
+                    rrd_clear_error();
+                    let res = rrd_create_r2(
+                        target_path.as_ptr(),
+                        RRD_STEP_SIZE as u64,
+                        0,
+                        0,
+                        source.as_mut_ptr(),
+                        std::ptr::null(),
+                        RRD_STORAGE_DEF.len() as i32,
+                        RRD_STORAGE_DEF.map(|v| v.as_ptr()).as_mut_ptr(),
+                    );
+                    if res != 0 {
+                        bail!(
+                            "RRD create Error: {}",
+                            CStr::from_ptr(rrd_get_error()).to_string_lossy()
+                        );
+                    }
+                }
+            }
+            Ok(())
+        })?;
+    println!("Migrated all nodes");
+
+    Ok(())
+}
diff --git a/src/parallel_handler.rs b/src/parallel_handler.rs
new file mode 100644
index 0000000..787742a
--- /dev/null
+++ b/src/parallel_handler.rs
@@ -0,0 +1,162 @@
+//! A thread pool which run a closure in parallel.
+
+use std::sync::{Arc, Mutex};
+use std::thread::JoinHandle;
+
+use anyhow::{Error, bail, format_err};
+use crossbeam_channel::{Sender, bounded};
+
+/// A handle to send data to the worker thread (implements clone)
+pub struct SendHandle<I> {
+    input: Sender<I>,
+    abort: Arc<Mutex<Option<String>>>,
+}
+
+/// Returns the first error happened, if any
+pub fn check_abort(abort: &Mutex<Option<String>>) -> Result<(), Error> {
+    let guard = abort.lock().unwrap();
+    if let Some(err_msg) = &*guard {
+        return Err(format_err!("{}", err_msg));
+    }
+    Ok(())
+}
+
+impl<I: Send> SendHandle<I> {
+    /// Send data to the worker threads
+    pub fn send(&self, input: I) -> Result<(), Error> {
+        check_abort(&self.abort)?;
+        match self.input.send(input) {
+            Ok(()) => Ok(()),
+            Err(_) => bail!("send failed - channel closed"),
+        }
+    }
+}
+
+/// A thread pool which run the supplied closure
+///
+/// The send command sends data to the worker threads. If one handler
+/// returns an error, we mark the channel as failed and it is no
+/// longer possible to send data.
+///
+/// When done, the 'complete()' method needs to be called to check for
+/// outstanding errors.
+pub struct ParallelHandler<I> {
+    handles: Vec<JoinHandle<()>>,
+    name: String,
+    input: Option<SendHandle<I>>,
+}
+
+impl<I> Clone for SendHandle<I> {
+    fn clone(&self) -> Self {
+        Self {
+            input: self.input.clone(),
+            abort: Arc::clone(&self.abort),
+        }
+    }
+}
+
+impl<I: Send + 'static> ParallelHandler<I> {
+    /// Create a new thread pool, each thread processing incoming data
+    /// with 'handler_fn'.
+    pub fn new<F>(name: &str, threads: usize, handler_fn: F) -> Self
+    where
+        F: Fn(I) -> Result<(), Error> + Send + Clone + 'static,
+    {
+        let mut handles = Vec::new();
+        let (input_tx, input_rx) = bounded::<I>(threads);
+
+        let abort = Arc::new(Mutex::new(None));
+
+        for i in 0..threads {
+            let input_rx = input_rx.clone();
+            let abort = Arc::clone(&abort);
+            let handler_fn = handler_fn.clone();
+
+            handles.push(
+                std::thread::Builder::new()
+                    .name(format!("{} ({})", name, i))
+                    .spawn(move || {
+                        loop {
+                            let data = match input_rx.recv() {
+                                Ok(data) => data,
+                                Err(_) => return,
+                            };
+                            if let Err(err) = (handler_fn)(data) {
+                                let mut guard = abort.lock().unwrap();
+                                if guard.is_none() {
+                                    *guard = Some(err.to_string());
+                                }
+                            }
+                        }
+                    })
+                    .unwrap(),
+            );
+        }
+        Self {
+            handles,
+            name: name.to_string(),
+            input: Some(SendHandle {
+                input: input_tx,
+                abort,
+            }),
+        }
+    }
+
+    /// Returns a cloneable channel to send data to the worker threads
+    pub fn channel(&self) -> SendHandle<I> {
+        self.input.as_ref().unwrap().clone()
+    }
+
+    /// Send data to the worker threads
+    pub fn send(&self, input: I) -> Result<(), Error> {
+        self.input.as_ref().unwrap().send(input)?;
+        Ok(())
+    }
+
+    /// Wait for worker threads to complete and check for errors
+    pub fn complete(mut self) -> Result<(), Error> {
+        let input = self.input.take().unwrap();
+        let abort = Arc::clone(&input.abort);
+        check_abort(&abort)?;
+        drop(input);
+
+        let msg_list = self.join_threads();
+
+        // an error might be encountered while waiting for the join
+        check_abort(&abort)?;
+
+        if msg_list.is_empty() {
+            return Ok(());
+        }
+        Err(format_err!("{}", msg_list.join("\n")))
+    }
+
+    fn join_threads(&mut self) -> Vec<String> {
+        let mut msg_list = Vec::new();
+
+        let mut i = 0;
+        while let Some(handle) = self.handles.pop() {
+            if let Err(panic) = handle.join() {
+                if let Some(panic_msg) = panic.downcast_ref::<&str>() {
+                    msg_list.push(format!("thread {} ({i}) panicked: {panic_msg}", self.name));
+                } else if let Some(panic_msg) = panic.downcast_ref::<String>() {
+                    msg_list.push(format!("thread {} ({i}) panicked: {panic_msg}", self.name));
+                } else {
+                    msg_list.push(format!("thread {} ({i}) panicked", self.name));
+                }
+            }
+            i += 1;
+        }
+        msg_list
+    }
+}
+
+// Note: We make sure that all threads will be joined
+impl<I> Drop for ParallelHandler<I> {
+    fn drop(&mut self) {
+        drop(self.input.take());
+        while let Some(handle) = self.handles.pop() {
+            let _ = handle.join();
+        }
+    }
+}
diff --git a/wrapper.h b/wrapper.h
new file mode 100644
index 0000000..64d0aa6
--- /dev/null
+++ b/wrapper.h
@@ -0,0 +1 @@
+#include <rrd.h>
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH cluster 1/1] status: introduce new pve9- rrd and metric format
  2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
                   ` (2 preceding siblings ...)
  2025-05-23 16:00 ` [pve-devel] [PATCH pve9-rrd-migration-tool 1/1] introduce rrd migration tool for pve8 -> pve9 Aaron Lauterer
@ 2025-05-23 16:00 ` Aaron Lauterer
  2025-05-23 16:37 ` [pve-devel] [PATCH common 1/4] fix error in pressure parsing Aaron Lauterer
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:00 UTC (permalink / raw)
  To: pve-devel

We add several new columns to nodes and VMs (guest) RRDs. See futher
down for details. Additionally we change the RRA definitions on how we
aggregate the data to match how we do it for the Proxmox Backup Server
[0].

The migration of an existing installation is handled by a dedicated
tool. Only once that has happened, will we store new data in the new
format.
This leaves us with a few cases to handle:

  data recv →          old                                 new
  ↓ rrd files
 -------------|---------------------------|-------------------------------------
  none        | check if directories exists:
              |     neither old or new -> new
	      |     new                -> new
	      |     old only           -> old
--------------|---------------------------|-------------------------------------
  only old    | use old file as is        | cut new columns and use old file
--------------|---------------------------|-------------------------------------
  new present | pad data to match new fmt | use new file as is and pass data

To handle the padding and cutting of the data, we use a buffer.

We add the following new columns:

Nodes:
* memfree
* membuffers
* memcached
* arcsize
* pressures:
  * cpu some
  * io some
  * io full
  * mem some
  * mem full

VMs:
* memhost (memory consumption of all processes in the guests cgroup, host view)
* pressures:
  * cpu some
  * cpu full
  * io some
  * io full
  * mem some
  * mem full

[0] https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/server/metric_collection/rrd.rs;h=ed39cc94ee056924b7adbc21b84c0209478bcf42;hb=dc324716a688a67d700fa133725740ac5d3795ce#l76

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/pmxcfs/status.c | 242 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 217 insertions(+), 25 deletions(-)

diff --git a/src/pmxcfs/status.c b/src/pmxcfs/status.c
index 3fdb179..4f258f6 100644
--- a/src/pmxcfs/status.c
+++ b/src/pmxcfs/status.c
@@ -1129,6 +1129,21 @@ kventry_hash_set(
 	return TRUE;
 }
 
+// RRAs are defined in the following way:
+//
+// RRA:CF:xff:step:rows
+// CF: AVERAGE or MAX
+// xff: 0.5
+// steps: stepsize is defined on rrd file creation! example: with 60 seconds step size:
+//	e.g. 1 => 60 sec, 30 => 1800 seconds or 30 min
+// rows: how many aggregated rows are kept, as in how far back in time we store data
+//
+// how many seconds are aggregated per RRA: steps * stepsize * rows
+// how many hours are aggregated per RRA: steps * stepsize * rows / 3600
+// how many days are aggregated per RRA: steps * stepsize * rows / 3600 / 24
+// https://oss.oetiker.ch/rrdtool/tut/rrd-beginners.en.html#Understanding_by_an_example
+
+// Time step size 60 seconds
 static const char *rrd_def_node[] = {
 	"DS:loadavg:GAUGE:120:0:U",
 	"DS:maxcpu:GAUGE:120:0:U",
@@ -1157,6 +1172,43 @@ static const char *rrd_def_node[] = {
 	NULL,
 };
 
+// Time step size 10 seconds
+static const char *rrd_def_node_pve9[] = {
+	"DS:loadavg:GAUGE:120:0:U",
+	"DS:maxcpu:GAUGE:120:0:U",
+	"DS:cpu:GAUGE:120:0:U",
+	"DS:iowait:GAUGE:120:0:U",
+	"DS:memtotal:GAUGE:120:0:U",
+	"DS:memused:GAUGE:120:0:U",
+	"DS:swaptotal:GAUGE:120:0:U",
+	"DS:swapused:GAUGE:120:0:U",
+	"DS:roottotal:GAUGE:120:0:U",
+	"DS:rootused:GAUGE:120:0:U",
+	"DS:netin:DERIVE:120:0:U",
+	"DS:netout:DERIVE:120:0:U",
+	"DS:memfree:GAUGE:120:0:U",
+	"DS:membuffers:GAUGE:120:0:U",
+	"DS:memcached:GAUGE:120:0:U",
+	"DS:arcsize:GAUGE:120:0:U",
+	"DS:pressurecpusome:GAUGE:120:0:U",
+	"DS:pressureiosome:GAUGE:120:0:U",
+	"DS:pressureiofull:GAUGE:120:0:U",
+	"DS:pressurememorysome:GAUGE:120:0:U",
+	"DS:pressurememoryfull:GAUGE:120:0:U",
+
+	"RRA:AVERAGE:0.5:1:1440", // 1 min * 1440 => 1 day
+	"RRA:AVERAGE:0.5:30:1440", // 30 min * 1440 => 30 day
+	"RRA:AVERAGE:0.5:360:1440", // 6 hours * 1440 => 360 day ~1 year
+	"RRA:AVERAGE:0.5:10080:570", // 1 week * 570 => ~10 years
+
+	"RRA:MAX:0.5:1:1440", // 1 min * 1440 => 1 day
+	"RRA:MAX:0.5:30:1440", // 30 min * 1440 => 30 day
+	"RRA:MAX:0.5:360:1440", // 6 hours * 1440 => 360 day ~1 year
+	"RRA:MAX:0.5:10080:570", // 1 week * 570 => ~10 years
+	NULL,
+};
+
+// Time step size 60 seconds
 static const char *rrd_def_vm[] = {
 	"DS:maxcpu:GAUGE:120:0:U",
 	"DS:cpu:GAUGE:120:0:U",
@@ -1183,6 +1235,39 @@ static const char *rrd_def_vm[] = {
 	NULL,
 };
 
+// Time step size 60 seconds
+static const char *rrd_def_vm_pve9[] = {
+	"DS:maxcpu:GAUGE:120:0:U",
+	"DS:cpu:GAUGE:120:0:U",
+	"DS:maxmem:GAUGE:120:0:U",
+	"DS:mem:GAUGE:120:0:U",
+	"DS:maxdisk:GAUGE:120:0:U",
+	"DS:disk:GAUGE:120:0:U",
+	"DS:netin:DERIVE:120:0:U",
+	"DS:netout:DERIVE:120:0:U",
+	"DS:diskread:DERIVE:120:0:U",
+	"DS:diskwrite:DERIVE:120:0:U",
+	"DS:memhost:GAUGE:120:0:U",
+	"DS:pressurecpusome:GAUGE:120:0:U",
+	"DS:pressurecpufull:GAUGE:120:0:U",
+	"DS:pressureiosome:GAUGE:120:0:U",
+	"DS:pressureiofull:GAUGE:120:0:U",
+	"DS:pressurememorysome:GAUGE:120:0:U",
+	"DS:pressurememoryfull:GAUGE:120:0:U",
+
+	"RRA:AVERAGE:0.5:1:1440", // 1 min * 1440 => 1 day
+	"RRA:AVERAGE:0.5:30:1440", // 30 min * 1440 => 30 day
+	"RRA:AVERAGE:0.5:360:1440", // 6 hours * 1440 => 360 day ~1 year
+	"RRA:AVERAGE:0.5:10080:570", // 1 week * 570 => ~10 years
+
+	"RRA:MAX:0.5:1:1440", // 1 min * 1440 => 1 day
+	"RRA:MAX:0.5:30:1440", // 30 min * 1440 => 30 day
+	"RRA:MAX:0.5:360:1440", // 6 hours * 1440 => 360 day ~1 year
+	"RRA:MAX:0.5:10080:570", // 1 week * 570 => ~10 years
+	NULL,
+};
+
+// Time step size 60 seconds
 static const char *rrd_def_storage[] = {
 	"DS:total:GAUGE:120:0:U",
 	"DS:used:GAUGE:120:0:U",
@@ -1200,6 +1285,23 @@ static const char *rrd_def_storage[] = {
 	"RRA:MAX:0.5:10080:70", // 7 day max - ony year
 	NULL,
 };
+//
+// Time step size 60 seconds
+static const char *rrd_def_storage_pve9[] = {
+	"DS:total:GAUGE:120:0:U",
+	"DS:used:GAUGE:120:0:U",
+
+	"RRA:AVERAGE:0.5:1:1440", // 1 min * 1440 => 1 day
+	"RRA:AVERAGE:0.5:30:1440", // 30 min * 1440 => 30 day
+	"RRA:AVERAGE:0.5:360:1440", // 6 hours * 1440 => 360 day ~1 year
+	"RRA:AVERAGE:0.5:10080:570", // 1 week * 570 => ~10 years
+
+	"RRA:MAX:0.5:1:1440", // 1 min * 1440 => 1 day
+	"RRA:MAX:0.5:30:1440", // 30 min * 1440 => 30 day
+	"RRA:MAX:0.5:360:1440", // 6 hours * 1440 => 360 day ~1 year
+	"RRA:MAX:0.5:10080:570", // 1 week * 570 => ~10 years
+	NULL,
+};
 
 #define RRDDIR "/var/lib/rrdcached/db"
 
@@ -1260,35 +1362,70 @@ update_rrd_data(
 	if (!rrd_format_update_buffer) {
 	    rrd_format_update_buffer = (char*)malloc(RRD_FORMAT_BUFFER_SIZE);
 	}
+	static const char* pve9_node_padding = "U:U:U:U:U:U:U:U:U";
+	static const char* pve9_vm_padding = "U:U:U:U:U:U:U";
+
+	const char *padding = NULL;
 
 	int skip = 0;
 	int data_cutoff = 0; // how many columns after initial skip should be a cut-off
 
+	// TODO drop pve2- data handling when not needed anymore
 	if (strncmp(key, "pve2-node/", 10) == 0 ||
 		strncmp(key, "pve9-node/", 10) == 0) {
 		const char *node = key + 10;
 
-		skip = 2;
-
 		if (strchr(node, '/') != NULL)
 			goto keyerror;
 
 		if (strlen(node) < 1)
 			goto keyerror;
 
-		if (strncmp(key, "pve9-node/", 10) == 0) {
-		    data_cutoff = 13;
-		}
+		filename = g_strdup_printf(RRDDIR "/pve9-node/%s", node);
+		char *filename_pve2 = g_strdup_printf(RRDDIR "/pve2-node/%s", node);
 
-		filename = g_strdup_printf(RRDDIR "/pve2-node/%s", node);
+		int use_pve2_file = 0;
 
-		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
-
-			mkdir(RRDDIR "/pve2-node", 0755);
+		// check existing rrd files and directories
+		if (g_file_test(filename, G_FILE_TEST_EXISTS)) {
+		    // new file exists, we use that
+		    // TODO: get conditions so that we do not have this empty one
+		} else if (g_file_test(filename_pve2, G_FILE_TEST_EXISTS)) {
+		    // old file exists, use that
+		    use_pve2_file = 1;
+		    filename = g_strdup_printf("%s", filename_pve2);
+		} else {
+		    // neither file exists, check for directories to decide and create file
+		    char *dir_pve2 = g_strdup_printf(RRDDIR "/pve2-node");
+		    char *dir_pve9 = g_strdup_printf(RRDDIR "/pve9-node");
+
+		    if (g_file_test(dir_pve9,G_FILE_TEST_IS_DIR)) {
+			int argcount = sizeof(rrd_def_node_pve9)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_node_pve9);
+		    } else if (g_file_test(dir_pve2, G_FILE_TEST_IS_DIR)) {
+			use_pve2_file = 1;
+			filename = g_strdup_printf("%s", filename_pve2);
 			int argcount = sizeof(rrd_def_node)/sizeof(void*) - 1;
 			create_rrd_file(filename, argcount, rrd_def_node);
+		    } else {
+			// no dir exists yet, use new pve9
+			mkdir(RRDDIR "/pve9-node", 0755);
+			int argcount = sizeof(rrd_def_node_pve9)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_node_pve9);
+		    }
+		    g_free(dir_pve2);
+		    g_free(dir_pve9);
+		}
+
+		skip = 2;
+
+		if (strncmp(key, "pve2-node/", 10) == 0 && !use_pve2_file) {
+		    padding = pve9_node_padding;
+		} else if (strncmp(key, "pve9-node/", 10) == 0 && use_pve2_file) {
+		    data_cutoff = 13;
 		}
 
+		g_free(filename_pve2);
 	} else if (strncmp(key, "pve2.3-vm/", 10) == 0 ||
 		strncmp(key, "pve9-vm/", 8) == 0) {
 
@@ -1299,27 +1436,57 @@ update_rrd_data(
 		    vmid = key + 8;
 		}
 
-		skip = 4;
-
 		if (strchr(vmid, '/') != NULL)
 			goto keyerror;
 
 		if (strlen(vmid) < 1)
 			goto keyerror;
 
-		if (strncmp(key, "pve9-vm/", 8) == 0) {
-		    data_cutoff = 11;
-		}
+		filename = g_strdup_printf(RRDDIR "/pve9-vm/%s", vmid);
+		char *filename_pve2 = g_strdup_printf(RRDDIR "/pve2-vm/%s", vmid);
 
-		filename = g_strdup_printf(RRDDIR "/%s/%s", "pve2-vm", vmid);
+		int use_pve2_file = 0;
 
-		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
-
-			mkdir(RRDDIR "/pve2-vm", 0755);
+		// check existing rrd files and directories
+		if (g_file_test(filename, G_FILE_TEST_EXISTS)) {
+		    // new file exists, we use that
+		    // TODO: get conditions so that we do not have this empty one
+		} else if (g_file_test(filename_pve2, G_FILE_TEST_EXISTS)) {
+		    // old file exists, use that
+		    use_pve2_file = 1;
+		    filename = g_strdup_printf("%s", filename_pve2);
+		} else {
+		    // neither file exists, check for directories to decide and create file
+		    char *dir_pve2 = g_strdup_printf(RRDDIR "/pve2-vm");
+		    char *dir_pve9 = g_strdup_printf(RRDDIR "/pve9-vm");
+
+		    if (g_file_test(dir_pve9,G_FILE_TEST_IS_DIR)) {
+			int argcount = sizeof(rrd_def_vm_pve9)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_vm_pve9);
+		    } else if (g_file_test(dir_pve2, G_FILE_TEST_IS_DIR)) {
+			use_pve2_file = 1;
+			filename = g_strdup_printf("%s", filename_pve2);
 			int argcount = sizeof(rrd_def_vm)/sizeof(void*) - 1;
 			create_rrd_file(filename, argcount, rrd_def_vm);
+		    } else {
+			// no dir exists yet, use new pve9
+			mkdir(RRDDIR "/pve9-vm", 0755);
+			int argcount = sizeof(rrd_def_vm_pve9)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_vm_pve9);
+		    }
+		    g_free(dir_pve2);
+		    g_free(dir_pve9);
 		}
 
+		skip = 4;
+
+		if (strncmp(key, "pve2.3-vm/", 10) == 0 && !use_pve2_file) {
+		    padding = pve9_vm_padding;
+		} else if (strncmp(key, "pve9-vm/", 8) == 0 && use_pve2_file) {
+		    data_cutoff = 11;
+		}
+
+		g_free(filename_pve2);
 	} else if (strncmp(key, "pve2-storage/", 13) == 0 ||
 		strncmp(key, "pve9-storage/", 13) == 0) {
 		const char *node = key + 13;
@@ -1339,20 +1506,43 @@ update_rrd_data(
 		if (strlen(storage) < 1)
 			goto keyerror;
 
-		filename = g_strdup_printf(RRDDIR "/pve2-storage/%s", node);
-
-		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
+		filename = g_strdup_printf(RRDDIR "/pve9-storage/%s", node);
+		char *filename_pve2 = g_strdup_printf(RRDDIR "/pve2-storage/%s", node);
 
-			mkdir(RRDDIR "/pve2-storage", 0755);
+		// check existing rrd files and directories
+		if (g_file_test(filename, G_FILE_TEST_EXISTS)) {
+		    // new file exists, we use that
+		    // TODO: get conditions so that we do not have this empty one
+		} else if (g_file_test(filename_pve2, G_FILE_TEST_EXISTS)) {
+		    // old file exists, use that
+		    filename = g_strdup_printf("%s", filename_pve2);
+		} else {
+		    // neither file exists, check for directories to decide and create file
+		    char *dir_pve2 = g_strdup_printf(RRDDIR "/pve2-storage");
+		    char *dir_pve9 = g_strdup_printf(RRDDIR "/pve9-storage");
+
+		    if (g_file_test(dir_pve9,G_FILE_TEST_IS_DIR)) {
+			int argcount = sizeof(rrd_def_storage_pve9)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_storage_pve9);
+		    } else if (g_file_test(dir_pve2, G_FILE_TEST_IS_DIR)) {
+			filename = g_strdup_printf("%s", filename_pve2);
+			int argcount = sizeof(rrd_def_storage)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_storage);
+		    } else {
+			// no dir exists yet, use new pve9
+			mkdir(RRDDIR "/pve9-storage", 0755);
 
 			char *dir = g_path_get_dirname(filename);
 			mkdir(dir, 0755);
 			g_free(dir);
 
-			int argcount = sizeof(rrd_def_storage)/sizeof(void*) - 1;
-			create_rrd_file(filename, argcount, rrd_def_storage);
+			int argcount = sizeof(rrd_def_storage_pve9)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_storage_pve9);
+		    }
+		    g_free(dir_pve2);
+		    g_free(dir_pve9);
 		}
-
+		g_free(filename_pve2);
 	} else {
 		goto keyerror;
 	}
@@ -1363,6 +1553,8 @@ update_rrd_data(
 	    const char *cut = rrd_skip_data(dp, data_cutoff);
 	    const int data_len = cut - dp - 1; // -1 to remove last colon
 	    snprintf(rrd_format_update_buffer, RRD_FORMAT_BUFFER_SIZE, "%.*s", data_len, dp);
+	} else if (padding) {
+	    snprintf(rrd_format_update_buffer, RRD_FORMAT_BUFFER_SIZE, "%s:%s", dp, padding);
 	} else {
 	    snprintf(rrd_format_update_buffer, RRD_FORMAT_BUFFER_SIZE, "%s", dp);
 	}
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data Aaron Lauterer
@ 2025-05-23 16:35   ` Aaron Lauterer
  2025-06-02 13:31   ` Thomas Lamprecht
  1 sibling, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:35 UTC (permalink / raw)
  To: pve-devel



On  2025-05-23  18:00, Aaron Lauterer wrote:
> For PVE9 there will be additional fields in the metrics that are
> collected. The new columns/fields are added at the end of the current
> ones. Therefore, if we get the new format, we need to cut it.
> 
> Paths to rrd filenames needed to be set manually to 'pve2-...' and will
> use the 'node' part instead of the full key, as that could also be
> 'pve9-...' which does not exists.

since it pops up for the first time in the series here: I currently 
chose 'pve9-' as prefix for the metric keys, following what we used so 
far AFAICT -> PVE version when it was introduced
.
But we could also think about changing it to something like 
'pve-{node,storage,vm}-{version}' as that could make it easier to change 
the code to also handle other new and right now unknown formats in the 
futures if we always only append new columns/fields.

But I am not exactly sure how we should do the versioning because the 
current approach in the status.c is to strncmp a fixed length of the 
full key and that would be problematic if we use the following examples:

pve-vm-9
pve-vm-10

when do we check for 8 or 9 character long strings? There might be a 
nice way to do this, as in, checking until we reach the separating /.

2 digits with leading 0 could be one approach. But if we also add minor 
PVE versions, well that makes it more complicated.
Or we could switch to it being just an integer that will be increased 
when we add more data.

Just to throw out some ideas :)


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH common 1/4] fix error in pressure parsing
  2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
                   ` (3 preceding siblings ...)
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster 1/1] status: introduce new pve9- rrd and metric format Aaron Lauterer
@ 2025-05-23 16:37 ` Aaron Lauterer
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
  2025-05-26 11:52 ` [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data DERUMIER, Alexandre via pve-devel
  6 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

From: Folke Gleumes <f.gleumes@proxmox.com>

Originally-by: Folke Gleumes <f.gleumes@proxmox.com>
[AL: rebased]
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/PVE/ProcFSTools.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm
index bdddde9..f9fe3f0 100644
--- a/src/PVE/ProcFSTools.pm
+++ b/src/PVE/ProcFSTools.pm
@@ -144,7 +144,7 @@ sub parse_pressure {
 	    $res->{$1}->{avg10} = $2;
 	    $res->{$1}->{avg60} = $3;
 	    $res->{$1}->{avg300} = $4;
-	    $res->{$1}->{total} = $4;
+	    $res->{$1}->{total} = $5;
 	}
     }
     $fh->close;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct
  2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
                   ` (4 preceding siblings ...)
  2025-05-23 16:37 ` [pve-devel] [PATCH common 1/4] fix error in pressure parsing Aaron Lauterer
@ 2025-05-23 16:37 ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH common 3/4] add helper to fetch value from smaps_rollup for pid Aaron Lauterer
                     ` (13 more replies)
  2025-05-26 11:52 ` [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data DERUMIER, Alexandre via pve-devel
  6 siblings, 14 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

From: Folke Gleumes <f.gleumes@proxmox.com>

Originally-by: Folke Gleumes <f.gleumes@proxmox.com>
[AL: rebased on current master]
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/PVE/ProcFSTools.pm | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm
index f9fe3f0..382e6c5 100644
--- a/src/PVE/ProcFSTools.pm
+++ b/src/PVE/ProcFSTools.pm
@@ -151,6 +151,28 @@ sub parse_pressure {
     return $res;
 }
 
+sub read_qemu_pressure {
+    my ($vmid) = @_;
+
+    my $res = {};
+    foreach my $type (qw(cpu memory io)) {
+	my $stats = parse_pressure("/sys/fs/cgroup/qemu.slice/$vmid.scope/$type.pressure");
+	$res->{$type} = $stats if $stats;
+    }
+    return $res;
+}
+
+sub read_lxc_pressure {
+    my ($vmid) = @_;
+
+    my $res = {};
+    foreach my $type (qw(cpu memory io)) {
+	my $stats = parse_pressure("/sys/fs/cgroup/lxc/$vmid/$type.pressure");
+	$res->{$type} = $stats if $stats;
+    }
+    return $res;
+}
+
 sub read_pressure {
     my $res = {};
     foreach my $type (qw(cpu memory io)) {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH common 3/4] add helper to fetch value from smaps_rollup for pid
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-06-02 14:11     ` Thomas Lamprecht
  2025-05-23 16:37   ` [pve-devel] [PATCH common 4/4] metrics: add buffer and cache to meminfo Aaron Lauterer
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/PVE/ProcFSTools.pm | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm
index 382e6c5..185b2b3 100644
--- a/src/PVE/ProcFSTools.pm
+++ b/src/PVE/ProcFSTools.pm
@@ -344,6 +344,20 @@ sub read_meminfo {
     return $res;
 }
 
+# extract memory data from /proc/$pid/smaps_rollup for PID. $value is e.g. "Pss".
+sub read_smaps_rollup {
+    my ($pid, $value) = @_;
+
+    my $res = 0;
+    return $res if !$pid || !$value;
+
+    my $mem_stats = eval { PVE::Tools::file_get_contents("/proc/${pid}/smaps_rollup") };
+    if ($mem_stats && $mem_stats =~ m/^${value}:\s+(\d+)\s*kB$/m) {
+	$res = $1 * 1024;
+    }
+    return $res;
+}
+
 # memory usage of current process
 sub read_memory_usage {
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH common 4/4] metrics: add buffer and cache to meminfo
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH common 3/4] add helper to fetch value from smaps_rollup for pid Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-06-02 14:07     ` Thomas Lamprecht
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 1/5] api2tools: drop old VM rrd schema Aaron Lauterer
                     ` (11 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

From: Folke Gleumes <f.gleumes@proxmox.com>

Expose buffers and cache as separate metrics instead of including them
in memfree and memused.

Originally-by: Folke Gleumes <f.gleumes@proxmox.com>
[AL: rebased and adapted to changes that happened in the meantime]
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/PVE/ProcFSTools.pm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm
index 185b2b3..91a69be 100644
--- a/src/PVE/ProcFSTools.pm
+++ b/src/PVE/ProcFSTools.pm
@@ -303,6 +303,8 @@ sub read_meminfo {
 	memfree => 0,
 	memavailable => 0,
 	memused => 0,
+	membuffers => 0,
+	memcached => 0,
 	memshared => 0,
 	swaptotal => 0,
 	swapfree => 0,
@@ -328,6 +330,8 @@ sub read_meminfo {
     # available for a new workload, without pushing the system into swap, no amount of calculating
     # with BUFFER, CACHE, .. will get you there, only the kernel can know this.
     $res->{memused} = $res->{memtotal} - $d->{memavailable};
+    $res->{membuffers} = $d->{buffers};
+    $res->{memcached} = $d->{cached};
 
     $res->{swaptotal} = $d->{swaptotal};
     $res->{swapfree} = $d->{swapfree};
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH manager 1/5] api2tools: drop old VM rrd schema
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH common 3/4] add helper to fetch value from smaps_rollup for pid Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH common 4/4] metrics: add buffer and cache to meminfo Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 2/5] pvestatd: collect and distribute new pve9- metrics Aaron Lauterer
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

pve2.3-vm has been introduced with commit 3b6ad3ac back in 2013. By now
there should not be any combination of clustered nodes that still send
the old pve2-vm variant.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 PVE/API2Tools.pm | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/PVE/API2Tools.pm b/PVE/API2Tools.pm
index a56eb732..37829983 100644
--- a/PVE/API2Tools.pm
+++ b/PVE/API2Tools.pm
@@ -90,23 +90,7 @@ sub extract_vm_stats {
 
     my $d;
 
-    if ($d = $rrd->{"pve2-vm/$vmid"}) {
-
-	$entry->{uptime} = ($d->[0] || 0) + 0;
-	$entry->{name} = $d->[1];
-	$entry->{status} = $entry->{uptime} ? 'running' : 'stopped';
-	$entry->{maxcpu} = ($d->[3] || 0) + 0;
-	$entry->{cpu} = ($d->[4] || 0) + 0;
-	$entry->{maxmem} = ($d->[5] || 0) + 0;
-	$entry->{mem} = ($d->[6] || 0) + 0;
-	$entry->{maxdisk} = ($d->[7] || 0) + 0;
-	$entry->{disk} = ($d->[8] || 0) + 0;
-	$entry->{netin} = ($d->[9] || 0) + 0;
-	$entry->{netout} = ($d->[10] || 0) + 0;
-	$entry->{diskread} = ($d->[11] || 0) + 0;
-	$entry->{diskwrite} = ($d->[12] || 0) + 0;
-
-    } elsif ($d = $rrd->{"pve2.3-vm/$vmid"}) {
+    if ($d = $rrd->{"pve2.3-vm/$vmid"}) {
 
 	$entry->{uptime} = ($d->[0] || 0) + 0;
 	$entry->{name} = $d->[1];
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH manager 2/5] pvestatd: collect and distribute new pve9- metrics
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (2 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 1/5] api2tools: drop old VM rrd schema Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 3/5] api: nodes: rrd and rrddata fetch from new pve9-node rrd files if present Aaron Lauterer
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

If we see that the migration to the new pve9- rrd format has been done
or is ongoing (new dir exists), we collect and send out the new format with additional
columns for nodes and VMs (guests).

Those are:
Nodes:
* memfree
* membuffers
* memcached
* arcsize
* pressures:
  * cpu some
  * io some
  * io full
  * mem some
  * mem full

VMs:
* memhost (memory consumption of all processes in the guests cgroup, host view)
* pressures:
  * cpu some
  * cpu full
  * io some
  * io full
  * mem some
  * mem full

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---

Notes:
    this will automatically send the additional columns to the metric
    servers as well. Not sure if that is okay or could be a problem that we
    need to address or at least mention in the release notes.

 PVE/Service/pvestatd.pm | 128 +++++++++++++++++++++++++++++++---------
 1 file changed, 100 insertions(+), 28 deletions(-)

diff --git a/PVE/Service/pvestatd.pm b/PVE/Service/pvestatd.pm
index d80c62da..eb4352fa 100755
--- a/PVE/Service/pvestatd.pm
+++ b/PVE/Service/pvestatd.pm
@@ -82,6 +82,16 @@ my $cached_kvm_version = '';
 my $next_flag_update_time;
 my $failed_flag_update_delay_sec = 120;
 
+# Checks if RRD files exist in the specified location.
+my $rrd_dir_exists = sub{
+    my ($location) = @_;
+    if (-d "/var/lib/rrdcached/db/${location}") {
+	return 1;
+    } else {
+	return 0;
+    }
+};
+
 sub update_supported_cpuflags {
     my $kvm_version = PVE::QemuServer::kvm_user_version();
 
@@ -173,19 +183,38 @@ sub update_node_status {
 
     my $meminfo = PVE::ProcFSTools::read_meminfo();
 
+    my $pressures = PVE::ProcFSTools::read_pressure();
+
     my $dinfo = df('/', 1);     # output is bytes
     # everything not free is considered to be used
     my $dused = $dinfo->{blocks} - $dinfo->{bfree};
 
     my $ctime = time();
 
-    my $data = $generate_rrd_string->(
-	[$uptime, $sublevel, $ctime, $avg1, $maxcpu, $stat->{cpu}, $stat->{wait},
-	 $meminfo->{memtotal}, $meminfo->{memused},
-	 $meminfo->{swaptotal}, $meminfo->{swapused},
-	 $dinfo->{blocks}, $dused, $netin, $netout]
-    );
-    PVE::Cluster::broadcast_rrd("pve2-node/$nodename", $data);
+    my $data;
+    # TODO: switch fully to pve9-node
+    if ($rrd_dir_exists->("pve9-node")) {
+	$data = $generate_rrd_string->(
+	    [$uptime, $sublevel, $ctime,  $avg1, $maxcpu, $stat->{cpu},
+		$stat->{wait}, $meminfo->{memtotal}, $meminfo->{memused},
+		$meminfo->{swaptotal}, $meminfo->{swapused}, $dinfo->{blocks},
+		$dused, $netin, $netout, $meminfo->{memavailable},
+		$meminfo->{buffers}, $meminfo->{cached}, $meminfo->{arcsize},
+		$pressures->{cpu}{some}{avg10}, $pressures->{io}{some}{avg10},
+		$pressures->{io}{full}{avg10},
+		$pressures->{memory}{some}{avg10},
+		$pressures->{memory}{full}{avg10}]
+	);
+	PVE::Cluster::broadcast_rrd("pve9-node/$nodename", $data);
+    } else {
+	$data = $generate_rrd_string->(
+	    [$uptime, $sublevel, $ctime, $avg1, $maxcpu, $stat->{cpu}, $stat->{wait},
+	     $meminfo->{memtotal}, $meminfo->{memused},
+	     $meminfo->{swaptotal}, $meminfo->{swapused},
+	     $dinfo->{blocks}, $dused, $netin, $netout]
+	);
+	PVE::Cluster::broadcast_rrd("pve2-node/$nodename", $data);
+    }
 
     my $node_metric = {
 	uptime => $uptime,
@@ -252,17 +281,39 @@ sub update_qemu_status {
 	my $data;
 	my $status = $d->{qmpstatus} || $d->{status} || 'stopped';
 	my $template = $d->{template} ? $d->{template} : "0";
-	if ($d->{pid}) { # running
-	    $data = $generate_rrd_string->(
-		[$d->{uptime}, $d->{name}, $status, $template, $ctime, $d->{cpus}, $d->{cpu},
-		 $d->{maxmem}, $d->{mem}, $d->{maxdisk}, $d->{disk},
-		 $d->{netin}, $d->{netout}, $d->{diskread}, $d->{diskwrite}]);
+
+	# TODO: switch fully to pve9-vm
+	if ($rrd_dir_exists->("pve9-vm")) {
+	    if ($d->{pid}) { # running
+		$data = $generate_rrd_string->(
+		    [$d->{uptime}, $d->{name}, $status, $template, $ctime,
+			$d->{cpus}, $d->{cpu}, $d->{maxmem}, $d->{mem},
+			$d->{maxdisk}, $d->{disk}, $d->{netin}, $d->{netout},
+			$d->{diskread}, $d->{diskwrite},$d->{memhost},
+			$d->{pressurecpusome}, $d->{pressurecpufull},
+			$d->{pressureiosome}, $d->{pressureiofull},
+			$d->{pressurememorysome}, $d->{pressurememoryfull}]);
+	    } else {
+		$data = $generate_rrd_string->(
+		    [0, $d->{name}, $status, $template, $ctime, $d->{cpus},
+			undef, $d->{maxmem}, undef, $d->{maxdisk}, $d->{disk},
+			undef, undef, undef, undef,undef, undef, undef, undef,
+			undef, undef, undef]);
+	    }
+	    PVE::Cluster::broadcast_rrd("pve9-vm/$vmid", $data);
 	} else {
-	    $data = $generate_rrd_string->(
-		[0, $d->{name}, $status, $template, $ctime, $d->{cpus}, undef,
-		 $d->{maxmem}, undef, $d->{maxdisk}, $d->{disk}, undef, undef, undef, undef]);
+	    if ($d->{pid}) { # running
+		$data = $generate_rrd_string->(
+		    [$d->{uptime}, $d->{name}, $status, $template, $ctime, $d->{cpus}, $d->{cpu},
+		     $d->{maxmem}, $d->{mem}, $d->{maxdisk}, $d->{disk},
+		     $d->{netin}, $d->{netout}, $d->{diskread}, $d->{diskwrite}]);
+	    } else {
+		$data = $generate_rrd_string->(
+		    [0, $d->{name}, $status, $template, $ctime, $d->{cpus}, undef,
+		     $d->{maxmem}, undef, $d->{maxdisk}, $d->{disk}, undef, undef, undef, undef]);
+	    }
+	    PVE::Cluster::broadcast_rrd("pve2.3-vm/$vmid", $data);
 	}
-	PVE::Cluster::broadcast_rrd("pve2.3-vm/$vmid", $data);
 
 	PVE::ExtMetric::update_all($transactions, 'qemu', $vmid, $d, $ctime, $nodename);
     }
@@ -460,20 +511,40 @@ sub update_lxc_status {
 	my $d = $vmstatus->{$vmid};
 	my $template = $d->{template} ? $d->{template} : "0";
 	my $data;
-	if ($d->{status} eq 'running') { # running
-	    $data = $generate_rrd_string->(
-		[$d->{uptime}, $d->{name}, $d->{status}, $template,
-		 $ctime, $d->{cpus}, $d->{cpu},
-		 $d->{maxmem}, $d->{mem},
-		 $d->{maxdisk}, $d->{disk},
-		 $d->{netin}, $d->{netout},
-		 $d->{diskread}, $d->{diskwrite}]);
+	if ($rrd_dir_exists->("pve9-vm")) {
+	    if ($d->{pid}) { # running
+		$data = $generate_rrd_string->(
+		    [$d->{uptime}, $d->{name}, $d->{status}, $template, $ctime,
+			$d->{cpus}, $d->{cpu}, $d->{maxmem}, $d->{mem},
+			$d->{maxdisk}, $d->{disk}, $d->{netin}, $d->{netout},
+			$d->{diskread}, $d->{diskwrite}, undef,
+			$d->{pressurecpusome}, $d->{pressurecpufull},
+			$d->{pressureiosome}, $d->{pressureiofull},
+			$d->{pressurememorysome}, $d->{pressurememoryfull}]);
+	    } else {
+		$data = $generate_rrd_string->(
+		    [0, $d->{name}, $d->{status}, $template, $ctime,
+			$d->{cpus}, undef, $d->{maxmem}, undef, $d->{maxdisk},
+			$d->{disk}, undef, undef, undef, undef, undef, undef,
+			undef, undef, undef, undef, undef]);
+	    }
+	    PVE::Cluster::broadcast_rrd("pve9-vm/$vmid", $data);
 	} else {
-	    $data = $generate_rrd_string->(
-		[0, $d->{name}, $d->{status}, $template, $ctime, $d->{cpus}, undef,
-		 $d->{maxmem}, undef, $d->{maxdisk}, $d->{disk}, undef, undef, undef, undef]);
+	    if ($d->{status} eq 'running') { # running
+		$data = $generate_rrd_string->(
+		    [$d->{uptime}, $d->{name}, $d->{status}, $template,
+		     $ctime, $d->{cpus}, $d->{cpu},
+		     $d->{maxmem}, $d->{mem},
+		     $d->{maxdisk}, $d->{disk},
+		     $d->{netin}, $d->{netout},
+		     $d->{diskread}, $d->{diskwrite}]);
+	    } else {
+		$data = $generate_rrd_string->(
+		    [0, $d->{name}, $d->{status}, $template, $ctime, $d->{cpus}, undef,
+		     $d->{maxmem}, undef, $d->{maxdisk}, $d->{disk}, undef, undef, undef, undef]);
+	    }
+	    PVE::Cluster::broadcast_rrd("pve2.3-vm/$vmid", $data);
 	}
-	PVE::Cluster::broadcast_rrd("pve2.3-vm/$vmid", $data);
 
 	PVE::ExtMetric::update_all($transactions, 'lxc', $vmid, $d, $ctime, $nodename);
     }
@@ -498,6 +569,7 @@ sub update_storage_status {
 	my $data = $generate_rrd_string->([$ctime, $d->{total}, $d->{used}]);
 
 	my $key = "pve2-storage/${nodename}/$storeid";
+	$key = "pve9-storage/${nodename}/$storeid" if $rrd_dir_exists->("pve9-storage");
 	PVE::Cluster::broadcast_rrd($key, $data);
 
 	PVE::ExtMetric::update_all($transactions, 'storage', $nodename, $storeid, $d, $ctime);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH manager 3/5] api: nodes: rrd and rrddata fetch from new pve9-node rrd files if present
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (3 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 2/5] pvestatd: collect and distribute new pve9- metrics Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 4/5] api2tools: extract stats: handle existence of new pve9- data Aaron Lauterer
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

if the new rrd pve9-node files are present, they contain the current
data and should be used.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 PVE/API2/Nodes.pm | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm
index 791d2dec..dd12c1ce 100644
--- a/PVE/API2/Nodes.pm
+++ b/PVE/API2/Nodes.pm
@@ -841,8 +841,10 @@ __PACKAGE__->register_method({
     code => sub {
 	my ($param) = @_;
 
+	my $path = "pve9-node/$param->{node}";
+	$path = "pve2-node/$param->{node}" if !-e $path;
 	return PVE::RRD::create_rrd_graph(
-	    "pve2-node/$param->{node}", $param->{timeframe},
+	    $path, $param->{timeframe},
 	    $param->{ds}, $param->{cf});
 
     }});
@@ -883,8 +885,10 @@ __PACKAGE__->register_method({
     code => sub {
 	my ($param) = @_;
 
+	my $path = "pve9-node/$param->{node}";
+	$path = "pve2-node/$param->{node}" if !-e "/var/lib/rrdcached/db/${path}";
 	return PVE::RRD::create_rrd_data(
-	    "pve2-node/$param->{node}", $param->{timeframe}, $param->{cf});
+	    $path, $param->{timeframe}, $param->{cf});
     }});
 
 __PACKAGE__->register_method({
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH manager 4/5] api2tools: extract stats: handle existence of new pve9- data
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (4 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 3/5] api: nodes: rrd and rrddata fetch from new pve9-node rrd files if present Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 5/5] ui: rrdmodels: add new columns Aaron Lauterer
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

and fall back to old pve2-{node,storage} or pve2.3-vm if not present.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 PVE/API2Tools.pm | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/PVE/API2Tools.pm b/PVE/API2Tools.pm
index 37829983..84ba0bf4 100644
--- a/PVE/API2Tools.pm
+++ b/PVE/API2Tools.pm
@@ -49,8 +49,7 @@ sub extract_node_stats {
 	status => 'unknown',
     };
 
-    if (my $d = $rrd->{"pve2-node/$node"}) {
-
+    if (my $d = $rrd->{"pve9-node/$node"} // $rrd->{"pve2-node/$node"}) {
 	if (!$members || # no cluster
 	    ($members->{$node} && $members->{$node}->{online})) {
 	    if (!$exclude_stats) {
@@ -90,7 +89,7 @@ sub extract_vm_stats {
 
     my $d;
 
-    if ($d = $rrd->{"pve2.3-vm/$vmid"}) {
+    if ($d = $rrd->{"pve9-vm/$vmid"} // $rrd->{"pve2.3-vm/$vmid"}) {
 
 	$entry->{uptime} = ($d->[0] || 0) + 0;
 	$entry->{name} = $d->[1];
@@ -128,7 +127,8 @@ sub extract_storage_stats {
 	content => $content,
     };
 
-    if (my $d = $rrd->{"pve2-storage/$node/$storeid"}) {
+    if (my $d = $rrd->{"pve9-storage/$node/$storeid"}
+	// $rrd->{"pve2-storage/$node/$storeid"}) {
 	$entry->{maxdisk} = ($d->[1] || 0) + 0;
 	$entry->{disk} = ($d->[2] || 0) + 0;
 	$entry->{status} = 'available';
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH manager 5/5] ui: rrdmodels: add new columns
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (5 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 4/5] api2tools: extract stats: handle existence of new pve9- data Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH storage 1/1] status: rrddata: use new pve9 rrd location if file is present Aaron Lauterer
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 www/manager6/data/model/RRDModels.js | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/www/manager6/data/model/RRDModels.js b/www/manager6/data/model/RRDModels.js
index 2240dcab..5504d0af 100644
--- a/www/manager6/data/model/RRDModels.js
+++ b/www/manager6/data/model/RRDModels.js
@@ -25,6 +25,15 @@ Ext.define('pve-rrd-node', {
 	'rootused',
 	'swaptotal',
 	'swapused',
+	'memfree',
+	'membuffer',
+	'memcached',
+	'arcsize',
+	'pressurecpusome',
+	'pressureiosome',
+	'pressureiofull',
+	'pressurememorysome',
+	'pressurememoryfull',
 	{ type: 'date', dateFormat: 'timestamp', name: 'time' },
     ],
 });
@@ -48,6 +57,13 @@ Ext.define('pve-rrd-guest', {
 	'maxdisk',
 	'diskread',
 	'diskwrite',
+	'memhost',
+	'pressurecpusome',
+	'pressurecpufull',
+	'pressureiosome',
+	'pressurecpufull',
+	'pressurememorysome',
+	'pressurememoryfull',
 	{ type: 'date', dateFormat: 'timestamp', name: 'time' },
     ],
 });
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH storage 1/1] status: rrddata: use new pve9 rrd location if file is present
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (6 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH manager 5/5] ui: rrdmodels: add new columns Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 1/4] metrics: add pressure to metrics Aaron Lauterer
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/PVE/API2/Storage/Status.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/PVE/API2/Storage/Status.pm b/src/PVE/API2/Storage/Status.pm
index 14915ae..11074bc 100644
--- a/src/PVE/API2/Storage/Status.pm
+++ b/src/PVE/API2/Storage/Status.pm
@@ -376,9 +376,9 @@ __PACKAGE__->register_method ({
     code => sub {
 	my ($param) = @_;
 
-	return PVE::RRD::create_rrd_data(
-	    "pve2-storage/$param->{node}/$param->{storage}",
-	    $param->{timeframe}, $param->{cf});
+	my $path = "pve9-storage/$param->{node}/$param->{storage}";
+	$path = "pve2-storage/$param->{node}/$param->{storage}" if !-e "/var/lib/rrdcached/db/${path}";
+	return PVE::RRD::create_rrd_data($path, $param->{timeframe}, $param->{cf});
     }});
 
 # makes no sense for big images and backup files (because it
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH qemu-server 1/4] metrics: add pressure to metrics
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (7 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH storage 1/1] status: rrddata: use new pve9 rrd location if file is present Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 2/4] vmstatus: add memhost for host view of vm mem consumption Aaron Lauterer
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

From: Folke Gleumes <f.gleumes@proxmox.com>

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 PVE/QemuServer.pm | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 577959a..5f36772 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -2943,6 +2943,14 @@ sub vmstatus {
 	} else {
 	    $d->{cpu} = $old->{cpu};
 	}
+
+	my $pressures = PVE::ProcFSTools::read_qemu_pressure($vmid);
+	$d->{pressurecpusome} = $pressures->{cpu}{some}{avg10};
+	$d->{pressurecpufull} = $pressures->{cpu}{full}{avg10};
+	$d->{pressureiosome} = $pressures->{io}{some}{avg10};
+	$d->{pressureiofull} = $pressures->{io}{full}{avg10};
+	$d->{pressurememorysome} = $pressures->{memory}{some}{avg10};
+	$d->{pressurememoryfull} = $pressures->{memory}{full}{avg10};
     }
 
     return $res if !$full;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH qemu-server 2/4] vmstatus: add memhost for host view of vm mem consumption
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (8 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 1/4] metrics: add pressure to metrics Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 3/4] vmstatus: switch mem stat to PSS of VM cgroup Aaron Lauterer
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

The mem field itself will switch from the outside view to the "inside"
view if the VM is reporting detailed memory usage informatio via the
ballooning device.

Since sometime other processes belong to a VM too, vor example swtpm,
we collect all PIDs belonging to the VM cgroup and fetch their PSS data
to account for shared libraries used.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 PVE/QemuServer.pm | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 5f36772..c5eb5c1 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -2867,6 +2867,7 @@ sub vmstatus {
 	$d->{uptime} = 0;
 	$d->{cpu} = 0;
 	$d->{mem} = 0;
+	$d->{memhost} = 0;
 
 	$d->{netout} = 0;
 	$d->{netin} = 0;
@@ -2951,6 +2952,14 @@ sub vmstatus {
 	$d->{pressureiofull} = $pressures->{io}{full}{avg10};
 	$d->{pressurememorysome} = $pressures->{memory}{some}{avg10};
 	$d->{pressurememoryfull} = $pressures->{memory}{full}{avg10};
+
+	my $fh = IO::File->new ("/sys/fs/cgroup/qemu.slice/${vmid}.scope/cgroup.procs", "r");
+	if ($fh) {
+	    while (my $childPid = <$fh>) {
+		$d->{memhost} = $d->{memhost} + PVE::ProcFSTools::read_smaps_rollup($childPid, "Pss");
+	    }
+	}
+	close($fh);
     }
 
     return $res if !$full;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH qemu-server 3/4] vmstatus: switch mem stat to PSS of VM cgroup
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (9 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 2/4] vmstatus: add memhost for host view of vm mem consumption Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 4/4] rrddata: use new pve9 rrd location if file is present Aaron Lauterer
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

Instead of RSS, let's use the same PSS values as for the specific host
view as default, in case this value is not overwritten by the balloon
info.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 PVE/QemuServer.pm | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index c5eb5c1..74850b7 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -2916,10 +2916,6 @@ sub vmstatus {
 
 	$d->{uptime} = int(($uptime - $pstat->{starttime})/$cpuinfo->{user_hz});
 
-	if ($pstat->{vsize}) {
-	    $d->{mem} = int(($pstat->{rss}/$pstat->{vsize})*$d->{maxmem});
-	}
-
 	my $old = $last_proc_pid_stat->{$pid};
 	if (!$old) {
 	    $last_proc_pid_stat->{$pid} = {
@@ -2960,6 +2956,8 @@ sub vmstatus {
 	    }
 	}
 	close($fh);
+
+	$d->{mem} = $d->{memhost};
     }
 
     return $res if !$full;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH qemu-server 4/4] rrddata: use new pve9 rrd location if file is present
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (10 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 3/4] vmstatus: switch mem stat to PSS of VM cgroup Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-05-23 16:37   ` [pve-devel] [PATCH container 1/1] " Aaron Lauterer
  2025-06-02 14:39   ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Thomas Lamprecht
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 PVE/API2/Qemu.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 626cce4..233e69d 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -1572,8 +1572,10 @@ __PACKAGE__->register_method({
     code => sub {
 	my ($param) = @_;
 
+	my $path = "pve9-vm/$param->{vmid}";
+	$path = "pve2-vm/$param->{vmid}" if !-e "/var/lib/rrdcached/db/${path}";
 	return PVE::RRD::create_rrd_data(
-	    "pve2-vm/$param->{vmid}", $param->{timeframe}, $param->{cf});
+	    $path, $param->{timeframe}, $param->{cf});
     }});
 
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [pve-devel] [PATCH container 1/1] rrddata: use new pve9 rrd location if file is present
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (11 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 4/4] rrddata: use new pve9 rrd location if file is present Aaron Lauterer
@ 2025-05-23 16:37   ` Aaron Lauterer
  2025-06-02 14:39   ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Thomas Lamprecht
  13 siblings, 0 replies; 26+ messages in thread
From: Aaron Lauterer @ 2025-05-23 16:37 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/PVE/API2/LXC.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/PVE/API2/LXC.pm b/src/PVE/API2/LXC.pm
index 5c6ee57..65a5d4a 100644
--- a/src/PVE/API2/LXC.pm
+++ b/src/PVE/API2/LXC.pm
@@ -694,8 +694,10 @@ __PACKAGE__->register_method({
     code => sub {
 	my ($param) = @_;
 
+	my $path = "pve9-vm/$param->{vmid}";
+	$path = "pve2-vm/$param->{vmid}" if !-e "/var/lib/rrdcached/db/${path}";
 	return PVE::RRD::create_rrd_data(
-	    "pve2-vm/$param->{vmid}", $param->{timeframe}, $param->{cf});
+	    $path, $param->{timeframe}, $param->{cf});
     }});
 
 __PACKAGE__->register_method({
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data
  2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
                   ` (5 preceding siblings ...)
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
@ 2025-05-26 11:52 ` DERUMIER, Alexandre via pve-devel
  6 siblings, 0 replies; 26+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-05-26 11:52 UTC (permalink / raw)
  To: pve-devel; +Cc: DERUMIER, Alexandre

[-- Attachment #1: Type: message/rfc822, Size: 19800 bytes --]

From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data
Date: Mon, 26 May 2025 11:52:51 +0000
Message-ID: <3001b36bce93e12a826727156f47944aaa4c8e30.camel@groupe-cyllene.com>

Thanks Aaron for this work,

pressure are really something usefull (more than the classic load
average), could be use to evict/migrate a vm from a node when pressure
is too high.

I was to original author of read_pressure/parse_pressure, but I never
had finished the rrd integration, so many thanks !



-------- Message initial --------
De: Aaron Lauterer <a.lauterer@proxmox.com>
Répondre à: Proxmox VE development discussion <pve-
devel@lists.proxmox.com>
À: pve-devel@lists.proxmox.com
Objet: [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-
migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data
Date: 23/05/2025 18:00:10

This patch series expands the RRD format for nodes and VMs. For all
types
(nodes, VMs, storage) we adjust the aggregation to align them with the
way they
are done on the Backup Server. Therefore, we have new RRD defitions for
all
3 types.

New values are added for nodes and VMs. In particular:

Nodes:
* memfree
* membuffers
* memcached
* arcsize
* pressures:
  * cpu some
  * io some
  * io full
  * mem some
  * mem full

VMs:
* memhost (memory consumption of all processes in the guests cgroup,
host view)
* pressures:
  * cpu some
  * cpu full
  * io some
  * io full
  * mem some
  * mem full

To not lose old RRD data, we need to migrate the old RRD files to the
ones with
the new schema. Some initial performance tests showed that migrating
10k VM
RRD files took ~2m40s single threaded. This is way to long to do it
within the
pmxcfs itself. Therefore this will be a dedicated step. I wrote a small
rust
tool that binds to librrd to to the migraton.

We could include it in a post-install step when upgrading to PVE 9.

To avoid missing data and key errors in the journal, we need to ship
some
changes to PVE 8 that can handle the new format sent out by pvestatd.
Those
patches are the first in the series and are marked with a "-pve8"
postfix in the
repo name.

This RFC series so far only handles migration and any changes needed
for the new
fields. It does not yet include any GUI patches to add additional
graphs to the
summary pages of nodes and guests.

Plans:
* Add GUI parts:
  * Additional graphs, mostly for pressures.
  * add more info the memory graph. e.g. ZFS ARC
  * add host memory view of guests in graph and gauge

* pve8to9:
  * have a check how many RRD files are present and verify that there
is enough
	space on the root FS


How to test:
1. build pve-cluster with the pve8 patches and install it on all nodes.
2. build all the other packages and install them.
   build the migration tool with cargo and copy the binary to the nodes
for now.
3. run the migration tool on the first host
4. continue running the migration tool on the other nodes one by one

If you uncomment the extra logging in the pmxcfs/status.c you should
see how
the different situations are handled.
In the PVE8 patches start at line 1373, in the later patches for PVE9
it starts
 at line 1565.

cluster-pve8:

Aaron Lauterer (2):
  cfs status.c: drop old pve2-vm rrd schema support
  status: handle new pve9- metrics update data

 src/pmxcfs/status.c | 56 ++++++++++++++++++++++++++++++++++-----------
 src/pmxcfs/status.h |  2 ++
 2 files changed, 45 insertions(+), 13 deletions(-)


pve9-rrd-migration-tool:

Aaron Lauterer (1):
  introduce rrd migration tool for pve8 -> pve9


cluster:

Aaron Lauterer (1):
  status: introduce new pve9- rrd and metric format

 src/pmxcfs/status.c | 242 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 217 insertions(+), 25 deletions(-)


common:

Aaron Lauterer (1):
  add helper to fetch value from smaps_rollup for pid

Folke Gleumes (3):
  fix error in pressure parsing
  add functions to retrieve pressures for vm/ct
  metrics: add buffer and cache to meminfo

 src/PVE/ProcFSTools.pm | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)


manager:

Aaron Lauterer (5):
  api2tools: drop old VM rrd schema
  pvestatd: collect and distribute new pve9- metrics
  api: nodes: rrd and rrddata fetch from new pve9-node rrd files if
    present
  api2tools: extract stats: handle existence of new pve9- data
  ui: rrdmodels: add new columns

 PVE/API2/Nodes.pm                    |   8 +-
 PVE/API2Tools.pm                     |  24 +----
 PVE/Service/pvestatd.pm              | 128 +++++++++++++++++++++------
 www/manager6/data/model/RRDModels.js |  16 ++++
 4 files changed, 126 insertions(+), 50 deletions(-)


storage:

Aaron Lauterer (1):
  status: rrddata: use new pve9 rrd location if file is present

 src/PVE/API2/Storage/Status.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)


qemu-server:

Aaron Lauterer (3):
  vmstatus: add memhost for host view of vm mem consumption
  vmstatus: switch mem stat to PSS of VM cgroup
  rrddata: use new pve9 rrd location if file is present

Folke Gleumes (1):
  metrics: add pressure to metrics

 PVE/API2/Qemu.pm  |  4 +++-
 PVE/QemuServer.pm | 23 +++++++++++++++++++----
 2 files changed, 22 insertions(+), 5 deletions(-)


container:

Aaron Lauterer (1):
  rrddata: use new pve9 rrd location if file is present

 src/PVE/API2/LXC.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


Summary over all repositories:
  12 files changed, 457 insertions(+), 98 deletions(-)


[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data
  2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data Aaron Lauterer
  2025-05-23 16:35   ` Aaron Lauterer
@ 2025-06-02 13:31   ` Thomas Lamprecht
  1 sibling, 0 replies; 26+ messages in thread
From: Thomas Lamprecht @ 2025-06-02 13:31 UTC (permalink / raw)
  To: Proxmox VE development discussion, Aaron Lauterer

Am 23.05.25 um 18:00 schrieb Aaron Lauterer:
> For PVE9 there will be additional fields in the metrics that are
> collected. The new columns/fields are added at the end of the current
> ones. Therefore, if we get the new format, we need to cut it.
> 
> Paths to rrd filenames needed to be set manually to 'pve2-...' and will
> use the 'node' part instead of the full key, as that could also be
> 'pve9-...' which does not exists.

any implications on using the node part, or is it fine here and resulting to
what is used now anyway? More rationale would definitively be helpful.

Did not checked out the whole series closely, so just a few things I noticed
from a quick look.

> 
> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
> ---
>  src/pmxcfs/status.c | 51 ++++++++++++++++++++++++++++++++++++++-------
>  src/pmxcfs/status.h |  2 ++
>  2 files changed, 46 insertions(+), 7 deletions(-)
> 
> diff --git a/src/pmxcfs/status.c b/src/pmxcfs/status.c
> index 77a18d8..3fdb179 100644
> --- a/src/pmxcfs/status.c
> +++ b/src/pmxcfs/status.c
> @@ -1236,6 +1236,8 @@ rrd_skip_data(
>  	return data;
>  }
>  
> +static char* rrd_format_update_buffer = NULL;
> +
>  static void
>  update_rrd_data(
>  	const char *key,
> @@ -1255,9 +1257,15 @@ update_rrd_data(
>  
>  	char *filename = NULL;
>  
> +	if (!rrd_format_update_buffer) {
> +	    rrd_format_update_buffer = (char*)malloc(RRD_FORMAT_BUFFER_SIZE);


pmxcfs uses the glib (which is not to be confused with glibc, the GNU std lib), and
while I'm not a big fan of that, it still makes sense to keep it consistency until it
gets ripped out (or rewritten to rust, ...), so please use "g_malloc0" here.

If this is a fixed buffer that is frequently used it could also be allocated statically
there as array.
And btw. this pointer is only safe to share as the function is called inside the local
rrdentry_hash_set which in turn is called inside the public cfs_status_set, which takes
the global mutex; otherwise this would be rather dangerous; so noting that it is and must
be protected by that mutex would be always good for such things.

But actually, I do not think you need that buffer at all, see further below, where you
"cut it off".

> +	}
> +
>  	int skip = 0;
> +	int data_cutoff = 0; // how many columns after initial skip should be a cut-off
>  
> -	if (strncmp(key, "pve2-node/", 10) == 0) {
> +	if (strncmp(key, "pve2-node/", 10) == 0 ||
> +		strncmp(key, "pve9-node/", 10) == 0) {

please move the `) {` to a new line to make the code in the block stand more
out from the one in the if condition, and thus more readable (I know style in
pmxcfs is not great as is, but new code can do slightly better)

I.e. something like:

if (
    strncmp(key, "pve2-node/", 10) == 0
    || strncmp(key, "pve9-node/", 10) == 0
) {
    ....

>  		const char *node = key + 10;
>  
>  		skip = 2;
> @@ -1268,7 +1276,11 @@ update_rrd_data(
>  		if (strlen(node) < 1)
>  			goto keyerror;
>  
> -		filename = g_strdup_printf(RRDDIR "/%s", key);
> +		if (strncmp(key, "pve9-node/", 10) == 0) {
> +		    data_cutoff = 13;

Do not just use seemingly random integers without a comment with a short
rationale.

> +		}
> +
> +		filename = g_strdup_printf(RRDDIR "/pve2-node/%s", node);
>  
>  		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
>  
> @@ -1277,8 +1289,15 @@ update_rrd_data(
>  			create_rrd_file(filename, argcount, rrd_def_node);
>  		}
>  
> -	} else if (strncmp(key, "pve2.3-vm/", 10) == 0) {
> -		const char *vmid = key + 10;
> +	} else if (strncmp(key, "pve2.3-vm/", 10) == 0 ||
> +		strncmp(key, "pve9-vm/", 8) == 0) {

use 9.0 and you avoid the need for below differentiation

> +
> +		const char *vmid;
> +		if (strncmp(key, "pve2.3-vm/", 10) == 0) {
> +		    vmid = key + 10;
> +		} else {
> +		    vmid = key + 8;
> +		}
>  
>  		skip = 4;
>  
> @@ -1288,6 +1307,10 @@ update_rrd_data(
>  		if (strlen(vmid) < 1)
>  			goto keyerror;
>  
> +		if (strncmp(key, "pve9-vm/", 8) == 0) {
> +		    data_cutoff = 11;
> +		}
> +
>  		filename = g_strdup_printf(RRDDIR "/%s/%s", "pve2-vm", vmid);
>  
>  		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
> @@ -1297,7 +1320,8 @@ update_rrd_data(
>  			create_rrd_file(filename, argcount, rrd_def_vm);
>  		}
>  
> -	} else if (strncmp(key, "pve2-storage/", 13) == 0) {
> +	} else if (strncmp(key, "pve2-storage/", 13) == 0 ||
> +		strncmp(key, "pve9-storage/", 13) == 0) {
>  		const char *node = key + 13;
>  
>  		const char *storage = node;
> @@ -1315,7 +1339,7 @@ update_rrd_data(
>  		if (strlen(storage) < 1)
>  			goto keyerror;
>  
> -		filename = g_strdup_printf(RRDDIR "/%s", key);
> +		filename = g_strdup_printf(RRDDIR "/pve2-storage/%s", node);
>  
>  		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
>  
> @@ -1335,7 +1359,20 @@ update_rrd_data(
>  
>  	const char *dp = skip ? rrd_skip_data(data, skip) : data;
>  
> -	const char *update_args[] = { dp, NULL };
> +	if (data_cutoff) {
> +	    const char *cut = rrd_skip_data(dp, data_cutoff);
> +	    const int data_len = cut - dp - 1; // -1 to remove last colon
> +	    snprintf(rrd_format_update_buffer, RRD_FORMAT_BUFFER_SIZE, "%.*s", data_len, dp);

This is inefficient in multiple ways, you already get the cut point nicely in
the cut string pointer, just write a zero there and your string is terminated;
That's how wonderful C is ;) And dangerous, but it really only gets less dangerous
by stopping to use it, so no point in bending backwards like your code does here,
as it still relies on the same underlying facts, just copies data around much
more.

In other words, replace most of this here with:

*(cut - 1) = 0; // terminate string by replacing colon from field separator with zero.

> +	} else {
> +	    snprintf(rrd_format_update_buffer, RRD_FORMAT_BUFFER_SIZE, "%s", dp);

this branch can then be dropped completely

> +	}
> +
> +	const char *update_args[] = { rrd_format_update_buffer, NULL };

here just keep the original that passes the dp string pointer, and do not be thrown
off by dp being defined as const, that means the pointer is const, not the value it
points too.

> +
> +	// TODO: remove in non RFC, but useful for debug logging to see if data is handled correctly
> +	// cfs_message("KEY: %s", key);
> +	// cfs_message("DATA: %s", dp);
> +	// cfs_message("BUFFER: %s", rrd_format_update_buffer);

you could add a single cfs_debug statement and keep it then, but I think using gdb
and  a setting a breakpoint here to inspect these would be actually enough.

>  
>  	if (use_daemon) {
>  		int status;
> diff --git a/src/pmxcfs/status.h b/src/pmxcfs/status.h
> index 041cb34..1a38f43 100644
> --- a/src/pmxcfs/status.h
> +++ b/src/pmxcfs/status.h
> @@ -34,6 +34,8 @@
>  
>  #define CFS_MAX_STATUS_SIZE (32*1024)
>  
> +#define RRD_FORMAT_BUFFER_SIZE (1024 * 1024) // 1 MiB
> +
>  typedef struct cfs_clnode cfs_clnode_t;
>  typedef struct cfs_clinfo cfs_clinfo_t;
>  



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [pve-devel] [PATCH common 4/4] metrics: add buffer and cache to meminfo
  2025-05-23 16:37   ` [pve-devel] [PATCH common 4/4] metrics: add buffer and cache to meminfo Aaron Lauterer
@ 2025-06-02 14:07     ` Thomas Lamprecht
  0 siblings, 0 replies; 26+ messages in thread
From: Thomas Lamprecht @ 2025-06-02 14:07 UTC (permalink / raw)
  To: Proxmox VE development discussion, Aaron Lauterer

Am 23.05.25 um 18:37 schrieb Aaron Lauterer:
> From: Folke Gleumes <f.gleumes@proxmox.com>
> 
> Expose buffers and cache as separate metrics instead of including them
> in memfree and memused.
> 
> Originally-by: Folke Gleumes <f.gleumes@proxmox.com>
> [AL: rebased and adapted to changes that happened in the meantime]
> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
> ---
>  src/PVE/ProcFSTools.pm | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm
> index 185b2b3..91a69be 100644
> --- a/src/PVE/ProcFSTools.pm
> +++ b/src/PVE/ProcFSTools.pm
> @@ -303,6 +303,8 @@ sub read_meminfo {
>  	memfree => 0,
>  	memavailable => 0,
>  	memused => 0,
> +	membuffers => 0,
> +	memcached => 0,
>  	memshared => 0,
>  	swaptotal => 0,
>  	swapfree => 0,
> @@ -328,6 +330,8 @@ sub read_meminfo {
>      # available for a new workload, without pushing the system into swap, no amount of calculating
>      # with BUFFER, CACHE, .. will get you there, only the kernel can know this.
>      $res->{memused} = $res->{memtotal} - $d->{memavailable};
> +    $res->{membuffers} = $d->{buffers};
> +    $res->{memcached} = $d->{cached};

Note the comment above the line you add this, which was recently added.

After reading would it make more sense to expose memavailable instead?

As that, e.g., also includes things like SReclaimable, which can be huge,
see [0] for some more background.

[0]: https://lore.kernel.org/all/20131107101345.14d7be90@annuminas.surriel.com/#t


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [pve-devel] [PATCH common 3/4] add helper to fetch value from smaps_rollup for pid
  2025-05-23 16:37   ` [pve-devel] [PATCH common 3/4] add helper to fetch value from smaps_rollup for pid Aaron Lauterer
@ 2025-06-02 14:11     ` Thomas Lamprecht
  0 siblings, 0 replies; 26+ messages in thread
From: Thomas Lamprecht @ 2025-06-02 14:11 UTC (permalink / raw)
  To: Proxmox VE development discussion, Aaron Lauterer

Short summary about what that file contains and why it can be useful (for us)
would be great to have as context here.

Am 23.05.25 um 18:37 schrieb Aaron Lauterer:
> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
> ---
>  src/PVE/ProcFSTools.pm | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm
> index 382e6c5..185b2b3 100644
> --- a/src/PVE/ProcFSTools.pm
> +++ b/src/PVE/ProcFSTools.pm
> @@ -344,6 +344,20 @@ sub read_meminfo {
>      return $res;
>  }
>  
> +# extract memory data from /proc/$pid/smaps_rollup for PID. $value is e.g. "Pss".
> +sub read_smaps_rollup {
> +    my ($pid, $value) = @_;
> +
> +    my $res = 0;
> +    return $res if !$pid || !$value;
> +
> +    my $mem_stats = eval { PVE::Tools::file_get_contents("/proc/${pid}/smaps_rollup") };
> +    if ($mem_stats && $mem_stats =~ m/^${value}:\s+(\d+)\s*kB$/m) {

is this an efficient interface? A caller needing more than one handful will
cause a full read of the file and regex match applied to that, granted, it's
not huge, but that's still not really a good excuse...

As of now this is really not better than just doing the same on a call site
directly. A method in such a generic interface should do better and return
a parsed list of values as hash.

> +	$res = $1 * 1024;
> +    }
> +    return $res;
> +}
> +
>  # memory usage of current process
>  sub read_memory_usage {
>  



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct
  2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
                     ` (12 preceding siblings ...)
  2025-05-23 16:37   ` [pve-devel] [PATCH container 1/1] " Aaron Lauterer
@ 2025-06-02 14:39   ` Thomas Lamprecht
  13 siblings, 0 replies; 26+ messages in thread
From: Thomas Lamprecht @ 2025-06-02 14:39 UTC (permalink / raw)
  To: Proxmox VE development discussion, Aaron Lauterer

Am 23.05.25 um 18:37 schrieb Aaron Lauterer:
> From: Folke Gleumes <f.gleumes@proxmox.com>
> 
> Originally-by: Folke Gleumes <f.gleumes@proxmox.com>
> [AL: rebased on current master]
> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
> ---
>  src/PVE/ProcFSTools.pm | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm
> index f9fe3f0..382e6c5 100644
> --- a/src/PVE/ProcFSTools.pm
> +++ b/src/PVE/ProcFSTools.pm
> @@ -151,6 +151,28 @@ sub parse_pressure {
>      return $res;
>  }
>  
> +sub read_qemu_pressure {
> +    my ($vmid) = @_;
> +
> +    my $res = {};
> +    foreach my $type (qw(cpu memory io)) {
> +	my $stats = parse_pressure("/sys/fs/cgroup/qemu.slice/$vmid.scope/$type.pressure");

There is a SysFSTools module ;-)

But OK, the parse_pressure method is here, so finish I guess.

I'd rather have this more generic though, as you got just two copies of the
same code going on here, and nothing here is specific to qemu or lxc; every
cgroup has these.

So I'd rather switch this over to a single

sub read_cgroup_pressure {
    my ($cgroup_path) = @_;

    my $res = {};
    for my $type (qw(cpu memory io)) {
        my $stats = parse_pressure("/sys/fs/cgroup/${cgroup_path}/${type}.pressure");
        $res->{$type} = $stats if $stats;
    }
    return $res;
}

If we want to abstract away the base cgroup that would be a job for a
pve-guest-common interface and respective implementation in pve-container
and qemu-server, but it's probably fine to just assemble the path where
needed (manager I'd figure?), in any case not worse than hard-coding it in
a dependency leave node like here.

> +	$res->{$type} = $stats if $stats;
> +    }
> +    return $res;
> +}
> +
> +sub read_lxc_pressure {
> +    my ($vmid) = @_;
> +
> +    my $res = {};
> +    foreach my $type (qw(cpu memory io)) {
> +	my $stats = parse_pressure("/sys/fs/cgroup/lxc/$vmid/$type.pressure");
> +	$res->{$type} = $stats if $stats;
> +    }
> +    return $res;
> +}
> +
>  sub read_pressure {
>      my $res = {};
>      foreach my $type (qw(cpu memory io)) {



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-06-02 14:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-23 16:00 [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data Aaron Lauterer
2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 1/2] cfs status.c: drop old pve2-vm rrd schema support Aaron Lauterer
2025-05-23 16:00 ` [pve-devel] [PATCH cluster-pve8 2/2] status: handle new pve9- metrics update data Aaron Lauterer
2025-05-23 16:35   ` Aaron Lauterer
2025-06-02 13:31   ` Thomas Lamprecht
2025-05-23 16:00 ` [pve-devel] [PATCH pve9-rrd-migration-tool 1/1] introduce rrd migration tool for pve8 -> pve9 Aaron Lauterer
2025-05-23 16:00 ` [pve-devel] [PATCH cluster 1/1] status: introduce new pve9- rrd and metric format Aaron Lauterer
2025-05-23 16:37 ` [pve-devel] [PATCH common 1/4] fix error in pressure parsing Aaron Lauterer
2025-05-23 16:37 ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH common 3/4] add helper to fetch value from smaps_rollup for pid Aaron Lauterer
2025-06-02 14:11     ` Thomas Lamprecht
2025-05-23 16:37   ` [pve-devel] [PATCH common 4/4] metrics: add buffer and cache to meminfo Aaron Lauterer
2025-06-02 14:07     ` Thomas Lamprecht
2025-05-23 16:37   ` [pve-devel] [PATCH manager 1/5] api2tools: drop old VM rrd schema Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH manager 2/5] pvestatd: collect and distribute new pve9- metrics Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH manager 3/5] api: nodes: rrd and rrddata fetch from new pve9-node rrd files if present Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH manager 4/5] api2tools: extract stats: handle existence of new pve9- data Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH manager 5/5] ui: rrdmodels: add new columns Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH storage 1/1] status: rrddata: use new pve9 rrd location if file is present Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 1/4] metrics: add pressure to metrics Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 2/4] vmstatus: add memhost for host view of vm mem consumption Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 3/4] vmstatus: switch mem stat to PSS of VM cgroup Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH qemu-server 4/4] rrddata: use new pve9 rrd location if file is present Aaron Lauterer
2025-05-23 16:37   ` [pve-devel] [PATCH container 1/1] " Aaron Lauterer
2025-06-02 14:39   ` [pve-devel] [PATCH common 2/4] add functions to retrieve pressures for vm/ct Thomas Lamprecht
2025-05-26 11:52 ` [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data DERUMIER, Alexandre via pve-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal