all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore
@ 2022-06-01  8:12 Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-cluster 1/1] add pve2-metrics rrd (single metrics) Alexandre Derumier
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

Hi,

I'm still working on vm balancing/scheduling, and need some new metrics.

As suggested by Thomas, instead sending values to new rrd,
and then reparse rrd later in balancer, we broadcast:

- last minute && last 5 min average cpu/ram of qemu/lxc/node
- pressure average of cpu/mem/io for qemu/lxc/node

Theses values are broadcast each minute in 1 kvstore key "balancer-stats"

Size of broadcasted "balancer-stats" is around : 500bytes * number vms.

Size of in memory hash for average values is around : 6k (host+mem) * number vms.
(for last minute, we keep each X seconds iteration, for last 5min we keep last 5x1min average)


- last ksm value
This value is broadcast each iteration is a separate kvstore key "ksm"



pve-common:

Alexandre Derumier (2):
  cgroup: get_pressure_stat: use controllers in get_path
  cgroup: get_pressure_stat: add cpu full pressure

 src/PVE/CGroup.pm | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

qemu-server:

Alexandre Derumier (3):
  vmstatus: add hostcpu value
  vmstatus: add hostmem value
  vmstatus: add pressure stats

 PVE/QemuServer.pm | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

pve-container:

Alexandre Derumier (1):
  vmstatus: add pressure stats

 src/PVE/LXC.pm | 2 ++
 1 file changed, 2 insertions(+)

Alexandre Derumier (4):
  pvestatd: add broadcast_balancer_stats
  pvestatd: qemu/lxc/node : add pressure stats
  pvestatd: qemu/lxc/node : add hostcpu/hostmem average stats
  pvestatd: node : broadcast ksm value

 PVE/Service/pvestatd.pm | 154 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 153 insertions(+), 1 deletion(-)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-cluster 1/1] add pve2-metrics rrd (single metrics)
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-common 1/2] cgroup: get_pressure_stat: use controllers in get_path Alexandre Derumier
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

This create 1 single rrd for each metric

allowed paths:

pve2-metrics/vms/<vmid>/<metric>
pve2-metrics/nodes/<node>/<metric>
pve2-metrics/storages/<node>/<storage>/<metric>

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 data/src/status.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/data/src/status.c b/data/src/status.c
index 9bceaeb..75ab89e 100644
--- a/data/src/status.c
+++ b/data/src/status.c
@@ -1097,6 +1097,23 @@ static const char *rrd_def_storage[] = {
 	NULL,
 };
 
+static const char *rrd_def_metric[] = {
+	"DS:metric:GAUGE:120:0:U",
+
+	"RRA:AVERAGE:0.5:1:70", // 1 min avg - one hour
+	"RRA:AVERAGE:0.5:30:70", // 30 min avg - one day
+	"RRA:AVERAGE:0.5:180:70", // 3 hour avg - one week
+	"RRA:AVERAGE:0.5:720:70", // 12 hour avg - one month
+	"RRA:AVERAGE:0.5:10080:70", // 7 day avg - ony year
+
+	"RRA:MAX:0.5:1:70", // 1 min max - one hour
+	"RRA:MAX:0.5:30:70", // 30 min max - one day
+	"RRA:MAX:0.5:180:70",  // 3 hour max - one week
+	"RRA:MAX:0.5:720:70", // 12 hour max - one month
+	"RRA:MAX:0.5:10080:70", // 7 day max - ony year
+	NULL,
+};
+
 #define RRDDIR "/var/lib/rrdcached/db"
 
 static void
@@ -1173,6 +1190,40 @@ update_rrd_data(
 			create_rrd_file(filename, argcount, rrd_def_node);
 		}
 
+	} else if ((strncmp(key, "pve2-metrics/", 13) == 0)) {
+		const char *path = key + 13;
+		char **pa = g_strsplit(path, "/", -1);
+
+		if (!pa[0] || !pa[1] || !pa[2] || strlen(pa[1]) < 1 || strlen(pa[2]) < 1) {
+			goto keyerror;
+		} else {
+			if (strcmp(pa[0], "vms") == 0 || strcmp(pa[0], "nodes") == 0) {
+				if (pa[3]) {
+					goto keyerror;
+				}
+			} else if (strcmp(pa[0], "storages") == 0) {
+				if (pa[4] || !pa[3] || strlen(pa[3]) < 1) {
+					goto keyerror;
+				}
+			} else {
+				goto keyerror;
+			}
+		}
+		g_strfreev(pa);
+
+
+		filename = g_strdup_printf(RRDDIR "/%s", key);
+
+		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
+
+			char *dir = g_path_get_dirname(filename);
+			g_mkdir_with_parents(dir, 0755);
+			g_free(dir);
+
+			int argcount = sizeof(rrd_def_metric)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_metric);
+		}
+
 	} else if ((strncmp(key, "pve2-vm/", 8) == 0) ||
 		   (strncmp(key, "pve2.3-vm/", 10) == 0)) {
 		const char *vmid;
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-common 1/2] cgroup: get_pressure_stat: use controllers in get_path
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-cluster 1/1] add pve2-metrics rrd (single metrics) Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 1/4] pvestatd: add broadcast_balancer_stats Alexandre Derumier
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 src/PVE/CGroup.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/PVE/CGroup.pm b/src/PVE/CGroup.pm
index 44b3297..d3873fd 100644
--- a/src/PVE/CGroup.pm
+++ b/src/PVE/CGroup.pm
@@ -380,7 +380,8 @@ sub get_pressure_stat {
 	},
     };
 
-    my ($path, $version) = $self->get_path(undef, 1);
+    my ($path, $version) = $self->get_any_path(1, 'cpu', 'memory', 'io');
+
     if (!defined($path)) {
 	return $res; # container or VM most likely isn't running, retrun zero stats
     } elsif ($version == 1) {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-manager 1/4] pvestatd: add broadcast_balancer_stats
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-cluster 1/1] add pve2-metrics rrd (single metrics) Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-common 1/2] cgroup: get_pressure_stat: use controllers in get_path Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 1/3] pvestatd: qemu/lxc/host : broadcast rrd pressure metrics Alexandre Derumier
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

broadcast each minute as it'll be average stats

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/Service/pvestatd.pm | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/PVE/Service/pvestatd.pm b/PVE/Service/pvestatd.pm
index 72445ec0..984877c1 100755
--- a/PVE/Service/pvestatd.pm
+++ b/PVE/Service/pvestatd.pm
@@ -79,6 +79,8 @@ sub hup {
 my $cached_kvm_version = '';
 my $next_flag_update_time;
 my $failed_flag_update_delay_sec = 120;
+my $balancer_stats = {};
+my $last_balancer_broadcast_time = 0;
 
 sub update_supported_cpuflags {
     my $kvm_version = PVE::QemuServer::kvm_user_version();
@@ -491,6 +493,20 @@ sub update_sdn_status {
     }
 }
 
+sub broadcast_balancer_stats {
+
+    my $ctime = time();
+
+    if($ctime >= $last_balancer_broadcast_time + 60) {
+
+	PVE::Cluster::broadcast_node_kv(
+	    'balancer-stats',
+	    encode_json($balancer_stats),
+	);
+	$last_balancer_broadcast_time = $ctime;
+    }
+}
+
 my $broadcast_version_info_done = 0;
 my sub broadcast_version_info : prototype() {
     if (!$broadcast_version_info_done) {
@@ -507,6 +523,8 @@ sub update_status {
     # correct list in case of an unexpected crash.
     my $rpcenv = PVE::RPCEnvironment::get();
 
+    $balancer_stats = {};
+
     eval {
 	my $tlist = $rpcenv->active_workers();
 	PVE::Cluster::broadcast_tasklist($tlist);
@@ -534,6 +552,12 @@ sub update_status {
     $err = $@;
     syslog('err', "lxc status update error: $err") if $err;
 
+    eval {
+	broadcast_balancer_stats();
+    };
+    $err = $@;
+    syslog('err', "balancer stats broadcast error: $err") if $err;
+
     eval {
 	rebalance_lxc_containers();
     };
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-manager 1/3] pvestatd: qemu/lxc/host : broadcast rrd pressure metrics
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
                   ` (2 preceding siblings ...)
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 1/4] pvestatd: add broadcast_balancer_stats Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-common 2/2] cgroup: get_pressure_stat: add cpu full pressure Alexandre Derumier
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

only "some" values for now, not sure we need full values

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/Service/pvestatd.pm | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/PVE/Service/pvestatd.pm b/PVE/Service/pvestatd.pm
index b1e71ec8..832d9dc5 100755
--- a/PVE/Service/pvestatd.pm
+++ b/PVE/Service/pvestatd.pm
@@ -132,6 +132,7 @@ sub update_node_status {
     my $stat = PVE::ProcFSTools::read_proc_stat();
     my $cpuinfo = PVE::ProcFSTools::read_cpuinfo();
     my $maxcpu = $cpuinfo->{cpus};
+    my $pressure = PVE::ProcFSTools::read_pressure();
 
     update_supported_cpuflags();
 
@@ -168,10 +169,13 @@ sub update_node_status {
 	memory => $meminfo,
 	blockstat => $dinfo,
 	nics => $netdev,
+	pressure => $pressure,
     };
     $node_metric->{cpustat}->@{qw(avg1 avg5 avg15)} = ($avg1, $avg5, $avg15);
     $node_metric->{cpustat}->{cpus} = $maxcpu;
 
+    broadcast_rrd_pressure($ctime, $node_metric, "pve2-metrics/nodes/$nodename");
+
     my $transactions = PVE::ExtMetric::transactions_start($status_cfg);
     PVE::ExtMetric::update_all($transactions, 'node', $nodename, $node_metric, $ctime);
     PVE::ExtMetric::transactions_finish($transactions);
@@ -232,12 +236,41 @@ sub update_qemu_status {
 	}
 	PVE::Cluster::broadcast_rrd("pve2.3-vm/$vmid", $data);
 
+	broadcast_rrd_pressure($ctime, $d, "pve2-metrics/vms/$vmid");
+
 	PVE::ExtMetric::update_all($transactions, 'qemu', $vmid, $d, $ctime, $nodename);
     }
 
     PVE::ExtMetric::transactions_finish($transactions);
 }
 
+sub broadcast_rrd_pressure {
+    my ($ctime, $d, $path) = @_;
+
+    return if !defined($d->{pressure});
+
+    my $pressure = $d->{pressure};
+
+    foreach my $type (keys %{$pressure}) {
+	my $pressuretype = $pressure->{$type};
+
+	foreach my $kind (keys %{$pressuretype}) {
+	    next if $kind ne 'some';
+	    my $pressurekind = $pressuretype->{$kind};
+
+	    foreach my $avg (keys %{$pressurekind}) {
+		next if $avg eq 'total';
+		my $value = $pressurekind->{$avg};
+		my $metric = "pressure_".$type."_".$kind."_".$avg;
+		my $data = $generate_rrd_string->([$ctime, $value]);
+		PVE::Cluster::broadcast_rrd("$path/$metric", $data);
+		$d->{$metric} = $value;
+	    }
+	}
+    }
+    delete $d->{pressure};
+}
+
 sub remove_stale_lxc_consoles {
 
     my $vmstatus = PVE::LXC::vmstatus();
@@ -441,6 +474,8 @@ sub update_lxc_status {
 	}
 	PVE::Cluster::broadcast_rrd("pve2.3-vm/$vmid", $data);
 
+	broadcast_rrd_pressure($ctime, $d, "pve2-metrics/vms/$vmid");
+
 	PVE::ExtMetric::update_all($transactions, 'lxc', $vmid, $d, $ctime, $nodename);
     }
     PVE::ExtMetric::transactions_finish($transactions);
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-common 2/2] cgroup: get_pressure_stat: add cpu full pressure
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
                   ` (3 preceding siblings ...)
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 1/3] pvestatd: qemu/lxc/host : broadcast rrd pressure metrics Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 2/4] pvestatd: qemu/lxc/node : add pressure stats Alexandre Derumier
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

available since kernel 5.13
https://lore.kernel.org/all/20210303034659.91735-2-zhouchengming@bytedance.com/T/#u

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 src/PVE/CGroup.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/PVE/CGroup.pm b/src/PVE/CGroup.pm
index d3873fd..bc5b8c8 100644
--- a/src/PVE/CGroup.pm
+++ b/src/PVE/CGroup.pm
@@ -368,7 +368,8 @@ sub get_pressure_stat {
 
     my $res = {
 	cpu => {
-	    some => { avg10 => 0, avg60 => 0, avg300 => 0 }
+	    some => { avg10 => 0, avg60 => 0, avg300 => 0 },
+	    full => { avg10 => 0, avg60 => 0, avg300 => 0 }
 	},
 	memory => {
 	    some => { avg10 => 0, avg60 => 0, avg300 => 0 },
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-manager 2/4] pvestatd: qemu/lxc/node : add pressure stats
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
                   ` (4 preceding siblings ...)
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-common 2/2] cgroup: get_pressure_stat: add cpu full pressure Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH qemu-server 2/3] vmstatus: add hostmem value Alexandre Derumier
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/Service/pvestatd.pm | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/PVE/Service/pvestatd.pm b/PVE/Service/pvestatd.pm
index 984877c1..dd578d6b 100755
--- a/PVE/Service/pvestatd.pm
+++ b/PVE/Service/pvestatd.pm
@@ -134,6 +134,7 @@ sub update_node_status {
     my $stat = PVE::ProcFSTools::read_proc_stat();
     my $cpuinfo = PVE::ProcFSTools::read_cpuinfo();
     my $maxcpu = $cpuinfo->{cpus};
+    my $pressure = PVE::ProcFSTools::read_pressure();
 
     update_supported_cpuflags();
 
@@ -170,10 +171,13 @@ sub update_node_status {
 	memory => $meminfo,
 	blockstat => $dinfo,
 	nics => $netdev,
+	pressure => $pressure,
     };
     $node_metric->{cpustat}->@{qw(avg1 avg5 avg15)} = ($avg1, $avg5, $avg15);
     $node_metric->{cpustat}->{cpus} = $maxcpu;
 
+    compute_pressure($ctime, $node_metric, 'node', $nodename);
+
     my $transactions = PVE::ExtMetric::transactions_start($status_cfg);
     PVE::ExtMetric::update_all($transactions, 'node', $nodename, $node_metric, $ctime);
     PVE::ExtMetric::transactions_finish($transactions);
@@ -227,6 +231,8 @@ sub update_qemu_status {
 		[$d->{uptime}, $d->{name}, $status, $template, $ctime, $d->{cpus}, $d->{cpu},
 		 $d->{maxmem}, $d->{mem}, $d->{maxdisk}, $d->{disk},
 		 $d->{netin}, $d->{netout}, $d->{diskread}, $d->{diskwrite}]);
+
+	    compute_pressure($ctime, $d, 'qemu', $vmid);
 	} else {
 	    $data = $generate_rrd_string->(
 		[0, $d->{name}, $status, $template, $ctime, $d->{cpus}, undef,
@@ -436,6 +442,7 @@ sub update_lxc_status {
 		 $d->{maxdisk}, $d->{disk},
 		 $d->{netin}, $d->{netout},
 		 $d->{diskread}, $d->{diskwrite}]);
+	    compute_pressure($ctime, $d, 'lxc', $vmid);
 	} else {
 	    $data = $generate_rrd_string->(
 		[0, $d->{name}, $d->{status}, $template, $ctime, $d->{cpus}, undef,
@@ -507,6 +514,33 @@ sub broadcast_balancer_stats {
     }
 }
 
+sub compute_pressure {
+    my ($ctime, $d, $objecttype, $id) = @_;
+
+    return if !defined($d->{pressure});
+    my $pressure = $d->{pressure};
+
+    foreach my $type (keys %{$pressure}) {
+	my $pressuretype = $pressure->{$type};
+
+	foreach my $kind (keys %{$pressuretype}) {
+	    next if $kind ne 'some';
+	    my $pressurekind = $pressuretype->{$kind};
+
+	    foreach my $avg (keys %{$pressurekind}) {
+		next if $avg eq 'total';
+		my $value = $pressurekind->{$avg};
+		#for externel metric
+		my $metric = "pressure_".$type."_".$kind."_".$avg;
+		$d->{$metric} = $value;
+		next if $avg eq 'avg10';
+		$balancer_stats->{$objecttype}->{$id}->{pressure}->{$type}->{$kind}->{$avg} = $value;
+	    }
+	}
+    }
+    delete $d->{pressure};
+}
+
 my $broadcast_version_info_done = 0;
 my sub broadcast_version_info : prototype() {
     if (!$broadcast_version_info_done) {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH qemu-server 2/3] vmstatus: add hostmem value
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
                   ` (5 preceding siblings ...)
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 2/4] pvestatd: qemu/lxc/node : add pressure stats Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 3/4] pvestatd: qemu/lxc/node : add hostcpu/hostmem average stats Alexandre Derumier
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 9441cf2..4fc183e 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -2933,6 +2933,11 @@ sub vmstatus {
 	if ($pstat->{vsize}) {
 	    $d->{mem} = int(($pstat->{rss}/$pstat->{vsize})*$d->{maxmem});
 	}
+	if (defined(my $hostmemstat = $cgroups->get_memory_stat())) {
+	    $d->{hostmem} = $hostmemstat->{mem};
+	} else {
+	    $d->{hostmem} = 0;
+	}
 
 	my $old = $last_proc_pid_stat->{$pid};
 	if (!$old) {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-manager 3/4] pvestatd: qemu/lxc/node : add hostcpu/hostmem average stats
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
                   ` (6 preceding siblings ...)
  2022-06-01  8:12 ` [pve-devel] [PATCH qemu-server 2/3] vmstatus: add hostmem value Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH qemu-server 3/3] vmstatus: add pressure stats Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 4/4] pvestatd: node : broadcast ksm value Alexandre Derumier
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

we aggregate each last X second stats to 1min average
we aggragate each last 5 1min average to 5min average.

vm avgstats are resetted when vm is stopped or removed.

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/Service/pvestatd.pm | 94 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 93 insertions(+), 1 deletion(-)

diff --git a/PVE/Service/pvestatd.pm b/PVE/Service/pvestatd.pm
index dd578d6b..a33b21cd 100755
--- a/PVE/Service/pvestatd.pm
+++ b/PVE/Service/pvestatd.pm
@@ -81,6 +81,7 @@ my $next_flag_update_time;
 my $failed_flag_update_delay_sec = 120;
 my $balancer_stats = {};
 my $last_balancer_broadcast_time = 0;
+my $avgstats = {};
 
 sub update_supported_cpuflags {
     my $kvm_version = PVE::QemuServer::kvm_user_version();
@@ -177,6 +178,11 @@ sub update_node_status {
     $node_metric->{cpustat}->{cpus} = $maxcpu;
 
     compute_pressure($ctime, $node_metric, 'node', $nodename);
+    my $avg_metrics = {
+	cpu => $stat->{cpu},
+        mem => $meminfo->{memused},
+    };
+    compute_avg_metrics($ctime, $avg_metrics, 'node', $nodename);
 
     my $transactions = PVE::ExtMetric::transactions_start($status_cfg);
     PVE::ExtMetric::update_all($transactions, 'node', $nodename, $node_metric, $ctime);
@@ -233,6 +239,11 @@ sub update_qemu_status {
 		 $d->{netin}, $d->{netout}, $d->{diskread}, $d->{diskwrite}]);
 
 	    compute_pressure($ctime, $d, 'qemu', $vmid);
+	    my $avg_metrics = {
+		cpu => $d->{hostcpu},
+		mem => $d->{hostmem},
+	    };
+	    compute_avg_metrics($ctime, $avg_metrics, 'qemu', $vmid);
 	} else {
 	    $data = $generate_rrd_string->(
 		[0, $d->{name}, $status, $template, $ctime, $d->{cpus}, undef,
@@ -242,7 +253,7 @@ sub update_qemu_status {
 
 	PVE::ExtMetric::update_all($transactions, 'qemu', $vmid, $d, $ctime, $nodename);
     }
-
+    delete_old_qemu_avgstats($vmstatus);
     PVE::ExtMetric::transactions_finish($transactions);
 }
 
@@ -443,6 +454,11 @@ sub update_lxc_status {
 		 $d->{netin}, $d->{netout},
 		 $d->{diskread}, $d->{diskwrite}]);
 	    compute_pressure($ctime, $d, 'lxc', $vmid);
+	    my $avg_metrics = {
+		cpu => $d->{cpu},
+		mem => $d->{mem},
+	    };
+	    compute_avg_metrics($ctime, $avg_metrics, 'lxc', $vmid);
 	} else {
 	    $data = $generate_rrd_string->(
 		[0, $d->{name}, $d->{status}, $template, $ctime, $d->{cpus}, undef,
@@ -452,6 +468,7 @@ sub update_lxc_status {
 
 	PVE::ExtMetric::update_all($transactions, 'lxc', $vmid, $d, $ctime, $nodename);
     }
+    delete_old_lxc_avgstats($vmstatus);
     PVE::ExtMetric::transactions_finish($transactions);
 }
 
@@ -541,6 +558,81 @@ sub compute_pressure {
     delete $d->{pressure};
 }
 
+sub compute_avg_metrics {
+   my ($ctime, $avg_metrics, $objectype, $id) = @_;
+
+    foreach my $metric (keys %$avg_metrics) {
+	my $value = $avg_metrics->{$metric};
+	next if !defined($value);
+
+        my $stats = $avgstats->{$objectype}->{$id}->{$metric} || {};
+	$stats->{series}->{60}->{$ctime} = $value;
+	$stats->{avg60} = 0 if !defined($stats->{avg60});
+	$stats->{avg300} = 0 if !defined($stats->{avg300});
+	$stats->{last_compute_time} = $ctime if !defined($stats->{last_compute_time});
+
+	#compute avg each minute
+
+	if($stats->{last_compute_time} && $ctime >= $stats->{last_compute_time} + 60) {
+	    $stats->{avg60} = compute_avg($stats->{series}->{60}, $ctime, 60);
+	    $stats->{series}->{300}->{$ctime} = $stats->{avg60};
+	    $stats->{avg300} = compute_avg($stats->{series}->{300}, $ctime, 300);
+	    $stats->{last_compute_time} = $ctime;
+	}
+
+	$balancer_stats->{$objectype}->{$id}->{$metric}->{avg60} = $stats->{avg60};
+	$balancer_stats->{$objectype}->{$id}->{$metric}->{avg300} = $stats->{avg300};
+	$avgstats->{$objectype}->{$id}->{$metric} = $stats;
+    }
+}
+
+sub compute_avg {
+   my ($series, $ctime, $delta) = @_;
+
+    my $total = 0;
+    my $count = 0;
+    my $to_delete;
+
+    foreach my $time (keys %$series) {
+	if ($ctime - $delta >= $time) {
+	    $to_delete->{$time} = 1;
+	    next;
+	}
+	$count++;
+	$total += $series->{$time};
+    }
+
+    delete %$series{keys %{$to_delete}};
+
+    my $avg = $total / $count if $count > 0;
+    return $avg;
+}
+
+sub delete_old_qemu_avgstats {
+    my ($vmstatus) = @_;
+
+    my $stats = $avgstats->{'qemu'};
+
+    my $to_delete;
+    #delete stats for removed vm, or stopped vm
+    foreach my $vmid (keys %$stats) {
+	$to_delete->{$vmid} = 1 if !defined($vmstatus->{$vmid}) || !$vmstatus->{$vmid}->{pid};
+    }
+    delete %$stats{keys %{$to_delete}};
+}
+
+sub delete_old_lxc_avgstats {
+    my ($vmstatus) = @_;
+
+    my $stats = $avgstats->{'lxc'};
+
+    my $to_delete;
+    #delete stats for removed ct, or stopped ct
+    foreach my $vmid (keys %$stats) {
+	$to_delete->{$vmid} = 1 if !defined($vmstatus->{$vmid}) || $vmstatus->{$vmid}->{status} ne 'running';
+    }
+    delete %$stats{keys %{$to_delete}};
+}
 my $broadcast_version_info_done = 0;
 my sub broadcast_version_info : prototype() {
     if (!$broadcast_version_info_done) {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH qemu-server 3/3] vmstatus: add pressure stats
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
                   ` (7 preceding siblings ...)
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 3/4] pvestatd: qemu/lxc/node : add hostcpu/hostmem average stats Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 4/4] pvestatd: node : broadcast ksm value Alexandre Derumier
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 4fc183e..09f3a0c 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -2971,6 +2971,8 @@ sub vmstatus {
 	    $d->{cpu} = $old->{cpu};
 	    $d->{hostcpu} = $old->{hostcpu};
 	}
+
+	$d->{pressure} = $cgroups->get_pressure_stat();
     }
 
     return $res if !$full;
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-manager 4/4] pvestatd: node : broadcast ksm value
  2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
                   ` (8 preceding siblings ...)
  2022-06-01  8:12 ` [pve-devel] [PATCH qemu-server 3/3] vmstatus: add pressure stats Alexandre Derumier
@ 2022-06-01  8:12 ` Alexandre Derumier
  9 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-06-01  8:12 UTC (permalink / raw)
  To: pve-devel; +Cc: t.lamprecht, Alexandre Derumier

only last value is needed for balancing algorithm

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/Service/pvestatd.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/PVE/Service/pvestatd.pm b/PVE/Service/pvestatd.pm
index a33b21cd..dfaef7ce 100755
--- a/PVE/Service/pvestatd.pm
+++ b/PVE/Service/pvestatd.pm
@@ -184,6 +184,8 @@ sub update_node_status {
     };
     compute_avg_metrics($ctime, $avg_metrics, 'node', $nodename);
 
+    PVE::Cluster::broadcast_node_kv("ksm", $meminfo->{memshared});
+
     my $transactions = PVE::ExtMetric::transactions_start($status_cfg);
     PVE::ExtMetric::update_all($transactions, 'node', $nodename, $node_metric, $ctime);
     PVE::ExtMetric::transactions_finish($transactions);
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [pve-devel] [PATCH pve-cluster 1/1] add pve2-metrics rrd (single metrics)
  2022-05-25  6:52 [pve-devel] [PATCH-SERIES cluster/common/qemu-server/lxc/manager] add new metric stats in single rrd Alexandre Derumier
@ 2022-05-25  6:52 ` Alexandre Derumier
  0 siblings, 0 replies; 12+ messages in thread
From: Alexandre Derumier @ 2022-05-25  6:52 UTC (permalink / raw)
  To: pve-devel

This create 1 single rrd for each metric

allowed paths:

pve2-metrics/vms/<vmid>/<metric>
pve2-metrics/nodes/<node>/<metric>
pve2-metrics/storages/<node>/<storage>/<metric>

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 data/src/status.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/data/src/status.c b/data/src/status.c
index 9bceaeb..75ab89e 100644
--- a/data/src/status.c
+++ b/data/src/status.c
@@ -1097,6 +1097,23 @@ static const char *rrd_def_storage[] = {
 	NULL,
 };
 
+static const char *rrd_def_metric[] = {
+	"DS:metric:GAUGE:120:0:U",
+
+	"RRA:AVERAGE:0.5:1:70", // 1 min avg - one hour
+	"RRA:AVERAGE:0.5:30:70", // 30 min avg - one day
+	"RRA:AVERAGE:0.5:180:70", // 3 hour avg - one week
+	"RRA:AVERAGE:0.5:720:70", // 12 hour avg - one month
+	"RRA:AVERAGE:0.5:10080:70", // 7 day avg - ony year
+
+	"RRA:MAX:0.5:1:70", // 1 min max - one hour
+	"RRA:MAX:0.5:30:70", // 30 min max - one day
+	"RRA:MAX:0.5:180:70",  // 3 hour max - one week
+	"RRA:MAX:0.5:720:70", // 12 hour max - one month
+	"RRA:MAX:0.5:10080:70", // 7 day max - ony year
+	NULL,
+};
+
 #define RRDDIR "/var/lib/rrdcached/db"
 
 static void
@@ -1173,6 +1190,40 @@ update_rrd_data(
 			create_rrd_file(filename, argcount, rrd_def_node);
 		}
 
+	} else if ((strncmp(key, "pve2-metrics/", 13) == 0)) {
+		const char *path = key + 13;
+		char **pa = g_strsplit(path, "/", -1);
+
+		if (!pa[0] || !pa[1] || !pa[2] || strlen(pa[1]) < 1 || strlen(pa[2]) < 1) {
+			goto keyerror;
+		} else {
+			if (strcmp(pa[0], "vms") == 0 || strcmp(pa[0], "nodes") == 0) {
+				if (pa[3]) {
+					goto keyerror;
+				}
+			} else if (strcmp(pa[0], "storages") == 0) {
+				if (pa[4] || !pa[3] || strlen(pa[3]) < 1) {
+					goto keyerror;
+				}
+			} else {
+				goto keyerror;
+			}
+		}
+		g_strfreev(pa);
+
+
+		filename = g_strdup_printf(RRDDIR "/%s", key);
+
+		if (!g_file_test(filename, G_FILE_TEST_EXISTS)) {
+
+			char *dir = g_path_get_dirname(filename);
+			g_mkdir_with_parents(dir, 0755);
+			g_free(dir);
+
+			int argcount = sizeof(rrd_def_metric)/sizeof(void*) - 1;
+			create_rrd_file(filename, argcount, rrd_def_metric);
+		}
+
 	} else if ((strncmp(key, "pve2-vm/", 8) == 0) ||
 		   (strncmp(key, "pve2.3-vm/", 10) == 0)) {
 		const char *vmid;
-- 
2.30.2




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-06-01  8:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-01  8:12 [pve-devel] [PATCH-SERIES common/qemu-server/lxc/manager] brodcast new metric stats in kvstore Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-cluster 1/1] add pve2-metrics rrd (single metrics) Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-common 1/2] cgroup: get_pressure_stat: use controllers in get_path Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 1/4] pvestatd: add broadcast_balancer_stats Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 1/3] pvestatd: qemu/lxc/host : broadcast rrd pressure metrics Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-common 2/2] cgroup: get_pressure_stat: add cpu full pressure Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 2/4] pvestatd: qemu/lxc/node : add pressure stats Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH qemu-server 2/3] vmstatus: add hostmem value Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 3/4] pvestatd: qemu/lxc/node : add hostcpu/hostmem average stats Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH qemu-server 3/3] vmstatus: add pressure stats Alexandre Derumier
2022-06-01  8:12 ` [pve-devel] [PATCH pve-manager 4/4] pvestatd: node : broadcast ksm value Alexandre Derumier
  -- strict thread matches above, loose matches on Subject: below --
2022-05-25  6:52 [pve-devel] [PATCH-SERIES cluster/common/qemu-server/lxc/manager] add new metric stats in single rrd Alexandre Derumier
2022-05-25  6:52 ` [pve-devel] [PATCH pve-cluster 1/1] add pve2-metrics rrd (single metrics) Alexandre Derumier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal