* [pve-devel] [PATCH pve-common 0/1] ProcFSTools: add read_pressure @ 2020-10-06 11:58 Alexandre Derumier 2020-10-06 11:58 ` [pve-devel] [PATCH pve-common 1/1] " Alexandre Derumier 0 siblings, 1 reply; 8+ messages in thread From: Alexandre Derumier @ 2020-10-06 11:58 UTC (permalink / raw) To: pve-devel Hi, I'm currently working on vm load balancing scheduler. This patch add new pressure counters, very usefull to known if a node is overloaded, with more granularity than loadaverage. Alexandre Derumier (1): ProcFSTools: add read_pressure src/PVE/ProcFSTools.pm | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) -- 2.20.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH pve-common 1/1] ProcFSTools: add read_pressure 2020-10-06 11:58 [pve-devel] [PATCH pve-common 0/1] ProcFSTools: add read_pressure Alexandre Derumier @ 2020-10-06 11:58 ` Alexandre Derumier 2020-10-11 8:23 ` Alexandre Derumier 2020-10-13 5:35 ` [pve-devel] applied: " Dietmar Maurer 0 siblings, 2 replies; 8+ messages in thread From: Alexandre Derumier @ 2020-10-06 11:58 UTC (permalink / raw) To: pve-devel read new /proc/pressure/(cpu,disk,io) introduced in kernel 4.20. This give more granular informations than loadaverage. Signed-off-by: Alexandre Derumier <aderumier@odiso.com> --- src/PVE/ProcFSTools.pm | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm index 7cf1472..7687c13 100644 --- a/src/PVE/ProcFSTools.pm +++ b/src/PVE/ProcFSTools.pm @@ -132,6 +132,24 @@ sub read_loadavg { return wantarray ? (0, 0, 0) : 0; } +sub read_pressure { + + my $res = {}; + foreach my $type (qw(cpu memory io)) { + if (my $fh = IO::File->new ("/proc/pressure/$type", "r")) { + while (defined (my $line = <$fh>)) { + if ($line =~ /^(some|full)\s+avg10\=(\d+\.\d+)\s+avg60\=(\d+\.\d+)\s+avg300\=(\d+\.\d+)\s+total\=(\d+)/) { + $res->{$type}->{$1}->{avg10} = $2; + $res->{$type}->{$1}->{avg60} = $3; + $res->{$type}->{$1}->{avg300} = $4; + } + } + $fh->close; + } + } + return $res; +} + my $last_proc_stat; sub read_proc_stat { -- 2.20.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH pve-common 1/1] ProcFSTools: add read_pressure 2020-10-06 11:58 ` [pve-devel] [PATCH pve-common 1/1] " Alexandre Derumier @ 2020-10-11 8:23 ` Alexandre Derumier 2020-10-13 6:05 ` Dietmar Maurer 2020-10-13 5:35 ` [pve-devel] applied: " Dietmar Maurer 1 sibling, 1 reply; 8+ messages in thread From: Alexandre Derumier @ 2020-10-11 8:23 UTC (permalink / raw) To: pve-devel Hi, I have notice that it's possible to get pressure info for each vm/ct through cgroups /sys/fs/cgroup/unified/qemu.slice/<vmid>.scope/cpu.pressure /sys/fs/cgroup/unified/lxc/<vmid>/cpu.pressure Maybe it could be great to have some new rrd graphs for each vm/ct ? They are very useful counters to known a specific vm/ct is overloaded Le mar. 6 oct. 2020 à 13:58, Alexandre Derumier <aderumier@odiso.com> a écrit : > read new /proc/pressure/(cpu,disk,io) introduced in kernel 4.20. > > This give more granular informations than loadaverage. > > Signed-off-by: Alexandre Derumier <aderumier@odiso.com> > --- > src/PVE/ProcFSTools.pm | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/src/PVE/ProcFSTools.pm b/src/PVE/ProcFSTools.pm > index 7cf1472..7687c13 100644 > --- a/src/PVE/ProcFSTools.pm > +++ b/src/PVE/ProcFSTools.pm > @@ -132,6 +132,24 @@ sub read_loadavg { > return wantarray ? (0, 0, 0) : 0; > } > > +sub read_pressure { > + > + my $res = {}; > + foreach my $type (qw(cpu memory io)) { > + if (my $fh = IO::File->new ("/proc/pressure/$type", "r")) { > + while (defined (my $line = <$fh>)) { > + if ($line =~ > /^(some|full)\s+avg10\=(\d+\.\d+)\s+avg60\=(\d+\.\d+)\s+avg300\=(\d+\.\d+)\s+total\=(\d+)/) > { > + $res->{$type}->{$1}->{avg10} = $2; > + $res->{$type}->{$1}->{avg60} = $3; > + $res->{$type}->{$1}->{avg300} = $4; > + } > + } > + $fh->close; > + } > + } > + return $res; > +} > + > my $last_proc_stat; > > sub read_proc_stat { > -- > 2.20.1 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH pve-common 1/1] ProcFSTools: add read_pressure 2020-10-11 8:23 ` Alexandre Derumier @ 2020-10-13 6:05 ` Dietmar Maurer 2020-10-13 6:32 ` Alexandre Derumier 0 siblings, 1 reply; 8+ messages in thread From: Dietmar Maurer @ 2020-10-13 6:05 UTC (permalink / raw) To: Proxmox VE development discussion, Alexandre Derumier > I have notice that it's possible to get pressure info for each vm/ct > through cgroups > > /sys/fs/cgroup/unified/qemu.slice/<vmid>.scope/cpu.pressure > /sys/fs/cgroup/unified/lxc/<vmid>/cpu.pressure > > > Maybe it could be great to have some new rrd graphs for each vm/ct ? > They are very useful counters to known a specific vm/ct is overloaded I have no idea how reliable this is, because we do not use cgroups v2. But yes, I think this would be useful. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH pve-common 1/1] ProcFSTools: add read_pressure 2020-10-13 6:05 ` Dietmar Maurer @ 2020-10-13 6:32 ` Alexandre Derumier 2020-10-13 7:38 ` Dietmar Maurer 0 siblings, 1 reply; 8+ messages in thread From: Alexandre Derumier @ 2020-10-13 6:32 UTC (permalink / raw) To: Dietmar Maurer; +Cc: Proxmox VE development discussion >>I have no idea how reliable this is, because we do not use cgroups v2. But yes, >>I think this would be useful. I have tested it on a host with a lot of small vms. (something like 400vms on a 48cores), with this number of vms, they was a lot of context switches, and vms was laggy. cpu usage was ok (maybe 40%), loadaverage was around 40, but pressure was around 20%. (so it seem more precise than loadaverage) global /proc/pressure/cpu was almost the sum of cgroups of /sys/fs/cgroup/unified/qemu.slice/<vmid>.scope/cpu.pressure so,it seem reliable. (I don't have lxc container in production, but I think it should be the same) So, yes, I think we could add them to rrd for both host/vms. BTW, I'm currently playing with reading the rrd files, and I have notice than lower precision is 1minute. as pvestatd send values around each 10s, is this 1minute precision an average of 6x10s values send by pvestatd ? I'm currently working on a poc of vm balancing, but I would like to have something like 15min of 10s precision (90 samples of 10s). So currently I'm getting stats each 10s manually with PVE::API2Tools::extract_vm_stats like the ressource api. (This use PVE::Cluster::rrd_dump , but I don't understand the ipcc_. code. does it only return current streamed values? then after the rrdcached daemon is writing to rrd file the average values each minute ?) I don't known if we could have rrd files with 15min of 10s precision ? (don't known the write load impact on disks) Le mar. 13 oct. 2020 à 08:05, Dietmar Maurer <dietmar@proxmox.com> a écrit : > > I have notice that it's possible to get pressure info for each vm/ct > > through cgroups > > > > /sys/fs/cgroup/unified/qemu.slice/<vmid>.scope/cpu.pressure > > /sys/fs/cgroup/unified/lxc/<vmid>/cpu.pressure > > > > > > Maybe it could be great to have some new rrd graphs for each vm/ct ? > > They are very useful counters to known a specific vm/ct is overloaded > > I have no idea how reliable this is, because we do not use cgroups v2. But > yes, > I think this would be useful. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH pve-common 1/1] ProcFSTools: add read_pressure 2020-10-13 6:32 ` Alexandre Derumier @ 2020-10-13 7:38 ` Dietmar Maurer 2020-10-13 12:05 ` Alexandre Derumier 0 siblings, 1 reply; 8+ messages in thread From: Dietmar Maurer @ 2020-10-13 7:38 UTC (permalink / raw) To: Alexandre Derumier; +Cc: Proxmox VE development discussion > BTW, I'm currently playing with reading the rrd files, and I have notice than lower precision is 1minute. > as pvestatd send values around each 10s, is this 1minute precision an average of 6x10s values send by pvestatd ? Yes (we also store the MAX) > I'm currently working on a poc of vm balancing, but I would like to have something like 15min of 10s precision (90 samples of 10s). Why do you need 10s resulution? Isn't 1min good enough? > So currently I'm getting stats each 10s manually with PVE::API2Tools::extract_vm_stats like the ressource api. > (This use PVE::Cluster::rrd_dump , but I don't understand the ipcc_. code. does it only return current streamed values? > then after the rrdcached daemon is writing to rrd file the average values each minute ?) > > I don't known if we could have rrd files with 15min of 10s precision ? (don't known the write load impact on disks) We use the following RRD conf, step is 60 seconds (see pve-cluster/src/status.c): static const char *rrd_def_node[] = { "DS:loadavg:GAUGE:120:0:U", "DS:maxcpu:GAUGE:120:0:U", "DS:cpu:GAUGE:120:0:U", "DS:iowait:GAUGE:120:0:U", "DS:memtotal:GAUGE:120:0:U", "DS:memused:GAUGE:120:0:U", "DS:swaptotal:GAUGE:120:0:U", "DS:swapused:GAUGE:120:0:U", "DS:roottotal:GAUGE:120:0:U", "DS:rootused:GAUGE:120:0:U", "DS:netin:DERIVE:120:0:U", "DS:netout:DERIVE:120:0:U", "RRA:AVERAGE:0.5:1:70", // 1 min avg - one hour "RRA:AVERAGE:0.5:30:70", // 30 min avg - one day "RRA:AVERAGE:0.5:180:70", // 3 hour avg - one week "RRA:AVERAGE:0.5:720:70", // 12 hour avg - one month "RRA:AVERAGE:0.5:10080:70", // 7 day avg - ony year "RRA:MAX:0.5:1:70", // 1 min max - one hour "RRA:MAX:0.5:30:70", // 30 min max - one day "RRA:MAX:0.5:180:70", // 3 hour max - one week "RRA:MAX:0.5:720:70", // 12 hour max - one month "RRA:MAX:0.5:10080:70", // 7 day max - ony year NULL, }; Also See: man rrdcreate So no, you do not get 10s precission from RRD. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH pve-common 1/1] ProcFSTools: add read_pressure 2020-10-13 7:38 ` Dietmar Maurer @ 2020-10-13 12:05 ` Alexandre Derumier 0 siblings, 0 replies; 8+ messages in thread From: Alexandre Derumier @ 2020-10-13 12:05 UTC (permalink / raw) To: Dietmar Maurer; +Cc: Proxmox VE development discussion >>Why do you need 10s resulution? Isn't 1min good enough? Well, if the 1min is an average of 10s metric, it's ok. I'm currently using 1min average and 5min average, so it's not a problem with current rrds. Thanks for the informations ! (I'll resend a patch to add pressure to rrd, and also add vm/ct pressure) Le mar. 13 oct. 2020 à 09:38, Dietmar Maurer <dietmar@proxmox.com> a écrit : > > BTW, I'm currently playing with reading the rrd files, and I have notice > than lower precision is 1minute. > > as pvestatd send values around each 10s, is this 1minute precision an > average of 6x10s values send by pvestatd ? > > Yes (we also store the MAX) > > > I'm currently working on a poc of vm balancing, but I would like to have > something like 15min of 10s precision (90 samples of 10s). > > Why do you need 10s resulution? Isn't 1min good enough? > > > So currently I'm getting stats each 10s manually > with PVE::API2Tools::extract_vm_stats like the ressource api. > > (This use PVE::Cluster::rrd_dump , but I don't understand the ipcc_. > code. does it only return current streamed values? > > then after the rrdcached daemon is writing to rrd file the average > values each minute ?) > > > > I don't known if we could have rrd files with 15min of 10s precision ? > (don't known the write load impact on disks) > > We use the following RRD conf, step is 60 seconds (see > pve-cluster/src/status.c): > > static const char *rrd_def_node[] = { > "DS:loadavg:GAUGE:120:0:U", > "DS:maxcpu:GAUGE:120:0:U", > "DS:cpu:GAUGE:120:0:U", > "DS:iowait:GAUGE:120:0:U", > "DS:memtotal:GAUGE:120:0:U", > "DS:memused:GAUGE:120:0:U", > "DS:swaptotal:GAUGE:120:0:U", > "DS:swapused:GAUGE:120:0:U", > "DS:roottotal:GAUGE:120:0:U", > "DS:rootused:GAUGE:120:0:U", > "DS:netin:DERIVE:120:0:U", > "DS:netout:DERIVE:120:0:U", > > "RRA:AVERAGE:0.5:1:70", // 1 min avg - one hour > "RRA:AVERAGE:0.5:30:70", // 30 min avg - one day > "RRA:AVERAGE:0.5:180:70", // 3 hour avg - one week > "RRA:AVERAGE:0.5:720:70", // 12 hour avg - one month > "RRA:AVERAGE:0.5:10080:70", // 7 day avg - ony year > > "RRA:MAX:0.5:1:70", // 1 min max - one hour > "RRA:MAX:0.5:30:70", // 30 min max - one day > "RRA:MAX:0.5:180:70", // 3 hour max - one week > "RRA:MAX:0.5:720:70", // 12 hour max - one month > "RRA:MAX:0.5:10080:70", // 7 day max - ony year > NULL, > }; > > Also See: man rrdcreate > > So no, you do not get 10s precission from RRD. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] applied: [PATCH pve-common 1/1] ProcFSTools: add read_pressure 2020-10-06 11:58 ` [pve-devel] [PATCH pve-common 1/1] " Alexandre Derumier 2020-10-11 8:23 ` Alexandre Derumier @ 2020-10-13 5:35 ` Dietmar Maurer 1 sibling, 0 replies; 8+ messages in thread From: Dietmar Maurer @ 2020-10-13 5:35 UTC (permalink / raw) To: Proxmox VE development discussion, Alexandre Derumier applied ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-10-13 12:05 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-06 11:58 [pve-devel] [PATCH pve-common 0/1] ProcFSTools: add read_pressure Alexandre Derumier 2020-10-06 11:58 ` [pve-devel] [PATCH pve-common 1/1] " Alexandre Derumier 2020-10-11 8:23 ` Alexandre Derumier 2020-10-13 6:05 ` Dietmar Maurer 2020-10-13 6:32 ` Alexandre Derumier 2020-10-13 7:38 ` Dietmar Maurer 2020-10-13 12:05 ` Alexandre Derumier 2020-10-13 5:35 ` [pve-devel] applied: " Dietmar Maurer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox