* [pve-devel] [PATCH V2 pve-common 0/4] add generic CGroup module
@ 2020-11-06 8:24 Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 1/4] move PVE::LXC::CGroup to PVE::CGroup Alexandre Derumier
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Alexandre Derumier @ 2020-11-06 8:24 UTC (permalink / raw)
To: pve-devel
This move Cgroup module from LXC to pve-common,
to be able to use it in qemu-server too.
(and add support for cgroupv2 only for qemu)
I have also included a bugfix for cpushares from cgroup v1,
when value is not defined,it's currently set to 100 instead 1024.
(can be trigerred by pct --delete cpuunits)
Changelog v2:
- keep get_subdir in base class
Alexandre Derumier (4):
move PVE::LXC::CGroup to PVE::CGroup
Cgroups: remove specific lxc code
bugfix: cpushares : default value is 1024 for cgroup v1
systemd: add CPUWeight encoding
src/Makefile | 1 +
src/PVE/CGroup.pm | 551 +++++++++++++++++++++++++++++++++++++++++++++
src/PVE/Systemd.pm | 2 +-
3 files changed, 553 insertions(+), 1 deletion(-)
create mode 100644 src/PVE/CGroup.pm
--
2.20.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] [PATCH V2 pve-common 1/4] move PVE::LXC::CGroup to PVE::CGroup
2020-11-06 8:24 [pve-devel] [PATCH V2 pve-common 0/4] add generic CGroup module Alexandre Derumier
@ 2020-11-06 8:24 ` Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 2/4] Cgroups: remove specific lxc code Alexandre Derumier
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Alexandre Derumier @ 2020-11-06 8:24 UTC (permalink / raw)
To: pve-devel
---
src/Makefile | 1 +
src/PVE/CGroup.pm | 582 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 583 insertions(+)
create mode 100644 src/PVE/CGroup.pm
diff --git a/src/Makefile b/src/Makefile
index 1987d0e..b2a4ac6 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -13,6 +13,7 @@ LIB_SOURCES = \
CLIHandler.pm \
CalendarEvent.pm \
CpuSet.pm \
+ CGroup.pm \
Daemon.pm \
Exception.pm \
INotify.pm \
diff --git a/src/PVE/CGroup.pm b/src/PVE/CGroup.pm
new file mode 100644
index 0000000..7e12af9
--- /dev/null
+++ b/src/PVE/CGroup.pm
@@ -0,0 +1,582 @@
+# cgroup handler
+#
+# This package should deal with figuring out the right cgroup path for a
+# container (via the command socket), reading and writing cgroup values, and
+# handling cgroup v1 & v2 differences.
+#
+# Note that the long term plan is to have resource manage functions instead of
+# dealing with cgroup files on the outside.
+
+package PVE::CGroup;
+
+use strict;
+use warnings;
+
+use IO::File;
+use IO::Select;
+use POSIX qw();
+
+use PVE::ProcFSTools;
+use PVE::Tools qw(
+ file_get_contents
+ file_read_firstline
+);
+
+use PVE::LXC::Command;
+
+# We don't want to do a command socket round trip for every cgroup read/write,
+# so any cgroup function needs to have the container's path cached, so this
+# package has to be instantiated.
+#
+# LXC keeps separate paths by controller (although they're normally all the
+# same, in our # case anyway), so we cache them by controller as well.
+sub new {
+ my ($class, $vmid) = @_;
+
+ my $self = { vmid => $vmid };
+
+ return bless $self, $class;
+}
+
+# Get the v1 controller list.
+#
+# Returns a set (hash mapping names to `1`) of cgroupv1 controllers, and an
+# optional boolean whether a unified (cgroupv2) hierarchy exists.
+#
+# Deprecated: Use `get_cgroup_controllers()` instead.
+sub get_v1_controllers {
+ my $v1 = {};
+ my $v2 = 0;
+ my $data = PVE::Tools::file_get_contents('/proc/self/cgroup');
+ while ($data =~ /^\d+:([^:\n]*):.*$/gm) {
+ my $type = $1;
+ if (length($type)) {
+ $v1->{$_} = 1 foreach split(/,/, $type);
+ } else {
+ $v2 = 1;
+ }
+ }
+ return wantarray ? ($v1, $v2) : $v1;
+}
+
+# Get the set v2 controller list from the `cgroup.controllers` file.
+my sub get_v2_controllers {
+ my $v2 = eval { file_get_contents('/sys/fs/cgroup/cgroup.controllers') }
+ || eval { file_get_contents('/sys/fs/cgroup/unified/cgroup.controllers') };
+ return undef if !defined $v2;
+
+ # It's a simple space separated list:
+ return { map { $_ => 1 } split(/\s+/, $v2) };
+}
+
+my $CGROUP_CONTROLLERS = undef;
+# Get a list of controllers enabled in each cgroup subsystem.
+#
+# This is a more complete version of `PVE::LXC::get_cgroup_subsystems`.
+#
+# Returns 2 sets (hashes mapping controller names to `1`), one for each cgroup
+# version.
+sub get_cgroup_controllers() {
+ if (!defined($CGROUP_CONTROLLERS)) {
+ my ($v1, undef) = get_v1_controllers();
+ my $v2 = get_v2_controllers();
+
+ $CGROUP_CONTROLLERS = [$v1, $v2];
+ }
+
+ return $CGROUP_CONTROLLERS->@*;
+}
+
+my $CGROUP_MODE = undef;
+# Figure out which cgroup mode we're operating under:
+#
+# Returns 1 if cgroupv1 controllers exist (hybrid or legacy mode), and 2 in a
+# cgroupv2-only environment.
+#
+# NOTE: To fully support a hybrid layout it is better to use functions like
+# `cpuset_controller_path`.
+#
+# This is a function, not a method!
+sub cgroup_mode() {
+ if (!defined($CGROUP_MODE)) {
+ my ($v1, $v2) = get_cgroup_controllers();
+ if (keys %$v1) {
+ # hybrid or legacy mode
+ $CGROUP_MODE = 1;
+ } elsif ($v2) {
+ $CGROUP_MODE = 2;
+ }
+ }
+
+ die "unknown cgroup mode\n" if !defined($CGROUP_MODE);
+ return $CGROUP_MODE;
+}
+
+my $CGROUPV2_PATH = undef;
+sub cgroupv2_base_path() {
+ if (!defined($CGROUPV2_PATH)) {
+ if (cgroup_mode() == 2) {
+ $CGROUPV2_PATH = '/sys/fs/cgroup';
+ } else {
+ $CGROUPV2_PATH = '/sys/fs/cgroup/unified';
+ }
+ }
+ return $CGROUPV2_PATH;
+}
+
+# Find a cgroup controller and return its path and version.
+#
+# LXC initializes the unified hierarchy first, so if a controller is
+# available via both we favor cgroupv2 here as well.
+#
+# Returns nothing if the controller is not available.
+sub find_cgroup_controller($) {
+ my ($controller) = @_;
+
+ my ($v1, $v2) = get_cgroup_controllers();
+
+ if (!defined($controller) || $v2->{$controller}) {
+ my $path = cgroupv2_base_path();
+ return wantarray ? ($path, 2) : $path;
+ }
+
+ if (defined($controller) && $v1->{$controller}) {
+ my $path = "/sys/fs/cgroup/$controller";
+ return wantarray ? ($path, 1) : $path;
+ }
+
+ return;
+}
+
+my $CG_PATH_CPUSET = undef;
+my $CG_VER_CPUSET = undef;
+# Find the cpuset cgroup controller.
+#
+# This is a function, not a method!
+sub cpuset_controller_path() {
+ if (!defined($CG_PATH_CPUSET)) {
+ ($CG_PATH_CPUSET, $CG_VER_CPUSET) = find_cgroup_controller('cpuset')
+ or die "failed to find cpuset controller\n";
+ }
+
+ return wantarray ? ($CG_PATH_CPUSET, $CG_VER_CPUSET) : $CG_PATH_CPUSET;
+}
+
+# Get a subdirectory (without the cgroup mount point) for a controller.
+#
+# If `$controller` is `undef`, get the unified (cgroupv2) path.
+#
+# Note that in cgroup v2, lxc uses the activated controller names
+# (`cgroup.controllers` file) as list of controllers for the unified hierarchy,
+# so this returns a result when a `controller` is provided even when using
+# a pure cgroupv2 setup.
+my sub get_subdir {
+ my ($self, $controller, $limiting) = @_;
+
+ my $entry_name = $controller || 'unified';
+ my $entry = ($self->{controllers}->{$entry_name} //= {});
+
+ my $kind = $limiting ? 'limit' : 'ns';
+ my $path = $entry->{$kind};
+
+ return $path if defined $path;
+
+ $path = PVE::LXC::Command::get_cgroup_path(
+ $self->{vmid},
+ $controller,
+ $limiting,
+ ) or return undef;
+
+ # untaint:
+ if ($path =~ /\.\./) {
+ die "lxc returned suspicious path: '$path'\n";
+ }
+ ($path) = ($path =~ /^(.*)$/s);
+
+ $entry->{$kind} = $path;
+
+ return $path;
+}
+
+# Get path and version for a controller.
+#
+# `$controller` may be `undef`, see get_subdir above for details.
+#
+# Returns either just the path, or the path and cgroup version as a tuple.
+sub get_path {
+ my ($self, $controller, $limiting) = @_;
+
+ # Find the controller before querying the lxc monitor via a socket:
+ my ($cgpath, $ver) = find_cgroup_controller($controller)
+ or return undef;
+
+ my $path = get_subdir($self, $controller, $limiting)
+ or return undef;
+
+ $path = "$cgpath/$path";
+ return wantarray ? ($path, $ver) : $path;
+}
+
+# Convenience method to get the path info if the first existing controller.
+#
+# Returns the same as `get_path`.
+sub get_any_path {
+ my ($self, $limiting, @controllers) = @_;
+
+ my ($path, $ver);
+ for my $c (@controllers) {
+ ($path, $ver) = $self->get_path($c, $limiting);
+ last if defined $path;
+ }
+ return wantarray ? ($path, $ver) : $path;
+}
+
+# Parse a 'Nested keyed' file:
+#
+# See kernel documentation `admin-guide/cgroup-v2.rst` 4.1.
+my sub parse_nested_keyed_file($) {
+ my ($data) = @_;
+ my $res = {};
+ foreach my $line (split(/\n/, $data)) {
+ my ($key, @values) = split(/\s+/, $line);
+
+ my $d = ($res->{$key} = {});
+
+ foreach my $value (@values) {
+ if (my ($key, $value) = ($value =~ /^([^=]+)=(.*)$/)) {
+ $d->{$key} = $value;
+ } else {
+ warn "bad key=value pair in nested keyed file\n";
+ }
+ }
+ }
+ return $res;
+}
+
+# Parse a 'Flat keyed' file:
+#
+# See kernel documentation `admin-guide/cgroup-v2.rst` 4.1.
+my sub parse_flat_keyed_file($) {
+ my ($data) = @_;
+ my $res = {};
+ foreach my $line (split(/\n/, $data)) {
+ if (my ($key, $value) = ($line =~ /^(\S+)\s+(.*)$/)) {
+ $res->{$key} = $value;
+ } else {
+ warn "bad 'key value' pair in flat keyed file\n";
+ }
+ }
+ return $res;
+}
+
+# Parse out 'diskread' and 'diskwrite' values from I/O stats for this container.
+sub get_io_stats {
+ my ($self) = @_;
+
+ my $res = {
+ diskread => 0,
+ diskwrite => 0,
+ };
+
+ # With cgroupv1 we have a 'blkio' controller, with cgroupv2 it's just 'io':
+ my ($path, $ver) = $self->get_any_path(1, 'io', 'blkio');
+ if (!defined($path)) {
+ # container not running
+ return undef;
+ } elsif ($ver == 2) {
+ # cgroupv2 environment, io controller enabled
+ my $io_stat = file_get_contents("$path/io.stat");
+
+ my $data = parse_nested_keyed_file($io_stat);
+ foreach my $dev (keys %$data) {
+ my $dev = $data->{$dev};
+ if (my $b = $dev->{rbytes}) {
+ $res->{diskread} += $b;
+ }
+ if (my $b = $dev->{wbytes}) {
+ $res->{diskread} += $b;
+ }
+ }
+
+ return $res;
+ } elsif ($ver == 1) {
+ # cgroupv1 environment:
+ my $io = file_get_contents("$path/blkio.throttle.io_service_bytes_recursive");
+ foreach my $line (split(/\n/, $io)) {
+ if (my ($type, $bytes) = ($line =~ /^\d+:\d+\s+(Read|Write)\s+(\d+)$/)) {
+ $res->{diskread} += $bytes if $type eq 'Read';
+ $res->{diskwrite} += $bytes if $type eq 'Write';
+ }
+ }
+
+ return $res;
+ } else {
+ die "bad cgroup version: $ver\n";
+ }
+
+ # container not running
+ return undef;
+}
+
+# Read utime and stime for this container from the cpuacct cgroup.
+# Values are in milliseconds!
+sub get_cpu_stat {
+ my ($self) = @_;
+
+ my $res = {
+ utime => 0,
+ stime => 0,
+ };
+
+ my ($path, $ver) = $self->get_any_path(1, 'cpuacct', 'cpu');
+ if (!defined($path)) {
+ # container not running
+ return undef;
+ } elsif ($ver == 2) {
+ my $data = eval { file_get_contents("$path/cpu.stat") };
+
+ # or no io controller available:
+ return undef if !defined($data);
+
+ $data = parse_flat_keyed_file($data);
+ $res->{utime} = int($data->{user_usec} / 1000);
+ $res->{stime} = int($data->{system_usec} / 1000);
+ } elsif ($ver == 1) {
+ # cgroupv1 environment:
+ my $clock_ticks = POSIX::sysconf(&POSIX::_SC_CLK_TCK);
+ my $clk_to_usec = 1000 / $clock_ticks;
+
+ my $data = parse_flat_keyed_file(file_get_contents("$path/cpuacct.stat"));
+ $res->{utime} = int($data->{user} * $clk_to_usec);
+ $res->{stime} = int($data->{system} * $clk_to_usec);
+ } else {
+ die "bad cgroup version: $ver\n";
+ }
+
+ return $res;
+}
+
+# Parse some memory data from `memory.stat`
+sub get_memory_stat {
+ my ($self) = @_;
+
+ my $res = {
+ mem => 0,
+ swap => 0,
+ };
+
+ my ($path, $ver) = $self->get_path('memory', 1);
+ if (!defined($path)) {
+ # container most likely isn't running
+ return undef;
+ } elsif ($ver == 2) {
+ my $mem = file_get_contents("$path/memory.current");
+ my $swap = file_get_contents("$path/memory.swap.current");
+
+ chomp ($mem, $swap);
+
+ # FIXME: For the cgv1 equivalent of `total_cache` we may need to sum up
+ # the values in `memory.stat`...
+
+ $res->{mem} = $mem;
+ $res->{swap} = $swap;
+ } elsif ($ver == 1) {
+ # cgroupv1 environment:
+ my $stat = parse_flat_keyed_file(file_get_contents("$path/memory.stat"));
+ my $mem = file_get_contents("$path/memory.usage_in_bytes");
+ my $memsw = file_get_contents("$path/memory.memsw.usage_in_bytes");
+ chomp ($mem, $memsw);
+
+ $res->{mem} = $mem - $stat->{total_cache};
+ $res->{swap} = $memsw - $mem;
+ } else {
+ die "bad cgroup version: $ver\n";
+ }
+
+ return $res;
+}
+
+# Change the memory limit for this container.
+#
+# Dies on error (including a not-running or currently-shutting-down guest).
+sub change_memory_limit {
+ my ($self, $mem_bytes, $swap_bytes) = @_;
+
+ my ($path, $ver) = $self->get_path('memory', 1);
+ if (!defined($path)) {
+ die "trying to change memory cgroup values: container not running\n";
+ } elsif ($ver == 2) {
+ PVE::ProcFSTools::write_proc_entry("$path/memory.swap.max", $swap_bytes)
+ if defined($swap_bytes);
+ PVE::ProcFSTools::write_proc_entry("$path/memory.max", $mem_bytes)
+ if defined($mem_bytes);
+ } elsif ($ver == 1) {
+ # With cgroupv1 we cannot control memory and swap limits separately.
+ # This also means that since the two values aren't independent, we need to handle
+ # growing and shrinking separately.
+ my $path_mem = "$path/memory.limit_in_bytes";
+ my $path_memsw = "$path/memory.memsw.limit_in_bytes";
+
+ my $old_mem_bytes = file_get_contents($path_mem);
+ my $old_memsw_bytes = file_get_contents($path_memsw);
+ chomp($old_mem_bytes, $old_memsw_bytes);
+
+ $mem_bytes //= $old_mem_bytes;
+ $swap_bytes //= $old_memsw_bytes - $old_mem_bytes;
+ my $memsw_bytes = $mem_bytes + $swap_bytes;
+
+ if ($memsw_bytes > $old_memsw_bytes) {
+ # Growing the limit means growing the combined limit first, then pulling the
+ # memory limitup.
+ PVE::ProcFSTools::write_proc_entry($path_memsw, $memsw_bytes);
+ PVE::ProcFSTools::write_proc_entry($path_mem, $mem_bytes);
+ } else {
+ # Shrinking means we first need to shrink the mem-only memsw cannot be
+ # shrunk below it.
+ PVE::ProcFSTools::write_proc_entry($path_mem, $mem_bytes);
+ PVE::ProcFSTools::write_proc_entry($path_memsw, $memsw_bytes);
+ }
+ } else {
+ die "bad cgroup version: $ver\n";
+ }
+
+ # return a truth value
+ return 1;
+}
+
+# Change the cpu quota for a container.
+#
+# Dies on error (including a not-running or currently-shutting-down guest).
+sub change_cpu_quota {
+ my ($self, $quota, $period) = @_;
+
+ die "quota without period not allowed\n" if !defined($period) && defined($quota);
+
+ my ($path, $ver) = $self->get_path('cpu', 1);
+ if (!defined($path)) {
+ die "trying to change cpu quota cgroup values: container not running\n";
+ } elsif ($ver == 2) {
+ # cgroupv2 environment, an undefined (unlimited) quota is defined as "max"
+ # in this interface:
+ $quota //= 'max'; # unlimited
+ if (defined($quota)) {
+ PVE::ProcFSTools::write_proc_entry("$path/cpu.max", "$quota $period");
+ } else {
+ # we're allowed to only write the quota:
+ PVE::ProcFSTools::write_proc_entry("$path/cpu.max", 'max');
+ }
+ } elsif ($ver == 1) {
+ $quota //= -1; # unlimited
+ $period //= -1;
+ PVE::ProcFSTools::write_proc_entry("$path/cpu.cfs_period_us", $period);
+ PVE::ProcFSTools::write_proc_entry("$path/cpu.cfs_quota_us", $quota);
+ } else {
+ die "bad cgroup version: $ver\n";
+ }
+
+ # return a truth value
+ return 1;
+}
+
+# Change the cpu "shares" for a container.
+#
+# In cgroupv1 we used a value in `[0..500000]` with a default of 1024.
+#
+# In cgroupv2 we do not have "shares", we have "weights" in the range
+# of `[1..10000]` with a default of 100.
+#
+# Since the default values don't match when scaling linearly, we use the
+# values we get as-is and simply error for values >10000 in cgroupv2.
+#
+# It is left to the user to figure this out for now.
+#
+# Dies on error (including a not-running or currently-shutting-down guest).
+sub change_cpu_shares {
+ my ($self, $shares, $cgroupv1_default) = @_;
+
+ my ($path, $ver) = $self->get_path('cpu', 1);
+ if (!defined($path)) {
+ die "trying to change cpu shares/weight cgroup values: container not running\n";
+ } elsif ($ver == 2) {
+ # the cgroupv2 documentation defines the default to 100
+ $shares //= 100;
+ die "cpu weight (shares) must be in range [1, 10000]\n" if $shares < 1 || $shares > 10000;
+ PVE::ProcFSTools::write_proc_entry("$path/cpu.weight", $shares);
+ } elsif ($ver == 1) {
+ $shares //= 100;
+ PVE::ProcFSTools::write_proc_entry("$path/cpu.shares", $shares // $cgroupv1_default);
+ } else {
+ die "bad cgroup version: $ver\n";
+ }
+
+ # return a truth value
+ return 1;
+}
+
+my sub v1_freeze_thaw {
+ my ($self, $controller_path, $freeze) = @_;
+ my $path = get_subdir($self, 'freezer', 1)
+ or die "trying to freeze container: container not running\n";
+ $path = "$controller_path/$path/freezer.state";
+
+ my $data = $freeze ? 'FROZEN' : 'THAWED';
+ PVE::ProcFSTools::write_proc_entry($path, $data);
+
+ # Here we just poll the freezer.state once per second.
+ while (1) {
+ my $state = file_get_contents($path);
+ chomp $state;
+ last if $state eq $data;
+ }
+}
+
+my sub v2_freeze_thaw {
+ my ($self, $controller_path, $freeze) = @_;
+ my $path = get_subdir($self, undef, 1)
+ or die "trying to freeze container: container not running\n";
+ $path = "$controller_path/$path";
+
+ my $desired_state = $freeze ? 1 : 0;
+
+ # cgroupv2 supports poll events on cgroup.events which contains the frozen
+ # state.
+ my $fh = IO::File->new("$path/cgroup.events", 'r')
+ or die "failed to open $path/cgroup.events file: $!\n";
+ my $select = IO::Select->new();
+ $select->add($fh);
+
+ PVE::ProcFSTools::write_proc_entry("$path/cgroup.freeze", $desired_state);
+ while (1) {
+ my $data = do {
+ local $/ = undef;
+ <$fh>
+ };
+ $data = parse_flat_keyed_file($data);
+ last if $data->{frozen} == $desired_state;
+ my @handles = $select->has_exception();
+ next if !@handles;
+ seek($fh, 0, 0)
+ or die "failed to rewind cgroup.events file: $!\n";
+ }
+}
+
+# Freeze or unfreeze a container.
+#
+# This will freeze the container at its outer (limiting) cgroup path. We use
+# this instead of `lxc-freeze` as `lxc-freeze` from lxc4 will not be able to
+# fetch the cgroup path from contaienrs still running on lxc3.
+sub freeze_thaw {
+ my ($self, $freeze) = @_;
+
+ my $controller_path = find_cgroup_controller('freezer');
+ if (defined($controller_path)) {
+ return v1_freeze_thaw($self, $controller_path, $freeze);
+ } else {
+ # cgroupv2 always has a freezer, there can be both cgv1 and cgv2
+ # freezers, but we'll prefer v1 when it's available as that's what lxc
+ # does as well...
+ return v2_freeze_thaw($self, cgroupv2_base_path(), $freeze);
+ }
+}
+
+1;
--
2.20.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] [PATCH V2 pve-common 2/4] Cgroups: remove specific lxc code
2020-11-06 8:24 [pve-devel] [PATCH V2 pve-common 0/4] add generic CGroup module Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 1/4] move PVE::LXC::CGroup to PVE::CGroup Alexandre Derumier
@ 2020-11-06 8:24 ` Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 3/4] bugfix: cpushares : default value is 1024 for cgroup v1 Alexandre Derumier
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Alexandre Derumier @ 2020-11-06 8:24 UTC (permalink / raw)
To: pve-devel
---
src/PVE/CGroup.pm | 43 ++++++-------------------------------------
1 file changed, 6 insertions(+), 37 deletions(-)
diff --git a/src/PVE/CGroup.pm b/src/PVE/CGroup.pm
index 7e12af9..45b9e7c 100644
--- a/src/PVE/CGroup.pm
+++ b/src/PVE/CGroup.pm
@@ -22,8 +22,6 @@ use PVE::Tools qw(
file_read_firstline
);
-use PVE::LXC::Command;
-
# We don't want to do a command socket round trip for every cgroup read/write,
# so any cgroup function needs to have the container's path cached, so this
# package has to be instantiated.
@@ -130,6 +128,7 @@ sub cgroupv2_base_path() {
# available via both we favor cgroupv2 here as well.
#
# Returns nothing if the controller is not available.
+
sub find_cgroup_controller($) {
my ($controller) = @_;
@@ -163,39 +162,10 @@ sub cpuset_controller_path() {
}
# Get a subdirectory (without the cgroup mount point) for a controller.
-#
-# If `$controller` is `undef`, get the unified (cgroupv2) path.
-#
-# Note that in cgroup v2, lxc uses the activated controller names
-# (`cgroup.controllers` file) as list of controllers for the unified hierarchy,
-# so this returns a result when a `controller` is provided even when using
-# a pure cgroupv2 setup.
-my sub get_subdir {
+sub get_subdir {
my ($self, $controller, $limiting) = @_;
- my $entry_name = $controller || 'unified';
- my $entry = ($self->{controllers}->{$entry_name} //= {});
-
- my $kind = $limiting ? 'limit' : 'ns';
- my $path = $entry->{$kind};
-
- return $path if defined $path;
-
- $path = PVE::LXC::Command::get_cgroup_path(
- $self->{vmid},
- $controller,
- $limiting,
- ) or return undef;
-
- # untaint:
- if ($path =~ /\.\./) {
- die "lxc returned suspicious path: '$path'\n";
- }
- ($path) = ($path =~ /^(.*)$/s);
-
- $entry->{$kind} = $path;
-
- return $path;
+ die "implement in subclass";
}
# Get path and version for a controller.
@@ -205,12 +175,11 @@ my sub get_subdir {
# Returns either just the path, or the path and cgroup version as a tuple.
sub get_path {
my ($self, $controller, $limiting) = @_;
-
# Find the controller before querying the lxc monitor via a socket:
my ($cgpath, $ver) = find_cgroup_controller($controller)
or return undef;
- my $path = get_subdir($self, $controller, $limiting)
+ my $path = $self->get_subdir($controller, $limiting)
or return undef;
$path = "$cgpath/$path";
@@ -515,7 +484,7 @@ sub change_cpu_shares {
my sub v1_freeze_thaw {
my ($self, $controller_path, $freeze) = @_;
- my $path = get_subdir($self, 'freezer', 1)
+ my $path = $self->get_subdir('freezer', 1)
or die "trying to freeze container: container not running\n";
$path = "$controller_path/$path/freezer.state";
@@ -532,7 +501,7 @@ my sub v1_freeze_thaw {
my sub v2_freeze_thaw {
my ($self, $controller_path, $freeze) = @_;
- my $path = get_subdir($self, undef, 1)
+ my $path = $self->get_subdir(undef, 1)
or die "trying to freeze container: container not running\n";
$path = "$controller_path/$path";
--
2.20.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] [PATCH V2 pve-common 3/4] bugfix: cpushares : default value is 1024 for cgroup v1
2020-11-06 8:24 [pve-devel] [PATCH V2 pve-common 0/4] add generic CGroup module Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 1/4] move PVE::LXC::CGroup to PVE::CGroup Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 2/4] Cgroups: remove specific lxc code Alexandre Derumier
@ 2020-11-06 8:24 ` Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 4/4] systemd: add CPUWeight encoding Alexandre Derumier
2020-11-06 8:31 ` [pve-devel] applied-series: [PATCH V2 pve-common 0/4] add generic CGroup module Thomas Lamprecht
4 siblings, 0 replies; 7+ messages in thread
From: Alexandre Derumier @ 2020-11-06 8:24 UTC (permalink / raw)
To: pve-devel
Currently if we delete cpuunit (undef), the default value is 100
---
src/PVE/CGroup.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/PVE/CGroup.pm b/src/PVE/CGroup.pm
index 45b9e7c..71d0846 100644
--- a/src/PVE/CGroup.pm
+++ b/src/PVE/CGroup.pm
@@ -472,7 +472,7 @@ sub change_cpu_shares {
die "cpu weight (shares) must be in range [1, 10000]\n" if $shares < 1 || $shares > 10000;
PVE::ProcFSTools::write_proc_entry("$path/cpu.weight", $shares);
} elsif ($ver == 1) {
- $shares //= 100;
+ $shares //= 1024;
PVE::ProcFSTools::write_proc_entry("$path/cpu.shares", $shares // $cgroupv1_default);
} else {
die "bad cgroup version: $ver\n";
--
2.20.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] [PATCH V2 pve-common 4/4] systemd: add CPUWeight encoding
2020-11-06 8:24 [pve-devel] [PATCH V2 pve-common 0/4] add generic CGroup module Alexandre Derumier
` (2 preceding siblings ...)
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 3/4] bugfix: cpushares : default value is 1024 for cgroup v1 Alexandre Derumier
@ 2020-11-06 8:24 ` Alexandre Derumier
2020-11-06 8:31 ` [pve-devel] applied-series: [PATCH V2 pve-common 0/4] add generic CGroup module Thomas Lamprecht
4 siblings, 0 replies; 7+ messages in thread
From: Alexandre Derumier @ 2020-11-06 8:24 UTC (permalink / raw)
To: pve-devel
---
src/PVE/Systemd.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/PVE/Systemd.pm b/src/PVE/Systemd.pm
index 85b35a3..ed13a60 100644
--- a/src/PVE/Systemd.pm
+++ b/src/PVE/Systemd.pm
@@ -105,7 +105,7 @@ sub enter_systemd_scope {
foreach my $key (keys %extra) {
if ($key eq 'Slice' || $key eq 'KillMode') {
push @{$properties}, [$key, $extra{$key}];
- } elsif ($key eq 'CPUShares') {
+ } elsif ($key eq 'CPUShares' || $key eq 'CPUWeight') {
push @{$properties}, [$key, dbus_uint64($extra{$key})];
} elsif ($key eq 'CPUQuota') {
push @{$properties}, ['CPUQuotaPerSecUSec',
--
2.20.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] applied-series: [PATCH V2 pve-common 0/4] add generic CGroup module
2020-11-06 8:24 [pve-devel] [PATCH V2 pve-common 0/4] add generic CGroup module Alexandre Derumier
` (3 preceding siblings ...)
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 4/4] systemd: add CPUWeight encoding Alexandre Derumier
@ 2020-11-06 8:31 ` Thomas Lamprecht
2020-11-08 14:10 ` Alexandre Derumier
4 siblings, 1 reply; 7+ messages in thread
From: Thomas Lamprecht @ 2020-11-06 8:31 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
On 06.11.20 09:24, Alexandre Derumier wrote:
> This move Cgroup module from LXC to pve-common,
> to be able to use it in qemu-server too.
> (and add support for cgroupv2 only for qemu)
>
> I have also included a bugfix for cpushares from cgroup v1,
> when value is not defined,it's currently set to 100 instead 1024.
> (can be trigerred by pct --delete cpuunits)
>
>
> Changelog v2:
>
> - keep get_subdir in base class
>
> Alexandre Derumier (4):
> move PVE::LXC::CGroup to PVE::CGroup
> Cgroups: remove specific lxc code
> bugfix: cpushares : default value is 1024 for cgroup v1
> systemd: add CPUWeight encoding
>
> src/Makefile | 1 +
> src/PVE/CGroup.pm | 551 +++++++++++++++++++++++++++++++++++++++++++++
> src/PVE/Systemd.pm | 2 +-
> 3 files changed, 553 insertions(+), 1 deletion(-)
> create mode 100644 src/PVE/CGroup.pm
>
applied series, thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] applied-series: [PATCH V2 pve-common 0/4] add generic CGroup module
2020-11-06 8:31 ` [pve-devel] applied-series: [PATCH V2 pve-common 0/4] add generic CGroup module Thomas Lamprecht
@ 2020-11-08 14:10 ` Alexandre Derumier
0 siblings, 0 replies; 7+ messages in thread
From: Alexandre Derumier @ 2020-11-08 14:10 UTC (permalink / raw)
To: Thomas Lamprecht; +Cc: Proxmox VE development discussion
Thanks Thomas,
don't forget related pve-container && qemu-server patches
https://lists.proxmox.com/pipermail/pve-devel/2020-October/045714.html
https://lists.proxmox.com/pipermail/pve-devel/2020-October/045720.html
Le ven. 6 nov. 2020 à 09:31, Thomas Lamprecht
<t.lamprecht@proxmox.com> a écrit :
>
> On 06.11.20 09:24, Alexandre Derumier wrote:
> > This move Cgroup module from LXC to pve-common,
> > to be able to use it in qemu-server too.
> > (and add support for cgroupv2 only for qemu)
> >
> > I have also included a bugfix for cpushares from cgroup v1,
> > when value is not defined,it's currently set to 100 instead 1024.
> > (can be trigerred by pct --delete cpuunits)
> >
> >
> > Changelog v2:
> >
> > - keep get_subdir in base class
> >
> > Alexandre Derumier (4):
> > move PVE::LXC::CGroup to PVE::CGroup
> > Cgroups: remove specific lxc code
> > bugfix: cpushares : default value is 1024 for cgroup v1
> > systemd: add CPUWeight encoding
> >
> > src/Makefile | 1 +
> > src/PVE/CGroup.pm | 551 +++++++++++++++++++++++++++++++++++++++++++++
> > src/PVE/Systemd.pm | 2 +-
> > 3 files changed, 553 insertions(+), 1 deletion(-)
> > create mode 100644 src/PVE/CGroup.pm
> >
>
>
>
> applied series, thanks!
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-11-08 14:10 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-06 8:24 [pve-devel] [PATCH V2 pve-common 0/4] add generic CGroup module Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 1/4] move PVE::LXC::CGroup to PVE::CGroup Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 2/4] Cgroups: remove specific lxc code Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 3/4] bugfix: cpushares : default value is 1024 for cgroup v1 Alexandre Derumier
2020-11-06 8:24 ` [pve-devel] [PATCH V2 pve-common 4/4] systemd: add CPUWeight encoding Alexandre Derumier
2020-11-06 8:31 ` [pve-devel] applied-series: [PATCH V2 pve-common 0/4] add generic CGroup module Thomas Lamprecht
2020-11-08 14:10 ` Alexandre Derumier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox