public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit
@ 2023-06-20 15:00 Fiona Ebner
  2023-06-20 15:00 ` [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper Fiona Ebner
  2023-06-22 12:56 ` [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Fiona Ebner @ 2023-06-20 15:00 UTC (permalink / raw)
  To: pve-devel

which combines the stop+wait logic previously present at the single
call site of wait_for_unit_removed() in QemuServer.pm. It also does a
reset-failed call first, to ensure a unit in a failed state is also
cleaned up properly.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 src/PVE/Systemd.pm | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/PVE/Systemd.pm b/src/PVE/Systemd.pm
index 2517d31..327106f 100644
--- a/src/PVE/Systemd.pm
+++ b/src/PVE/Systemd.pm
@@ -7,7 +7,7 @@ use Net::DBus qw(dbus_uint32 dbus_uint64 dbus_boolean);
 use Net::DBus::Callback;
 use Net::DBus::Reactor;
 
-use PVE::Tools qw(file_set_contents file_get_contents trim);
+use PVE::Tools qw(file_set_contents file_get_contents run_command trim);
 
 sub escape_unit {
     my ($val, $is_path) = @_;
@@ -167,6 +167,20 @@ sub wait_for_unit_removed($;$) {
     }, $timeout);
 }
 
+sub cleanup_transient_unit($;$) {
+    my ($unit, $timeout) = @_;
+
+    eval {
+	my %param = ( outfunc => sub {}, errfunc => sub {} );
+	# If the unit is in a failed state (e.g. after being OOM-killed), stopping is not enough.
+	run_command(['/bin/systemctl', 'reset-failed', $unit], %param);
+	run_command(['/bin/systemctl', 'stop', $unit], %param);
+    };
+
+    # Issues with the above not being fully completed are rare, but not impossible, see bug #3733.
+    wait_for_unit_removed($unit, $timeout);
+}
+
 sub read_ini {
     my ($filename) = @_;
 
-- 
2.39.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper
  2023-06-20 15:00 [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Fiona Ebner
@ 2023-06-20 15:00 ` Fiona Ebner
  2023-06-22 12:56 ` [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Fiona Ebner @ 2023-06-20 15:00 UTC (permalink / raw)
  To: pve-devel

which also runs a reset-failed command for the unit first, to ensure
it is also cleaned up properly if in a failed state (e.g. after being
OOM-killed). Previuosly, users in that situation would only see the
less than ideal error message 'timeout waiting on systemd'.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Dependency bump for libpve-common-perl needed.

 PVE/QemuServer.pm            | 7 +------
 test/MigrationTest/Shared.pm | 2 +-
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 940cdacd..34d9b0b1 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5871,12 +5871,7 @@ sub vm_start_nolock {
 
     PVE::Storage::activate_volumes($storecfg, $vollist);
 
-    eval {
-	run_command(['/bin/systemctl', 'stop', "$vmid.scope"], outfunc => sub{}, errfunc => sub{});
-    };
-    # Issues with the above 'stop' not being fully completed are extremely rare, a very low
-    # timeout should be more than enough here...
-    PVE::Systemd::wait_for_unit_removed("$vmid.scope", 20);
+    PVE::Systemd::cleanup_transient_unit("$vmid.scope", 20);
 
     my $cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{cpuunits});
 
diff --git a/test/MigrationTest/Shared.pm b/test/MigrationTest/Shared.pm
index aa7203d1..44624f7f 100644
--- a/test/MigrationTest/Shared.pm
+++ b/test/MigrationTest/Shared.pm
@@ -191,7 +191,7 @@ $storage_plugin_module->mock(
 
 our $systemd_module = Test::MockModule->new("PVE::Systemd");
 $systemd_module->mock(
-    wait_for_unit_removed => sub {
+    cleanup_transient_unit => sub {
 	return;
     },
     enter_systemd_scope => sub {
-- 
2.39.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit
  2023-06-20 15:00 [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Fiona Ebner
  2023-06-20 15:00 ` [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper Fiona Ebner
@ 2023-06-22 12:56 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2023-06-22 12:56 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 20/06/2023 um 17:00 schrieb Fiona Ebner:
> which combines the stop+wait logic previously present at the single
> call site of wait_for_unit_removed() in QemuServer.pm. It also does a
> reset-failed call first, to ensure a unit in a failed state is also
> cleaned up properly.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  src/PVE/Systemd.pm | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/src/PVE/Systemd.pm b/src/PVE/Systemd.pm
> index 2517d31..327106f 100644
> --- a/src/PVE/Systemd.pm
> +++ b/src/PVE/Systemd.pm
> @@ -7,7 +7,7 @@ use Net::DBus qw(dbus_uint32 dbus_uint64 dbus_boolean);
>  use Net::DBus::Callback;
>  use Net::DBus::Reactor;
>  
> -use PVE::Tools qw(file_set_contents file_get_contents trim);
> +use PVE::Tools qw(file_set_contents file_get_contents run_command trim);
>  
>  sub escape_unit {
>      my ($val, $is_path) = @_;
> @@ -167,6 +167,20 @@ sub wait_for_unit_removed($;$) {
>      }, $timeout);
>  }
>  
> +sub cleanup_transient_unit($;$) {
> +    my ($unit, $timeout) = @_;
> +
> +    eval {
> +	my %param = ( outfunc => sub {}, errfunc => sub {} );
> +	# If the unit is in a failed state (e.g. after being OOM-killed), stopping is not enough.
> +	run_command(['/bin/systemctl', 'reset-failed', $unit], %param);
> +	run_command(['/bin/systemctl', 'stop', $unit], %param);
> +    };
> +
> +    # Issues with the above not being fully completed are rare, but not impossible, see bug #3733.
> +    wait_for_unit_removed($unit, $timeout);
> +}
> +
>  sub read_ini {
>      my ($filename) = @_;
>  

for the record, this the same in qemu-server directly for now, as talked off list,
didn't remembered that we got a run_command there already, still big thanks for
finding this!




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-22 12:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-20 15:00 [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Fiona Ebner
2023-06-20 15:00 ` [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper Fiona Ebner
2023-06-22 12:56 ` [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal