all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit
@ 2023-06-20 15:00 Fiona Ebner
  2023-06-20 15:00 ` [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper Fiona Ebner
  2023-06-22 12:56 ` [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Fiona Ebner @ 2023-06-20 15:00 UTC (permalink / raw)
  To: pve-devel

which combines the stop+wait logic previously present at the single
call site of wait_for_unit_removed() in QemuServer.pm. It also does a
reset-failed call first, to ensure a unit in a failed state is also
cleaned up properly.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 src/PVE/Systemd.pm | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/PVE/Systemd.pm b/src/PVE/Systemd.pm
index 2517d31..327106f 100644
--- a/src/PVE/Systemd.pm
+++ b/src/PVE/Systemd.pm
@@ -7,7 +7,7 @@ use Net::DBus qw(dbus_uint32 dbus_uint64 dbus_boolean);
 use Net::DBus::Callback;
 use Net::DBus::Reactor;
 
-use PVE::Tools qw(file_set_contents file_get_contents trim);
+use PVE::Tools qw(file_set_contents file_get_contents run_command trim);
 
 sub escape_unit {
     my ($val, $is_path) = @_;
@@ -167,6 +167,20 @@ sub wait_for_unit_removed($;$) {
     }, $timeout);
 }
 
+sub cleanup_transient_unit($;$) {
+    my ($unit, $timeout) = @_;
+
+    eval {
+	my %param = ( outfunc => sub {}, errfunc => sub {} );
+	# If the unit is in a failed state (e.g. after being OOM-killed), stopping is not enough.
+	run_command(['/bin/systemctl', 'reset-failed', $unit], %param);
+	run_command(['/bin/systemctl', 'stop', $unit], %param);
+    };
+
+    # Issues with the above not being fully completed are rare, but not impossible, see bug #3733.
+    wait_for_unit_removed($unit, $timeout);
+}
+
 sub read_ini {
     my ($filename) = @_;
 
-- 
2.39.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper
  2023-06-20 15:00 [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Fiona Ebner
@ 2023-06-20 15:00 ` Fiona Ebner
  2023-06-22 12:56 ` [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Fiona Ebner @ 2023-06-20 15:00 UTC (permalink / raw)
  To: pve-devel

which also runs a reset-failed command for the unit first, to ensure
it is also cleaned up properly if in a failed state (e.g. after being
OOM-killed). Previuosly, users in that situation would only see the
less than ideal error message 'timeout waiting on systemd'.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Dependency bump for libpve-common-perl needed.

 PVE/QemuServer.pm            | 7 +------
 test/MigrationTest/Shared.pm | 2 +-
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 940cdacd..34d9b0b1 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5871,12 +5871,7 @@ sub vm_start_nolock {
 
     PVE::Storage::activate_volumes($storecfg, $vollist);
 
-    eval {
-	run_command(['/bin/systemctl', 'stop', "$vmid.scope"], outfunc => sub{}, errfunc => sub{});
-    };
-    # Issues with the above 'stop' not being fully completed are extremely rare, a very low
-    # timeout should be more than enough here...
-    PVE::Systemd::wait_for_unit_removed("$vmid.scope", 20);
+    PVE::Systemd::cleanup_transient_unit("$vmid.scope", 20);
 
     my $cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{cpuunits});
 
diff --git a/test/MigrationTest/Shared.pm b/test/MigrationTest/Shared.pm
index aa7203d1..44624f7f 100644
--- a/test/MigrationTest/Shared.pm
+++ b/test/MigrationTest/Shared.pm
@@ -191,7 +191,7 @@ $storage_plugin_module->mock(
 
 our $systemd_module = Test::MockModule->new("PVE::Systemd");
 $systemd_module->mock(
-    wait_for_unit_removed => sub {
+    cleanup_transient_unit => sub {
 	return;
     },
     enter_systemd_scope => sub {
-- 
2.39.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit
  2023-06-20 15:00 [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Fiona Ebner
  2023-06-20 15:00 ` [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper Fiona Ebner
@ 2023-06-22 12:56 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2023-06-22 12:56 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 20/06/2023 um 17:00 schrieb Fiona Ebner:
> which combines the stop+wait logic previously present at the single
> call site of wait_for_unit_removed() in QemuServer.pm. It also does a
> reset-failed call first, to ensure a unit in a failed state is also
> cleaned up properly.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  src/PVE/Systemd.pm | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/src/PVE/Systemd.pm b/src/PVE/Systemd.pm
> index 2517d31..327106f 100644
> --- a/src/PVE/Systemd.pm
> +++ b/src/PVE/Systemd.pm
> @@ -7,7 +7,7 @@ use Net::DBus qw(dbus_uint32 dbus_uint64 dbus_boolean);
>  use Net::DBus::Callback;
>  use Net::DBus::Reactor;
>  
> -use PVE::Tools qw(file_set_contents file_get_contents trim);
> +use PVE::Tools qw(file_set_contents file_get_contents run_command trim);
>  
>  sub escape_unit {
>      my ($val, $is_path) = @_;
> @@ -167,6 +167,20 @@ sub wait_for_unit_removed($;$) {
>      }, $timeout);
>  }
>  
> +sub cleanup_transient_unit($;$) {
> +    my ($unit, $timeout) = @_;
> +
> +    eval {
> +	my %param = ( outfunc => sub {}, errfunc => sub {} );
> +	# If the unit is in a failed state (e.g. after being OOM-killed), stopping is not enough.
> +	run_command(['/bin/systemctl', 'reset-failed', $unit], %param);
> +	run_command(['/bin/systemctl', 'stop', $unit], %param);
> +    };
> +
> +    # Issues with the above not being fully completed are rare, but not impossible, see bug #3733.
> +    wait_for_unit_removed($unit, $timeout);
> +}
> +
>  sub read_ini {
>      my ($filename) = @_;
>  

for the record, this the same in qemu-server directly for now, as talked off list,
didn't remembered that we got a run_command there already, still big thanks for
finding this!




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-22 12:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-20 15:00 [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Fiona Ebner
2023-06-20 15:00 ` [pve-devel] [PATCH qemu-server] vm start: switch to new cleanup_transient_unit systemd helper Fiona Ebner
2023-06-22 12:56 ` [pve-devel] [PATCH common] systemd: add helper to cleanup transient unit Thomas Lamprecht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal