public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH common/qemu-server/manager v3] fix #3502: VM start timeout config parameter
@ 2022-12-16 13:36 Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH common v3 1/5] VM start timeout config parameter in backend Daniel Tschlatscher
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-16 13:36 UTC (permalink / raw)
  To: pve-devel

This patch series adds a an option to specify a timeout value in the
config for starting VMs.
The minimum is 0 seconds, which disables the timeout completely. The
maximum is 86400 seconds, equivalent to 1 day.

The timeout value can also be passed via the CLI, which always
overrides the setting. If no value for timeout is passed, the timeout
will either be read from the config, or if unavailable, chosen
heuristically like before, with a default of 30 seconds in most cases.

For this, a new property string called "startoptions" is added.
Currently only the VM start timeout is configurable with it.


Changes from v2
* Rebased to current repository masters
* Some minor code cleanups
* Reordered the commits and added a cover letter


pve-common:

Daniel Tschlatscher (1):
  VM start timeout config parameter

 src/PVE/JSONSchema.pm | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

qemu-server:

Daniel Tschlatscher (3):
  fix #3502: VM start timeout config parameter
  await and kill lingering KVM thread when VM start reaches timeout
  make the timeout value editable when the VM is locked

 PVE/API2/Qemu.pm          | 27 ++++++++++++++++++++++++++
 PVE/QemuServer.pm         | 41 ++++++++++++++++++++++++++++++++-------
 PVE/QemuServer/Helpers.pm |  4 ++++
 3 files changed, 65 insertions(+), 7 deletions(-)

pve-manager:

Daniel Tschlatscher (1):
  VM start Timeout "Options" parameter in the GUI

 www/manager6/qemu/Options.js | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH common v3 1/5] VM start timeout config parameter in backend
  2022-12-16 13:36 [pve-devel] [PATCH common/qemu-server/manager v3] fix #3502: VM start timeout config parameter Daniel Tschlatscher
@ 2022-12-16 13:36 ` Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 2/5] expose VM start timeout config setting in API Daniel Tschlatscher
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-16 13:36 UTC (permalink / raw)
  To: pve-devel

This allows setting the 'startoptions' property string in the config.
For now this only implements the 'timeout' parameter but should be
rather easily extensible and allow related VM start config options
to be also configurable here.

Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
---

Changes from v2:
* Reworded the API description somewhat

 src/PVE/JSONSchema.pm | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/src/PVE/JSONSchema.pm b/src/PVE/JSONSchema.pm
index 527e409..64dc01b 100644
--- a/src/PVE/JSONSchema.pm
+++ b/src/PVE/JSONSchema.pm
@@ -640,6 +640,17 @@ sub pve_verify_startup_order {
     die "unable to parse startup options\n";
 }
 
+register_format('start-options', \&pve_verify_startup_options);
+sub pve_verify_startup_options {
+    my ($value, $noerr) = @_;
+
+    return $value if pve_parse_startup_options($value);
+
+    return undef if $noerr;
+
+    die "unable to parse vm start options\n";
+}
+
 my %bwlimit_opt = (
     optional => 1,
     type => 'number', minimum => '0',
@@ -748,6 +759,33 @@ PVE::JSONSchema::register_standard_option('pve-startup-order', {
     typetext => '[[order=]\d+] [,up=\d+] [,down=\d+] ',
 });
 
+sub pve_parse_startup_options {
+    my ($value) = @_;
+
+    return undef if !$value;
+
+    my $res = {};
+
+    foreach my $p (split(/,/, $value)) {
+	next if $p =~ m/^\s*$/;
+
+	if ($p =~ m/^timeout=(\d+)$/ && int($1) <= 86400) {
+	    $res->{timeout} = $1;
+	} else {
+	    return undef;
+	}
+    }
+
+    return $res;
+}
+
+register_standard_option('start-options', {
+    description => "Start up options for the VM. This only allows setting the VM start timeout for now, which is the maximum VM startup timeout in seconds. The maximum value for timeout is 86400, the minimum 0, which disables the timeout completely. If timeout is unset, the timeout will either be the memory of the VM in GiBs or 30, depending on which is higher. If unset and hibernated, the value will at least be 300 seconds, with hugepages at least 150 seconds.",
+    optional => 1,
+    type => 'string', format => 'start-options',
+    typetext => 'timeout=\d+',
+});
+
 register_format('pve-tfa-secret', \&pve_verify_tfa_secret);
 sub pve_verify_tfa_secret {
     my ($key, $noerr) = @_;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH qemu-server v3 2/5] expose VM start timeout config setting in API
  2022-12-16 13:36 [pve-devel] [PATCH common/qemu-server/manager v3] fix #3502: VM start timeout config parameter Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH common v3 1/5] VM start timeout config parameter in backend Daniel Tschlatscher
@ 2022-12-16 13:36 ` Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout Daniel Tschlatscher
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-16 13:36 UTC (permalink / raw)
  To: pve-devel

This patch makes it possible to now set the starting timeout value for
a VM via the config parameter 'startoptions'.
Now, if the timeout parameter is set, it will override the heuristic
calculation of the VM start timeout.

The maximum value for the timeout parameter is 86400 seconds, which is
one day. The minimum value is 0, which disables the timeout.

Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
---

Changes from v2:
* No changes

 PVE/API2/Qemu.pm          | 2 ++
 PVE/QemuServer.pm         | 1 +
 PVE/QemuServer/Helpers.pm | 4 ++++
 3 files changed, 7 insertions(+)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index badfc37..8375ab9 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -541,6 +541,7 @@ my $generaloptions = {
     'reboot' => 1,
     'startdate' => 1,
     'startup' => 1,
+    'startoptions' => 1,
     'tdf' => 1,
     'template' => 1,
 };
@@ -2647,6 +2648,7 @@ __PACKAGE__->register_method({
 		description => "Wait maximal timeout seconds.",
 		type => 'integer',
 		minimum => 0,
+		maximum => 86400,
 		default => 'max(30, vm memory in GiB)',
 		optional => 1,
 	    },
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a746b3d..2adbe3a 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -716,6 +716,7 @@ EODESCR
 	description => "List of host cores used to execute guest processes, for example: 0,5,8-11",
 	optional => 1,
     },
+    startoptions => get_standard_option('start-options'),
 };
 
 my $cicustom_fmt = {
diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
index e91f906..1fa9011 100644
--- a/PVE/QemuServer/Helpers.pm
+++ b/PVE/QemuServer/Helpers.pm
@@ -144,6 +144,10 @@ sub version_cmp {
 
 sub config_aware_timeout {
     my ($config, $is_suspended) = @_;
+
+    my $startup = PVE::JSONSchema::pve_parse_startup_options($config->{startoptions});
+    return $startup->{timeout} if defined($startup->{timeout});
+
     my $memory = $config->{memory};
     my $timeout = 30;
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout
  2022-12-16 13:36 [pve-devel] [PATCH common/qemu-server/manager v3] fix #3502: VM start timeout config parameter Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH common v3 1/5] VM start timeout config parameter in backend Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 2/5] expose VM start timeout config setting in API Daniel Tschlatscher
@ 2022-12-16 13:36 ` Daniel Tschlatscher
  2022-12-21 11:14   ` Fabian Grünbichler
  2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 4/5] make the timeout value editable when the VM is locked Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH manager v3 5/5] VM start Timeout "Options" parameter in the GUI Daniel Tschlatscher
  4 siblings, 1 reply; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-16 13:36 UTC (permalink / raw)
  To: pve-devel

In some cases the VM API start method would return before the detached
KVM process would have exited. This is especially problematic with HA,
because the HA manager would think the VM started successfully, later
see that it exited and start it again in an endless loop.

Moreover, another case exists when resuming a hibernated VM. In this
case, the qemu thread will attempt to load the whole vmstate into
memory before exiting.
Depending on vmstate size, disk read speed, and similar factors this
can take quite a while though and it is not possible to start the VM
normally during this time.

To get around this, this patch intercepts the error, looks whether a
corresponding KVM thread is still running, and waits for/kills it,
before continuing.

Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
---

Changes from v2:
* Rebased to current master
* Changed warn to use 'log_warn' instead
* Reworded log message when waiting for lingering qemu process

 PVE/QemuServer.pm | 40 +++++++++++++++++++++++++++++++++-------
 1 file changed, 33 insertions(+), 7 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 2adbe3a..f63dc3f 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5884,15 +5884,41 @@ sub vm_start_nolock {
 		$tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom);
 	    }
 
-	    my $exitcode = run_command($cmd, %run_params);
-	    if ($exitcode) {
-		if ($tpmpid) {
-		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
-		    kill 'TERM', $tpmpid;
+	    eval {
+		my $exitcode = run_command($cmd, %run_params);
+
+		if ($exitcode) {
+		    if ($tpmpid) {
+			log_warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
+			kill 'TERM', $tpmpid;
+		    }
+		    die "QEMU exited with code $exitcode\n";
 		}
-		die "QEMU exited with code $exitcode\n";
+	    };
+
+	    if (my $err = $@) {
+		my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid);
+
+		if ($pid ne "") {
+		    my $count = 0;
+		    my $timeout = 300;
+
+		    print "Waiting $timeout seconds for detached qemu process $pid to exit\n";
+		    while (($count < $timeout) &&
+			PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
+			$count++;
+			sleep(1);
+		    }
+
+		    if ($count >= $timeout) {
+			log_warn "Reached timeout. Terminating now with SIGKILL\n";
+			kill(9, $pid);
+		    }
+		}
+
+		die $err;
 	    }
-	};
+	}
     };
 
     if ($conf->{hugepages}) {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH qemu-server v3 4/5] make the timeout value editable when the VM is locked
  2022-12-16 13:36 [pve-devel] [PATCH common/qemu-server/manager v3] fix #3502: VM start timeout config parameter Daniel Tschlatscher
                   ` (2 preceding siblings ...)
  2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout Daniel Tschlatscher
@ 2022-12-16 13:36 ` Daniel Tschlatscher
  2022-12-16 13:36 ` [pve-devel] [PATCH manager v3 5/5] VM start Timeout "Options" parameter in the GUI Daniel Tschlatscher
  4 siblings, 0 replies; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-16 13:36 UTC (permalink / raw)
  To: pve-devel

In some cases the VM could no longer start when the timeout value was
set and afterwards, for example, hibernated. In this case the VM is
soft locked in the GUI, because the API would not allow changing the
timeout value anymore. (The only way out here would be to change the
value manually in the config)

To avoid unwanted side effects, it is possible to change the value for
the new 'startoptions' parameter, only if the VM is currently locked
with lock 'suspended'.

Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
---

Changes from v2:
* Rebased to current master

 PVE/API2/Qemu.pm | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 8375ab9..a160440 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -11,6 +11,7 @@ use JSON;
 use URI::Escape;
 use Crypt::OpenSSL::Random;
 use Socket qw(SOCK_STREAM);
+use List::Util qw(first);
 
 use PVE::APIClient::LWP;
 use PVE::CGroup;
@@ -650,6 +651,28 @@ my $check_vm_modify_config_perm = sub {
     return 1;
 };
 
+# Certain parameter fields should still be editable and deletable when the VM is locked
+# Returns true only if all parameter fields to be edited or deleted are defined in @allowed
+sub skiplock_for_allowed_fields {
+    my ($param, @deleted) = @_;
+
+    my @allowed = qw"startoptions";
+    my $skiplock = 1;
+
+    my @to_check = @deleted;
+    for (keys %$param) {
+	push(@to_check, $_);
+    }
+
+    my $idx = 0;
+    while ($skiplock && $idx < keys @to_check) {
+	$skiplock &= defined(first { $_ eq $to_check[$idx] } @allowed);
+	$idx++;
+    }
+
+    return $skiplock;
+}
+
 __PACKAGE__->register_method({
     name => 'vmlist',
     path => '',
@@ -1598,6 +1621,8 @@ my $update_vm_api  = sub {
 	    push @delete, 'runningcpu' if $conf->{runningcpu};
 	}
 
+	$skiplock |= $conf->{lock} eq "suspended" && skiplock_for_allowed_fields($param, @delete);
+
 	PVE::QemuConfig->check_lock($conf) if !$skiplock;
 
 	foreach my $opt (keys %$revert) {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH manager v3 5/5] VM start Timeout "Options" parameter in the GUI
  2022-12-16 13:36 [pve-devel] [PATCH common/qemu-server/manager v3] fix #3502: VM start timeout config parameter Daniel Tschlatscher
                   ` (3 preceding siblings ...)
  2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 4/5] make the timeout value editable when the VM is locked Daniel Tschlatscher
@ 2022-12-16 13:36 ` Daniel Tschlatscher
  4 siblings, 0 replies; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-16 13:36 UTC (permalink / raw)
  To: pve-devel

This makes it possible to set the newly introduced config parameter
for timeout via the 'startoptions' property string.
For now this only implements setting the timeout value when starting
a VM, though this should be rather easily exentensible to include
other future start options parameters.

Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
---

Changes from v2:
* No changes

 www/manager6/qemu/Options.js | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/www/manager6/qemu/Options.js b/www/manager6/qemu/Options.js
index 7b112400..7f148e18 100644
--- a/www/manager6/qemu/Options.js
+++ b/www/manager6/qemu/Options.js
@@ -76,6 +76,37 @@ Ext.define('PVE.qemu.Options', {
 			onlineHelp: 'qm_startup_and_shutdown',
 		    } : undefined,
 	    },
+	    startoptions: {
+		header: gettext('VM startup options'),
+		defaultValue: Proxmox.Utils.defaultText,
+		renderer: val => val,
+		editor: caps.vms['VM.Config.Options'] ? {
+		    xtype: 'proxmoxWindowEdit',
+		    subject: gettext('VM start timeout'),
+		    setValues: function(values) {
+			Ext.Array.each(this.query('inputpanel'), function(panel) {
+			    panel.setValues(PVE.Parser.parsePropertyString(values.startoptions));
+			});
+		    },
+		    items: {
+			xtype: 'inputpanel',
+			items: {
+			    xtype: 'proxmoxintegerfield',
+			    name: 'timeout',
+			    minValue: 0,
+			    maxValue: 86400,
+			    fieldLabel: gettext('Timeout (sec)'),
+			    emptyText: Proxmox.Utils.defaultText,
+			},
+			onGetValues: function(values) {
+			    if (values === undefined || Object.keys(values).length === 0) {
+				return { 'delete': 'startoptions' };
+			    }
+			    return { 'startoptions': PVE.Parser.printPropertyString(values) };
+			},
+		    },
+		} : undefined,
+	    },
 	    ostype: {
 		header: gettext('OS Type'),
 		editor: caps.vms['VM.Config.Options'] ? 'PVE.qemu.OSTypeEdit' : undefined,
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout
  2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout Daniel Tschlatscher
@ 2022-12-21 11:14   ` Fabian Grünbichler
  2022-12-22 12:58     ` Daniel Tschlatscher
  0 siblings, 1 reply; 10+ messages in thread
From: Fabian Grünbichler @ 2022-12-21 11:14 UTC (permalink / raw)
  To: Proxmox VE development discussion

On December 16, 2022 2:36 pm, Daniel Tschlatscher wrote:
> In some cases the VM API start method would return before the detached
> KVM process would have exited. This is especially problematic with HA,
> because the HA manager would think the VM started successfully, later
> see that it exited and start it again in an endless loop.
> 
> Moreover, another case exists when resuming a hibernated VM. In this
> case, the qemu thread will attempt to load the whole vmstate into
> memory before exiting.
> Depending on vmstate size, disk read speed, and similar factors this
> can take quite a while though and it is not possible to start the VM
> normally during this time.
> 
> To get around this, this patch intercepts the error, looks whether a
> corresponding KVM thread is still running, and waits for/kills it,
> before continuing.
> 
> Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
> ---
> 
> Changes from v2:
> * Rebased to current master
> * Changed warn to use 'log_warn' instead
> * Reworded log message when waiting for lingering qemu process
> 
>  PVE/QemuServer.pm | 40 +++++++++++++++++++++++++++++++++-------
>  1 file changed, 33 insertions(+), 7 deletions(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 2adbe3a..f63dc3f 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -5884,15 +5884,41 @@ sub vm_start_nolock {
>  		$tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom);
>  	    }
>  
> -	    my $exitcode = run_command($cmd, %run_params);
> -	    if ($exitcode) {
> -		if ($tpmpid) {
> -		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
> -		    kill 'TERM', $tpmpid;
> +	    eval {
> +		my $exitcode = run_command($cmd, %run_params);
> +
> +		if ($exitcode) {
> +		    if ($tpmpid) {
> +			log_warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup
error\n";

this warn -> log_warn change kind of slipped in, it's not really part of this
patch?

> +			kill 'TERM', $tpmpid;
> +		    }
> +		    die "QEMU exited with code $exitcode\n";
>  		}
> -		die "QEMU exited with code $exitcode\n";
> +	    };
> +
> +	    if (my $err = $@) {
> +		my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid);
> +
> +		if ($pid ne "") {

can be combined:
if (my $pid = ...) {

}

(empty string evaluates to false in perl ;))

> +		    my $count = 0;
> +		    my $timeout = 300;
> +
> +		    print "Waiting $timeout seconds for detached qemu process $pid to exit\n";
> +		    while (($count < $timeout) &&
> +			PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
> +			$count++;
> +			sleep(1);
> +		    }
> +

either here

> +		    if ($count >= $timeout) {
> +			log_warn "Reached timeout. Terminating now with SIGKILL\n";

or here, recheck that VM is still running and still has the same PID, and log
accordingly instead of KILLing if not..

the same is also true in _do_vm_stop

> +			kill(9, $pid);
> +		    }
> +		}
> +
> +		die $err;
>  	    }
> -	};
> +	}
>      };
>  
>      if ($conf->{hugepages}) {
> -- 
> 2.30.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout
  2022-12-21 11:14   ` Fabian Grünbichler
@ 2022-12-22 12:58     ` Daniel Tschlatscher
  2022-12-22 13:20       ` Fabian Grünbichler
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-22 12:58 UTC (permalink / raw)
  To: pve-devel



On 12/21/22 12:14, Fabian Grünbichler wrote:
> On December 16, 2022 2:36 pm, Daniel Tschlatscher wrote:
>> In some cases the VM API start method would return before the detached
>> KVM process would have exited. This is especially problematic with HA,
>> because the HA manager would think the VM started successfully, later
>> see that it exited and start it again in an endless loop.
>>
>> Moreover, another case exists when resuming a hibernated VM. In this
>> case, the qemu thread will attempt to load the whole vmstate into
>> memory before exiting.
>> Depending on vmstate size, disk read speed, and similar factors this
>> can take quite a while though and it is not possible to start the VM
>> normally during this time.
>>
>> To get around this, this patch intercepts the error, looks whether a
>> corresponding KVM thread is still running, and waits for/kills it,
>> before continuing.
>>
>> Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
>> ---
>>
>> Changes from v2:
>> * Rebased to current master
>> * Changed warn to use 'log_warn' instead
>> * Reworded log message when waiting for lingering qemu process
>>
>>  PVE/QemuServer.pm | 40 +++++++++++++++++++++++++++++++++-------
>>  1 file changed, 33 insertions(+), 7 deletions(-)
>>
>> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
>> index 2adbe3a..f63dc3f 100644
>> --- a/PVE/QemuServer.pm
>> +++ b/PVE/QemuServer.pm
>> @@ -5884,15 +5884,41 @@ sub vm_start_nolock {
>>  		$tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom);
>>  	    }
>>  
>> -	    my $exitcode = run_command($cmd, %run_params);
>> -	    if ($exitcode) {
>> -		if ($tpmpid) {
>> -		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
>> -		    kill 'TERM', $tpmpid;
>> +	    eval {
>> +		my $exitcode = run_command($cmd, %run_params);
>> +
>> +		if ($exitcode) {
>> +		    if ($tpmpid) {
>> +			log_warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup
> error\n";
> 
> this warn -> log_warn change kind of slipped in, it's not really part of this
> patch?

Because I changed this line anyway, I changed it to log_warn as it is
imported already and, as I understood, the preferable alternative
to calling 'warn'.
Sourcing this in it's own patch seems overkill to me, or would you
rather suggest something like this should be handled in, e.g. a
file-encompassing refactoring?

> 
>> +			kill 'TERM', $tpmpid;
>> +		    }
>> +		    die "QEMU exited with code $exitcode\n";
>>  		}
>> -		die "QEMU exited with code $exitcode\n";
>> +	    };
>> +
>> +	    if (my $err = $@) {
>> +		my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid);
>> +
>> +		if ($pid ne "") {
> 
> can be combined:
> if (my $pid = ...) {
> 
> }
> 
> (empty string evaluates to false in perl ;))

Thanks for the input!

> 
>> +		    my $count = 0;
>> +		    my $timeout = 300;
>> +
>> +		    print "Waiting $timeout seconds for detached qemu process $pid to exit\n";
>> +		    while (($count < $timeout) &&
>> +			PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
>> +			$count++;
>> +			sleep(1);
>> +		    }
>> +
> 
> either here
> 
>> +		    if ($count >= $timeout) {
>> +			log_warn "Reached timeout. Terminating now with SIGKILL\n";
> 
> or here, recheck that VM is still running and still has the same PID, and log
> accordingly instead of KILLing if not..
> 
> the same is also true in _do_vm_stop

Alright, I will look into it

> 
>> +			kill(9, $pid);
>> +		    }
>> +		}
>> +
>> +		die $err;
>>  	    }
>> -	};
>> +	}
>>      };
>>  
>>      if ($conf->{hugepages}) {
>> -- 
>> 2.30.2
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout
  2022-12-22 12:58     ` Daniel Tschlatscher
@ 2022-12-22 13:20       ` Fabian Grünbichler
  2022-12-22 14:22         ` Daniel Tschlatscher
  0 siblings, 1 reply; 10+ messages in thread
From: Fabian Grünbichler @ 2022-12-22 13:20 UTC (permalink / raw)
  To: Proxmox VE development discussion

On December 22, 2022 1:58 pm, Daniel Tschlatscher wrote:

>>>  
>>> -	    my $exitcode = run_command($cmd, %run_params);
>>> -	    if ($exitcode) {
>>> -		if ($tpmpid) {
>>> -		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
>>> -		    kill 'TERM', $tpmpid;
>>> +	    eval {
>>> +		my $exitcode = run_command($cmd, %run_params);
>>> +
>>> +		if ($exitcode) {
>>> +		    if ($tpmpid) {
>>> +			log_warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup
>> error\n";
>> 
>> this warn -> log_warn change kind of slipped in, it's not really part of this
>> patch?
> 
> Because I changed this line anyway, I changed it to log_warn as it is
> imported already and, as I understood, the preferable alternative
> to calling 'warn'.
> Sourcing this in it's own patch seems overkill to me, or would you
> rather suggest something like this should be handled in, e.g. a
> file-encompassing refactoring?

ideally it could be sent as cleanup patch up-front (then it can be applied even
if the rest needs another round ;)) or at least mentioned somewhere (e.g., in
the patch notes). seemingly unrelated changes in a patch always make me wary that
the patch was generated from some unclean tree/more or less than intended was
`git add`ed. in this case my guess was that you just changed that (wrapped) call
site to match your newly introduced ones, but it could also have been an
unintentional search+replace result, for example, so I'd rather ask :)




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout
  2022-12-22 13:20       ` Fabian Grünbichler
@ 2022-12-22 14:22         ` Daniel Tschlatscher
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Tschlatscher @ 2022-12-22 14:22 UTC (permalink / raw)
  To: pve-devel



On 12/22/22 14:20, Fabian Grünbichler wrote:
> On December 22, 2022 1:58 pm, Daniel Tschlatscher wrote:
> 
>>>>  
>>>> -	    my $exitcode = run_command($cmd, %run_params);
>>>> -	    if ($exitcode) {
>>>> -		if ($tpmpid) {
>>>> -		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
>>>> -		    kill 'TERM', $tpmpid;
>>>> +	    eval {
>>>> +		my $exitcode = run_command($cmd, %run_params);
>>>> +
>>>> +		if ($exitcode) {
>>>> +		    if ($tpmpid) {
>>>> +			log_warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup
>>> error\n";
>>>
>>> this warn -> log_warn change kind of slipped in, it's not really part of this
>>> patch?
>>
>> Because I changed this line anyway, I changed it to log_warn as it is
>> imported already and, as I understood, the preferable alternative
>> to calling 'warn'.
>> Sourcing this in it's own patch seems overkill to me, or would you
>> rather suggest something like this should be handled in, e.g. a
>> file-encompassing refactoring?
> 
> ideally it could be sent as cleanup patch up-front (then it can be applied even
> if the rest needs another round ;)) or at least mentioned somewhere (e.g., in
> the patch notes). seemingly unrelated changes in a patch always make me wary that
> the patch was generated from some unclean tree/more or less than intended was
> `git add`ed. in this case my guess was that you just changed that (wrapped) call
> site to match your newly introduced ones, but it could also have been an
> unintentional search+replace result, for example, so I'd rather ask :)

As the v2 still used the 'warn' calls, I mentioned it under the "Changes
from v2" section. 'Changed warn to use 'log_warn' instead'. (Albeit
wasn't at my best with this wording here)

> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-12-22 14:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-16 13:36 [pve-devel] [PATCH common/qemu-server/manager v3] fix #3502: VM start timeout config parameter Daniel Tschlatscher
2022-12-16 13:36 ` [pve-devel] [PATCH common v3 1/5] VM start timeout config parameter in backend Daniel Tschlatscher
2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 2/5] expose VM start timeout config setting in API Daniel Tschlatscher
2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 3/5] await and kill lingering KVM thread when VM start reaches timeout Daniel Tschlatscher
2022-12-21 11:14   ` Fabian Grünbichler
2022-12-22 12:58     ` Daniel Tschlatscher
2022-12-22 13:20       ` Fabian Grünbichler
2022-12-22 14:22         ` Daniel Tschlatscher
2022-12-16 13:36 ` [pve-devel] [PATCH qemu-server v3 4/5] make the timeout value editable when the VM is locked Daniel Tschlatscher
2022-12-16 13:36 ` [pve-devel] [PATCH manager v3 5/5] VM start Timeout "Options" parameter in the GUI Daniel Tschlatscher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal