* [pve-devel] [PATCH installer v2 1/6] low-level: initialize UI backend for 'dump-env' subcommand too
2024-02-13 15:13 [pve-devel] [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection Christoph Heiss
@ 2024-02-13 15:13 ` Christoph Heiss
2024-02-13 15:13 ` [pve-devel] [PATCH installer v2 2/6] sys: command: factor out kill() + waitpid() from run_command() Christoph Heiss
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Christoph Heiss @ 2024-02-13 15:13 UTC (permalink / raw)
To: pve-devel
Some detection routines might try to log things and call some
Proxmox::Ui functions all the way down, so just initialize it with the
stdio backend to avoid errors.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes since v1:
* no changes
proxmox-low-level-installer | 1 +
1 file changed, 1 insertion(+)
diff --git a/proxmox-low-level-installer b/proxmox-low-level-installer
index d127a40..2848295 100755
--- a/proxmox-low-level-installer
+++ b/proxmox-low-level-installer
@@ -91,6 +91,7 @@ Proxmox::Log::init("/tmp/install-low-level-${cmd}.log");
my $env = Proxmox::Install::ISOEnv::get();
if ($cmd eq 'dump-env') {
+ Proxmox::UI::init_stdio({}, $env);
my $out_dir = $env->{locations}->{run};
make_path($out_dir);
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH installer v2 2/6] sys: command: factor out kill() + waitpid() from run_command()
2024-02-13 15:13 [pve-devel] [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection Christoph Heiss
2024-02-13 15:13 ` [pve-devel] [PATCH installer v2 1/6] low-level: initialize UI backend for 'dump-env' subcommand too Christoph Heiss
@ 2024-02-13 15:13 ` Christoph Heiss
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 3/6] sys: command: handle EINTR in run_command() Christoph Heiss
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Christoph Heiss @ 2024-02-13 15:13 UTC (permalink / raw)
To: pve-devel
This moves the kill() + waitpid() combo into a separate subroutine,
avoiding open-coding that sequence. wait_for_process() also handles
properly unkillable process (e.g. in D-state) and avoids completely
locking up the installer in such cases. See [0].
For the latter case, a timeout exists (with a default of 5 seconds) in
which to wait for the process to exit after sending an optional
TERM/KILL signal.
Also while at it, add a few basic tests for run_command().
[0] https://lists.proxmox.com/pipermail/pve-devel/2024-February/061697.html
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes since v1:
* new patch
Proxmox/Sys/Command.pm | 60 +++++++++++++++++++++++++++++++++++++-----
test/Makefile | 5 +++-
test/run-command.pl | 35 ++++++++++++++++++++++++
3 files changed, 92 insertions(+), 8 deletions(-)
create mode 100755 test/run-command.pl
diff --git a/Proxmox/Sys/Command.pm b/Proxmox/Sys/Command.pm
index c3e24b3..e64e0ee 100644
--- a/Proxmox/Sys/Command.pm
+++ b/Proxmox/Sys/Command.pm
@@ -33,12 +33,55 @@ my sub cmd2string {
return join (' ', $quoted_args->@*);
}
+# Safely for the (sub-)process specified by $pid to exit, using a timeout.
+#
+# When kill => 1 is set, at first a TERM-signal is sent to the process before
+# checking if it exited.
+# If that fails, KILL is sent to process and then up to timeout => $timeout
+# seconds (default: 5) are waited for the process to exit.
+#
+# On sucess, the exitcode of the process is returned, otherwise `undef` (aka.
+# the process was unkillable).
+my sub wait_for_process {
+ my ($pid, %params) = @_;
+
+ kill('TERM', $pid) if $params{kill};
+
+ my $terminated = waitpid($pid, WNOHANG);
+ return $? if $terminated > 0;
+
+ kill('KILL', $pid) if $params{kill};
+
+ my $timeout = $params{timeout} // 5;
+ for (1 .. $timeout) {
+ $terminated = waitpid($pid, WNOHANG);
+ return $? if $terminated > 0;
+ sleep(1);
+ }
+
+ log_warn("failed to kill child pid $pid, probably stuck in D-state?\n");
+
+ # We tried our best, better let the child hang in the back then completely
+ # blocking installer progress .. it's a rather short-lived environment anyway
+}
+
sub syscmd {
my ($cmd) = @_;
return run_command($cmd, undef, undef, 1);
}
+# Runs a command an a subprocess, properly handling IO via piping, cleaning up and passing back the
+# exit code.
+#
+# If $cmd contains a pipe |, the command will be executed inside a bash shell.
+# If $cmd contains 'chpasswd', the input will be specially quoted for that purpose.
+#
+# Arguments:
+# * $cmd - The command to run, either a single string or array with individual arguments
+# * $func - Logging subroutine to call, receives both stdout and stderr
+# * $input - Stdin contents for the spawned subprocess
+# * $noout - Whether to append any process output to the return value
sub run_command {
my ($cmd, $func, $input, $noout) = @_;
@@ -104,8 +147,7 @@ sub run_command {
my $count = sysread ($h, $buf, 4096);
if (!defined ($count)) {
my $err = $!;
- kill (9, $pid);
- waitpid ($pid, 0);
+ wait_for_process($pid, kill => 1);
die "command '$cmd' failed: $err";
}
$select->remove($h) if !$count;
@@ -128,15 +170,19 @@ sub run_command {
&$func($logout) if $func;
- my $rv = waitpid ($pid, 0);
+ my $ec = wait_for_process($pid);
- return $? if $noout; # behave like standard system();
+ # behave like standard system(); returns -1 in case of errors too
+ return ($ec // -1) if $noout;
- if ($? == -1) {
+ if (!defined($ec)) {
+ # Don't fail completely here to let the install continue
+ warn "command '$cmdstr' failed to exit properly\n";
+ } elsif ($ec == -1) {
croak "command '$cmdstr' failed to execute\n";
- } elsif (my $sig = ($? & 127)) {
+ } elsif (my $sig = ($ec & 127)) {
croak "command '$cmdstr' failed - got signal $sig\n";
- } elsif (my $exitcode = ($? >> 8)) {
+ } elsif (my $exitcode = ($ec >> 8)) {
croak "command '$cmdstr' failed with exit code $exitcode";
}
diff --git a/test/Makefile b/test/Makefile
index fb80fc4..ae80a94 100644
--- a/test/Makefile
+++ b/test/Makefile
@@ -3,8 +3,11 @@ all:
export PERLLIB=..
.PHONY: check
-check: test-zfs-arc-max
+check: test-zfs-arc-max test-run-command
.PHONY: test-zfs-arc-max
test-zfs-arc-max:
./zfs-arc-max.pl
+
+test-run-command:
+ ./run-command.pl
diff --git a/test/run-command.pl b/test/run-command.pl
new file mode 100755
index 0000000..7d5805e
--- /dev/null
+++ b/test/run-command.pl
@@ -0,0 +1,35 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+
+use File::Temp;
+use Test::More;
+
+use Proxmox::Sys::Command qw(run_command CMD_FINISHED);
+use Proxmox::Sys::File qw(file_read_all);
+use Proxmox::UI;
+
+my $log_file = File::Temp->new();
+Proxmox::Log::init($log_file->filename);
+
+Proxmox::UI::init_stdio();
+
+is(run_command('echo test'), "test\n", 'basic usage');
+
+is(run_command('echo test', undef, undef, 1), 0, 'system()-mode');
+
+my $ret = run_command('bash -c "echo test; sleep 1000; echo test"', sub {
+ my $line = shift;
+ is($line, 'test', 'using CMD_FINISHED - produced correct log line');
+
+ return CMD_FINISHED;
+});
+is($ret, '', 'using CMD_FINISHED');
+
+# Check the log for errors/warnings
+my $log = file_read_all($log_file->filename);
+ok($log !~ m/(WARN|ERROR): /, 'no warnings or errors logged');
+print $log if $log =~ m/(WARN|ERROR): /;
+
+done_testing();
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH installer v2 3/6] sys: command: handle EINTR in run_command()
2024-02-13 15:13 [pve-devel] [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection Christoph Heiss
2024-02-13 15:13 ` [pve-devel] [PATCH installer v2 1/6] low-level: initialize UI backend for 'dump-env' subcommand too Christoph Heiss
2024-02-13 15:13 ` [pve-devel] [PATCH installer v2 2/6] sys: command: factor out kill() + waitpid() from run_command() Christoph Heiss
@ 2024-02-13 15:14 ` Christoph Heiss
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 4/6] sys: command: allow terminating the process early from log subroutine Christoph Heiss
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Christoph Heiss @ 2024-02-13 15:14 UTC (permalink / raw)
To: pve-devel
Previously, the I/O loop would continue endlessly until the subprocess
exited.
This explicit handling allows run_command() to be used with e.g.
alarm().
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes since v1:
* new patch
Proxmox/Sys/Command.pm | 9 ++++++++-
test/run-command.pl | 11 +++++++++++
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/Proxmox/Sys/Command.pm b/Proxmox/Sys/Command.pm
index e64e0ee..6389b17 100644
--- a/Proxmox/Sys/Command.pm
+++ b/Proxmox/Sys/Command.pm
@@ -134,10 +134,17 @@ sub run_command {
$select->add($error);
my ($ostream, $logout) = ('', '', '');
+ my $caught_sig;
while ($select->count) {
my @handles = $select->can_read (0.2);
+ # If we catch a signal, stop processing & clean up
+ if ($!{EINTR}) {
+ $caught_sig = 1;
+ last;
+ }
+
Proxmox::UI::process_events();
next if !scalar (@handles); # timeout
@@ -170,7 +177,7 @@ sub run_command {
&$func($logout) if $func;
- my $ec = wait_for_process($pid);
+ my $ec = wait_for_process($pid, kill => $caught_sig);
# behave like standard system(); returns -1 in case of errors too
return ($ec // -1) if $noout;
diff --git a/test/run-command.pl b/test/run-command.pl
index 7d5805e..19bdce0 100755
--- a/test/run-command.pl
+++ b/test/run-command.pl
@@ -27,6 +27,17 @@ my $ret = run_command('bash -c "echo test; sleep 1000; echo test"', sub {
});
is($ret, '', 'using CMD_FINISHED');
+# https://bugzilla.proxmox.com/show_bug.cgi?id=4872
+my $prev;
+eval {
+ local $SIG{ALRM} = sub { die "timed out!\n" };
+ $prev = alarm(1);
+ $ret = run_command('sleep 5');
+};
+alarm($prev);
+
+is($@, "timed out!\n", 'SIGALRM interaction');
+
# Check the log for errors/warnings
my $log = file_read_all($log_file->filename);
ok($log !~ m/(WARN|ERROR): /, 'no warnings or errors logged');
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH installer v2 4/6] sys: command: allow terminating the process early from log subroutine
2024-02-13 15:13 [pve-devel] [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection Christoph Heiss
` (2 preceding siblings ...)
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 3/6] sys: command: handle EINTR in run_command() Christoph Heiss
@ 2024-02-13 15:14 ` Christoph Heiss
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 5/6] sys: command: add option to not print process output to stdout Christoph Heiss
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Christoph Heiss @ 2024-02-13 15:14 UTC (permalink / raw)
To: pve-devel
If the logging subroutine $func returns CMD_FINISHED after processing a
line, the running subprocess is killed early.
This mechanism can be used when e.g. only a certain part of the output
of a (long-running) command is needed, avoiding the extra time it would
take the command to finish properly.
This is done in a entirely backwards-compatible way, i.e. existing
usages don't need any modification.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes since v1:
* introduced constant for return value (thanks Thomas!)
* added documentation
Proxmox/Sys/Command.pm | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/Proxmox/Sys/Command.pm b/Proxmox/Sys/Command.pm
index 6389b17..36e99c1 100644
--- a/Proxmox/Sys/Command.pm
+++ b/Proxmox/Sys/Command.pm
@@ -15,7 +15,9 @@ use Proxmox::Log;
use Proxmox::UI;
use base qw(Exporter);
-our @EXPORT_OK = qw(run_command syscmd);
+our @EXPORT_OK = qw(run_command syscmd CMD_FINISHED);
+
+use constant CMD_FINISHED => 1;
my sub shellquote {
my $str = shift;
@@ -79,7 +81,8 @@ sub syscmd {
#
# Arguments:
# * $cmd - The command to run, either a single string or array with individual arguments
-# * $func - Logging subroutine to call, receives both stdout and stderr
+# * $func - Logging subroutine to call, receives both stdout and stderr. Might return CMD_FINISHED
+# to exit early and ignore the rest of the process output.
# * $input - Stdin contents for the spawned subprocess
# * $noout - Whether to append any process output to the return value
sub run_command {
@@ -163,7 +166,13 @@ sub run_command {
$logout .= $buf;
while ($logout =~ s/^([^\010\r\n]*)(\r|\n|(\010)+|\r\n)//s) {
my $line = $1;
- $func->($line) if $func;
+ if ($func) {
+ my $ret = $func->($line);
+ if (defined($ret) && $ret == CMD_FINISHED) {
+ wait_for_process($pid, kill => 1);
+ return $ostream;
+ }
+ };
}
} elsif ($h eq $error) {
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH installer v2 5/6] sys: command: add option to not print process output to stdout
2024-02-13 15:13 [pve-devel] [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection Christoph Heiss
` (3 preceding siblings ...)
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 4/6] sys: command: allow terminating the process early from log subroutine Christoph Heiss
@ 2024-02-13 15:14 ` Christoph Heiss
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 6/6] fix #4872: run env: use run_command() for country detection Christoph Heiss
2024-02-23 15:17 ` [pve-devel] applied-esries: [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in " Thomas Lamprecht
6 siblings, 0 replies; 8+ messages in thread
From: Christoph Heiss @ 2024-02-13 15:14 UTC (permalink / raw)
To: pve-devel
If $noprint is set, the output of the command won't be printed to stdout
of the parent process.
Fully backwards-compatible again, only takes effect if the new argument
is actually specified.
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes since v1:
* added parameter documentation
Proxmox/Sys/Command.pm | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/Proxmox/Sys/Command.pm b/Proxmox/Sys/Command.pm
index 36e99c1..d145483 100644
--- a/Proxmox/Sys/Command.pm
+++ b/Proxmox/Sys/Command.pm
@@ -85,8 +85,9 @@ sub syscmd {
# to exit early and ignore the rest of the process output.
# * $input - Stdin contents for the spawned subprocess
# * $noout - Whether to append any process output to the return value
+# * $noprint - Whether to print any process output to the parents stdout
sub run_command {
- my ($cmd, $func, $input, $noout) = @_;
+ my ($cmd, $func, $input, $noout, $noprint) = @_;
my $cmdstr;
if (!ref($cmd)) {
@@ -178,7 +179,7 @@ sub run_command {
} elsif ($h eq $error) {
$ostream .= $buf if !($noout || $func);
}
- print $buf;
+ print $buf if !$noprint;
STDOUT->flush();
log_info($buf);
}
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH installer v2 6/6] fix #4872: run env: use run_command() for country detection
2024-02-13 15:13 [pve-devel] [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection Christoph Heiss
` (4 preceding siblings ...)
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 5/6] sys: command: add option to not print process output to stdout Christoph Heiss
@ 2024-02-13 15:14 ` Christoph Heiss
2024-02-23 15:17 ` [pve-devel] applied-esries: [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in " Thomas Lamprecht
6 siblings, 0 replies; 8+ messages in thread
From: Christoph Heiss @ 2024-02-13 15:14 UTC (permalink / raw)
To: pve-devel
This fixes a rather longstanding issue [0][1] with the country
detection, in that it might get completely stuck and thus hangs the
installation.
This is due how Perl, signals and line reading interacts.
A minimal reproducer, how the installer currently works, looks like
this:
```
#!/usr/bin/env perl
use strict;
use warnings;
open (my $fh, '-|', 'sleep', '1000') or die;
my $prev = alarm(2);
eval {
local $SIG{ALRM} = sub { die "timed out!\n" };
my $line;
while (defined ($line = <$fh>)) {
print "line: $line";
}
};
alarm($prev);
close($fh);
```
One might expect that this times out after 2 seconds, as specified in
`alarm(2)`. The thruth is that `$line = <$fh>` apparently prevents the
signal to go through. This then causes the installer to hang there
indefinitely, if `traceroute` never progresses - which seems to happen
on lots of (weird) networks, as evidently can be seen in the forum [1].
Proxmox::Sys::Command::run_command() handles of these weird cases, takes
care of the nitty-gritty details and - most importantly - interacts
properly with SIGALRM, so just use that instead.
This _should_ really fix that issue, but reproducing it 1:1 as part of
the installation process is _very_ hard, basically pure luck. But
rewriting the reproducer using run_command (in the exact same way that
this patch rewrites detect_country_tracing_to()) fixes the issue there,
so it's the best we can probably do.
NB: This causes that the traceroute command is now printed to the log
(as run_command() logs that by default), which we could also hide e.g.
through another parameter if wanted.
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=4872
[1] https://forum.proxmox.com/threads/proxmox-installation-trying-to-detect-country.134301/
Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
Changes since v1:
* use new CMD_FINISHED constant
Proxmox/Install/RunEnv.pm | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/Proxmox/Install/RunEnv.pm b/Proxmox/Install/RunEnv.pm
index c393f67..d7ccea6 100644
--- a/Proxmox/Install/RunEnv.pm
+++ b/Proxmox/Install/RunEnv.pm
@@ -7,6 +7,7 @@ use Carp;
use JSON qw(from_json to_json);
use Proxmox::Log;
+use Proxmox::Sys::Command qw(run_command CMD_FINISHED);
use Proxmox::Sys::File qw(file_read_firstline);
use Proxmox::Sys::Block;
use Proxmox::Sys::Net;
@@ -188,35 +189,36 @@ my sub detect_country_tracing_to : prototype($$) {
my ($ipver, $destination) = @_;
print STDERR "trying to detect country...\n";
- open(my $TRACEROUTE_FH, '-|', 'traceroute', "-$ipver", '-N', '1', '-q', '1', '-n', $destination)
- or return undef;
+ my $traceroute_cmd = ['traceroute', "-$ipver", '-N', '1', '-q', '1', '-n', $destination];
my $geoip_bin = ($ipver == 6) ? 'geoiplookup6' : 'geoiplookup';
my $country;
-
- my $previous_alarm = alarm (10);
+ my $previous_alarm;
eval {
local $SIG{ALRM} = sub { die "timed out!\n" };
- my $line;
- while (defined ($line = <$TRACEROUTE_FH>)) {
+ $previous_alarm = alarm (10);
+
+ run_command($traceroute_cmd, sub {
+ my $line = shift;
+
log_debug("DC TRACEROUTE: $line");
if ($line =~ m/^\s*\d+\s+(\S+)\s/) {
my $geoip = qx/$geoip_bin $1/;
log_debug("DC GEOIP: $geoip");
+
if ($geoip =~ m/GeoIP Country Edition:\s*([A-Z]+),/) {
$country = lc ($1);
log_info("DC FOUND: $country\n");
- last;
+ return CMD_FINISHED;
}
}
- }
+ }, undef, undef, 1);
+
};
my $err = $@;
alarm ($previous_alarm);
- close($TRACEROUTE_FH);
-
if ($err) {
die "unable to detect country - $err\n";
} elsif ($country) {
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] applied-esries: [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection
2024-02-13 15:13 [pve-devel] [PATCH installer v2 0/6] fix #4872: properly timeout `traceroute` command in country detection Christoph Heiss
` (5 preceding siblings ...)
2024-02-13 15:14 ` [pve-devel] [PATCH installer v2 6/6] fix #4872: run env: use run_command() for country detection Christoph Heiss
@ 2024-02-23 15:17 ` Thomas Lamprecht
6 siblings, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2024-02-23 15:17 UTC (permalink / raw)
To: Proxmox VE development discussion, Christoph Heiss
Am 13/02/2024 um 16:13 schrieb Christoph Heiss:
> For all the details, see patch #6.
>
> TL;DR: SIGALRM does not interrupt line reading using <>, causing the
> installer to hang on country detection. Fix it by using
> Proxmox::Sys::Command::run_command(), which properly interacts with
> SIGALRM.
>
> Patch #1 is a rather mundane fix for some niche cases, #2 is a small
> refactoring as proposed by Thomas and #3 might warrant some more
> thought. #4 & #5 are preparatory and to not alter existing behaviour.
>
> v1: https://lists.proxmox.com/pipermail/pve-devel/2024-February/061677.html
>
> Changes since v1:
> * new patches #2 (refactoring) and #3 (EINTR handling)
> * introduce CMD_FINISHED constant for run_command()
> * added function documentation
>
> Christoph Heiss (6):
> low-level: initialize UI backend for 'dump-env' subcommand too
> sys: command: factor out kill() + waitpid() from run_command()
> sys: command: handle EINTR in run_command()
> sys: command: allow terminating the process early from log subroutine
> sys: command: add option to not print process output to stdout
> fix #4872: run env: use run_command() for country detection
>
> Proxmox/Install/RunEnv.pm | 22 +++++-----
> Proxmox/Sys/Command.pm | 85 ++++++++++++++++++++++++++++++++-----
> proxmox-low-level-installer | 1 +
> test/Makefile | 5 ++-
> test/run-command.pl | 46 ++++++++++++++++++++
> 5 files changed, 137 insertions(+), 22 deletions(-)
> create mode 100755 test/run-command.pl
>
> --
> 2.43.0
>
applied series, with a small follow-up to that tries to make wait_for_process
a bit more forgiving towards processes that need a bit of time for a graceful
exit, thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread