From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id D0A9567700 for ; Thu, 27 Aug 2020 10:44:38 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id BA2DE15A32 for ; Thu, 27 Aug 2020 10:44:08 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 81E7515964 for ; Thu, 27 Aug 2020 10:44:06 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 4264D448D8 for ; Thu, 27 Aug 2020 10:44:06 +0200 (CEST) Date: Thu, 27 Aug 2020 10:44:04 +0200 From: Wolfgang Bumiller To: Thomas Lamprecht Cc: Proxmox VE development discussion , Fabian Ebner Message-ID: <20200827084404.o2aizoasd6r7xl4d@olga.proxmox.com> References: <20200819103037.15143-1-f.ebner@proxmox.com> <1ef68f7f-437a-a160-05e2-f3b111ece024@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1ef68f7f-437a-a160-05e2-f3b111ece024@proxmox.com> User-Agent: NeoMutt/20180716 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.019 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [lxc.pm, command.pm] Subject: Re: [pve-devel] [RFC container] Improve feedback for startup X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Aug 2020 08:44:38 -0000 On Thu, Aug 20, 2020 at 11:36:39AM +0200, Thomas Lamprecht wrote: > On 19.08.20 12:30, Fabian Ebner wrote: > > Since it was necessary to switch to 'Type=Simple' in the systemd > > service (see 545d6f0a13ac2bf3a8d3f224c19c0e0def12116d ), > > 'systemctl start pve-container@ID' would not wait for the 'lxc-start' > > command anymore. Thus every container start was reported as a success > > and the 'post-start' hook would trigger immediately after the > > 'systemctl start' command. > > > > Use 'lxc-monitor' to get the necessary information and detect > > startup failure and only run the 'post-start' hookscript after > > the container is effectively running. If something goes wrong > > with the monitor, fall back to the old behavior. > > > > Signed-off-by: Fabian Ebner > > --- > > src/PVE/LXC.pm | 36 +++++++++++++++++++++++++++++++++++- > > 1 file changed, 35 insertions(+), 1 deletion(-) > > > > appreciate the effort! > We could also directly connect to /run/lxc/var/lib/lxc/monitor-fifo (or the abstract > unix socket, but not much gained/difference here) of the lxc-monitord which publishes > all state changes and unpack the new state [0] directly. > > [0] https://github.com/lxc/lxc/blob/8bdacc22a48f9c09902a1d2febd71439cb38c082/src/lxc/state.h#L10 > > @Wolfgang, what do you think? Just tested adding a state client to our Command.pm directly, seems to work, so we would depend neither on lxc-monitor nor lxc-monitord. Example & code follow below. The only issue with it is that we'd need to retry connecting to the command socket a few times since we don't know when it becomes available, but that shouldn't be too bad IMO. But first some inline comments to the original patch. > > diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm > > index db5b8ca..35dc54c 100644 > > --- a/src/PVE/LXC.pm > > +++ b/src/PVE/LXC.pm > > @@ -2191,10 +2191,44 @@ sub vm_start { > > > > PVE::Storage::activate_volumes($storage_cfg, $vollist); > > > > + my $monitor_pid = open(my $monitor_fh, '-|', "/usr/bin/lxc-monitor -n $vmid") > > + or warn "could not open pipe to lxc-monitor\n"; > > + > > my $cmd = ['systemctl', 'start', "pve-container\@$vmid"]; > > > > PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1); > > - eval { PVE::Tools::run_command($cmd); }; > > + eval { > > + PVE::Tools::run_command($cmd); > > + > > + my $success; > > + if ($monitor_pid) { > > + eval { > > + local $SIG{ALRM} = sub { die "got timeout\n" }; > > + alarm(10); # 'STARTING' should appear quickly We have a `run_with_timeout` helper in Tools which is more robust than this. There can be all sorts of interactions between alarms, `eval` and the rest of the code. For one, this overrides a previous alarm without restoring it. > > + > > + while (my $line = <$monitor_fh>) { > > + if ($line =~ m/^'$vmid' changed state to \[([A-Z]*)\]$/) { > > + my $status = $1; > > + alarm(0); > > + $success = 1 if $status eq 'RUNNING'; > > + $success = 0 if $status eq 'ABORTING' > > + || $status eq 'STOPPING' > > + || $status eq 'STOPPED'; > > + if (defined($success)) { > > + kill('KILL', $monitor_pid); > > + waitpid($monitor_pid, 0); > > + } > > + } else { > > + die "unexpected output from lxc-monitor: $line\n"; > > + } > > + } > > + }; > > + warn "Problem with lxc-monitor: $@" if $@; Secondly, if we reach this part successfully, the alarm can still trigger *HERE* which is already outside the `eval`, where we'd throw back the `got timeout` message instead of the `lxc-start failed ...` message, which could get potentially confusing. We do have a 2nd eval around in this code, but that's a little "far". Eg. the 'unexpected output from lxc-monitor' error might get swallowed if it triggers and the alarm goes off before the above `warn` line, replacing the current contents of '$@'... interrupts are pure pain. > > + alarm(0); > > + } > > + die "'lxc-start' failed for container '$vmid'\n" > > + if defined($success) && !$success; > > + }; > > if (my $err = $@) { > > unlink $skiplock_flag_fn; > > die $err; > > Usage example: use PVE::LXC::Command; my $sock = PVE::LXC::Command::get_state_client(404); die "not running\n" if !defined($sock); while (1) { my ($type, $name, $value) = PVE::LXC::Command::read_lxc_message($sock); last if !defined($type); print("$name: $type => $value\n"); } Patch for Command.pm: ---8<--- >From 6ac578ef889a3a9c8aefc4f05215b4ec66049546 Mon Sep 17 00:00:00 2001 From: Wolfgang Bumiller Date: Thu, 27 Aug 2020 10:31:06 +0200 Subject: [PATCH container] command: add state client functions Signed-off-by: Wolfgang Bumiller --- src/PVE/LXC/Command.pm | 91 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) diff --git a/src/PVE/LXC/Command.pm b/src/PVE/LXC/Command.pm index beed890..6df767d 100644 --- a/src/PVE/LXC/Command.pm +++ b/src/PVE/LXC/Command.pm @@ -11,20 +11,36 @@ use warnings; use IO::Socket::UNIX; use Socket qw(SOCK_STREAM SOL_SOCKET SO_PASSCRED); +use POSIX qw(NAME_MAX); use base 'Exporter'; use constant { + LXC_CMD_GET_STATE => 3, LXC_CMD_GET_CGROUP => 6, + LXC_CMD_ADD_STATE_CLIENT => 10, LXC_CMD_FREEZE => 15, LXC_CMD_UNFREEZE => 16, LXC_CMD_GET_LIMITING_CGROUP => 19, }; +use constant { + STATE_STOPPED => 0, + STATE_STARTING => 1, + STATE_RUNNING => 2, + STATE_STOPPING => 3, + STATE_ABORTING => 4, + STATE_FREEZING => 5, + STATE_FROZEN => 6, + STATE_THAWED => 7, + MAX_STATE => 8, +}; + our @EXPORT_OK = qw( raw_command_transaction simple_command get_cgroup_path + get_state_client ); # Get the command socket for a container. @@ -81,6 +97,33 @@ my sub _unpack_lxc_cmd_rsp($) { return ($ret, $len); } +my $LXC_MSG_SIZE = length(pack('I! Z'.(NAME_MAX+1).' x![I] I', 0, "", 0)); +# Unpack an lxc_msg struct. +my sub _unpack_lxc_msg($) { + my ($packet) = @_; + + # struct lxc_msg { + # lxc_msg_type_t type; + # char name[NAME_MAX+1]; + # int value; + # }; + + my ($type, $name, $value) = unpack('I!Z'.(NAME_MAX+1).'I!', $packet); + + if ($type == 0) { + $type = 'STATE'; + } elsif ($type == 1) { + $type = 'PRIORITY'; + } elsif ($type == 2) { + $type = 'EXITCODE'; + } else { + warn "unsupported lxc message type $type received\n"; + $type = undef; + } + + return ($type, $name, $value); +} + # Send a complete packet: my sub _do_send($$) { my ($sock, $data) = @_; @@ -206,4 +249,52 @@ sub unfreeze($$) { return $res; } +# Add this command socket as a state client. +# +# Currently all states are observed. +# +# Returns undef if the container is not running, dies on errors. +sub get_state_client($) { + my ($vmid) = @_; + + my $socket = _get_command_socket($vmid) + or return undef; + + # For now we want all states (except 'reboots', since we would never see those, reboots would + # use a value of '2' for STATE_RUNNING) + my $states = pack('I!', 2) x MAX_STATE; + + my ($res, undef) = raw_command_transaction($socket, LXC_CMD_ADD_STATE_CLIENT, $states); + if ($res != MAX_STATE) { + die "container is currently in unexpected state $res\n"; + } + + return $socket; +} + +# Read an lxc message from a socket. +# +# Returns undef on EOF (if lxc exits). +# Otherwise returns a (type, vmid, value) tuple. +# +# The returned 'type' currently can be 'STATE', 'PRIORITY' or 'EXITSTATUS'. +sub read_lxc_message($) { + my ($socket) = @_; + + my $msg; + my $got = recv($socket, $msg, $LXC_MSG_SIZE, 0) + // die "failed to read from state socket: $!\n"; + + if (length($msg) == 0) { + return undef; + } + + die "short read on state socket ($LXC_MSG_SIZE != ".length($msg).")\n" + if length($msg) != $LXC_MSG_SIZE; + + my ($type, $name, $value) = _unpack_lxc_msg($msg); + + return ($type, $name, $value); +} + 1; -- 2.20.1