From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 3BA39621F4 for ; Thu, 20 Aug 2020 11:37:13 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 28A432C073 for ; Thu, 20 Aug 2020 11:36:43 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 34CE72C064 for ; Thu, 20 Aug 2020 11:36:41 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 0352F44734 for ; Thu, 20 Aug 2020 11:36:41 +0200 (CEST) To: Proxmox VE development discussion , Fabian Ebner , Wolfgang Bumiller References: <20200819103037.15143-1-f.ebner@proxmox.com> From: Thomas Lamprecht Message-ID: <1ef68f7f-437a-a160-05e2-f3b111ece024@proxmox.com> Date: Thu, 20 Aug 2020 11:36:39 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:80.0) Gecko/20100101 Thunderbird/80.0 MIME-Version: 1.0 In-Reply-To: <20200819103037.15143-1-f.ebner@proxmox.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.603 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -1.361 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [lxc.pm] Subject: Re: [pve-devel] [RFC container] Improve feedback for startup X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2020 09:37:13 -0000 On 19.08.20 12:30, Fabian Ebner wrote: > Since it was necessary to switch to 'Type=Simple' in the systemd > service (see 545d6f0a13ac2bf3a8d3f224c19c0e0def12116d ), > 'systemctl start pve-container@ID' would not wait for the 'lxc-start' > command anymore. Thus every container start was reported as a success > and the 'post-start' hook would trigger immediately after the > 'systemctl start' command. > > Use 'lxc-monitor' to get the necessary information and detect > startup failure and only run the 'post-start' hookscript after > the container is effectively running. If something goes wrong > with the monitor, fall back to the old behavior. > > Signed-off-by: Fabian Ebner > --- > src/PVE/LXC.pm | 36 +++++++++++++++++++++++++++++++++++- > 1 file changed, 35 insertions(+), 1 deletion(-) > appreciate the effort! We could also directly connect to /run/lxc/var/lib/lxc/monitor-fifo (or the abstract unix socket, but not much gained/difference here) of the lxc-monitord which publishes all state changes and unpack the new state [0] directly. [0] https://github.com/lxc/lxc/blob/8bdacc22a48f9c09902a1d2febd71439cb38c082/src/lxc/state.h#L10 @Wolfgang, what do you think? > diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm > index db5b8ca..35dc54c 100644 > --- a/src/PVE/LXC.pm > +++ b/src/PVE/LXC.pm > @@ -2191,10 +2191,44 @@ sub vm_start { > > PVE::Storage::activate_volumes($storage_cfg, $vollist); > > + my $monitor_pid = open(my $monitor_fh, '-|', "/usr/bin/lxc-monitor -n $vmid") > + or warn "could not open pipe to lxc-monitor\n"; > + > my $cmd = ['systemctl', 'start', "pve-container\@$vmid"]; > > PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1); > - eval { PVE::Tools::run_command($cmd); }; > + eval { > + PVE::Tools::run_command($cmd); > + > + my $success; > + if ($monitor_pid) { > + eval { > + local $SIG{ALRM} = sub { die "got timeout\n" }; > + alarm(10); # 'STARTING' should appear quickly > + > + while (my $line = <$monitor_fh>) { > + if ($line =~ m/^'$vmid' changed state to \[([A-Z]*)\]$/) { > + my $status = $1; > + alarm(0); > + $success = 1 if $status eq 'RUNNING'; > + $success = 0 if $status eq 'ABORTING' > + || $status eq 'STOPPING' > + || $status eq 'STOPPED'; > + if (defined($success)) { > + kill('KILL', $monitor_pid); > + waitpid($monitor_pid, 0); > + } > + } else { > + die "unexpected output from lxc-monitor: $line\n"; > + } > + } > + }; > + warn "Problem with lxc-monitor: $@" if $@; > + alarm(0); > + } > + die "'lxc-start' failed for container '$vmid'\n" > + if defined($success) && !$success; > + }; > if (my $err = $@) { > unlink $skiplock_flag_fn; > die $err; >