From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 214C661C9F for ; Wed, 19 Aug 2020 12:30:44 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 13A981BB38 for ; Wed, 19 Aug 2020 12:30:44 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 780E21BB30 for ; Wed, 19 Aug 2020 12:30:43 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3A689446A9 for ; Wed, 19 Aug 2020 12:30:43 +0200 (CEST) From: Fabian Ebner To: pve-devel@lists.proxmox.com Date: Wed, 19 Aug 2020 12:30:37 +0200 Message-Id: <20200819103037.15143-1-f.ebner@proxmox.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.003 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [lxc.pm] Subject: [pve-devel] [RFC container] Improve feedback for startup X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2020 10:30:44 -0000 Since it was necessary to switch to 'Type=Simple' in the systemd service (see 545d6f0a13ac2bf3a8d3f224c19c0e0def12116d ), 'systemctl start pve-container@ID' would not wait for the 'lxc-start' command anymore. Thus every container start was reported as a success and the 'post-start' hook would trigger immediately after the 'systemctl start' command. Use 'lxc-monitor' to get the necessary information and detect startup failure and only run the 'post-start' hookscript after the container is effectively running. If something goes wrong with the monitor, fall back to the old behavior. Signed-off-by: Fabian Ebner --- src/PVE/LXC.pm | 36 +++++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm index db5b8ca..35dc54c 100644 --- a/src/PVE/LXC.pm +++ b/src/PVE/LXC.pm @@ -2191,10 +2191,44 @@ sub vm_start { PVE::Storage::activate_volumes($storage_cfg, $vollist); + my $monitor_pid = open(my $monitor_fh, '-|', "/usr/bin/lxc-monitor -n $vmid") + or warn "could not open pipe to lxc-monitor\n"; + my $cmd = ['systemctl', 'start', "pve-container\@$vmid"]; PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1); - eval { PVE::Tools::run_command($cmd); }; + eval { + PVE::Tools::run_command($cmd); + + my $success; + if ($monitor_pid) { + eval { + local $SIG{ALRM} = sub { die "got timeout\n" }; + alarm(10); # 'STARTING' should appear quickly + + while (my $line = <$monitor_fh>) { + if ($line =~ m/^'$vmid' changed state to \[([A-Z]*)\]$/) { + my $status = $1; + alarm(0); + $success = 1 if $status eq 'RUNNING'; + $success = 0 if $status eq 'ABORTING' + || $status eq 'STOPPING' + || $status eq 'STOPPED'; + if (defined($success)) { + kill('KILL', $monitor_pid); + waitpid($monitor_pid, 0); + } + } else { + die "unexpected output from lxc-monitor: $line\n"; + } + } + }; + warn "Problem with lxc-monitor: $@" if $@; + alarm(0); + } + die "'lxc-start' failed for container '$vmid'\n" + if defined($success) && !$success; + }; if (my $err = $@) { unlink $skiplock_flag_fn; die $err; -- 2.20.1