From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id AD8A466CB1 for ; Tue, 28 Jul 2020 15:08:56 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 9B23425368 for ; Tue, 28 Jul 2020 15:08:26 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 5125025358 for ; Tue, 28 Jul 2020 15:08:25 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 17C6543390; Tue, 28 Jul 2020 15:08:25 +0200 (CEST) To: Wolfgang Bumiller Cc: pbs-devel@lists.proxmox.com References: <20200727065449.12717-1-d.csapak@proxmox.com> <20200727065449.12717-2-d.csapak@proxmox.com> <20200728121600.4bks5oz3iieqgqdl@olga.proxmox.com> From: Dominik Csapak Message-ID: <4528fda8-89b5-1852-3c7a-6c6bf0f5f02f@proxmox.com> Date: Tue, 28 Jul 2020 15:08:20 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:79.0) Gecko/20100101 Thunderbird/79.0 MIME-Version: 1.0 In-Reply-To: <20200728121600.4bks5oz3iieqgqdl@olga.proxmox.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 1.142 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.951 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [node.rs] Subject: Re: [pbs-devel] [PATCH proxmox-backup 2/2] api2/node/termproxy: fix zombies on worker abort X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2020 13:08:56 -0000 On 7/28/20 2:16 PM, Wolfgang Bumiller wrote: > On Mon, Jul 27, 2020 at 08:54:49AM +0200, Dominik Csapak wrote: >> tokios kill_on_drop sometimes leaves zombies around, especially >> when there is not another tokio::process::Command spawned after >> >> so instead of relying on the 'kill_on_drop' feature, we explicitly >> kill the child on a worker abort. to be able to do this >> we have to use 'tokio::select' instead of 'futures::select' since >> the latter requires the future to be fused, which consumes the >> child handle, leaving us no possibility to kill it after fusing. >> (tokio::select does not need the futures to be fused, so we >> can reuse the child future after the select again) >> >> Signed-off-by: Dominik Csapak >> --- >> src/api2/node.rs | 27 +++++++++++++++++---------- >> 1 file changed, 17 insertions(+), 10 deletions(-) >> >> diff --git a/src/api2/node.rs b/src/api2/node.rs >> index e6ecc85..011717f 100644 >> --- a/src/api2/node.rs >> +++ b/src/api2/node.rs >> @@ -2,10 +2,7 @@ use std::net::TcpListener; >> use std::os::unix::io::AsRawFd; >> >> use anyhow::{bail, format_err, Error}; >> -use futures::{ >> - future::{FutureExt, TryFutureExt}, >> - select, >> -}; >> +use futures::future::{FutureExt, TryFutureExt}; >> use hyper::body::Body; >> use hyper::http::request::Parts; >> use hyper::upgrade::Upgraded; >> @@ -170,7 +167,6 @@ async fn termproxy( >> let mut cmd = tokio::process::Command::new("/usr/bin/termproxy"); >> >> cmd.args(&arguments) >> - .kill_on_drop(true) >> .stdout(std::process::Stdio::piped()) >> .stderr(std::process::Stdio::piped()); >> >> @@ -197,8 +193,9 @@ async fn termproxy( >> Ok::<(), Error>(()) >> }; >> >> - select!{ >> - res = child.fuse() => { >> + let mut needs_kill = false; >> + let res = tokio::select!{ >> + res = &mut child => { >> let exit_code = res?; >> if !exit_code.success() { >> match exit_code.code() { >> @@ -208,10 +205,20 @@ async fn termproxy( >> } >> Ok(()) >> }, >> - res = stdout_fut.fuse() => res, >> - res = stderr_fut.fuse() => res, >> - res = worker.abort_future().fuse() => res.map_err(Error::from), >> + res = stdout_fut => res, >> + res = stderr_fut => res, >> + res = worker.abort_future() => { >> + needs_kill = true; >> + res.map_err(Error::from) >> + } >> + }; >> + >> + if needs_kill { > > I think we should log if res is an `Err` at this point, as this would > indicate an *error* with the `abort_future`, which IMO shouldn't happen > but also not quitely discarded which the two `?` below would do (since > they return on error). keeping the results is good imo, but i'd rather still return the result from the future above and log if the kill/await here fails? e.g, if let Err(err) = child.kill() { //log } else if let Err(err) = child.await { //log } res ? > >> + child.kill()?; >> + child.await?; >> } >> + >> + res >> }, >> )?; >> >> -- >> 2.20.1