From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 7BC4966D01 for ; Tue, 28 Jul 2020 14:16:36 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 6B37124DB5 for ; Tue, 28 Jul 2020 14:16:06 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 5417324DAA for ; Tue, 28 Jul 2020 14:16:05 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2000E433C4 for ; Tue, 28 Jul 2020 14:16:05 +0200 (CEST) Date: Tue, 28 Jul 2020 14:16:00 +0200 From: Wolfgang Bumiller To: Dominik Csapak Cc: pbs-devel@lists.proxmox.com Message-ID: <20200728121600.4bks5oz3iieqgqdl@olga.proxmox.com> References: <20200727065449.12717-1-d.csapak@proxmox.com> <20200727065449.12717-2-d.csapak@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200727065449.12717-2-d.csapak@proxmox.com> User-Agent: NeoMutt/20180716 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.029 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [node.rs] Subject: Re: [pbs-devel] [PATCH proxmox-backup 2/2] api2/node/termproxy: fix zombies on worker abort X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2020 12:16:36 -0000 On Mon, Jul 27, 2020 at 08:54:49AM +0200, Dominik Csapak wrote: > tokios kill_on_drop sometimes leaves zombies around, especially > when there is not another tokio::process::Command spawned after > > so instead of relying on the 'kill_on_drop' feature, we explicitly > kill the child on a worker abort. to be able to do this > we have to use 'tokio::select' instead of 'futures::select' since > the latter requires the future to be fused, which consumes the > child handle, leaving us no possibility to kill it after fusing. > (tokio::select does not need the futures to be fused, so we > can reuse the child future after the select again) > > Signed-off-by: Dominik Csapak > --- > src/api2/node.rs | 27 +++++++++++++++++---------- > 1 file changed, 17 insertions(+), 10 deletions(-) > > diff --git a/src/api2/node.rs b/src/api2/node.rs > index e6ecc85..011717f 100644 > --- a/src/api2/node.rs > +++ b/src/api2/node.rs > @@ -2,10 +2,7 @@ use std::net::TcpListener; > use std::os::unix::io::AsRawFd; > > use anyhow::{bail, format_err, Error}; > -use futures::{ > - future::{FutureExt, TryFutureExt}, > - select, > -}; > +use futures::future::{FutureExt, TryFutureExt}; > use hyper::body::Body; > use hyper::http::request::Parts; > use hyper::upgrade::Upgraded; > @@ -170,7 +167,6 @@ async fn termproxy( > let mut cmd = tokio::process::Command::new("/usr/bin/termproxy"); > > cmd.args(&arguments) > - .kill_on_drop(true) > .stdout(std::process::Stdio::piped()) > .stderr(std::process::Stdio::piped()); > > @@ -197,8 +193,9 @@ async fn termproxy( > Ok::<(), Error>(()) > }; > > - select!{ > - res = child.fuse() => { > + let mut needs_kill = false; > + let res = tokio::select!{ > + res = &mut child => { > let exit_code = res?; > if !exit_code.success() { > match exit_code.code() { > @@ -208,10 +205,20 @@ async fn termproxy( > } > Ok(()) > }, > - res = stdout_fut.fuse() => res, > - res = stderr_fut.fuse() => res, > - res = worker.abort_future().fuse() => res.map_err(Error::from), > + res = stdout_fut => res, > + res = stderr_fut => res, > + res = worker.abort_future() => { > + needs_kill = true; > + res.map_err(Error::from) > + } > + }; > + > + if needs_kill { I think we should log if res is an `Err` at this point, as this would indicate an *error* with the `abort_future`, which IMO shouldn't happen but also not quitely discarded which the two `?` below would do (since they return on error). > + child.kill()?; > + child.await?; > } > + > + res > }, > )?; > > -- > 2.20.1