From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 7F31F61018 for ; Wed, 16 Dec 2020 11:21:48 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 717661AC10 for ; Wed, 16 Dec 2020 11:21:18 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id E25DF1AC02 for ; Wed, 16 Dec 2020 11:21:17 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id AB8F545208 for ; Wed, 16 Dec 2020 11:21:17 +0100 (CET) To: pbs-devel@lists.proxmox.com, d.csapak@proxmox.com References: <20201216081209.6997-1-d.csapak@proxmox.com> From: Fabian Ebner Message-ID: <2c132f6f-e8c0-5d5f-e53d-d5e32fe156f7@proxmox.com> Date: Wed, 16 Dec 2020 11:21:03 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: <20201216081209.6997-1-d.csapak@proxmox.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.008 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [daemon.rs] Subject: Re: [pbs-devel] [PATCH proxmox-backup] tools/daemon: improve reload behaviour X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2020 10:21:48 -0000 This fixes the postinst problem for me. I let dpkg -i run in a loop 50 times and it never got stuck anymore. Tested-By: Fabian Ebner Am 16.12.20 um 09:12 schrieb Dominik Csapak: > it seems that sometimes, the child process signal gets handled > before the parent process signal. Systemd then ignores the > childs signal (finished reloading) and only after going into > reloading state because of the parent. this will never finish. > > Instead, wait for the state to change to 'reloading' after sending > that signal in the parent, an only fork afterwards. This way > we ensure that systemd knows about the reloading before actually trying > to do it. > > Signed-off-by: Dominik Csapak > --- > this all goes away with systemds notify barrier hopefully.... > > src/tools/daemon.rs | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/src/tools/daemon.rs b/src/tools/daemon.rs > index 6bb4a41b..2aa52772 100644 > --- a/src/tools/daemon.rs > +++ b/src/tools/daemon.rs > @@ -291,6 +291,7 @@ where > if let Err(e) = systemd_notify(SystemdNotify::Reloading) { > log::error!("failed to notify systemd about the state change: {}", e); > } > + wait_service_is_active_or_reloading(service_name, true).await?; > if let Err(e) = reloader.take().unwrap().fork_restart() { > log::error!("error during reload: {}", e); > let _ = systemd_notify(SystemdNotify::Status("error during reload".to_string())); > @@ -305,7 +306,7 @@ where > > // FIXME: this is a hack, replace with sd_notify_barrier when available > if server::is_reload_request() { > - wait_service_is_active(service_name).await?; > + wait_service_is_active_or_reloading(service_name, false).await?; > } > > log::info!("daemon shut down..."); > @@ -313,7 +314,7 @@ where > } > > // hack, do not use if unsure! > -async fn wait_service_is_active(service: &str) -> Result<(), Error> { > +async fn wait_service_is_active_or_reloading(service: &str, wait_for_reload: bool) -> Result<(), Error> { > tokio::time::delay_for(std::time::Duration::new(1, 0)).await; > loop { > let text = match tokio::process::Command::new("systemctl") > @@ -328,7 +329,8 @@ async fn wait_service_is_active(service: &str) -> Result<(), Error> { > Err(err) => bail!("executing 'systemctl is-active' failed - {}", err), > }; > > - if text.trim().trim_start() != "reloading" { > + let is_reload = text.trim().trim_start() == "reloading"; > + if is_reload == wait_for_reload { > return Ok(()); > } > tokio::time::delay_for(std::time::Duration::new(5, 0)).await; >