From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <t.lamprecht@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 1F89E71632
 for <pbs-devel@lists.proxmox.com>; Thu, 12 May 2022 12:04:37 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id EF2982574B
 for <pbs-devel@lists.proxmox.com>; Thu, 12 May 2022 12:04:06 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id E04742573F
 for <pbs-devel@lists.proxmox.com>; Thu, 12 May 2022 12:04:05 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id AC80143487
 for <pbs-devel@lists.proxmox.com>; Thu, 12 May 2022 12:04:05 +0200 (CEST)
Message-ID: <6a79b760-7f99-2dac-fbd6-365f4cc799c3@proxmox.com>
Date: Thu, 12 May 2022 12:04:04 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:101.0) Gecko/20100101
 Thunderbird/101.0
Content-Language: en-GB
To: Proxmox Backup Server development discussion
 <pbs-devel@lists.proxmox.com>, Fabian Ebner <f.ebner@proxmox.com>
References: <20220504113324.70300-1-f.ebner@proxmox.com>
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
In-Reply-To: <20220504113324.70300-1-f.ebner@proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.021 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [proxmox-backup-api.rs, minimal-rest-server.rs, daemon.rs,
 proxmox-backup-proxy.rs]
Subject: [pbs-devel] applied: [PATCH/RFC proxmox-backup] rest server:
 daemon: update PID file before sending MAINPID notification
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 12 May 2022 10:04:37 -0000

Am 5/4/22 um 13:33 schrieb Fabian Ebner:
> There is a race upon reload, where it can happen that:
> 1. systemd forks off /bin/kill -HUP $MAINPID
> 2. Current instance forks off new one and notifies systemd with the
>    new MAINPID.
> 3. systemd sets new MAINPID.
> 4. systemd receives SIGCHLD for the kill process (which is the current
>    control process for the service) and reads the PID of the old
>    instance from the PID file, resetting MAINPID to the PID of the old
>    instance.
> 5. Old instance exits.
> 6. systemd receives SIGCHLD for the old instance, reads the PID of the
>    old instance from the PID file once more. systemd sees that the
>    MAINPID matches the child PID and considers the service exited.
> 7. systemd receivese notification from the new PID and is confused.
>    The service won't get active, because the notification wasn't
>    handled.
> 
> To fix it, update the PID file before sending the MAINPID
> notification, similar to what a comment in systemd's
> src/core/service.c suggests:
>> /* Forking services may occasionally move to a new PID.
>>  * As long as they update the PID file before exiting the old
>>  * PID, they're fine. */
> but for our Type=notify "before sending the notification" rather than
> "before exiting", because otherwise, the mix-up in 4. could still
> happen (although it might not actually be problematic without the
> mix-up in 6., it still seems better to avoid).
> 
> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
> ---
> 
> An alternative would be to not tell systemd about the PIDFile at all,
> but there's two small downsides:
> * The PID file isn't cleaned up automatically when the service exits.
> * Having the PID file updated before sending the MAINPID notification
>   feels a bit cleaner (even if the PID file is not used by systemd).
> 
>  .../examples/minimal-rest-server.rs           | 18 +++++++----
>  proxmox-rest-server/src/daemon.rs             | 16 ++++++++--
>  src/bin/proxmox-backup-api.rs                 | 30 +++++++++--------
>  src/bin/proxmox-backup-proxy.rs               | 32 +++++++++++--------
>  4 files changed, 60 insertions(+), 36 deletions(-)
> 
>

applied, thanks!