From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9EF257B2ED for ; Fri, 8 Jul 2022 09:09:23 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 928E1259E2 for ; Fri, 8 Jul 2022 09:09:23 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 8 Jul 2022 09:09:22 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 301B443E07; Fri, 8 Jul 2022 09:09:22 +0200 (CEST) Message-ID: Date: Fri, 8 Jul 2022 09:09:20 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:103.0) Gecko/20100101 Thunderbird/103.0 Content-Language: en-GB To: Mark Schouten , Proxmox Backup Server development discussion References: From: Thomas Lamprecht In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.004 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pbs-devel] Scheduler causing connectivity issues? X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jul 2022 07:09:23 -0000 Hi, On 07/07/2022 17:49, Mark Schouten wrote: > We’re getting complaints that one of our PBS’es is periodically unreachable. After investigation if the network might be at fault (even though it’s handling about 5.5Gbit at night), we found that PBS is piling up waiting connections every minute, on the minute, as you can see below. You see the output of `date`, combined with `ss -np | grep -c 8007`, the number of active connections. > > At first I thought that pvestatd was ddossing PBS, but pvestatd seems to run more often than once in a minute. > > So stracing the API process, I found that that process is also just waiting for something; must be the proxy-process. > > grepping for ‘minute’ in the code, I stumbled upon the function `next_minute` in ./src/bin/proxmox-backup-proxy.rs. I’m not quite sure if I understand it correctly, but it seems that every minute, the scheduler is going to try and find out if it should be doing something. > > Drilling down on that in my strace-foo, I think I see quite some read/write/rename actions on jobstate-files. Which leads me to conclude that the proxy process is waiting for the scheduler.. > > This is just guess-work, but you guys can surely find out better what’s going on than me. > > This PBS is running with 45 users and 67 datastores. > > Hope you guys can find something.. If I need to debug anything, let me know! Thanks for the info, this already helps quite a bit. We'll look into it and re-check with you if we need more info. cheers, Thomas