From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id EAE7F98B0C for ; Wed, 26 Apr 2023 16:17:15 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D253C1F5E7 for ; Wed, 26 Apr 2023 16:17:15 +0200 (CEST) Received: from mail.unix-scripts.info (unknown [IPv6:2a02:27d0:0:dead:beef::222]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 26 Apr 2023 16:17:12 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by mail.unix-scripts.info (Postfix) with ESMTP id 3653E3A0B93 for ; Wed, 26 Apr 2023 16:17:12 +0200 (CEST) Received: from mail.unix-scripts.info ([127.0.0.1]) by localhost (venus.unix-scripts.info [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZNw8XLJrTjZa for ; Wed, 26 Apr 2023 16:17:12 +0200 (CEST) Received: from [IPV6:2a02:27d0:0:5e0d:da9e:f3ff:fe2d:48ca] (unknown [IPv6:2a02:27d0:0:5e0d:da9e:f3ff:fe2d:48ca]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mail.unix-scripts.info (Postfix) with ESMTPSA id 205173A0898 for ; Wed, 26 Apr 2023 16:17:06 +0200 (CEST) Message-ID: <1694e06d-5105-18e7-0224-1942bd1e86a7@unix-scripts.info> Date: Wed, 26 Apr 2023 16:17:04 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Content-Language: fr To: pve-user@lists.proxmox.com References: From: Laurent CARON In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_INFOUSMEBIZ 0.75 Prevalent use of .info|.us|.me|.me.uk|.biz|xyz|id|rocks|life domains in spam/malware NICE_REPLY_A -1.422 Looks like a legit reply (A) RDNS_NONE 0.793 Delivered to internal network by a host with no rDNS SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [PVE-User] Peak load at 7.30AM... X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Apr 2023 14:17:16 -0000 Hi, What about smbstatus ? No clients accessing shares ? No automated AV scan, ... ? Le 26/04/2023 à 12:48, Marco Gaiarin a écrit : > Situation: a debian stretch mostly 'samba server' for a 150+ clients, in a > couple of phisical server; VM get replicated between the two server every 30 > minutes. > > Very frequently at 7.30 the VM got a high peak rate, becaming mostly > irresponsive; after fiddling a bit, i've added a watchdog LOAD limit, and > the VM now reboot. > > Looking at logs, it seems caused by the replica of the 7.30: > > Apr 26 07:30:12 vdmsv1 qemu-ga: info: guest-ping called > Apr 26 07:30:13 vdmsv1 qemu-ga: info: guest-fsfreeze called > > and after some (6 to 8 minutes) watchdog reset it: > > Apr 26 07:36:11 vdmsv1 watchdog[2525]: loadavg 57 33 14 is higher than the given threshold 80 32 16! > Apr 26 07:36:11 vdmsv1 watchdog[2525]: shutting down the system because of error 253 = 'load average too high' > > > Some notes: > > 1) the replica run every 30 minutes; no other of the 47 replicas of the day > seems sufficient to trigger a reboot. > > 2) the phisical server during the high peak seems totally unaffected (no > high iodelay, no sensible load/cpu...). > > 3) at 7.30 there's no user (they arrive around 8.15). > > > I'm doing some hypotesis; for example debian by default rotate logs at 6.30, > but looking at file dates logs are completely rotated before the 7.00, so > the 7.00 replica could have triggered the reboot... > > > Someone have some hint on how can i debug this!? > > > Thanks. >