public inbox for pve-user@lists.proxmox.com
 help / color / mirror / Atom feed
* [PVE-User] Peak load at 7.30AM...
@ 2023-04-26 10:48 Marco Gaiarin
  2023-04-26 14:17 ` Laurent CARON
  2023-04-27 12:32 ` Marco Gaiarin
  0 siblings, 2 replies; 3+ messages in thread
From: Marco Gaiarin @ 2023-04-26 10:48 UTC (permalink / raw)
  To: pve-user


Situation: a debian stretch mostly 'samba server' for a 150+ clients, in a
couple of phisical server; VM get replicated between the two server every 30
minutes.

Very frequently at 7.30 the VM got a high peak rate, becaming mostly
irresponsive; after fiddling a bit, i've added a watchdog LOAD limit, and
the VM now reboot.

Looking at logs, it seems caused by the replica of the 7.30:

	Apr 26 07:30:12 vdmsv1 qemu-ga: info: guest-ping called
	Apr 26 07:30:13 vdmsv1 qemu-ga: info: guest-fsfreeze called

and after some (6 to 8 minutes) watchdog reset it:

	Apr 26 07:36:11 vdmsv1 watchdog[2525]: loadavg 57 33 14 is higher than the given threshold 80 32 16!
	Apr 26 07:36:11 vdmsv1 watchdog[2525]: shutting down the system because of error 253 = 'load average too high'


Some notes:

1) the replica run every 30 minutes; no other of the 47 replicas of the day
  seems sufficient to trigger a reboot.

2) the phisical server during the high peak seems totally unaffected (no
  high iodelay, no sensible load/cpu...).

3) at 7.30 there's no user (they arrive around 8.15).


I'm doing some hypotesis; for example debian by default rotate logs at 6.30,
but looking at file dates logs are completely rotated before the 7.00, so
the 7.00 replica could have triggered the reboot...


Someone have some hint on how can i debug this!?


Thanks.

-- 
  E allora osservi gli altri giocare
  e` un gioco strano devi imparare			(E. Bennato)





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PVE-User] Peak load at 7.30AM...
  2023-04-26 10:48 [PVE-User] Peak load at 7.30AM Marco Gaiarin
@ 2023-04-26 14:17 ` Laurent CARON
  2023-04-27 12:32 ` Marco Gaiarin
  1 sibling, 0 replies; 3+ messages in thread
From: Laurent CARON @ 2023-04-26 14:17 UTC (permalink / raw)
  To: pve-user

Hi,

What about smbstatus ? No clients accessing shares ? No automated AV 
scan, ... ?


Le 26/04/2023 à 12:48, Marco Gaiarin a écrit :
> Situation: a debian stretch mostly 'samba server' for a 150+ clients, in a
> couple of phisical server; VM get replicated between the two server every 30
> minutes.
>
> Very frequently at 7.30 the VM got a high peak rate, becaming mostly
> irresponsive; after fiddling a bit, i've added a watchdog LOAD limit, and
> the VM now reboot.
>
> Looking at logs, it seems caused by the replica of the 7.30:
>
> 	Apr 26 07:30:12 vdmsv1 qemu-ga: info: guest-ping called
> 	Apr 26 07:30:13 vdmsv1 qemu-ga: info: guest-fsfreeze called
>
> and after some (6 to 8 minutes) watchdog reset it:
>
> 	Apr 26 07:36:11 vdmsv1 watchdog[2525]: loadavg 57 33 14 is higher than the given threshold 80 32 16!
> 	Apr 26 07:36:11 vdmsv1 watchdog[2525]: shutting down the system because of error 253 = 'load average too high'
>
>
> Some notes:
>
> 1) the replica run every 30 minutes; no other of the 47 replicas of the day
>    seems sufficient to trigger a reboot.
>
> 2) the phisical server during the high peak seems totally unaffected (no
>    high iodelay, no sensible load/cpu...).
>
> 3) at 7.30 there's no user (they arrive around 8.15).
>
>
> I'm doing some hypotesis; for example debian by default rotate logs at 6.30,
> but looking at file dates logs are completely rotated before the 7.00, so
> the 7.00 replica could have triggered the reboot...
>
>
> Someone have some hint on how can i debug this!?
>
>
> Thanks.
>



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PVE-User] Peak load at 7.30AM...
  2023-04-26 10:48 [PVE-User] Peak load at 7.30AM Marco Gaiarin
  2023-04-26 14:17 ` Laurent CARON
@ 2023-04-27 12:32 ` Marco Gaiarin
  1 sibling, 0 replies; 3+ messages in thread
From: Marco Gaiarin @ 2023-04-27 12:32 UTC (permalink / raw)
  To: Marco Gaiarin; +Cc: pve-user


> I'm doing some hypotesis; for example debian by default rotate logs at 6.30,
> but looking at file dates logs are completely rotated before the 7.00, so
> the 7.00 replica could have triggered the reboot...

OK, found.

It was effectively the daily tasks; not logrotate, but a task/script that do
a set of 'find' in a virtual disk hosted on a slow ZFS pool (see my other
post).

Seems that doing a 'findstorm' on that volume can lead load to sky...


For now, i've added a 'sleep 1' between find calls; i don't need the script
end in precise time.

-- 
  La condivisione è il segreto di tutto...		(Roberto Colonello)





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-28  6:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-26 10:48 [PVE-User] Peak load at 7.30AM Marco Gaiarin
2023-04-26 14:17 ` Laurent CARON
2023-04-27 12:32 ` Marco Gaiarin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal