From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id BBA27951BE for ; Tue, 17 Jan 2023 18:58:17 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A2FD9FC3E for ; Tue, 17 Jan 2023 18:58:17 +0100 (CET) Received: from mout.web.de (mout.web.de [212.227.17.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 17 Jan 2023 18:58:15 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=s29768273; t=1673978289; bh=e/xKz0wlf4UkViVw7iTbXirVaNaWwgjzBPCrDGnfob4=; h=X-UI-Sender-Class:Date:Subject:To:References:From:In-Reply-To; b=Cm48ESpKnmgNMd3WFt7TJOWXKCw/hpT7/Mx8brzqwHZaqy7LB6OlNOqT/abNhKyMr vTIESUDQOTL+NJ2/gXI6buHO3/y+tPUtRfE4QiVZaOEUTuB1i7LJ7q4PtdYiXTasWO 5scIFQRz/PoG+fbvkxl4PRGPYe/VXeVfGxTUDiCftycFarDKxiPO4z9yalTJkUwx9U bh5ctuoWZ2YG5r8DZOoyGEnwy61vAA5Fi8gpr6BIotj9W1qgyWVcGRWcL7hCwBn6IW P0z3PMHysaUIe3RMzwo11zVU8NXdMghU1PmdWwPRYAwsf2e33BFFXO2PjiSHTj9CAC AmDnGnF2UbPyw== X-UI-Sender-Class: 814a7b36-bfc1-4dae-8640-3722d8ec6cd6 Received: from [172.20.35.20] ([37.24.118.138]) by smtp.web.de (mrweb106 [213.165.67.124]) with ESMTPSA (Nemesis) id 1MBjMM-1pRl6S2duG-00C6gY; Tue, 17 Jan 2023 18:45:12 +0100 Message-ID: <8d7aca90-efda-2a8f-9ca2-68792fe258cc@web.de> Date: Tue, 17 Jan 2023 18:45:11 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 To: Proxmox VE user list , Adam Weremczuk References: <4ac6cb85-bf4e-ff81-e120-7365be0f1c10@matrixscience.com> From: Roland In-Reply-To: <4ac6cb85-bf4e-ff81-e120-7365be0f1c10@matrixscience.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:u7FKGCf8duE05RZUH9g+ewm+K2tx1cMeiTkutxMtQSVfvRF0+A5 sxLYuABpa20ZKIetA3wpLfa44t7/6B7tMHR2LB5kHhLJNIvJSYvsH05HEqob9TCH5F6u1Hg lejJKwGDGnAxy9UMKcTFgMA0bqyX+v7GaDzN1Xho3iA6LNa4aNVZS6JzGd2vg4kLYwBK1ta WrZvwFlPrSh+IXEWlo+IA== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:lRB0o3QoHDc=;juOUK8bis47StzdMUHmDNrK+Uaa XjNwzYX36RpavLTbYgQg9+Bm5aLYnR10ddytN+PpTMFNhDurmjB5MBnr1VvfLR9H0thQXqkaQ Q9mq6KxyLGj2RwtLm4QM2Lw8HKz/36H32Mhe+DF7laIhnfMMDvrGO8u0xA8Mb3/PxwuOh4jzM MBcGQ6+MJAyCxVGGXV5NNzXKz8BrqV13xuQtSDyOgHDVR9GjKjwlte77zxtXIi2e5JVBNXMS9 Vik2wbJsJcJ/GzowOwf5TeR/Nl6N8vklGWO/YmtZLO3w0k1sVI2+OssK+GPO2nDU3PvX8FRMc 1oJ0c87ldMtxzgd5V9u8YgbAixQ7GgYdshh3gb5O77swXM5ECJDRQ03m5T6p6AtvLg7jtP8nK +t1tCc/zwOl0EDGjRs9HGKF2DFKDTePn2WWDfma3MBnuySi04gknzisXm9BkFh5EaCL1gcrtA Vl0HMXljABtSk1+CszEpRPnpWKfv+ZALKOgsZaLAgrhPbZFzTEdXLiwojASlt84JVIvR/hgFF IP50WvikTVgWoMI8ymI9zuifdAFUOcFcL0xyut076yTXXW0qqw86kAQjZSW3RPqzi4Q5uTrA5 SHP0wpacN1P89jhw0rGon9k7KtBFa8Rh8B4o9BeAIoaFVbWrq+/I8+bItHzNh5P79JMHH8jT7 GhE2li2bQIJ27G+PtTDlFgk1BdCx/KR0ukzMQ9hl15uwhzdDMQCSa+9R9GD6co93RH/euOipn pbHgS6a4cgmQIs471N+NlJzXZEYxmoGu86xd5dJffehEwqZBbZCLVf1xkU2PDDalyr9mSG/7D AECfP5bUaTHXdfEPbo/iO9Yq0T/prN8sXdUZyX5YZxTjK3S6hMc2bEkYd9B/K85wnlVeP8HFZ Xk4Ozs8nOpMlFFfGpHY8+6g/1IsoLv9ANEhWxjibG0eR2yk3x2MYvq/QiiBAv1mcNAidOEVuL bbEgjHiCZYXC76oPfmBuiJxYOjs= X-SPAM-LEVEL: Spam detection results: 0 AWL 0.293 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain FREEMAIL_FROM 0.001 Sender email is commonly abused enduser mail provider NICE_REPLY_A -0.097 Looks like a legit reply (A) RCVD_IN_MSPIKE_H3 0.001 Good reputation (+3) RCVD_IN_MSPIKE_WL 0.001 Mailspike good senders SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [theregister.com, proxmox.com] Subject: Re: [PVE-User] Proxmox VM hard resets X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2023 17:58:17 -0000 can you reproduce this with debian 11 or ubuntu 22 VM (create some load there), i think this is not a proxmox problem which can be solved at the proxmox/vm-guest level see https://www.theregister.com/2017/11/28/stunning_antistun_vm_stun_problem_f= ix/ for example roland Am 17.01.23 um 16:04 schrieb Adam Weremczuk: > Hi all, > > My environment is quite unusual as I run PVE 7.2-11 as a VM on VMware > 7.0.2. It runs several LXC containers and generally things are working > fine. > > Recently the Proxmox VM (called "jaguar") started resetting itself > (and all containers) shortly after Altaro VM Backup kicked off a > scheduled VM backup over the network. > Each time a hard reset was requested by the OS itself (Proxmox > hypervisor). > > The time of the "stun/unstun" operation seems to be causing the issue > here i.e. usually the stun/unstun operation should take a very short > amount of time, however, in my case, depending on the load on both the > hypervisor and the guest VM (nested hypervisor), that time can vary > and take a bit longer, snippet below from various stun/unstun operations= : > > 2023-01-12T23:00:55.407Z| vcpu-0| | I005: CPT: vm was stunned for > 32142467 us > 2023-01-12T23:01:12.848Z| vcpu-0| | I005: CPT: vm was stunned for > 14942070 us > 2023-01-12T23:11:35.984Z| vcpu-0| opID=3D1487b0d5| I005: CPT: vm was > stunned for 277986 us > 2023-01-12T23:11:39.431Z| vcpu-0| | I005: CPT: vm was stunned for > 122089 us > > As you can see the stun time is different between each disk, now what > I think that is happening here is depending on the stun/unstun time of > the VM (virtualized hypervisor), the virtualized hypervisor watchdog > is noticing that the OS is being frozen for a X amount time and > issuing a hard reset. I guess when the stun time is over 30 sec, the > guest OS is issuing a hard reset. > > 2023-01-12T23:00:55.407Z| vcpu-0| | I005: CPT: vm was stunned for > 32142467 us > 2023-01-12T23:00:55.407Z| vcpu-0| | I005: SnapshotVMXTakeSnapshotWork: > Transition to mode 1. > 2023-01-12T23:00:55.407Z| vcpu-0| | I005: > SnapshotVMXTakeSnapshotComplete: Done with snapshot > 'ALTAROTEMPSNAPSHOTDONOTDELETE463b73a7-f363-4daf-acf3-b0322fe84429': 95 > 2023-01-12T23:00:55.407Z| vcpu-0| | I005: > VigorTransport_ServerSendResponse opID=3D1487b008 seq=3D887616: Complete= d > Snapshot request. > 2023-01-12T23:00:55.409Z| vcpu-8| | I005: HBACommon: First write on > scsi0:0.fileName=3D'/vmfs/volumes/61364720-e494cfe4-6cff-b083fed97d91/ja= guar/jaguar-000001.vmdk' > 2023-01-12T23:00:55.409Z| vcpu-8| | I005: DDB: "longContentID" =3D > "08bf301ae8e75c151d2f273571a4ea9f" (was > "2a6fd4c33a60f8d724ccc100a666f0d7") > 2023-01-12T23:00:57.906Z| vcpu-8| | I005: DISKLIB-CHAIN : > DiskChainUpdateContentID: old=3D0xa666f0d7, new=3D0x71a4ea9f > (08bf301ae8e75c151d2f273571a4ea9f) > 2023-01-12T23:00:57.906Z| vcpu-9| | I005: Chipset: The guest has > requested that the virtual machine be hard reset. > > I'm struggling to establish how the watchdog timer (or equivalent) is > configured :( Maybe increasing its trigger time would solve the issue? > > Any other ideas / similar experiences? > > Regards, > Adam > > > _______________________________________________ > pve-user mailing list > pve-user@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >