From: Adam Weremczuk <adamw@matrixscience.com>
To: pve-user@lists.proxmox.com
Subject: [PVE-User] Proxmox VM hard resets
Date: Tue, 17 Jan 2023 15:04:36 +0000 [thread overview]
Message-ID: <4ac6cb85-bf4e-ff81-e120-7365be0f1c10@matrixscience.com> (raw)
Hi all,
My environment is quite unusual as I run PVE 7.2-11 as a VM on VMware
7.0.2. It runs several LXC containers and generally things are working fine.
Recently the Proxmox VM (called "jaguar") started resetting itself (and
all containers) shortly after Altaro VM Backup kicked off a scheduled VM
backup over the network.
Each time a hard reset was requested by the OS itself (Proxmox hypervisor).
The time of the "stun/unstun" operation seems to be causing the issue
here i.e. usually the stun/unstun operation should take a very short
amount of time, however, in my case, depending on the load on both the
hypervisor and the guest VM (nested hypervisor), that time can vary and
take a bit longer, snippet below from various stun/unstun operations:
2023-01-12T23:00:55.407Z| vcpu-0| | I005: CPT: vm was stunned for
32142467 us
2023-01-12T23:01:12.848Z| vcpu-0| | I005: CPT: vm was stunned for
14942070 us
2023-01-12T23:11:35.984Z| vcpu-0| opID=1487b0d5| I005: CPT: vm was
stunned for 277986 us
2023-01-12T23:11:39.431Z| vcpu-0| | I005: CPT: vm was stunned for 122089 us
As you can see the stun time is different between each disk, now what I
think that is happening here is depending on the stun/unstun time of the
VM (virtualized hypervisor), the virtualized hypervisor watchdog is
noticing that the OS is being frozen for a X amount time and issuing a
hard reset. I guess when the stun time is over 30 sec, the guest OS is
issuing a hard reset.
2023-01-12T23:00:55.407Z| vcpu-0| | I005: CPT: vm was stunned for
32142467 us
2023-01-12T23:00:55.407Z| vcpu-0| | I005: SnapshotVMXTakeSnapshotWork:
Transition to mode 1.
2023-01-12T23:00:55.407Z| vcpu-0| | I005:
SnapshotVMXTakeSnapshotComplete: Done with snapshot
'ALTAROTEMPSNAPSHOTDONOTDELETE463b73a7-f363-4daf-acf3-b0322fe84429': 95
2023-01-12T23:00:55.407Z| vcpu-0| | I005:
VigorTransport_ServerSendResponse opID=1487b008 seq=887616: Completed
Snapshot request.
2023-01-12T23:00:55.409Z| vcpu-8| | I005: HBACommon: First write on
scsi0:0.fileName='/vmfs/volumes/61364720-e494cfe4-6cff-b083fed97d91/jaguar/jaguar-000001.vmdk'
2023-01-12T23:00:55.409Z| vcpu-8| | I005: DDB: "longContentID" =
"08bf301ae8e75c151d2f273571a4ea9f" (was "2a6fd4c33a60f8d724ccc100a666f0d7")
2023-01-12T23:00:57.906Z| vcpu-8| | I005: DISKLIB-CHAIN :
DiskChainUpdateContentID: old=0xa666f0d7, new=0x71a4ea9f
(08bf301ae8e75c151d2f273571a4ea9f)
2023-01-12T23:00:57.906Z| vcpu-9| | I005: Chipset: The guest has
requested that the virtual machine be hard reset.
I'm struggling to establish how the watchdog timer (or equivalent) is
configured :( Maybe increasing its trigger time would solve the issue?
Any other ideas / similar experiences?
Regards,
Adam
next reply other threads:[~2023-01-17 15:11 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-17 15:04 Adam Weremczuk [this message]
2023-01-17 17:45 ` Roland
2023-01-17 18:59 ` Adam Weremczuk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ac6cb85-bf4e-ff81-e120-7365be0f1c10@matrixscience.com \
--to=adamw@matrixscience.com \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox