From: Alwin Antreich via pve-user <pve-user@lists.proxmox.com>
To: "Proxmox VE user list" <pve-user@lists.proxmox.com>
Cc: Alwin Antreich <alwin@antreich.com>
Subject: Re: [PVE-User] VMs With Multiple Interfaces Rebooting
Date: Wed, 27 Nov 2024 09:38:59 +0000 [thread overview]
Message-ID: <mailman.759.1732700348.391.pve-user@lists.proxmox.com> (raw)
In-Reply-To: <CA+U74VNt=fNn2vmy3JuqNOFG8DbHWjf7HxTu3MsN7S62FFMwBw@mail.gmail.com>
[-- Attachment #1: Type: message/rfc822, Size: 5701 bytes --]
From: "Alwin Antreich" <alwin@antreich.com>
To: "Proxmox VE user list" <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] VMs With Multiple Interfaces Rebooting
Date: Wed, 27 Nov 2024 09:38:59 +0000
Message-ID: <64bc4bad4c8528beaf44558880c9723751431d16@antreich.com>
Hi JR,
November 25, 2024 at 4:08 PM, "JR Richardson" <jmr.richardson@gmail.com mailto:jmr.richardson@gmail.com?to=%22JR%20Richardson%22%20%3Cjmr.richardson%40gmail.com%3E > wrote:
>
> >
> > Super stable environment for many years through software and hardware
> > upgrades, few issues to speak of, then without warning one of my
> > hypervisors in 3 node group crashed with a memory dimm error, cluster
> > HA took over and restarted the VMs on the other two nodes in the group
> > as expected. The problem quickly materialized as the VMs started
> > rebooting quickly, a lot of network issues and notice of migration
> > pending. I could not lockdown exactly what the root cause was. Notable
> > This sounds like it wanted to balance the load. Do you have CRS active and/or static load scheduling?
> >
> CRS option is set to basic, not dynamic.
K, basic. And I meant is rebalance active. :)
>
> 2024-11-21T18:37:38.248094-06:00 vvepve13 pve-ha-lrm[4337]: <root@pam>
> end task UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root@pam:
> OK
> 2024-11-21T18:37:38.254144-06:00 vvepve13 pve-ha-lrm[4337]: service
> status vm:13101 started
> 2024-11-21T18:37:44.256824-06:00 vvepve13 QEMU[3794]: kvm:
> ../accel/kvm/kvm-all.c:1836: kvm_irqchip_commit_routes: Assertion `ret
> == 0' failed.
This doesn't look good. I'd assume that this is VM13101, which failed to start. And was consequently moved to the other remaining node (vice versa).
But this doesn't explain the WHY. You will need to look further into the logs to see what else transpired during this time.
Cheers,
Alwin
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
next prev parent reply other threads:[~2024-11-27 9:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <mailman.5.1732532402.36715.pve-user@lists.proxmox.com>
2024-11-25 15:08 ` JR Richardson
2024-11-27 9:38 ` Alwin Antreich via pve-user [this message]
2024-11-22 16:59 JR Richardson
[not found] <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>
2024-11-22 7:53 ` Mark Schouten via pve-user
2024-11-22 7:53 ` Mark Schouten via pve-user
2024-11-25 5:32 ` Alwin Antreich via pve-user
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mailman.759.1732700348.391.pve-user@lists.proxmox.com \
--to=pve-user@lists.proxmox.com \
--cc=alwin@antreich.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox