From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 0A8001FF164 for ; Fri, 22 Nov 2024 17:59:43 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 5C1ED18122; Fri, 22 Nov 2024 17:59:47 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732294747; x=1732899547; darn=lists.proxmox.com; h=content-language:thread-index:content-transfer-encoding :mime-version:message-id:date:subject:to:from:from:to:cc:subject :date:message-id:reply-to; bh=d4+T2CQo+vNZ43GtTCiaYVq1DXx/ETDjNs+bZoCEsyg=; b=iJ9l9D/fr5/r1iLZuScioiOy2y8uTM3LGD4jX2MzT4TYle89UzyUCFnKu4mi7WAzfE GDLNbL8YehfXrRjY7lry41ZfofYXb4R7SmsmpCHwVRJxPFys4FJYkhJ+RFSvnnNrMXbG 1OFwSfPBYf1nnIsXdxFCXzZSDo7nEZFveq3C12OJBSBogGfx5fB0RFlX2WWVo/I5h2Iu H/C5ghsB9ETPT4Ti6vR0+/WdGsIiG78vPHwbEJA93e0zqDieRMM/ZR03AJUQi8VQmJk4 FO1/Bbwat+dn+nhVtzsXuK2lJQOhg4q0aYPEA49mucfBJ41So2Ne9ZTmjs6+H54wIj2f Lh9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732294747; x=1732899547; h=content-language:thread-index:content-transfer-encoding :mime-version:message-id:date:subject:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=d4+T2CQo+vNZ43GtTCiaYVq1DXx/ETDjNs+bZoCEsyg=; b=ETXF8E7dKAzMxu62gr+Q3Q+6KyRiieNee8B+/I8M4KSmwzvRFYzHUKaA0/vLkddyZV /3Wg+Qn2p0q9dImuH912GpaaHvxDNB0UGMOB9VU7n7uM7RJN9X7CJTZd3UVAPe0NEuVp 1j2GHKVZaOXgxuWjV9PmmhHoOwimhApV/04JDR7x2DJweNoVZYwZDZx4r1gWREieY9ef 8Hi9eqjuTa3vje4nRShu7Ywdp6Y5XlL/kdDukvr0Hao9hJzQtjGd9IxH4tUFKBPy8X5s bGK74r+Ts+Pi+DIlzS1rLqJcBpHVucojgzWo+owgqKzh68UxTt1amLyVWNLRdw+8Uxny Laxg== X-Gm-Message-State: AOJu0YxdkliXvMCXpgxKCQmGF0LZ2NJgfk1YgA2Swo5uA+Q3e1+JUyXF ENFieZje7ck7/PHeaehZJAHdiSPXN9GTXrCmkgOY+6Qqsy8qf1Jl5BcYkA== X-Gm-Gg: ASbGnctXsq7nsE3CgIsfgZdwmwv3wHKu+yQbaPXg/4mKokdsNEefq0rwF0VmwS3QLOu W245qXY6os110dyTnfC/oWk1MrFBEQPW7DRyM60CuT3Z/cijY+SF5fvOnpSqpa0xcdsxmqwsXJo nziH/7zjWX1BmUBUJiIEFwYp6ao23lCf6OPHdIDs3rpnCEym/1kJj3E9syW6iLxCCZX9M79pK6T Ks9H/XFaTFSO21VbUcxpnYKPfBR+EwWbo6TwUPKJKq3F2HUXqZJkJXJ4mKD4Q== X-Google-Smtp-Source: AGHT+IE72pB/isHI1c3I69OhodYItxQ6zuY48QuhX8lnd0sQykOEl+/97wCipiaQ0Hli18fXF9vNIw== X-Received: by 2002:a05:6871:4398:b0:296:e288:6567 with SMTP id 586e51a60fabf-29720ad6360mr3824982fac.1.1732294746896; Fri, 22 Nov 2024 08:59:06 -0800 (PST) From: "JR Richardson" To: Date: Fri, 22 Nov 2024 10:59:03 -0600 Message-ID: <000e01db3cff$d6a20130$83e60390$@gmail.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: Ads8/9CL8b0SV0uvSPS+QKsWYsZf3g== Content-Language: en-us X-SPAM-LEVEL: Spam detection results: 0 AWL 1.000 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain DMARC_PASS -0.1 DMARC pass policy FREEMAIL_FROM 0.001 Sender email is commonly abused enduser mail provider RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [PVE-User] VMs With Multiple Interfaces Rebooting X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE user list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-user-bounces@lists.proxmox.com Sender: "pve-user" Hi Mark, Found this error during log review: " vvepve13 pvestatd[1468]: VM 13113 qmp command failed - VM 13113 qmp command 'query-proxmox-support' failed - unable to connect to VM 13113 qmp socket - timeout after 51 retries" HA was sending shutdown to the VM after not being able to verify VM was running. I initially through this was networking related but as I investigate further, this seems like a bug in 'qm', so strange, been running on this version for months, doing migrations and spinning up new VMs without any issues. Thanks JR Hi JR, What do you mean by ?reboot?? Does the vm crash so that it is powered down from a HA point of view and started back up? Or does the VM OS nicely reboot? Mark Schouten > Op 22 nov 2024 om 07:18 heeft JR Richardson het volgende geschreven: > > ?Hey Folks, > > Just wanted to share an experience I recently had, Cluster parameters: > 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage. > Server Specs: > CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets) > Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z) Manager Version > pve-manager/8.2.4/faa83925c9641325 > > Super stable environment for many years through software and hardware > upgrades, few issues to speak of, then without warning one of my > hypervisors in 3 node group crashed with a memory dimm error, cluster > HA took over and restarted the VMs on the other two nodes in the group > as expected. The problem quickly materialized as the VMs started > rebooting quickly, a lot of network issues and notice of migration > pending. I could not lockdown exactly what the root cause was. Notable > was these particular VMs all have multiple network interfaces. After > several hours of not being able to get the current VMs stable, I tried > spinning up new VMs on to no avail, reboots persisted on the new VMs. > This seemed to only affect the VMs that were on the hypervisor that > failed all other VMs across the cluster were fine. > > I have not installed any third-party monitoring software, found a few > post in the forum about it, but was not my issue. > > In an act of desperation, I performed a dist-upgrade and this solved > the issue straight away. > Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) Manager Version > pve-manager/8.3.0/c1689ccb1065a83b > > Hope this was helpful and if there are any ideas on why this happened, > I welcome any responses. > > Thanks. > > JR _______________________________________________ pve-user mailing list pve-user@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user