From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 27F796A8EB for ; Thu, 25 Mar 2021 20:17:27 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 1F3AA20EE5 for ; Thu, 25 Mar 2021 20:16:57 +0100 (CET) Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id CE11920ED9 for ; Thu, 25 Mar 2021 20:16:54 +0100 (CET) Received: by mail-qt1-x833.google.com with SMTP id m7so2492668qtq.11 for ; Thu, 25 Mar 2021 12:16:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=/MaYXMHrV/CS41OMv7VSYCho0N1CmQheNu9rVg50nDQ=; b=K74tNM57WWX5xYDIs7BbyphBeCiFtX8ntuhCu7Bf/87AacZHr/a8GKrfEZ0/FNZv0/ rmEkan7lcAIylU22KvOz+XOvgg1jxjxMlp1P92ouHB6cGAAXf8YhDXKf52ph/fgY2TYe peSMcXJuFKk+Sp+AXFe8KxdbzPYccfMajyWwBEZ+a+BW61iCTGQ84T7U4RZS6QCPXoNh abUz8kgojXD4o+z+/JvHDKjUC06kzCShlHHBboBhOr9eWRccQHm7YBqQT27r47CMu3tk qhKShCiiJQUdeENkhDdMh1nnf+Sgte6QUJ/3er385lhcLXK3HyXyjKS3WitkQmHrwfcz 8j6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=/MaYXMHrV/CS41OMv7VSYCho0N1CmQheNu9rVg50nDQ=; b=FZrnDhAv73UzHjorcXQVMmvzNL4VREQQsLDZXQ/xcW1HlPz2BzGIuZqT7gE+1H+vlT mQ7TDdr+wBnR+VXvpXvSUvp3+xrli9mu+weASlfIwzQyn503pKR5JRRfyp2oPEHgxtBR Q6KhlBBZFS08ZLY7pwwoK9/mUd8bFZpWWqay9aStU6lkfBMGJghn1gL1MmYgl2sU6kGb OUhMUvOQrBJAgWVTj82W32xJR2UgC0+85VyxQg//rH9rrZ6zBYNhDy9133IMmerPnN4H fCV0f1DUUCey9rbAOCa14fguc8Wd230K6CmJFS8Mz1s0GcaFbQ42EDBRPvuAAsUG3haO hF+A== X-Gm-Message-State: AOAM530QQWgtLU+khRZmGmxfJ1Oh8h40fVHEJXO8peBMBX/mev/Bm8tb Dd798ytoNRmhZXmzCVaIO0oh0gm5BE55rEzdszs= X-Google-Smtp-Source: ABdhPJzWhRXmJ637SzXyC/6nnIGISktk+81DaJaeKsK6HS0DEnMad7spuvXDoXkzr2OwnUUqs31cMUHoy3tSjMa59CM= X-Received: by 2002:ac8:6e85:: with SMTP id c5mr8812356qtv.299.1616699807823; Thu, 25 Mar 2021 12:16:47 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Gilberto Ferreira Date: Thu, 25 Mar 2021 16:16:11 -0300 Message-ID: To: jameslipski , Proxmox VE user list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL -0.058 Adjusted score from AWL reputation of From: address DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain FREEMAIL_ENVFROM_END_DIGIT 0.25 Envelope-from freemail username ends in digit FREEMAIL_FROM 0.001 Sender email is commonly abused enduser mail provider KAM_ASCII_DIVIDERS 0.8 Spam that uses ascii formatting tricks RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Subject: Re: [PVE-User] Not sure if this is a corosync issue. X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Mar 2021 19:17:27 -0000 Nice! Keep us posted. --- Gilberto Nunes Ferreira (47) 99676-7530 - Whatsapp / Telegram Em qui., 25 de mar. de 2021 =C3=A0s 16:13, jameslipski via pve-user escreveu: > > > > > ---------- Forwarded message ---------- > From: jameslipski > To: Proxmox VE user list > Cc: > Bcc: > Date: Thu, 25 Mar 2021 19:12:38 +0000 > Subject: Re: [PVE-User] Not sure if this is a corosync issue. > Hi, > > Alright. I'll try. Since these nodes are in production it might be a whil= e till I get a chance to. > > =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 Original = Message =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 > On Thursday, March 25, 2021 2:58 PM, Gilberto Ferreira wrote: > > > Hi > > You should consider, with carefully, update it to new versions. > > > > ------------------------------------------------------------------- > > > > Gilberto Nunes Ferreira > > (47) 99676-7530 - Whatsapp / Telegram > > > > Em qui., 25 de mar. de 2021 =C3=A0s 15:56, jameslipski via pve-user > > pve-user@lists.proxmox.com escreveu: > > > > > ---------- Forwarded message ---------- > > > From: jameslipski jameslipski@protonmail.com > > > To: Proxmox VE user list pve-user@lists.proxmox.com > > > Cc: > > > Bcc: > > > Date: Thu, 25 Mar 2021 18:56:10 +0000 > > > Subject: Re: [PVE-User] Not sure if this is a corosync issue. > > > Hello, > > > All nodes are running the same version 6.0-4. Pveversion -v shows: > > > proxmox-ve: 6.0-2 (running kernel: 5.0.15-1-pve) > > > pve-manager: 6.0-4 (running version: 6.0-4/2a719255) > > > pve-kernel-5.0: 6.0-5 > > > pve-kernel-helper: 6.0-5 > > > pve-kernel-5.0.15-1-pve: 5.0.15-1 > > > ceph: 14.2.2-pve1 > > > ceph-fuse: 14.2.2-pve1 > > > corosync: 3.0.2-pve2 > > > criu: 3.11-3 > > > glusterfs-client: 5.5-3 > > > ksm-control-daemon: 1.3-1 > > > libjs-extjs: 6.0.1-10 > > > libknet1: 1.10-pve1 > > > libpve-access-control: 6.0-2 > > > libpve-apiclient-perl: 3.0-2 > > > libpve-common-perl: 6.0-2 > > > libpve-guest-common-perl: 3.0-1 > > > libpve-http-server-perl: 3.0-2 > > > libpve-storage-perl: 6.0-5 > > > libqb0: 1.0.5-1 > > > lvm2: 2.03.02-pve3 > > > lxc-pve: 3.1.0-61 > > > lxcfs: 3.0.3-pve60 > > > novnc-pve: 1.0.0-60 > > > proxmox-mini-journalreader: 1.1-1 > > > proxmox-widget-toolkit: 2.0-5 > > > pve-cluster: 6.0-4 > > > pve-container: 3.0-3 > > > pve-docs: 6.0-4 > > > pve-edk2-firmware: 2.20190614-1 > > > pve-firewall: 4.0-5 > > > pve-firmware: 3.0-2 > > > pve-ha-manager: 3.0-2 > > > pve-i18n: 2.0-2 > > > pve-qemu-kvm: 4.0.0-3 > > > pve-xtermjs: 3.13.2-1 > > > qemu-server: 6.0-5 > > > smartmontools: 7.0-pve2 > > > spiceterm: 3.1-1 > > > vncterm: 1.6-1 > > > zfsutils-linux: 0.8.1-pve1 > > > =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 Origi= nal Message =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 > > > On Thursday, March 25, 2021 2:26 PM, Gilberto Ferreira gilberto.nunes= 32@gmail.com wrote: > > > > > > > What pve version? > > > > Is this an update from previously PVE Versions??? > > > > > > > > Gilberto Nunes Ferreira > > > > (47) 99676-7530 - Whatsapp / Telegram > > > > Em qui., 25 de mar. de 2021 =C3=A0s 15:19, jameslipski via pve-user > > > > pve-user@lists.proxmox.com escreveu: > > > > > > > > > ---------- Forwarded message ---------- > > > > > From: jameslipski jameslipski@protonmail.com > > > > > To: Proxmox VE user list pve-user@lists.proxmox.com > > > > > Cc: > > > > > Bcc: > > > > > Date: Thu, 25 Mar 2021 18:02:25 +0000 > > > > > Subject: Not sure if this is a corosync issue. > > > > > Greetings, > > > > > Today, one of my nodes seems to have rebooted randomly (node in q= uestion has been in a production environment for several months; no issues = since it was added to the cluster). During my investigation, the following = is what I see before the crash; unfortunately, I'm having a little bit of a= n issue deciphering this: > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [status] crit: rrdentry_hash= _set: assertion 'data[len-1] =3D=3D 0' failed > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [dcdb] crit: cpg_dispatch fa= iled: 2 > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [dcdb] crit: cpg_leave faile= d: 2 > > > > > Mar 25 12:54:54 node09 systemd[1]: corosync.service: Main process= exited, code=3Dkilled, status=3D11/SEGV > > > > > Mar 25 12:54:54 node09 systemd[1]: corosync.service: Failed with = result 'signal'. > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [confdb] crit: cmap_dispatch= failed: 2 > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [quorum] crit: quorum_dispat= ch failed: 2 > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [status] notice: node lost q= uorum > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [status] crit: cpg_dispatch = failed: 2 > > > > > Mar 25 12:54:54 node09 pmxcfs[5419]: [status] crit: cpg_leave fai= led: 2 > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [quorum] crit: quorum_initia= lize failed: 2 > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [quorum] crit: can't initial= ize service > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [confdb] crit: cmap_initiali= ze failed: 2 > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [confdb] crit: can't initial= ize service > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [dcdb] notice: start cluster= connection > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [dcdb] crit: cpg_initialize = failed: 2 > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [dcdb] crit: can't initializ= e service > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [status] notice: start clust= er connection > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [status] crit: cpg_initializ= e failed: 2 > > > > > Mar 25 12:54:55 node09 pmxcfs[5419]: [status] crit: can't initial= ize service > > > > > Mar 25 12:54:56 node09 pve-ha-crm[2161]: status change slave =3D>= wait_for_quorum > > > > > Mar 25 12:55:00 node09 systemd[1]: Starting Proxmox VE replicatio= n runner... > > > > > Mar 25 12:55:00 node09 pve-ha-lrm[2169]: lost lock 'ha_agent_node= 09_lock - cfs lock update failed - Permission denied > > > > > Mar 25 12:55:01 node09 pmxcfs[5419]: [quorum] crit: quorum_initia= lize failed: 2 > > > > > Mar 25 12:55:01 node09 pmxcfs[5419]: [confdb] crit: cmap_initiali= ze failed: 2 > > > > > Mar 25 12:55:01 node09 pmxcfs[5419]: [dcdb] crit: cpg_initialize = failed: 2 > > > > > Mar 25 12:55:01 node09 pmxcfs[5419]: [status] crit: cpg_initializ= e failed: 2 > > > > > Mar 25 12:55:01 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:01 node09 CRON[2755555]: (root) CMD (command -v debi= an-sa1 > /dev/null && debian-sa1 1 1) > > > > > Mar 25 12:55:02 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:03 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:04 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:05 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:05 node09 pve-ha-lrm[2169]: status change active =3D= > lost_agent_lock > > > > > Mar 25 12:55:06 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:07 node09 pmxcfs[5419]: [quorum] crit: quorum_initia= lize failed: 2 > > > > > Mar 25 12:55:07 node09 pmxcfs[5419]: [confdb] crit: cmap_initiali= ze failed: 2 > > > > > Mar 25 12:55:07 node09 pmxcfs[5419]: [dcdb] crit: cpg_initialize = failed: 2 > > > > > Mar 25 12:55:07 node09 pmxcfs[5419]: [status] crit: cpg_initializ= e failed: 2 > > > > > Mar 25 12:55:07 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:08 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:09 node09 pvesr[2755547]: trying to acquire cfs lock= 'file-replication_cfg' ... > > > > > Mar 25 12:55:10 node09 pvesr[2755547]: error with cfs lock 'file-= replication_cfg': no quorum! > > > > > Mar 25 12:55:10 node09 systemd[1]: pvesr.service: Main process ex= ited, code=3Dexited, status=3D13/n/a > > > > > Mar 25 12:55:10 node09 systemd[1]: pvesr.service: Failed with res= ult 'exit-code'. > > > > > Mar 25 12:55:10 node09 systemd[1]: Failed to start Proxmox VE rep= lication runner. > > > > > Mar 25 12:55:13 node09 pmxcfs[5419]: [quorum] crit: quorum_initia= lize failed: 2 > > > > > Mar 25 12:55:13 node09 pmxcfs[5419]: [confdb] crit: cmap_initiali= ze failed: 2 > > > > > Mar 25 12:55:13 node09 pmxcfs[5419]: [dcdb] crit: cpg_initialize = failed: 2 > > > > > Mar 25 12:55:13 node09 pmxcfs[5419]: [status] crit: cpg_initializ= e failed: 2 > > > > > Mar 25 12:55:19 node09 pmxcfs[5419]: [quorum] crit: quorum_initia= lize failed: 2 > > > > > Mar 25 12:55:19 node09 pmxcfs[5419]: [confdb] crit: cmap_initiali= ze failed: 2 > > > > > Mar 25 12:55:19 node09 pmxcfs[5419]: [dcdb] crit: cpg_initialize = failed: 2 > > > > > Mar 25 12:55:19 node09 pmxcfs[5419]: [status] crit: cpg_initializ= e failed: 2 > > > > > Mar 25 12:55:25 node09 pmxcfs[5419]: [quorum] crit: quorum_initia= lize failed: 2 > > > > > Mar 25 12:55:25 node09 pmxcfs[5419]: [confdb] crit: cmap_initiali= ze failed: 2 > > > > > Mar 25 12:55:25 node09 pmxcfs[5419]: [dcdb] crit: cpg_initialize = failed: 2 > > > > > Mar 25 12:55:25 node09 pmxcfs[5419]: [status] crit: cpg_initializ= e failed: 2 > > > > > Mar 25 12:55:31 node09 pmxcfs[5419]: [quorum] crit: quorum_initia= lize failed: 2 > > > > > Mar 25 12:55:31 node09 pmxcfs[5419]: [confdb] crit: cmap_initiali= ze failed: 2 > > > > > Mar 25 12:55:31 node09 pmxcfs[5419]: [dcdb] crit: cpg_initialize = failed: 2 > > > > > Mar 25 12:55:31 node09 pmxcfs[5419]: [status] crit: cpg_initializ= e failed: 2 > > > > > I see that corosync experienced the following: > > > > > Mar 25 12:54:54 node09 systemd[1]: corosync.service: Main process= exited, code=3Dkilled, status=3D11/SEGV > > > > > Mar 25 12:54:54 node09 systemd[1]: corosync.service: Failed with = result 'signal'. > > > > > and I'm not too sure why. Also not sure if that alone took down t= he system. Any help is much appreciated. If any additional information is n= eeded, please let us know. Thank you. > > > > > ---------- Forwarded message ---------- > > > > > From: jameslipski via pve-user pve-user@lists.proxmox.com > > > > > To: Proxmox VE user list pve-user@lists.proxmox.com > > > > > Cc: jameslipski jameslipski@protonmail.com > > > > > Bcc: > > > > > Date: Thu, 25 Mar 2021 18:02:25 +0000 > > > > > Subject: [PVE-User] Not sure if this is a corosync issue. > > > > > pve-user mailing list > > > > > pve-user@lists.proxmox.com > > > > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > > > ---------- Forwarded message ---------- > > > From: jameslipski via pve-user pve-user@lists.proxmox.com > > > To: Proxmox VE user list pve-user@lists.proxmox.com > > > Cc: jameslipski jameslipski@protonmail.com > > > Bcc: > > > Date: Thu, 25 Mar 2021 18:56:10 +0000 > > > Subject: Re: [PVE-User] Not sure if this is a corosync issue. > > > > > > pve-user mailing list > > > pve-user@lists.proxmox.com > > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > > > > ---------- Forwarded message ---------- > From: jameslipski via pve-user > To: Proxmox VE user list > Cc: jameslipski > Bcc: > Date: Thu, 25 Mar 2021 19:12:38 +0000 > Subject: Re: [PVE-User] Not sure if this is a corosync issue. > _______________________________________________ > pve-user mailing list > pve-user@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user