From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 356D19C29F for ; Wed, 31 May 2023 14:29:26 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 10DF89B1F for ; Wed, 31 May 2023 14:28:56 +0200 (CEST) Received: from mail-qt1-x830.google.com (mail-qt1-x830.google.com [IPv6:2607:f8b0:4864:20::830]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 31 May 2023 14:28:54 +0200 (CEST) Received: by mail-qt1-x830.google.com with SMTP id d75a77b69052e-3f6b9ad956cso32820791cf.1 for ; Wed, 31 May 2023 05:28:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gridscale-io.20221208.gappssmtp.com; s=20221208; t=1685536125; x=1688128125; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=tWNL/jLRKhyg3K2oOHflY2PuL9OQEM5252tr7cV+FQ4=; b=bW4uz52q7VZfJsnAA8E1I8U68+g63EwfKNY6T1xx1zH03aKAP2W61MdZVZD/Il/gQp aZEdnpoR7gynLcgygIilo1EKaQs5Tc8g5nhujUqY0bVVROZO1OkXLW0stEM735YWxDtI HqYyaiz+ZSH3RP0rwfoIE6TpW7lMFGddNiMvo7/e67Ba1dtlC0C3r8b5TOPNr329NLYz umPcVzgUiVt2HkQX7Sax2udnCwJ1Ex9gqBRynMChdgPRqO2rkFsbxaXPyqTfWLtzbtqP /U9qf9mLb3VdeV81OFrxj3cIzq/LNWkS5dhvCQj2/ysNx1T0jUaUI13pgqd4aQ2bDtsO AixA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685536125; x=1688128125; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=tWNL/jLRKhyg3K2oOHflY2PuL9OQEM5252tr7cV+FQ4=; b=RoNEN3mZ6H+1FroiR39tMU36PPgBLo78ytUtxBTA7uJmyP5s24CsMoFwCD7qd4Dv8P ZSUcX2g3TQXDWM6s8oa/pB/j3TJTRPmz5tyt5KCxWZqR2tOsRKsUDYN6SvPXpltdR1zX O3oWplghKqEPqSpzVt5HX+tKz1tlEYzo+mGvgzejuFdit/XOyikmziCGoNN00UuPFcit sGZLYybSZCwz9BlDNtbIbnpWQ4afmzeDUrF9hipYSgTcV1JWqmiHOqSd7uXUIp0tEAS/ fyKrdGP9kuDV/WNiRNxyi4slMQ9kB0P1gXJjcX87qlpWM2w6NGogVUNXKAPJcjjMU+24 frPg== X-Gm-Message-State: AC+VfDzx13hKzUo8Q99Xn3D3kBGMcy9MMLspmYB6qHXGv9dWQrr+fJ2+ XwVjsFCb9Mm4t5BSoBtXJeBxFLg6d0BMVWMJ3ON/pQaNafpDgzCvxyY= X-Google-Smtp-Source: ACHHUZ7l+5oTYNy2lbRZejFJ2y26EUfBWQnFJXPyvfPrG73rqYwXfjDdaYoiqT+mj+X/g2Ds9Z+K8RU3WYObd1p8sqA= X-Received: by 2002:a05:6214:27ec:b0:615:29ab:e4a8 with SMTP id jt12-20020a05621427ec00b0061529abe4a8mr6069256qvb.31.1685536123752; Wed, 31 May 2023 05:28:43 -0700 (PDT) MIME-Version: 1.0 From: Benjamin Hofer Date: Wed, 31 May 2023 14:28:34 +0200 Message-ID: To: pve-user@lists.proxmox.com X-SPAM-LEVEL: Spam detection results: 0 AWL 0.184 Adjusted score from AWL reputation of From: address BAYES_40 -0.001 Bayes spam probability is 20 to 40% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DMARC_PASS -0.1 DMARC pass policy HTML_MESSAGE 0.001 HTML included in message RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [PVE-User] Kernel panics when using OpenVSwitch bridges (Proxmox VE 7.3) X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 May 2023 12:29:26 -0000 Hello community, we're using OpenVSwitch bridges on a productive Proxxmox 7.3 cluster with 4 nodes (different hardware). Some weeks ago, a sudden reboot happened on one of the cluster nodes. Further analysis showed that we had kernel panics / CPU stalls which seem to be related to OpenVSwitch. After more analysis, we found out that we're able to reliably reproduce the OVS related kernel panic jst by restarting running LXC containers with network interfaces. The behaviour could be reproduced on all our nodes. As these nodes are quite different in their hardware specifications, we assume that it's caused by some software-related (OVS) bug. The kernel panics do NOT occur when switching to linux bridged on a node. See kernel log extract attached. Did someone have a similar behaviour? What experiences do you have with linux bridges compared to OVS bridges regarding network performance? We could do without OVS features but must rely on good enough performance. System: pveversion: pve-manager/7.3-6/723bb6ec (running kernel: 5.15.102-1-pve) OVS 2.15.0 Thank you in advance. All the best Benjamin