From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <benjamin@gridscale.io>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 356D19C29F
 for <pve-user@lists.proxmox.com>; Wed, 31 May 2023 14:29:26 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 10DF89B1F
 for <pve-user@lists.proxmox.com>; Wed, 31 May 2023 14:28:56 +0200 (CEST)
Received: from mail-qt1-x830.google.com (mail-qt1-x830.google.com
 [IPv6:2607:f8b0:4864:20::830])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-user@lists.proxmox.com>; Wed, 31 May 2023 14:28:54 +0200 (CEST)
Received: by mail-qt1-x830.google.com with SMTP id
 d75a77b69052e-3f6b9ad956cso32820791cf.1
 for <pve-user@lists.proxmox.com>; Wed, 31 May 2023 05:28:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gridscale-io.20221208.gappssmtp.com; s=20221208; t=1685536125; x=1688128125; 
 h=to:subject:message-id:date:from:mime-version:from:to:cc:subject
 :date:message-id:reply-to;
 bh=tWNL/jLRKhyg3K2oOHflY2PuL9OQEM5252tr7cV+FQ4=;
 b=bW4uz52q7VZfJsnAA8E1I8U68+g63EwfKNY6T1xx1zH03aKAP2W61MdZVZD/Il/gQp
 aZEdnpoR7gynLcgygIilo1EKaQs5Tc8g5nhujUqY0bVVROZO1OkXLW0stEM735YWxDtI
 HqYyaiz+ZSH3RP0rwfoIE6TpW7lMFGddNiMvo7/e67Ba1dtlC0C3r8b5TOPNr329NLYz
 umPcVzgUiVt2HkQX7Sax2udnCwJ1Ex9gqBRynMChdgPRqO2rkFsbxaXPyqTfWLtzbtqP
 /U9qf9mLb3VdeV81OFrxj3cIzq/LNWkS5dhvCQj2/ysNx1T0jUaUI13pgqd4aQ2bDtsO
 AixA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20221208; t=1685536125; x=1688128125;
 h=to:subject:message-id:date:from:mime-version:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=tWNL/jLRKhyg3K2oOHflY2PuL9OQEM5252tr7cV+FQ4=;
 b=RoNEN3mZ6H+1FroiR39tMU36PPgBLo78ytUtxBTA7uJmyP5s24CsMoFwCD7qd4Dv8P
 ZSUcX2g3TQXDWM6s8oa/pB/j3TJTRPmz5tyt5KCxWZqR2tOsRKsUDYN6SvPXpltdR1zX
 O3oWplghKqEPqSpzVt5HX+tKz1tlEYzo+mGvgzejuFdit/XOyikmziCGoNN00UuPFcit
 sGZLYybSZCwz9BlDNtbIbnpWQ4afmzeDUrF9hipYSgTcV1JWqmiHOqSd7uXUIp0tEAS/
 fyKrdGP9kuDV/WNiRNxyi4slMQ9kB0P1gXJjcX87qlpWM2w6NGogVUNXKAPJcjjMU+24
 frPg==
X-Gm-Message-State: AC+VfDzx13hKzUo8Q99Xn3D3kBGMcy9MMLspmYB6qHXGv9dWQrr+fJ2+
 XwVjsFCb9Mm4t5BSoBtXJeBxFLg6d0BMVWMJ3ON/pQaNafpDgzCvxyY=
X-Google-Smtp-Source: ACHHUZ7l+5oTYNy2lbRZejFJ2y26EUfBWQnFJXPyvfPrG73rqYwXfjDdaYoiqT+mj+X/g2Ds9Z+K8RU3WYObd1p8sqA=
X-Received: by 2002:a05:6214:27ec:b0:615:29ab:e4a8 with SMTP id
 jt12-20020a05621427ec00b0061529abe4a8mr6069256qvb.31.1685536123752; Wed, 31
 May 2023 05:28:43 -0700 (PDT)
MIME-Version: 1.0
From: Benjamin Hofer <benjamin@gridscale.io>
Date: Wed, 31 May 2023 14:28:34 +0200
Message-ID: <CAD=jCXP3x_PrZRx_kSXQy7_YMTwsD3H0_Ja8s3C3J+h3GHSmrg@mail.gmail.com>
To: pve-user@lists.proxmox.com
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.184 Adjusted score from AWL reputation of From: address
 BAYES_40               -0.001 Bayes spam probability is 20 to 40%
 DKIM_SIGNED               0.1 Message has a DKIM or DK signature,
 not necessarily valid
 DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature
 DMARC_PASS               -0.1 DMARC pass policy
 HTML_MESSAGE            0.001 HTML included in message
 RCVD_IN_DNSWL_NONE     -0.0001 Sender listed at https://www.dnswl.org/,
 no trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
Subject: [PVE-User] Kernel panics when using OpenVSwitch bridges (Proxmox VE
 7.3)
X-BeenThere: pve-user@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE user list <pve-user.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-user>, 
 <mailto:pve-user-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-user/>
List-Post: <mailto:pve-user@lists.proxmox.com>
List-Help: <mailto:pve-user-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user>, 
 <mailto:pve-user-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 31 May 2023 12:29:26 -0000

Hello community,

we're using OpenVSwitch bridges on a productive Proxxmox 7.3 cluster with 4
nodes (different hardware). Some weeks ago, a sudden reboot happened on one
of the cluster nodes. Further analysis showed that we had  kernel panics /
CPU stalls which seem to be related to OpenVSwitch. After more analysis, we
found out that we're able to reliably reproduce the OVS related kernel
panic jst by restarting running LXC containers with network interfaces. The
behaviour could be reproduced on all our nodes. As these nodes are quite
different in their hardware specifications, we assume that it's caused by
some software-related (OVS) bug.

The kernel panics do NOT occur when switching to linux bridged on a node.

See kernel log extract attached.

Did someone have a similar behaviour?
What experiences do you have with linux bridges compared to OVS bridges
regarding network performance?

We could do without OVS features but must rely on good enough performance.

System:
pveversion: pve-manager/7.3-6/723bb6ec (running kernel: 5.15.102-1-pve)
OVS 2.15.0

Thank you in advance.

All the best
Benjamin