public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH pve-kernel master 1/2] backport fix for network crashes when exiting ovs-tcpdump
@ 2025-01-24 10:21 Friedrich Weber
  2025-01-24 10:21 ` [pve-devel] [PATCH pve-kernel bookworm-6.8 2/2] " Friedrich Weber
  2025-01-24 12:42 ` [pve-devel] applied-series: [PATCH pve-kernel master 1/2] " Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Friedrich Weber @ 2025-01-24 10:21 UTC (permalink / raw)
  To: pve-devel

A user in enterprise support reported a low (<1/50) chance of network
crash (with soft lockups reported) when exiting ovs-tcpdump (a tcpdump
wrapper provided by Open vSwitch) which they could only resolve by
rebooting the host.

After reporting the issue upstream with a reproducer [1], an OVS
developer submitted a kernel patch which is now included 6.13 and some
stable kernels. With this patch, the reproducer does not seem to
trigger the issue anymore. Hence, backport the patch.

[1] https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html

Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---

Notes:
    As the patch is included in stable kernels, it might also eventually
    appear in Ubuntu kernels, so not sure if it makes more sense to
    backport or to wait until then.

 ...lockup-on-tx-to-unregistering-netdev.patch | 77 +++++++++++++++++++
 1 file changed, 77 insertions(+)
 create mode 100644 patches/kernel/0016-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch

diff --git a/patches/kernel/0016-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch b/patches/kernel/0016-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch
new file mode 100644
index 0000000..94c65ee
--- /dev/null
+++ b/patches/kernel/0016-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch
@@ -0,0 +1,77 @@
+From 179893798c518361c20cb11b5b0c71cecf019ad4 Mon Sep 17 00:00:00 2001
+From: Ilya Maximets <i.maximets@ovn.org>
+Date: Thu, 9 Jan 2025 13:21:24 +0100
+Subject: [PATCH] openvswitch: fix lockup on tx to unregistering netdev with
+ carrier
+
+Commit in a fixes tag attempted to fix the issue in the following
+sequence of calls:
+
+    do_output
+    -> ovs_vport_send
+       -> dev_queue_xmit
+          -> __dev_queue_xmit
+             -> netdev_core_pick_tx
+                -> skb_tx_hash
+
+When device is unregistering, the 'dev->real_num_tx_queues' goes to
+zero and the 'while (unlikely(hash >= qcount))' loop inside the
+'skb_tx_hash' becomes infinite, locking up the core forever.
+
+But unfortunately, checking just the carrier status is not enough to
+fix the issue, because some devices may still be in unregistering
+state while reporting carrier status OK.
+
+One example of such device is a net/dummy.  It sets carrier ON
+on start, but it doesn't implement .ndo_stop to set the carrier off.
+And it makes sense, because dummy doesn't really have a carrier.
+Therefore, while this device is unregistering, it's still easy to hit
+the infinite loop in the skb_tx_hash() from the OVS datapath.  There
+might be other drivers that do the same, but dummy by itself is
+important for the OVS ecosystem, because it is frequently used as a
+packet sink for tcpdump while debugging OVS deployments.  And when the
+issue is hit, the only way to recover is to reboot.
+
+Fix that by also checking if the device is running.  The running
+state is handled by the net core during unregistering, so it covers
+unregistering case better, and we don't really need to send packets
+to devices that are not running anyway.
+
+While only checking the running state might be enough, the carrier
+check is preserved.  The running and the carrier states seem disjoined
+throughout the code and different drivers.  And other core functions
+like __dev_direct_xmit() check both before attempting to transmit
+a packet.  So, it seems safer to check both flags in OVS as well.
+
+Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
+Reported-by: Friedrich Weber <f.weber@proxmox.com>
+Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
+Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
+Tested-by: Friedrich Weber <f.weber@proxmox.com>
+Reviewed-by: Aaron Conole <aconole@redhat.com>
+Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@ovn.org
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 47e55e4b410f7d552e43011baa5be1aab4093990)
+Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
+---
+ net/openvswitch/actions.c | 4 +++-
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
+index 101f9a23792c..896f50387347 100644
+--- a/net/openvswitch/actions.c
++++ b/net/openvswitch/actions.c
+@@ -930,7 +930,9 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
+ {
+ 	struct vport *vport = ovs_vport_rcu(dp, out_port);
+ 
+-	if (likely(vport && netif_carrier_ok(vport->dev))) {
++	if (likely(vport &&
++		   netif_running(vport->dev) &&
++		   netif_carrier_ok(vport->dev))) {
+ 		u16 mru = OVS_CB(skb)->mru;
+ 		u32 cutlen = OVS_CB(skb)->cutlen;
+ 
+-- 
+2.39.5
+
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] [PATCH pve-kernel bookworm-6.8 2/2] backport fix for network crashes when exiting ovs-tcpdump
  2025-01-24 10:21 [pve-devel] [PATCH pve-kernel master 1/2] backport fix for network crashes when exiting ovs-tcpdump Friedrich Weber
@ 2025-01-24 10:21 ` Friedrich Weber
  2025-01-24 12:42 ` [pve-devel] applied-series: [PATCH pve-kernel master 1/2] " Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Friedrich Weber @ 2025-01-24 10:21 UTC (permalink / raw)
  To: pve-devel

A user in enterprise support reported a low (<1/50) chance of network
crash (with soft lockups reported) when exiting ovs-tcpdump (a tcpdump
wrapper provided by Open vSwitch) which they could only resolve by
rebooting the host.

After reporting the issue upstream with a reproducer [1], an OVS
developer submitted a kernel patch which is now included 6.13 and some
stable kernels. With this patch, the reproducer does not seem to
trigger the issue anymore. Hence, backport the patch.

[1] https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html

Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---

Notes:
    As the patch is included in stable kernels, it might also eventually
    appear in Ubuntu kernels, so not sure if it makes more sense to
    backport or to wait until then.

 ...lockup-on-tx-to-unregistering-netdev.patch | 77 +++++++++++++++++++
 1 file changed, 77 insertions(+)
 create mode 100644 patches/kernel/0019-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch

diff --git a/patches/kernel/0019-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch b/patches/kernel/0019-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch
new file mode 100644
index 0000000..f08f31f
--- /dev/null
+++ b/patches/kernel/0019-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch
@@ -0,0 +1,77 @@
+From 48ebfb955228ae4804c2c666a7b200bd38f330b0 Mon Sep 17 00:00:00 2001
+From: Ilya Maximets <i.maximets@ovn.org>
+Date: Thu, 9 Jan 2025 13:21:24 +0100
+Subject: [PATCH] openvswitch: fix lockup on tx to unregistering netdev with
+ carrier
+
+Commit in a fixes tag attempted to fix the issue in the following
+sequence of calls:
+
+    do_output
+    -> ovs_vport_send
+       -> dev_queue_xmit
+          -> __dev_queue_xmit
+             -> netdev_core_pick_tx
+                -> skb_tx_hash
+
+When device is unregistering, the 'dev->real_num_tx_queues' goes to
+zero and the 'while (unlikely(hash >= qcount))' loop inside the
+'skb_tx_hash' becomes infinite, locking up the core forever.
+
+But unfortunately, checking just the carrier status is not enough to
+fix the issue, because some devices may still be in unregistering
+state while reporting carrier status OK.
+
+One example of such device is a net/dummy.  It sets carrier ON
+on start, but it doesn't implement .ndo_stop to set the carrier off.
+And it makes sense, because dummy doesn't really have a carrier.
+Therefore, while this device is unregistering, it's still easy to hit
+the infinite loop in the skb_tx_hash() from the OVS datapath.  There
+might be other drivers that do the same, but dummy by itself is
+important for the OVS ecosystem, because it is frequently used as a
+packet sink for tcpdump while debugging OVS deployments.  And when the
+issue is hit, the only way to recover is to reboot.
+
+Fix that by also checking if the device is running.  The running
+state is handled by the net core during unregistering, so it covers
+unregistering case better, and we don't really need to send packets
+to devices that are not running anyway.
+
+While only checking the running state might be enough, the carrier
+check is preserved.  The running and the carrier states seem disjoined
+throughout the code and different drivers.  And other core functions
+like __dev_direct_xmit() check both before attempting to transmit
+a packet.  So, it seems safer to check both flags in OVS as well.
+
+Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
+Reported-by: Friedrich Weber <f.weber@proxmox.com>
+Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
+Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
+Tested-by: Friedrich Weber <f.weber@proxmox.com>
+Reviewed-by: Aaron Conole <aconole@redhat.com>
+Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@ovn.org
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 47e55e4b410f7d552e43011baa5be1aab4093990)
+Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
+---
+ net/openvswitch/actions.c | 4 +++-
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
+index 964225580824..0c0e4fb58bf4 100644
+--- a/net/openvswitch/actions.c
++++ b/net/openvswitch/actions.c
+@@ -925,7 +925,9 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
+ {
+ 	struct vport *vport = ovs_vport_rcu(dp, out_port);
+ 
+-	if (likely(vport && netif_carrier_ok(vport->dev))) {
++	if (likely(vport &&
++		   netif_running(vport->dev) &&
++		   netif_carrier_ok(vport->dev))) {
+ 		u16 mru = OVS_CB(skb)->mru;
+ 		u32 cutlen = OVS_CB(skb)->cutlen;
+ 
+-- 
+2.39.5
+
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] applied-series: [PATCH pve-kernel master 1/2] backport fix for network crashes when exiting ovs-tcpdump
  2025-01-24 10:21 [pve-devel] [PATCH pve-kernel master 1/2] backport fix for network crashes when exiting ovs-tcpdump Friedrich Weber
  2025-01-24 10:21 ` [pve-devel] [PATCH pve-kernel bookworm-6.8 2/2] " Friedrich Weber
@ 2025-01-24 12:42 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2025-01-24 12:42 UTC (permalink / raw)
  To: Proxmox VE development discussion, Friedrich Weber

Am 24.01.25 um 11:21 schrieb Friedrich Weber:
> A user in enterprise support reported a low (<1/50) chance of network
> crash (with soft lockups reported) when exiting ovs-tcpdump (a tcpdump
> wrapper provided by Open vSwitch) which they could only resolve by
> rebooting the host.
> 
> After reporting the issue upstream with a reproducer [1], an OVS
> developer submitted a kernel patch which is now included 6.13 and some
> stable kernels. With this patch, the reproducer does not seem to
> trigger the issue anymore. Hence, backport the patch.
> 
> [1] https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
> 
> Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
> ---
> 
> Notes:
>     As the patch is included in stable kernels, it might also eventually
>     appear in Ubuntu kernels, so not sure if it makes more sense to
>     backport or to wait until then.
> 
>  ...lockup-on-tx-to-unregistering-netdev.patch | 77 +++++++++++++++++++
>  1 file changed, 77 insertions(+)
>  create mode 100644 patches/kernel/0016-openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch
> 
>

applied both patches, thanks!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-01-24 12:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-24 10:21 [pve-devel] [PATCH pve-kernel master 1/2] backport fix for network crashes when exiting ovs-tcpdump Friedrich Weber
2025-01-24 10:21 ` [pve-devel] [PATCH pve-kernel bookworm-6.8 2/2] " Friedrich Weber
2025-01-24 12:42 ` [pve-devel] applied-series: [PATCH pve-kernel master 1/2] " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal