all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Gabriel Goller <g.goller@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH pve-kernel 2/5] kernel: backport: netfilter: nft_set_rbtree: continue traversal if element is inactive
Date: Thu, 11 Sep 2025 12:05:43 +0200	[thread overview]
Message-ID: <20250911100555.63174-3-g.goller@proxmox.com> (raw)
In-Reply-To: <20250911100555.63174-1-g.goller@proxmox.com>

If a match is found in a rbtree, set the interval at the very end to
avoid the element being inactive when finishing the traversal.

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
---
 ...t_rbtree-continue-traversal-if-eleme.patch | 88 +++++++++++++++++++
 1 file changed, 88 insertions(+)
 create mode 100644 patches/kernel/0015-netfilter-nft_set_rbtree-continue-traversal-if-eleme.patch

diff --git a/patches/kernel/0015-netfilter-nft_set_rbtree-continue-traversal-if-eleme.patch b/patches/kernel/0015-netfilter-nft_set_rbtree-continue-traversal-if-eleme.patch
new file mode 100644
index 000000000000..9e4d4d687003
--- /dev/null
+++ b/patches/kernel/0015-netfilter-nft_set_rbtree-continue-traversal-if-eleme.patch
@@ -0,0 +1,88 @@
+From 2af0ed300431a3c5675cd6a7219424430fa9651b Mon Sep 17 00:00:00 2001
+From: Gabriel Goller <g.goller@proxmox.com>
+Date: Wed, 10 Sep 2025 12:08:56 +0200
+Subject: [PATCH 2/5] netfilter: nft_set_rbtree: continue traversal if element
+ is inactive
+
+When the rbtree lookup function finds a match in the rbtree, it sets the
+range start interval to a potentially inactive element.
+
+Then, after tree lookup, if the matching element is inactive, it returns
+NULL and suppresses a matching result.
+
+This is wrong and leads to false negative matches when a transaction has
+already entered the commit phase.
+
+cpu0					cpu1
+  has added new elements to clone
+  has marked elements as being
+  inactive in new generation
+					perform lookup in the set
+  enters commit phase:
+I) increments the genbit
+					A) observes new genbit
+					B) finds matching range
+					C) returns no match: found
+					range invalid in new generation
+II) removes old elements from the tree
+					C New nft_lookup happening now
+				       	  will find matching element,
+					  because it is no longer
+					  obscured by old, inactive one.
+
+Consider a packet matching range r1-r2:
+
+cpu0 processes following transaction:
+1. remove r1-r2
+2. add r1-r3
+
+P is contained in both ranges. Therefore, cpu1 should always find a match
+for P.  Due to above race, this is not the case:
+
+cpu1 does find r1-r2, but then ignores it due to the genbit indicating
+the range has been removed.  It does NOT test for further matches.
+
+The situation persists for all lookups until after cpu0 hits II) after
+which r1-r3 range start node is tested for the first time.
+
+Move the "interval start is valid" check ahead so that tree traversal
+continues if the starting interval is not valid in this generation.
+
+Thanks to Stefan Hanreich for providing an initial reproducer for this
+bug.
+
+Reported-by: Stefan Hanreich <s.hanreich@proxmox.com>
+Fixes: c1eda3c6394f ("netfilter: nft_rbtree: ignore inactive matching element with no descendants")
+Signed-off-by: Florian Westphal <fw@strlen.de>
+Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
+---
+ net/netfilter/nft_set_rbtree.c | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
+index 2e8ef16ff191..c4eb94258e24 100644
+--- a/net/netfilter/nft_set_rbtree.c
++++ b/net/netfilter/nft_set_rbtree.c
+@@ -77,7 +77,9 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set
+ 			    nft_rbtree_interval_end(rbe) &&
+ 			    nft_rbtree_interval_start(interval))
+ 				continue;
+-			interval = rbe;
++			if (nft_set_elem_active(&rbe->ext, genmask) &&
++			    !nft_rbtree_elem_expired(rbe))
++				interval = rbe;
+ 		} else if (d > 0)
+ 			parent = rcu_dereference_raw(parent->rb_right);
+ 		else {
+@@ -103,8 +105,6 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set
+ 	}
+ 
+ 	if (set->flags & NFT_SET_INTERVAL && interval != NULL &&
+-	    nft_set_elem_active(&interval->ext, genmask) &&
+-	    !nft_rbtree_elem_expired(interval) &&
+ 	    nft_rbtree_interval_start(interval)) {
+ 		*ext = &interval->ext;
+ 		return true;
+-- 
+2.47.3
+
-- 
2.47.3



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  parent reply	other threads:[~2025-09-11 10:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-11 10:05 [pve-devel] [PATCH kernel 0/5] backport nftables atomicity fix Gabriel Goller
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 1/5] kernel: backport: netfilter: nft_set_pipapo: don't check genbit from packetpath lookups Gabriel Goller
2025-09-11 10:05 ` Gabriel Goller [this message]
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 3/5] kernel: backport: netfilter: nf_tables: place base_seq in struct net Gabriel Goller
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 4/5] kernel: backport: netfilter: nf_tables: make nft_set_do_lookup available unconditionally Gabriel Goller
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 5/5] kernel: backport: netfilter: nf_tables: restart set lookup on base_seq change Gabriel Goller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250911100555.63174-3-g.goller@proxmox.com \
    --to=g.goller@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal