From: Gabriel Goller <g.goller@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH kernel 0/5] backport nftables atomicity fix
Date: Thu, 11 Sep 2025 12:05:41 +0200 [thread overview]
Message-ID: <20250911100555.63174-1-g.goller@proxmox.com> (raw)
Stefan Hanreich discovered this nftables bug which breaks the atomicity when
updating certain sets. This means that when updating a set, packets sometimes
slip through even though the existing and the incoming rules deny the packet.
A full reproducer is available here: [0].
More information in following commit messages.
The upstream series has not been applied yet, but is available here:
https://lore.kernel.org/netfilter-devel/20250910080227.11174-1-fw@strlen.de/
Nftables changed quite a bit since 6.14 so the backport was a bit tricky -- a
few Tested-by's would be nice :). If anyone needs help to reproduce this or
wants a pre-build kernel with the fix feel free to reach out!
Thanks to Stefan Hanreich for identifying the bug and providing a minimal
reproducer, and to Florian Westphal for the quick fix.
[0]:
Initial network setup:
ip netns add east
ip netns add west
ip link add east type veth peer name west
ip link set east netns east
ip link set west netns west
ip netns exec east ip a a 192.0.2.20/24 dev east
ip netns exec west ip link add br0 type bridge
ip netns exec west ip a a 192.0.2.10/24 dev br0
ip netns exec west ip link set west master br0
ip netns exec east ip link set up east
ip netns exec west ip link set up west
ip netns exec west ip link set up br0
Initial nft ruleset in network namespace 'west':
table bridge west {
set east-ip-nomatch {
type ipv4_addr
flags interval;
elements = { 0.0.0.0-192.0.2.19, 192.0.2.21-255.255.255.255 }
}
chain block-spoofed {
type filter hook prerouting priority filter; policy accept;
ip saddr @east-ip-nomatch drop
}
}
This should block all traffic on the bridge br0, which does not have
192.0.2.20 as source IP address, but when continuously flushing /
re-creating the east-ip-nomatch set via the following commands:
$ while true; do ip netns exec west nft -j -f update_set.json; done;
# update_set.json
{
"nftables": [
{
"add": {
"set": {
"family": "bridge",
"table": "west",
"name": "east-ip-nomatch",
"type": "ipv4_addr",
"flags": [
"interval"
]
}
}
},
{
"flush": {
"set": {
"family": "bridge",
"table": "west",
"name": "east-ip-nomatch"
}
}
},
{
"add": {
"element": {
"family": "bridge",
"table": "west",
"name": "east-ip-nomatch",
"elem": [
{
"range": ["0.0.0.0", "192.0.2.19"]
},
{
"range": ["192.0.2.21", "255.255.255.255"]
}
]
}
}
}
]
}
And then continously sending ICMP packets from east to west via e.g. scapy:
$ ip netns exec east python3 -c 'from scapy.all import send, Ether, IP,
ICMP; send(IP(src="192.0.2.30", dst="192.0.2.10")/ICMP(id=2222, seq=42),
count=1000000, inter=0.001)'
Some of them pass through, as is visible via tcpdump (sometimes its
required to terminate the process for the packets to be visible, since
the buffers do not get flushed immediately):
$ ip netns exec west tcpdump -envi br0 icmp
tcpdump: listening on br0, link-type EN10MB (Ethernet), snapshot length
262144 bytes
17:11:10.008758 06:a4:e8:d4:db:20 > 8a:88:57:79:f6:97, ethertype IPv4
(0x0800), length 42: (tos 0x0, ttl 64, id 1, offset 0, flags [none],
proto ICMP (1), l
ength 28)
192.0.2.30 > 192.0.2.10: ICMP echo request, id 2222, seq 42, length 8
pve-kernel:
Gabriel Goller (5):
kernel: backport: netfilter: nft_set_pipapo: don't check genbit from
packetpath lookups
kernel: backport: netfilter: nft_set_rbtree: continue traversal if
element is inactive
kernel: backport: netfilter: nf_tables: place base_seq in struct net
kernel: backport: netfilter: nf_tables: make nft_set_do_lookup
available unconditionally
kernel: backport: netfilter: nf_tables: restart set lookup on base_seq
change
...t_pipapo-don-t-check-genbit-from-pac.patch | 160 +++++++++
...t_rbtree-continue-traversal-if-eleme.patch | 88 +++++
..._tables-place-base_seq-in-struct-net.patch | 310 ++++++++++++++++++
...les-make-nft_set_do_lookup-available.patch | 86 +++++
...les-restart-set-lookup-on-base_seq-c.patch | 148 +++++++++
5 files changed, 792 insertions(+)
create mode 100644 patches/kernel/0014-netfilter-nft_set_pipapo-don-t-check-genbit-from-pac.patch
create mode 100644 patches/kernel/0015-netfilter-nft_set_rbtree-continue-traversal-if-eleme.patch
create mode 100644 patches/kernel/0016-netfilter-nf_tables-place-base_seq-in-struct-net.patch
create mode 100644 patches/kernel/0017-netfilter-nf_tables-make-nft_set_do_lookup-available.patch
create mode 100644 patches/kernel/0018-netfilter-nf_tables-restart-set-lookup-on-base_seq-c.patch
Summary over all repositories:
5 files changed, 792 insertions(+), 0 deletions(-)
--
Generated by git-murpp 0.8.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next reply other threads:[~2025-09-11 10:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 10:05 Gabriel Goller [this message]
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 1/5] kernel: backport: netfilter: nft_set_pipapo: don't check genbit from packetpath lookups Gabriel Goller
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 2/5] kernel: backport: netfilter: nft_set_rbtree: continue traversal if element is inactive Gabriel Goller
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 3/5] kernel: backport: netfilter: nf_tables: place base_seq in struct net Gabriel Goller
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 4/5] kernel: backport: netfilter: nf_tables: make nft_set_do_lookup available unconditionally Gabriel Goller
2025-09-11 10:05 ` [pve-devel] [PATCH pve-kernel 5/5] kernel: backport: netfilter: nf_tables: restart set lookup on base_seq change Gabriel Goller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250911100555.63174-1-g.goller@proxmox.com \
--to=g.goller@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.