* [PATCH frr 1/2] frr: backport #21166 and #21958, fixing EVPN IPv4 routes with IPv6 nexhtop
2026-05-15 15:23 [PATCH frr 0/2] Fix leaked EVPN routes having wrong nexthop on IPv4 via IPv6 routes Gabriel Goller
@ 2026-05-15 15:23 ` Gabriel Goller
2026-05-15 16:06 ` Gabriel Goller
2026-05-15 15:23 ` [PATCH frr 2/2] bump to version 10.6.1-1+pve2 Gabriel Goller
2026-05-16 23:59 ` [PATCH frr 0/2] Fix leaked EVPN routes having wrong nexthop on IPv4 via IPv6 routes Thomas Lamprecht
2 siblings, 1 reply; 6+ messages in thread
From: Gabriel Goller @ 2026-05-15 15:23 UTC (permalink / raw)
To: pve-devel
When leaking EVPN routes with a IPv4 prefix and a IPv6 nexthop (e.g. on
IPv6 VTEPs), then the routes in the destination VRF have a nexthop of
0.0.0.0. This is because the EVPN AF in bgpd sets the BGP_ATTR_NEXT_HOP
flag, which means only the bgp next-hop property is checked and not the
bgp MP (multiprotocol, bgp4) next-hop (which is the one that contains
the ipv6 addr). So bgpd just makes up a ipv4 address and sends it to
ipv4. Some changes have been done in a previous commit, but this
particular issue hasn't been fixed, so upstreamed the change.
[1]: https://github.com/FRRouting/frr/pull/21166
[2]: https://github.com/FRRouting/frr/pull/21958
Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
---
debian/patches/series | 2 +
...R_NEXT_HOP-flag-handling-in-bgp_attr.patch | 149 ++++++++++++++++++
...v6-nexthops-when-importing-EVPN-IPv4.patch | 107 +++++++++++++
3 files changed, 258 insertions(+)
create mode 100644 debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch
create mode 100644 debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch
diff --git a/debian/patches/series b/debian/patches/series
index fed297922f2d..51b5fe2f29f4 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -2,6 +2,8 @@ upstream/0001-bgpd-fix-EVPN-VRF-auto-RT-deletion-collision.patch
upstream/0002-bgpd-export-local-rt2-mac-ip-entries-to-unicast.patch
upstream/0003-bgpd-do-not-add-local-vtep-as-remote.patch
upstream/0004-topotests-add-bgp_evpn_rt2_local_leak.patch
+upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch
+upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch
pve/0001-enable-bgp-bfd-daemons.patch
pve/0002-bgpd-add-an-option-for-RT-auto-derivation-to-force-A.patch
pve/0003-tests-add-bgp-evpn-autort-test.patch
diff --git a/debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch b/debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch
new file mode 100644
index 000000000000..290afb92eb17
--- /dev/null
+++ b/debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch
@@ -0,0 +1,149 @@
+From c8bf184649db651a7cee4e9509ace06aeabf79d6 Mon Sep 17 00:00:00 2001
+From: Enke Chen <enchen@paloaltonetworks.com>
+Date: Mon, 16 Mar 2026 14:27:07 -0700
+Subject: [PATCH 1/2] bgpd: fix BGP_ATTR_NEXT_HOP flag handling in
+ bgp_attr_default_set()
+
+bgp_attr_default_set() unconditionally set the BGP_ATTR_NEXT_HOP flag
+on every call, even though attr.nexthop (the IPv4 address field) is
+all-zeros and not yet assigned. This flag is used by
+BGP_ATTR_NEXTHOP_AFI_IP6 to distinguish IPv4 vs IPv6 nexthops, so
+having it always set caused non-IPv4 routes to be misidentified.
+Callers were working around this by manually calling UNSET_FLAG for
+non-IPv4 cases, which was fragile and error-prone.
+
+Remove the unconditional flag from bgp_attr_default_set() and enforce
+the invariant that BGP_ATTR_NEXT_HOP is set where and only where
+attr.nexthop is assigned as an actual IPv4 nexthop:
+
+- bgp_evpn_vtep_ip_to_attr_nh(): set the flag alongside attr->nexthop
+ for IPv4 VTEPs, covering all EVPN call sites through this helper.
+- bgp_evpn_fill_rmac_nh_to_attr(): set the flag in both IPv4 nexthop
+ assignment paths (anycast-IP and PIP).
+- bgp_static_update(): set the flag explicitly for AFI_IP; remove the
+ UNSET_FLAG workaround from the else branch.
+- bgp_redistribute_add(): set the flag in all three IPv4 nexthop cases
+ (NEXTHOP_TYPE_IFINDEX/IPv4, NEXTHOP_TYPE_IPV4[_IFINDEX],
+ NEXTHOP_TYPE_BLACKHOLE/IPv4); remove the blanket UNSET_FLAG workaround.
+- subgroup_default_originate(): set the flag for the IPv4
+ default-originate path.
+
+Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
+---
+ bgpd/bgp_attr.c | 1 -
+ bgpd/bgp_evpn.c | 2 ++
+ bgpd/bgp_evpn_mh.c | 1 +
+ bgpd/bgp_route.c | 10 ++++++----
+ bgpd/bgp_updgrp_adv.c | 2 ++
+ 5 files changed, 11 insertions(+), 5 deletions(-)
+
+diff --git a/bgpd/bgp_attr.c b/bgpd/bgp_attr.c
+index 09d4948ab866..afe23a07a054 100644
+--- a/bgpd/bgp_attr.c
++++ b/bgpd/bgp_attr.c
+@@ -1396,7 +1396,6 @@ struct attr *bgp_attr_default_set(struct attr *attr, struct bgp *bgp,
+ attr->tag = 0;
+ attr->label_index = BGP_INVALID_LABEL_INDEX;
+ attr->label = MPLS_INVALID_LABEL;
+- bgp_attr_set(attr, BGP_ATTR_NEXT_HOP);
+ attr->mp_nexthop_len = IPV6_MAX_BYTELEN;
+ attr->local_pref = bgp->default_local_pref;
+
+diff --git a/bgpd/bgp_evpn.c b/bgpd/bgp_evpn.c
+index 8e3569b54419..0b0eb1d623cd 100644
+--- a/bgpd/bgp_evpn.c
++++ b/bgpd/bgp_evpn.c
+@@ -8429,6 +8429,7 @@ void bgp_evpn_fill_rmac_nh_to_attr(struct bgp *bgp_vrf, struct attr *attr, struc
+ attr->nexthop = bgp_vrf->originator_ip.ipaddr_v4;
+ attr->mp_nexthop_global_in = bgp_vrf->originator_ip.ipaddr_v4;
+ attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV4;
++ bgp_attr_set(attr, BGP_ATTR_NEXT_HOP);
+ } else {
+ IPV6_ADDR_COPY(&attr->mp_nexthop_global, &bgp_vrf->originator_ip.ipaddr_v6);
+ attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL;
+@@ -8449,6 +8450,7 @@ void bgp_evpn_fill_rmac_nh_to_attr(struct bgp *bgp_vrf, struct attr *attr, struc
+ if (bgp_vrf->evpn_info->pip_ip.ipaddr_v4.s_addr != INADDR_ANY) {
+ attr->nexthop = bgp_vrf->evpn_info->pip_ip.ipaddr_v4;
+ attr->mp_nexthop_global_in = bgp_vrf->evpn_info->pip_ip.ipaddr_v4;
++ bgp_attr_set(attr, BGP_ATTR_NEXT_HOP);
+ } else if (bgp_vrf->evpn_info->pip_ip.ipaddr_v4.s_addr == INADDR_ANY) {
+ if (bgp_debug_zebra(NULL))
+ zlog_debug("VRF %s evp %pFX advertise-pip primary ip is not configured",
+diff --git a/bgpd/bgp_evpn_mh.c b/bgpd/bgp_evpn_mh.c
+index f79b65c69a97..fa3e60dde759 100644
+--- a/bgpd/bgp_evpn_mh.c
++++ b/bgpd/bgp_evpn_mh.c
+@@ -100,6 +100,7 @@ void bgp_evpn_vtep_ip_to_attr_nh(const struct ipaddr *vtep_ip, struct attr *attr
+ attr->nexthop = vtep_ip->ipaddr_v4;
+ attr->mp_nexthop_global_in = vtep_ip->ipaddr_v4;
+ attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV4;
++ bgp_attr_set(attr, BGP_ATTR_NEXT_HOP);
+ } else if (IS_IPADDR_V6(vtep_ip)) {
+ IPV6_ADDR_COPY(&attr->mp_nexthop_global, &vtep_ip->ipaddr_v6);
+ attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL;
+diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c
+index ddbd24d9aafb..0a7fb527dce7 100644
+--- a/bgpd/bgp_route.c
++++ b/bgpd/bgp_route.c
+@@ -8267,8 +8267,10 @@ void bgp_static_update(struct bgp *bgp, const struct prefix *p,
+
+ bgp_attr_default_set(&attr, bgp, BGP_ORIGIN_IGP);
+
+- if (afi == AFI_IP)
++ if (afi == AFI_IP) {
+ nh_length = IPV4_MAX_BYTELEN;
++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP);
++ }
+
+ /* NHC */
+ nhc = XCALLOC(MTYPE_BGP_NHC, sizeof(struct bgp_nhc));
+@@ -10575,9 +10577,6 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p,
+ */
+ assert(attr.aspath);
+
+- if (p->family == AF_INET6)
+- UNSET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP));
+-
+ switch (nhtype) {
+ case NEXTHOP_TYPE_IFINDEX:
+ switch (p->family) {
+@@ -10585,6 +10584,7 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p,
+ attr.nexthop.s_addr = INADDR_ANY;
+ attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV4;
+ attr.mp_nexthop_global_in.s_addr = INADDR_ANY;
++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP);
+ break;
+ case AF_INET6:
+ memset(&attr.mp_nexthop_global, 0,
+@@ -10598,6 +10598,7 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p,
+ attr.nexthop = nexthop->ipv4;
+ attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV4;
+ attr.mp_nexthop_global_in = nexthop->ipv4;
++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP);
+ break;
+ case NEXTHOP_TYPE_IPV6:
+ case NEXTHOP_TYPE_IPV6_IFINDEX:
+@@ -10610,6 +10611,7 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p,
+ attr.nexthop.s_addr = INADDR_ANY;
+ attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV4;
+ attr.mp_nexthop_global_in.s_addr = INADDR_ANY;
++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP);
+ break;
+ case AF_INET6:
+ memset(&attr.mp_nexthop_global, 0,
+diff --git a/bgpd/bgp_updgrp_adv.c b/bgpd/bgp_updgrp_adv.c
+index 07b532e2324c..9947948c995e 100644
+--- a/bgpd/bgp_updgrp_adv.c
++++ b/bgpd/bgp_updgrp_adv.c
+@@ -987,6 +987,8 @@ void subgroup_default_originate(struct update_subgroup *subgrp, bool withdraw)
+ if (peer->shared_network
+ && !IN6_IS_ADDR_UNSPECIFIED(&peer->nexthop.v6_local))
+ attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL;
++ } else {
++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP);
+ }
+
+ if (peer->default_rmap[afi][safi].name) {
+--
+2.47.3
+
diff --git a/debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch b/debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch
new file mode 100644
index 000000000000..ffa78c29f30d
--- /dev/null
+++ b/debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch
@@ -0,0 +1,107 @@
+From f512fac23368ddc1be4cdab95601410d907a8e92 Mon Sep 17 00:00:00 2001
+From: Gabriel Goller <g.goller@proxmox.com>
+Date: Fri, 15 May 2026 16:04:25 +0200
+Subject: [PATCH 2/2] bgpd: preserve IPv6 nexthops when importing EVPN IPv4
+ routes
+
+When importing an EVPN route into a VRF unicast table,
+install_evpn_route_entry_in_vrf() converted every imported IPv4 route
+into a route with the legacy IPv4 NEXT_HOP attribute set:
+
+ attr.nexthop = attr.mp_nexthop_global_in;
+ SET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP));
+
+This is only valid when the imported EVPN nexthop is IPv4. With IPv6
+VTEPs we can get IPv4 prefixes with IPv6 nexthops and the route already
+has the real nexthop encoded in the MP nexthop fields. In that case
+setting BGP_ATTR_NEXT_HOP creates an inconsistent attribute: the route
+has an IPv6 MP nexthop, but is also marked as having a classic IPv4
+NEXT_HOP.
+
+This breaks code that uses BGP_ATTR_NEXTHOP_AFI_IP6() to determine
+the nexthop address family. BGP_ATTR_NEXTHOP_AFI_IP6() sees
+BGP_ATTR_NEXT_HOP and thinks this is a IPv4 route with a IPv4 nexthop
+even though mp_nexthop_len indicates an IPv6 nexthop. The result is that
+VRF import/leak drops the IPv6 nexthop and sends a 0.0.0.0 nexthop to
+zebra.
+
+Fix this by only assigning attr.nexthop and setting BGP_ATTR_NEXT_HOP
+when the imported EVPN route does not have an IPv6 MP nexthop. EVPN IPv4
+routes with IPv6 nexthops are left as MP-nexthop routes.
+
+This is related to the previous BGP_ATTR_NEXT_HOP cleanup (#21166) and
+was probably missed there.
+
+Also make the nexthop-change detection handle this case by comparing the
+MP IPv6 nexthop for IPv4 routes that carry one.
+
+Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
+---
+ bgpd/bgp_evpn.c | 36 ++++++++++++++++++++++--------------
+ 1 file changed, 22 insertions(+), 14 deletions(-)
+
+diff --git a/bgpd/bgp_evpn.c b/bgpd/bgp_evpn.c
+index 0b0eb1d623cd..b1de8948e4d3 100644
+--- a/bgpd/bgp_evpn.c
++++ b/bgpd/bgp_evpn.c
+@@ -3215,11 +3215,11 @@ static int install_evpn_route_entry_in_vrf(struct bgp *bgp_vrf,
+ } else
+ return 0;
+
+- /* EVPN routes currently only support a IPv4 next hop which corresponds
+- * to the remote VTEP. When importing into a VRF, if it is IPv6 host
+- * or prefix route, we have to convert the next hop to an IPv4-mapped
+- * address for the rest of the code to flow through. In the case of IPv4,
+- * make sure to set the flag for next hop attribute.
++ /* EVPN routes may carry either an IPv4 or IPv6 next hop corresponding
++ * to the remote VTEP. When importing into a VRF, IPv6 host/prefix routes
++ * use an IPv6 MP nexthop. For IPv4 routes, set the legacy NEXT_HOP
++ * attribute only when the imported nexthop is IPv4; IPv6 nexthops are
++ * preserved as MP nexthops.
+ */
+ attr = *parent_pi->attr;
+ bre = bgp_attr_get_evpn_overlay(&attr);
+@@ -3245,11 +3245,13 @@ static int install_evpn_route_entry_in_vrf(struct bgp *bgp_vrf,
+ SET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP));
+ }
+ } else {
+- if (afi == AFI_IP6)
++ if (afi == AFI_IP) {
++ if (!BGP_ATTR_MP_NEXTHOP_LEN_IP6(&attr)) {
++ attr.nexthop = attr.mp_nexthop_global_in;
++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP);
++ }
++ } else if (afi == AFI_IP6) {
+ evpn_convert_nexthop_to_ipv6(&attr);
+- else {
+- attr.nexthop = attr.mp_nexthop_global_in;
+- SET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP));
+ }
+ }
+
+@@ -3287,11 +3289,17 @@ static int install_evpn_route_entry_in_vrf(struct bgp *bgp_vrf,
+ bgp_path_info_restore(dest, pi);
+
+ /* Mark if nexthop has changed. */
+- if ((afi == AFI_IP
+- && !IPV4_ADDR_SAME(&pi->attr->nexthop, &attr_new->nexthop))
+- || (afi == AFI_IP6
+- && !IPV6_ADDR_SAME(&pi->attr->mp_nexthop_global,
+- &attr_new->mp_nexthop_global)))
++ if (afi == AFI_IP) {
++ bool old_v6nh = BGP_ATTR_MP_NEXTHOP_LEN_IP6(pi->attr);
++ bool new_v6nh = BGP_ATTR_MP_NEXTHOP_LEN_IP6(attr_new);
++
++ if (old_v6nh != new_v6nh ||
++ (old_v6nh && !IPV6_ADDR_SAME(&pi->attr->mp_nexthop_global,
++ &attr_new->mp_nexthop_global)) ||
++ (!old_v6nh && !IPV4_ADDR_SAME(&pi->attr->nexthop, &attr_new->nexthop)))
++ SET_FLAG(pi->flags, BGP_PATH_IGP_CHANGED);
++ } else if (afi == AFI_IP6 && !IPV6_ADDR_SAME(&pi->attr->mp_nexthop_global,
++ &attr_new->mp_nexthop_global))
+ SET_FLAG(pi->flags, BGP_PATH_IGP_CHANGED);
+
+ bgp_path_info_set_flag(dest, pi, BGP_PATH_ATTR_CHANGED);
+--
+2.47.3
+
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread