From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 35EB81FF14C for ; Fri, 15 May 2026 17:24:18 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A548512A02; Fri, 15 May 2026 17:24:16 +0200 (CEST) From: Gabriel Goller To: pve-devel@lists.proxmox.com Subject: [PATCH frr 1/2] frr: backport #21166 and #21958, fixing EVPN IPv4 routes with IPv6 nexhtop Date: Fri, 15 May 2026 17:23:56 +0200 Message-ID: <20260515152400.726794-2-g.goller@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260515152400.726794-1-g.goller@proxmox.com> References: <20260515152400.726794-1-g.goller@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1778858636284 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.028 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: XQILVWXLKXMHFBO55ZSSXO2PVYSQHJN7 X-Message-ID-Hash: XQILVWXLKXMHFBO55ZSSXO2PVYSQHJN7 X-MailFrom: g.goller@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: When leaking EVPN routes with a IPv4 prefix and a IPv6 nexthop (e.g. on IPv6 VTEPs), then the routes in the destination VRF have a nexthop of 0.0.0.0. This is because the EVPN AF in bgpd sets the BGP_ATTR_NEXT_HOP flag, which means only the bgp next-hop property is checked and not the bgp MP (multiprotocol, bgp4) next-hop (which is the one that contains the ipv6 addr). So bgpd just makes up a ipv4 address and sends it to ipv4. Some changes have been done in a previous commit, but this particular issue hasn't been fixed, so upstreamed the change. [1]: https://github.com/FRRouting/frr/pull/21166 [2]: https://github.com/FRRouting/frr/pull/21958 Signed-off-by: Gabriel Goller --- debian/patches/series | 2 + ...R_NEXT_HOP-flag-handling-in-bgp_attr.patch | 149 ++++++++++++++++++ ...v6-nexthops-when-importing-EVPN-IPv4.patch | 107 +++++++++++++ 3 files changed, 258 insertions(+) create mode 100644 debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch create mode 100644 debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch diff --git a/debian/patches/series b/debian/patches/series index fed297922f2d..51b5fe2f29f4 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -2,6 +2,8 @@ upstream/0001-bgpd-fix-EVPN-VRF-auto-RT-deletion-collision.patch upstream/0002-bgpd-export-local-rt2-mac-ip-entries-to-unicast.patch upstream/0003-bgpd-do-not-add-local-vtep-as-remote.patch upstream/0004-topotests-add-bgp_evpn_rt2_local_leak.patch +upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch +upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch pve/0001-enable-bgp-bfd-daemons.patch pve/0002-bgpd-add-an-option-for-RT-auto-derivation-to-force-A.patch pve/0003-tests-add-bgp-evpn-autort-test.patch diff --git a/debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch b/debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch new file mode 100644 index 000000000000..290afb92eb17 --- /dev/null +++ b/debian/patches/upstream/0005-bgpd-fix-BGP_ATTR_NEXT_HOP-flag-handling-in-bgp_attr.patch @@ -0,0 +1,149 @@ +From c8bf184649db651a7cee4e9509ace06aeabf79d6 Mon Sep 17 00:00:00 2001 +From: Enke Chen +Date: Mon, 16 Mar 2026 14:27:07 -0700 +Subject: [PATCH 1/2] bgpd: fix BGP_ATTR_NEXT_HOP flag handling in + bgp_attr_default_set() + +bgp_attr_default_set() unconditionally set the BGP_ATTR_NEXT_HOP flag +on every call, even though attr.nexthop (the IPv4 address field) is +all-zeros and not yet assigned. This flag is used by +BGP_ATTR_NEXTHOP_AFI_IP6 to distinguish IPv4 vs IPv6 nexthops, so +having it always set caused non-IPv4 routes to be misidentified. +Callers were working around this by manually calling UNSET_FLAG for +non-IPv4 cases, which was fragile and error-prone. + +Remove the unconditional flag from bgp_attr_default_set() and enforce +the invariant that BGP_ATTR_NEXT_HOP is set where and only where +attr.nexthop is assigned as an actual IPv4 nexthop: + +- bgp_evpn_vtep_ip_to_attr_nh(): set the flag alongside attr->nexthop + for IPv4 VTEPs, covering all EVPN call sites through this helper. +- bgp_evpn_fill_rmac_nh_to_attr(): set the flag in both IPv4 nexthop + assignment paths (anycast-IP and PIP). +- bgp_static_update(): set the flag explicitly for AFI_IP; remove the + UNSET_FLAG workaround from the else branch. +- bgp_redistribute_add(): set the flag in all three IPv4 nexthop cases + (NEXTHOP_TYPE_IFINDEX/IPv4, NEXTHOP_TYPE_IPV4[_IFINDEX], + NEXTHOP_TYPE_BLACKHOLE/IPv4); remove the blanket UNSET_FLAG workaround. +- subgroup_default_originate(): set the flag for the IPv4 + default-originate path. + +Signed-off-by: Enke Chen +--- + bgpd/bgp_attr.c | 1 - + bgpd/bgp_evpn.c | 2 ++ + bgpd/bgp_evpn_mh.c | 1 + + bgpd/bgp_route.c | 10 ++++++---- + bgpd/bgp_updgrp_adv.c | 2 ++ + 5 files changed, 11 insertions(+), 5 deletions(-) + +diff --git a/bgpd/bgp_attr.c b/bgpd/bgp_attr.c +index 09d4948ab866..afe23a07a054 100644 +--- a/bgpd/bgp_attr.c ++++ b/bgpd/bgp_attr.c +@@ -1396,7 +1396,6 @@ struct attr *bgp_attr_default_set(struct attr *attr, struct bgp *bgp, + attr->tag = 0; + attr->label_index = BGP_INVALID_LABEL_INDEX; + attr->label = MPLS_INVALID_LABEL; +- bgp_attr_set(attr, BGP_ATTR_NEXT_HOP); + attr->mp_nexthop_len = IPV6_MAX_BYTELEN; + attr->local_pref = bgp->default_local_pref; + +diff --git a/bgpd/bgp_evpn.c b/bgpd/bgp_evpn.c +index 8e3569b54419..0b0eb1d623cd 100644 +--- a/bgpd/bgp_evpn.c ++++ b/bgpd/bgp_evpn.c +@@ -8429,6 +8429,7 @@ void bgp_evpn_fill_rmac_nh_to_attr(struct bgp *bgp_vrf, struct attr *attr, struc + attr->nexthop = bgp_vrf->originator_ip.ipaddr_v4; + attr->mp_nexthop_global_in = bgp_vrf->originator_ip.ipaddr_v4; + attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV4; ++ bgp_attr_set(attr, BGP_ATTR_NEXT_HOP); + } else { + IPV6_ADDR_COPY(&attr->mp_nexthop_global, &bgp_vrf->originator_ip.ipaddr_v6); + attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL; +@@ -8449,6 +8450,7 @@ void bgp_evpn_fill_rmac_nh_to_attr(struct bgp *bgp_vrf, struct attr *attr, struc + if (bgp_vrf->evpn_info->pip_ip.ipaddr_v4.s_addr != INADDR_ANY) { + attr->nexthop = bgp_vrf->evpn_info->pip_ip.ipaddr_v4; + attr->mp_nexthop_global_in = bgp_vrf->evpn_info->pip_ip.ipaddr_v4; ++ bgp_attr_set(attr, BGP_ATTR_NEXT_HOP); + } else if (bgp_vrf->evpn_info->pip_ip.ipaddr_v4.s_addr == INADDR_ANY) { + if (bgp_debug_zebra(NULL)) + zlog_debug("VRF %s evp %pFX advertise-pip primary ip is not configured", +diff --git a/bgpd/bgp_evpn_mh.c b/bgpd/bgp_evpn_mh.c +index f79b65c69a97..fa3e60dde759 100644 +--- a/bgpd/bgp_evpn_mh.c ++++ b/bgpd/bgp_evpn_mh.c +@@ -100,6 +100,7 @@ void bgp_evpn_vtep_ip_to_attr_nh(const struct ipaddr *vtep_ip, struct attr *attr + attr->nexthop = vtep_ip->ipaddr_v4; + attr->mp_nexthop_global_in = vtep_ip->ipaddr_v4; + attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV4; ++ bgp_attr_set(attr, BGP_ATTR_NEXT_HOP); + } else if (IS_IPADDR_V6(vtep_ip)) { + IPV6_ADDR_COPY(&attr->mp_nexthop_global, &vtep_ip->ipaddr_v6); + attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL; +diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c +index ddbd24d9aafb..0a7fb527dce7 100644 +--- a/bgpd/bgp_route.c ++++ b/bgpd/bgp_route.c +@@ -8267,8 +8267,10 @@ void bgp_static_update(struct bgp *bgp, const struct prefix *p, + + bgp_attr_default_set(&attr, bgp, BGP_ORIGIN_IGP); + +- if (afi == AFI_IP) ++ if (afi == AFI_IP) { + nh_length = IPV4_MAX_BYTELEN; ++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP); ++ } + + /* NHC */ + nhc = XCALLOC(MTYPE_BGP_NHC, sizeof(struct bgp_nhc)); +@@ -10575,9 +10577,6 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p, + */ + assert(attr.aspath); + +- if (p->family == AF_INET6) +- UNSET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP)); +- + switch (nhtype) { + case NEXTHOP_TYPE_IFINDEX: + switch (p->family) { +@@ -10585,6 +10584,7 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p, + attr.nexthop.s_addr = INADDR_ANY; + attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV4; + attr.mp_nexthop_global_in.s_addr = INADDR_ANY; ++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP); + break; + case AF_INET6: + memset(&attr.mp_nexthop_global, 0, +@@ -10598,6 +10598,7 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p, + attr.nexthop = nexthop->ipv4; + attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV4; + attr.mp_nexthop_global_in = nexthop->ipv4; ++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP); + break; + case NEXTHOP_TYPE_IPV6: + case NEXTHOP_TYPE_IPV6_IFINDEX: +@@ -10610,6 +10611,7 @@ void bgp_redistribute_add(struct bgp *bgp, struct prefix *p, + attr.nexthop.s_addr = INADDR_ANY; + attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV4; + attr.mp_nexthop_global_in.s_addr = INADDR_ANY; ++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP); + break; + case AF_INET6: + memset(&attr.mp_nexthop_global, 0, +diff --git a/bgpd/bgp_updgrp_adv.c b/bgpd/bgp_updgrp_adv.c +index 07b532e2324c..9947948c995e 100644 +--- a/bgpd/bgp_updgrp_adv.c ++++ b/bgpd/bgp_updgrp_adv.c +@@ -987,6 +987,8 @@ void subgroup_default_originate(struct update_subgroup *subgrp, bool withdraw) + if (peer->shared_network + && !IN6_IS_ADDR_UNSPECIFIED(&peer->nexthop.v6_local)) + attr.mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL; ++ } else { ++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP); + } + + if (peer->default_rmap[afi][safi].name) { +-- +2.47.3 + diff --git a/debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch b/debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch new file mode 100644 index 000000000000..ffa78c29f30d --- /dev/null +++ b/debian/patches/upstream/0006-bgpd-preserve-IPv6-nexthops-when-importing-EVPN-IPv4.patch @@ -0,0 +1,107 @@ +From f512fac23368ddc1be4cdab95601410d907a8e92 Mon Sep 17 00:00:00 2001 +From: Gabriel Goller +Date: Fri, 15 May 2026 16:04:25 +0200 +Subject: [PATCH 2/2] bgpd: preserve IPv6 nexthops when importing EVPN IPv4 + routes + +When importing an EVPN route into a VRF unicast table, +install_evpn_route_entry_in_vrf() converted every imported IPv4 route +into a route with the legacy IPv4 NEXT_HOP attribute set: + + attr.nexthop = attr.mp_nexthop_global_in; + SET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP)); + +This is only valid when the imported EVPN nexthop is IPv4. With IPv6 +VTEPs we can get IPv4 prefixes with IPv6 nexthops and the route already +has the real nexthop encoded in the MP nexthop fields. In that case +setting BGP_ATTR_NEXT_HOP creates an inconsistent attribute: the route +has an IPv6 MP nexthop, but is also marked as having a classic IPv4 +NEXT_HOP. + +This breaks code that uses BGP_ATTR_NEXTHOP_AFI_IP6() to determine +the nexthop address family. BGP_ATTR_NEXTHOP_AFI_IP6() sees +BGP_ATTR_NEXT_HOP and thinks this is a IPv4 route with a IPv4 nexthop +even though mp_nexthop_len indicates an IPv6 nexthop. The result is that +VRF import/leak drops the IPv6 nexthop and sends a 0.0.0.0 nexthop to +zebra. + +Fix this by only assigning attr.nexthop and setting BGP_ATTR_NEXT_HOP +when the imported EVPN route does not have an IPv6 MP nexthop. EVPN IPv4 +routes with IPv6 nexthops are left as MP-nexthop routes. + +This is related to the previous BGP_ATTR_NEXT_HOP cleanup (#21166) and +was probably missed there. + +Also make the nexthop-change detection handle this case by comparing the +MP IPv6 nexthop for IPv4 routes that carry one. + +Signed-off-by: Gabriel Goller +--- + bgpd/bgp_evpn.c | 36 ++++++++++++++++++++++-------------- + 1 file changed, 22 insertions(+), 14 deletions(-) + +diff --git a/bgpd/bgp_evpn.c b/bgpd/bgp_evpn.c +index 0b0eb1d623cd..b1de8948e4d3 100644 +--- a/bgpd/bgp_evpn.c ++++ b/bgpd/bgp_evpn.c +@@ -3215,11 +3215,11 @@ static int install_evpn_route_entry_in_vrf(struct bgp *bgp_vrf, + } else + return 0; + +- /* EVPN routes currently only support a IPv4 next hop which corresponds +- * to the remote VTEP. When importing into a VRF, if it is IPv6 host +- * or prefix route, we have to convert the next hop to an IPv4-mapped +- * address for the rest of the code to flow through. In the case of IPv4, +- * make sure to set the flag for next hop attribute. ++ /* EVPN routes may carry either an IPv4 or IPv6 next hop corresponding ++ * to the remote VTEP. When importing into a VRF, IPv6 host/prefix routes ++ * use an IPv6 MP nexthop. For IPv4 routes, set the legacy NEXT_HOP ++ * attribute only when the imported nexthop is IPv4; IPv6 nexthops are ++ * preserved as MP nexthops. + */ + attr = *parent_pi->attr; + bre = bgp_attr_get_evpn_overlay(&attr); +@@ -3245,11 +3245,13 @@ static int install_evpn_route_entry_in_vrf(struct bgp *bgp_vrf, + SET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP)); + } + } else { +- if (afi == AFI_IP6) ++ if (afi == AFI_IP) { ++ if (!BGP_ATTR_MP_NEXTHOP_LEN_IP6(&attr)) { ++ attr.nexthop = attr.mp_nexthop_global_in; ++ bgp_attr_set(&attr, BGP_ATTR_NEXT_HOP); ++ } ++ } else if (afi == AFI_IP6) { + evpn_convert_nexthop_to_ipv6(&attr); +- else { +- attr.nexthop = attr.mp_nexthop_global_in; +- SET_FLAG(attr.flag, ATTR_FLAG_BIT(BGP_ATTR_NEXT_HOP)); + } + } + +@@ -3287,11 +3289,17 @@ static int install_evpn_route_entry_in_vrf(struct bgp *bgp_vrf, + bgp_path_info_restore(dest, pi); + + /* Mark if nexthop has changed. */ +- if ((afi == AFI_IP +- && !IPV4_ADDR_SAME(&pi->attr->nexthop, &attr_new->nexthop)) +- || (afi == AFI_IP6 +- && !IPV6_ADDR_SAME(&pi->attr->mp_nexthop_global, +- &attr_new->mp_nexthop_global))) ++ if (afi == AFI_IP) { ++ bool old_v6nh = BGP_ATTR_MP_NEXTHOP_LEN_IP6(pi->attr); ++ bool new_v6nh = BGP_ATTR_MP_NEXTHOP_LEN_IP6(attr_new); ++ ++ if (old_v6nh != new_v6nh || ++ (old_v6nh && !IPV6_ADDR_SAME(&pi->attr->mp_nexthop_global, ++ &attr_new->mp_nexthop_global)) || ++ (!old_v6nh && !IPV4_ADDR_SAME(&pi->attr->nexthop, &attr_new->nexthop))) ++ SET_FLAG(pi->flags, BGP_PATH_IGP_CHANGED); ++ } else if (afi == AFI_IP6 && !IPV6_ADDR_SAME(&pi->attr->mp_nexthop_global, ++ &attr_new->mp_nexthop_global)) + SET_FLAG(pi->flags, BGP_PATH_IGP_CHANGED); + + bgp_path_info_set_flag(dest, pi, BGP_PATH_ATTR_CHANGED); +-- +2.47.3 + -- 2.47.3