From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 8DC787193B for ; Tue, 29 Jun 2021 16:15:35 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 83660FC0F for ; Tue, 29 Jun 2021 16:15:05 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 55E83FC01 for ; Tue, 29 Jun 2021 16:15:04 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 1B076467F4; Tue, 29 Jun 2021 16:15:04 +0200 (CEST) Message-ID: Date: Tue, 29 Jun 2021 16:14:50 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:90.0) Gecko/20100101 Thunderbird/90.0 Content-Language: en-US To: Proxmox VE user list , Stoiko Ivanov , Mark Schouten References: <5377d815-bde4-9ca8-8584-ff63a6eb27ba@proxmox.com> <0d129a03-9a70-e123-5e5a-e7862ef303ac@tuxis.nl> <152e5dc5-8b0c-f182-4d85-1e1b3639209a@tuxis.nl> <20210629153111.2a0fbc28@rosa.proxmox.com> From: Thomas Lamprecht In-Reply-To: <20210629153111.2a0fbc28@rosa.proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.571 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [PVE-User] Proxmox VE 7.0 (beta) released! X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Jun 2021 14:15:35 -0000 On 29.06.21 15:31, Stoiko Ivanov wrote: > On Tue, 29 Jun 2021 14:04:05 +0200 > Mark Schouten wrote: >=20 >> Hi, >> >> Op 29-06-2021 om 12:31 schreef Thomas Lamprecht: >>>> I do not completely understand why that fixes it though.=C2=A0 Comme= nting out MACAddressPolicy=3Dpersistent helps, but why? >>>> =20 >>> >>> Because duplicate MAC addresses are not ideal, to say the least? =20 >> >> That I understand. :) >> >> But, the cluster interface works when bridge_vlan_aware is off,=20 >> regardless of the MacAddressPolicy setting. >> >=20 > We managed to find a reproducer - my current guess is that it might hav= e > something to do with intel NIC drivers or some changes in ifupdown2 (or= > udev, or in their interaction ;) - Sadly if tcpdump fixes the issues, i= t > makes debugging quite hard :) The issue is that the kernel always (since close to forever) cleared the = bridge's promisc mode when there was either no port or exactly one port with flood= or learning enabled in the `br_manage_promisc` function. Further, on toggeling VLAN-aware the aforementioned `br_manage_promisc` i= s called from `br_vlan_filter_toggle` So, why does this breaks now? I really do not think it's due to some driv= er-specific stuff, not impossible but the following sounds like a better explanation = about the "why now": Previously the MAC address of the bridge was the same as the one from the= single port, so there it didn't matter to much if promisc was on on the single port it= self, the bridge could accept the packages. But now, with the systemd default MACAd= dresPolicy "persistent" now also applying to bridges, the bridge gets a different MA= C than the port, which means the disabled promisc matters on that port quite a bit m= ore. So vlan-aware on "breaks" it by mistake, as then a br_manage_promisc call= is made at a time where the "clear promisc for port" logic triggers, so rather a = side-effect than a real cause. I quite tempted to drop the br_auto_port special case for the single port= case in the kernel as fix, but need to think about this - and probably will send = that to LKML first to poke for some comments...