From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <josef@oderland.se>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id D64F7755C4
 for <pve-devel@lists.proxmox.com>; Wed, 13 Oct 2021 15:53:48 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id C8559127B2
 for <pve-devel@lists.proxmox.com>; Wed, 13 Oct 2021 15:53:48 +0200 (CEST)
Received: from office.oderland.com (office.oderland.com [91.201.60.5])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 98674127A5
 for <pve-devel@lists.proxmox.com>; Wed, 13 Oct 2021 15:53:47 +0200 (CEST)
Received: from [193.180.18.161] (port=38884 helo=[10.137.0.14])
 by office.oderland.com with esmtpsa (TLS1.2) tls
 TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2)
 (envelope-from <josef@oderland.se>) id 1maehR-002J4F-31
 for pve-devel@lists.proxmox.com; Wed, 13 Oct 2021 15:53:41 +0200
Message-ID: <a8c84731-a2cd-1b63-48ef-0c8fa72c8818@oderland.se>
Date: Wed, 13 Oct 2021 15:53:40 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:93.0) Gecko/20100101
 Thunderbird/93.0
Content-Language: en-US
To: pve-devel@lists.proxmox.com
References: <2b417bee43cb4484bcba66afc6076113@velartis.at>
 <093EC041-0E5D-41F2-99C9-CF8A5E767313@marinov.us>
 <c7e6443e731f430380d57f465d10dadc@velartis.at>
 <4F0DFA30-F1ED-4322-857A-4F4C24B463FE@marinov.us>
 <1FAB115F-FD40-41E1-AC81-A781DA29B378@marinov.us>
 <190901a568da4ce3a4553e6d929e6828@velartis.at>
 <04e7ef9a-2054-d929-fd1d-cf5f63047816@oderland.se>
 <1b968b67edec4c3783db1ee568372e65@velartis.at>
From: Josef Johansson <josef@oderland.se>
In-Reply-To: <1b968b67edec4c3783db1ee568372e65@velartis.at>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - office.oderland.com
X-AntiAbuse: Original Domain - lists.proxmox.com
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - oderland.se
X-Get-Message-Sender-Via: office.oderland.com: authenticated_id:
 josjoh@oderland.se
X-Authenticated-Sender: office.oderland.com: josjoh@oderland.se
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.812 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_ASCII_DIVIDERS        0.8 Spam that uses ascii formatting tricks
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [proxmox.com]
Subject: Re: [pve-devel] BUG in vlan aware bridge
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 13 Oct 2021 13:53:48 -0000


Med vänliga hälsningar
Josef Johansson

On 10/13/21 15:47, VELARTIS Philipp Dürhammer wrote:
>>> As a datapoint I could ping fine from a MTU 1500 host, over MTU 9000 vlan-aware bridges with firewalls to another MTU 1500.
>>> As you would assume the package is defragmented over MTU 9000 links and fragmented again over MTU 1500 devices.
> So you did a ping with -s 2000 (or bigger) and your tap device is vlan tagged from the vm where you ping?
Oh right. I have to test that out correctly. I have it lab, will reach
back to you when I've tested it properly.
> -----Ursprüngliche Nachricht-----
> Von: pve-devel <pve-devel-bounces@lists.proxmox.com> Im Auftrag von Josef Johansson
> Gesendet: Mittwoch, 13. Oktober 2021 13:37
> An: pve-devel@lists.proxmox.com
> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>
> Hi,
>
> AFAIK it's netfilter that is doing defragmenting so that it can firewall.
>
> If you specify
>
> iptables -t raw -I PREROUTING -s 77.244.240.131 -j NOTRACK
>
> iptables -t raw -I PREROUTING -s 37.16.72.52 -j NOTRACK
>
> you should be able to make it ignore your packets.
>
>
> As a datapoint I could ping fine from a MTU 1500 host, over MTU 9000 vlan-aware bridges with firewalls to another MTU 1500.
>
> As you would assume the package is defragmented over MTU 9000 links and fragmented again over MTU 1500 devices.
>
> Med vänliga hälsningar
> Josef Johansson
>
> On 10/13/21 11:22, VELARTIS Philipp Dürhammer wrote:
>> HI,
>>
>>
>> Yes i think it has nothing to do with the bonds but with the vlan aware bridge interface.
>>
>> I see this with ping -s 1500
>>
>> On tap interface: 
>> 11:19:35.141414 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 39999, offset 0, flags [+], proto ICMP (1), length 1500)
>>     37.16.72.52 > 77.244.240.131: ICMP echo request, id 2182, seq 4, 
>> length 1480
>> 11:19:35.141430 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype IPv4 (0x0800), length 562: (tos 0x0, ttl 64, id 39999, offset 1480, flags [none], proto ICMP (1), length 548)
>>     37.16.72.52 > 77.244.240.131: ip-proto-1
>>
>> On vmbr0:
>> 11:19:35.141442 62:47:e0:fe:f9:31 > 54:e0:32:27:6e:50, ethertype 802.1Q (0x8100), length 2046: vlan 350, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 39999, offset 0, flags [none], proto ICMP (1), length 2028)
>>     37.16.72.52 > 77.244.240.131: ICMP echo request, id 2182, seq 4, 
>> length 2008
>>
>> On bond0 its gone....
>>
>> But who is in charge of fragementing the packets normally? The bridge itself? Netfilter?
>>
>> -----Ursprüngliche Nachricht-----
>> Von: pve-devel <pve-devel-bounces@lists.proxmox.com> Im Auftrag von 
>> Stoyan Marinov
>> Gesendet: Mittwoch, 13. Oktober 2021 00:46
>> An: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
>> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>>
>> OK, I have just verified it has nothing to do with bonds. I get the same behavior with vlan aware bridge, bridge-nf-call-iptables=1 with regular eth0 being part of the bridge. Packets arrive fragmented on tap, reassembled by netfilter and then re-injected in bridge assembled (full size).
>>
>> I did have limited success by setting net.bridge.bridge-nf-filter-vlan-tagged to 1. Now packets seem to get fragmented on the way out and back in, but there are still issues:
>>
>> 1. I'm testing with ping -s 2000 (1500 mtu everywhere) to an external box. I do see reply packets arrive on the vm nic, but ping doesn't see them. Haven't analyzed much further.
>> 2. While watching with tcpdump (inside the vm) i notice "ip reassembly time exceeded" messages being generated from the vm.
>>
>> I'll try to investigate a bit further tomorrow.
>>
>>> On 12 Oct 2021, at 11:26 PM, Stoyan Marinov <stoyan@marinov.us> wrote:
>>>
>>> That's an interesting observation. Now that I think about it, it could be caused by bonding and not the underlying device. When I tested this (about an year ago) I was using bonding on the mlx adapters and not using bonding on intel ones.
>>>
>>>> On 12 Oct 2021, at 3:36 PM, VELARTIS Philipp Dürhammer <p.duerhammer@velartis.at> wrote:
>>>>
>>>> HI,
>>>>
>>>> we use HP Server with Intel Cards or the standard hp nic ( ithink 
>>>> also intel)
>>>>
>>>> Also I see the I did a mistake:
>>>>
>>>> Setup working:
>>>> tapX (UNtagged) <- -> vmbr0 <- - > bond0
>>>>
>>>> is correct. (before I had also tagged)
>>>>
>>>> it should be :
>>>>
>>>> Setup not working:
>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>
>>>> Setup working:
>>>> tapX (untagged) <- -> vmbr0 <- - > bond0
>>>>
>>>> Setup also working:
>>>> tapX < - - > vmbr0v350 < -- > bond0.350 < -- > bond0
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: pve-devel <pve-devel-bounces@lists.proxmox.com> Im Auftrag von 
>>>> Stoyan Marinov
>>>> Gesendet: Dienstag, 12. Oktober 2021 13:16
>>>> An: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
>>>> Betreff: Re: [pve-devel] BUG in vlan aware bridge
>>>>
>>>> I'm having the very same issue with Mellanox ethernet adapters. I don't see this behavior with Intel nics. What network cards do you have?
>>>>
>>>>> On 12 Oct 2021, at 1:48 PM, VELARTIS Philipp Dürhammer <p.duerhammer@velartis.at> wrote:
>>>>>
>>>>> HI,
>>>>>
>>>>> i am playing around since days because we have strange packet losses.
>>>>> Finally I can report following (Linux 5.11.22-4-pve, Proxmox 7, all devices MTU 1500):
>>>>>
>>>>> Packet with sizes > 1500 without VLAN working well but at the moment they are Tagged they are dropped by the bond device.
>>>>> Netfilter (set to 1) always reassembles the packets when they arrive a bridge. But they don't get fragmented again I they are VLAN tagged. So the bond device drops them. If the bridge is NOT Vlan aware they also get fragmented and it works well.
>>>>>
>>>>> Setup not working:
>>>>>
>>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>>
>>>>> Setup working:
>>>>>
>>>>> tapX (tagged) <- -> vmbr0 <- - > bond0
>>>>>
>>>>> Setup also working:
>>>>>
>>>>> tapX < - - > vmbr0v350 < -- > bond0.350 < -- > bond0
>>>>>
>>>>> Have you got any idea where to search? I don't understand who is in 
>>>>> charge of fragmenting packages again if they get reassembled by 
>>>>> netfilter. (and why it is not working with vlan aware bridges)
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> pve-devel mailing list
>>>>> pve-devel@lists.proxmox.com
>>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>>>
>>>> _______________________________________________
>>>> pve-devel mailing list
>>>> pve-devel@lists.proxmox.com
>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>> _______________________________________________
>>>> pve-devel mailing list
>>>> pve-devel@lists.proxmox.com
>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>> _______________________________________________
>>> pve-devel mailing list
>>> pve-devel@lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel