From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 6CC0B1FF14C for ; Fri, 26 Jun 2026 10:34:21 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 84C289D25; Fri, 26 Jun 2026 10:34:19 +0200 (CEST) Message-ID: Date: Fri, 26 Jun 2026 10:34:10 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH qemu-server] fix #7627: net: virtio: disable host_tunnel feature again with 11.0+pve1 To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , pve-devel@lists.proxmox.com References: <20260603152127.901085-1-f.ebner@proxmox.com> <2d28bb02-bf70-4f9a-a91b-b5c8162527d6@proxmox.com> <1781614973.8wdi1mzrwu.astroid@yuna.none> Content-Language: en-US From: Fiona Ebner In-Reply-To: <1781614973.8wdi1mzrwu.astroid@yuna.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1782462848348 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.009 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Message-ID-Hash: 5WFK5GX4S2FSTUAWJOPEN2JUPRVD6HR5 X-Message-ID-Hash: 5WFK5GX4S2FSTUAWJOPEN2JUPRVD6HR5 X-MailFrom: f.ebner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Am 16.06.26 um 3:02 PM schrieb Fabian Grünbichler: > On June 9, 2026 11:32 am, Fiona Ebner wrote: >> Am 03.06.26 um 5:21 PM schrieb Fiona Ebner: >>> QEMU machine version 10.2 started exposing the new features >>> VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO_CSUM >>> VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO >>> VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM >>> VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO >>> >>> but the host tunnel one causes issues with certain guest network >>> configurations, in particular when using VXLAN [0][1][2][3] when the >>> traffic goes over a physical NIC, at least when the NIC does not have >>> support for these feature itself. >>> >>> The negotiation in QEMU does not consider the physical NIC, it just >>> looks whether the vhost-net device and the guest both support it and >>> then turns on the feature for the tap device. However, it seems like >>> the tap device does not itself add the inner TCP checksums for the >>> encapsulated traffic. It's not entirely clear yet if this is a kernel >>> issue or if the common configuration with bridged tap interface >>> going to physical NIC is not supported in this configuration without >>> some additional tweaks. When the traffic does not go via a physical >>> NIC, it seems to work (i.e. both source and target VM on the same >>> host). >>> >>> For now, disable this advanced host tunnel feature again, until the >>> issue can be properly diagnosed and fixed (if there is a fix to be >>> made). If users do require the feature again, it can be exposed via >>> the schema as CLI-only and maybe in the UI as an advanced >>> configuration option. >>> >>> [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=7627 >>> [1]: https://forum.proxmox.com/threads/183494/post-855144 >>> [2]: https://forum.proxmox.com/threads/182328/post-854627 >>> [3]: https://forum.proxmox.com/threads/183963/#post-855737 >>> >>> Signed-off-by: Fiona Ebner >>> --- >>> >>> Many thanks to Stefan and Gabriel for discussions and continuing to >>> analyze the issue! For now, let's make a stop-gap fix and turn the >>> problematic host tunnel feature back off. I will also send a mail >>> upstream asking about the issue, but not today, as I have to leave. >> >> There is a patch now [0], but since the issue was in the virtio-net >> driver, the fix will need to be rolled out to guest kernels, which we >> don't have control over. While there is an easy workaround with pinning >> the machine version to 10.1 for affected guests, I still wonder if we >> should go for disabling the feature by default with 11.0+pve1 for now, >> to avoid more people running into the regression? Maybe re-enabling it >> with the next major PVE release next summer? > > sound sensible - do we want to offer an escape hatch for setups with > fixed guest kernels that want to benefit performance-wise? I wrote this in the commit message and wanted to wait for actual user requests, but okay, I'll send a v2 with such an option. Need to re-roll for updating the commit message with the current information anyways.