From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id C39C21FF139 for ; Tue, 10 Feb 2026 10:55:39 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id DE0401AB48; Tue, 10 Feb 2026 10:56:22 +0100 (CET) Message-ID: Date: Tue, 10 Feb 2026 10:56:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Stefan Hanreich Subject: Re: [pve-devel] [PATCH container 1/1] Signed-off-by: Maurice Klein To: Maurice Klein , pve-devel@lists.proxmox.com References: <20260109121049.70740-1-klein@aetherus.de> <20260109121049.70740-2-klein@aetherus.de> <021b748f-44db-4546-8399-c6f7312a11fc@proxmox.com> <77ad7294-e8b6-4862-8ce5-b81181d1188f@proxmox.com> <321bd4ff-f147-4329-9788-50061d569fa6@aetherus.de> <5d2bcf8a-ea6f-48aa-8cc6-c92cfb93311f@proxmox.com> <2a06be0f-4f4d-4c90-ab9b-4f7b062d6664@aetherus.de> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.721 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: SUIDDGXHS6B7N2UEZ332PXKUXUJJM3TS X-Message-ID-Hash: SUIDDGXHS6B7N2UEZ332PXKUXUJJM3TS X-MailFrom: s.hanreich@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 2/6/26 12:21 PM, Maurice Klein wrote: [snip] > I think I didn't explain properly about that. > Basically the whole Idea is to have a gateway IP like 192.0.2.1/32 on > the pve host on that bridge and not have a /24 or so route then. Those are just local to the node for routing, the /24 wouldn't get announced - only the /32 routes for the additional IPs. But I guess with that setup you could do without it as well. It shouldn't be an issue to create a 'subnet' as /32 and then give the PVE host the only IP as gateway IP and configure it that way. Layer-2 zones for instance (VLAN, QinQ, VXLAN) don't even need a subnet configured at all - so I don't see a blocker there. > Guests then also have addresses whatever they might look like. > For example a guest could have 1.1.1.1/32 but usually always /32, > although I guess for some use cases it could be beneficial to be able to > have a guest that gets more then a /32 but let's put that aside for now. would be quite interesting for IPv6 actually. > Now there is no need/reason to define which subnet a guest is on and no > need to be in the same with the host. > > The guest would configure it's ip statically inside and it would be > a /32 usually. Yeah, most implementations I've seen usually have a 'cluster IP' that is in RFC 1918 range for local cluster communication, but with containers that works a lot easier since you can control the network configuration of them whereas with VMs you cannot and would need to update the configuration on every move - or use the same subnet across every node instead of having a dedicated subnet per node/rack. [snip] > Now the biggest thing this enables us to do is in pve clusters if we > build for example a ibgp full mesh the routes get shared. > There could be any topology now and routing would adapt. > just as an example while that is a shity topology it can illustrate the > point.: > >       GW-1        GW-2 >         | \        / | >         |  \      /  | >         |   \    /   | >        pve1--pve3 >            \      / >             \    / >              pve2 > > Any pve can fail and there would still be everything reachable. > Always the shortest path will be chosen. > Any link can Fail. > Any Gateway can Fail. > Even multiple links failing is ok. > No chance for loops because every link is p2p. > Much like at the full mesh ceph setup with ospf or openfabric. > > That can be archived with evpn/vxlan and anycast gateway and multiple > exit nodes. > Problem is the complexity and by giving bigger routes then /24 to > gateways they will not always use the optimal path thus increasing > latancy and putting unnesisary routing load on hosts where the vm isn't > living right now. > And all that to have one L2 domain which often brings more disadvantages > then advantages. > > I hope I explained it well now, if not feel free to ask anything, I > could also provide some bigger documentation with screenshots of > everything. Yes that makes sense. The way I described it in my previous mail should be like that, since it decouples the IP configuration + route creation (which would then be handled by the zone / vnet) from the announcement of that route (which would be handled by fabrics). As a start we could just utilize the default routing table. I'm planning on adding VRF + Route Redistribution + Route Map support mid-term, so the new zone could then profit from those without having to implement anything of the sort for now. It's a bit of an awkward timing, since I'm still working on implementing several features that this plugin would benefit quite heavily from and I don't want to do any duplicate work / code ourselves into a corner by basically implementing all that functionality but only specific to that plugin and then having to migrate everything over while maintaining backwards compatibility. [snip] > I also feel like it would make sense in the network device, since it is > part of specific configuration for that vm but I get why you are > reluctant to that. > This honestly makes me reconsider the sdn approach a little bit. > I have an Idea here that could be something workable. > What if we add a field not saying guest ip, what if we instead call id > routes. > Essentially that is what it is and might have extra use cases apart from > what I'm trying to archive. > That way for this use case you can use those fields to add the > needed /32 host routes. > It wouldn't be specific to the sdn feature we build. > The SDN feature could then be more about configuring the bridge with the > right addresses and fetures and enable us to later distribute the routes > via bgp and other ways. > I looked into the hotplug scenarios as well and that way those would be > solved. Yeah, I think VM configuration is the best bet. It should be tied to the network device imo, so I guess adding a property that allows configuring a CIDR there should be fine for starting out. Adding the route is handled by the respective tap_plug / veth_create functions in pve-network and the new zone plugin then. [snip]