From: Stefan Hanreich <s.hanreich@proxmox.com>
To: Maurice Klein <klein@aetherus.de>, pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH container 1/1] Signed-off-by: Maurice Klein <klein@aetherus.de>
Date: Tue, 10 Feb 2026 10:56:19 +0100 [thread overview]
Message-ID: <fd5722e0-2c29-4c58-9f0f-c50b523e6989@proxmox.com> (raw)
In-Reply-To: <a2110fae-4877-49c7-91d8-62364b78f3f9@aetherus.de>
On 2/6/26 12:21 PM, Maurice Klein wrote:
[snip]
> I think I didn't explain properly about that.
> Basically the whole Idea is to have a gateway IP like 192.0.2.1/32 on
> the pve host on that bridge and not have a /24 or so route then.
Those are just local to the node for routing, the /24 wouldn't get
announced - only the /32 routes for the additional IPs. But I guess with
that setup you could do without it as well. It shouldn't be an issue to
create a 'subnet' as /32 and then give the PVE host the only IP as
gateway IP and configure it that way. Layer-2 zones for instance (VLAN,
QinQ, VXLAN) don't even need a subnet configured at all - so I don't see
a blocker there.
> Guests then also have addresses whatever they might look like.
> For example a guest could have 1.1.1.1/32 but usually always /32,
> although I guess for some use cases it could be beneficial to be able to
> have a guest that gets more then a /32 but let's put that aside for now.
would be quite interesting for IPv6 actually.
> Now there is no need/reason to define which subnet a guest is on and no
> need to be in the same with the host.
>
> The guest would configure it's ip statically inside and it would be
> a /32 usually.
Yeah, most implementations I've seen usually have a 'cluster IP' that is
in RFC 1918 range for local cluster communication, but with containers
that works a lot easier since you can control the network configuration
of them whereas with VMs you cannot and would need to update the
configuration on every move - or use the same subnet across every node
instead of having a dedicated subnet per node/rack.
[snip]
> Now the biggest thing this enables us to do is in pve clusters if we
> build for example a ibgp full mesh the routes get shared.
> There could be any topology now and routing would adapt.
> just as an example while that is a shity topology it can illustrate the
> point.:
>
> GW-1 GW-2
> | \ / |
> | \ / |
> | \ / |
> pve1--pve3
> \ /
> \ /
> pve2
>
> Any pve can fail and there would still be everything reachable.
> Always the shortest path will be chosen.
> Any link can Fail.
> Any Gateway can Fail.
> Even multiple links failing is ok.
> No chance for loops because every link is p2p.
> Much like at the full mesh ceph setup with ospf or openfabric.
>
> That can be archived with evpn/vxlan and anycast gateway and multiple
> exit nodes.
> Problem is the complexity and by giving bigger routes then /24 to
> gateways they will not always use the optimal path thus increasing
> latancy and putting unnesisary routing load on hosts where the vm isn't
> living right now.
> And all that to have one L2 domain which often brings more disadvantages
> then advantages.
>
> I hope I explained it well now, if not feel free to ask anything, I
> could also provide some bigger documentation with screenshots of
> everything.
Yes that makes sense. The way I described it in my previous mail should
be like that, since it decouples the IP configuration + route creation
(which would then be handled by the zone / vnet) from the announcement
of that route (which would be handled by fabrics). As a start we could
just utilize the default routing table. I'm planning on adding VRF +
Route Redistribution + Route Map support mid-term, so the new zone could
then profit from those without having to implement anything of the sort
for now. It's a bit of an awkward timing, since I'm still working on
implementing several features that this plugin would benefit quite
heavily from and I don't want to do any duplicate work / code ourselves
into a corner by basically implementing all that functionality but only
specific to that plugin and then having to migrate everything over while
maintaining backwards compatibility.
[snip]
> I also feel like it would make sense in the network device, since it is
> part of specific configuration for that vm but I get why you are
> reluctant to that.
> This honestly makes me reconsider the sdn approach a little bit.
> I have an Idea here that could be something workable.
> What if we add a field not saying guest ip, what if we instead call id
> routes.
> Essentially that is what it is and might have extra use cases apart from
> what I'm trying to archive.
> That way for this use case you can use those fields to add the
> needed /32 host routes.
> It wouldn't be specific to the sdn feature we build.
> The SDN feature could then be more about configuring the bridge with the
> right addresses and fetures and enable us to later distribute the routes
> via bgp and other ways.
> I looked into the hotplug scenarios as well and that way those would be
> solved.
Yeah, I think VM configuration is the best bet. It should be tied to the
network device imo, so I guess adding a property that allows configuring
a CIDR there should be fine for starting out. Adding the route is
handled by the respective tap_plug / veth_create functions in
pve-network and the new zone plugin then.
[snip]
next prev parent reply other threads:[~2026-02-10 9:55 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260109121049.70740-1-klein@aetherus.de>
2026-01-09 12:10 ` Maurice Klein via pve-devel
[not found] ` <20260109121049.70740-2-klein@aetherus.de>
2026-01-19 8:37 ` Maurice Klein via pve-devel
2026-01-19 14:35 ` Stefan Hanreich
2026-01-21 19:04 ` Maurice Klein via pve-devel
[not found] ` <d18928a0-6ab0-4e90-ad3a-0674bbdedb72@aetherus.de>
2026-01-27 10:02 ` Stefan Hanreich
2026-01-27 10:37 ` Maurice Klein via pve-devel
[not found] ` <321bd4ff-f147-4329-9788-50061d569fa6@aetherus.de>
2026-01-29 12:20 ` Stefan Hanreich
2026-02-01 14:32 ` Maurice Klein
2026-02-06 8:23 ` Stefan Hanreich
2026-02-06 11:22 ` Maurice Klein
2026-02-10 9:56 ` Stefan Hanreich [this message]
[not found] <20260109124514.72991-1-klein@aetherus.de>
2026-01-09 12:45 ` Maurice Klein via pve-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fd5722e0-2c29-4c58-9f0f-c50b523e6989@proxmox.com \
--to=s.hanreich@proxmox.com \
--cc=klein@aetherus.de \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox