From: Stefan Hanreich <s.hanreich@proxmox.com>
To: Maurice Klein <klein@aetherus.de>, pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH container 1/1] Signed-off-by: Maurice Klein <klein@aetherus.de>
Date: Tue, 10 Feb 2026 10:56:19 +0100 [thread overview]
Message-ID: <fd5722e0-2c29-4c58-9f0f-c50b523e6989@proxmox.com> (raw)
In-Reply-To: <a2110fae-4877-49c7-91d8-62364b78f3f9@aetherus.de>
On 2/6/26 12:21 PM, Maurice Klein wrote:
[snip]
> I think I didn't explain properly about that.
> Basically the whole Idea is to have a gateway IP like 192.0.2.1/32 on
> the pve host on that bridge and not have a /24 or so route then.
Those are just local to the node for routing, the /24 wouldn't get
announced - only the /32 routes for the additional IPs. But I guess with
that setup you could do without it as well. It shouldn't be an issue to
create a 'subnet' as /32 and then give the PVE host the only IP as
gateway IP and configure it that way. Layer-2 zones for instance (VLAN,
QinQ, VXLAN) don't even need a subnet configured at all - so I don't see
a blocker there.
> Guests then also have addresses whatever they might look like.
> For example a guest could have 1.1.1.1/32 but usually always /32,
> although I guess for some use cases it could be beneficial to be able to
> have a guest that gets more then a /32 but let's put that aside for now.
would be quite interesting for IPv6 actually.
> Now there is no need/reason to define which subnet a guest is on and no
> need to be in the same with the host.
>
> The guest would configure it's ip statically inside and it would be
> a /32 usually.
Yeah, most implementations I've seen usually have a 'cluster IP' that is
in RFC 1918 range for local cluster communication, but with containers
that works a lot easier since you can control the network configuration
of them whereas with VMs you cannot and would need to update the
configuration on every move - or use the same subnet across every node
instead of having a dedicated subnet per node/rack.
[snip]
> Now the biggest thing this enables us to do is in pve clusters if we
> build for example a ibgp full mesh the routes get shared.
> There could be any topology now and routing would adapt.
> just as an example while that is a shity topology it can illustrate the
> point.:
>
> GW-1 GW-2
> | \ / |
> | \ / |
> | \ / |
> pve1--pve3
> \ /
> \ /
> pve2
>
> Any pve can fail and there would still be everything reachable.
> Always the shortest path will be chosen.
> Any link can Fail.
> Any Gateway can Fail.
> Even multiple links failing is ok.
> No chance for loops because every link is p2p.
> Much like at the full mesh ceph setup with ospf or openfabric.
>
> That can be archived with evpn/vxlan and anycast gateway and multiple
> exit nodes.
> Problem is the complexity and by giving bigger routes then /24 to
> gateways they will not always use the optimal path thus increasing
> latancy and putting unnesisary routing load on hosts where the vm isn't
> living right now.
> And all that to have one L2 domain which often brings more disadvantages
> then advantages.
>
> I hope I explained it well now, if not feel free to ask anything, I
> could also provide some bigger documentation with screenshots of
> everything.
Yes that makes sense. The way I described it in my previous mail should
be like that, since it decouples the IP configuration + route creation
(which would then be handled by the zone / vnet) from the announcement
of that route (which would be handled by fabrics). As a start we could
just utilize the default routing table. I'm planning on adding VRF +
Route Redistribution + Route Map support mid-term, so the new zone could
then profit from those without having to implement anything of the sort
for now. It's a bit of an awkward timing, since I'm still working on
implementing several features that this plugin would benefit quite
heavily from and I don't want to do any duplicate work / code ourselves
into a corner by basically implementing all that functionality but only
specific to that plugin and then having to migrate everything over while
maintaining backwards compatibility.
[snip]
> I also feel like it would make sense in the network device, since it is
> part of specific configuration for that vm but I get why you are
> reluctant to that.
> This honestly makes me reconsider the sdn approach a little bit.
> I have an Idea here that could be something workable.
> What if we add a field not saying guest ip, what if we instead call id
> routes.
> Essentially that is what it is and might have extra use cases apart from
> what I'm trying to archive.
> That way for this use case you can use those fields to add the
> needed /32 host routes.
> It wouldn't be specific to the sdn feature we build.
> The SDN feature could then be more about configuring the bridge with the
> right addresses and fetures and enable us to later distribute the routes
> via bgp and other ways.
> I looked into the hotplug scenarios as well and that way those would be
> solved.
Yeah, I think VM configuration is the best bet. It should be tied to the
network device imo, so I guess adding a property that allows configuring
a CIDR there should be fine for starting out. Adding the route is
handled by the respective tap_plug / veth_create functions in
pve-network and the new zone plugin then.
[snip]
next prev parent reply other threads:[~2026-02-10 9:55 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260109121049.70740-1-klein@aetherus.de>
2026-01-09 12:10 ` Maurice Klein via pve-devel
[not found] ` <20260109121049.70740-2-klein@aetherus.de>
2026-01-19 8:37 ` Maurice Klein via pve-devel
2026-01-19 14:35 ` Stefan Hanreich
2026-01-21 19:04 ` Maurice Klein via pve-devel
[not found] ` <d18928a0-6ab0-4e90-ad3a-0674bbdedb72@aetherus.de>
2026-01-27 10:02 ` Stefan Hanreich
2026-01-27 10:37 ` Maurice Klein via pve-devel
[not found] ` <321bd4ff-f147-4329-9788-50061d569fa6@aetherus.de>
2026-01-29 12:20 ` Stefan Hanreich
2026-02-01 14:32 ` Maurice Klein
2026-02-06 8:23 ` Stefan Hanreich
2026-02-06 11:22 ` Maurice Klein
2026-02-10 9:56 ` Stefan Hanreich [this message]
[not found] <20260109124514.72991-1-klein@aetherus.de>
2026-01-09 12:45 ` Maurice Klein via pve-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fd5722e0-2c29-4c58-9f0f-c50b523e6989@proxmox.com \
--to=s.hanreich@proxmox.com \
--cc=klein@aetherus.de \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.