* [pdm-devel] RFC: Synchronizing configuration changes across remotes @ 2025-01-30 15:48 Stefan Hanreich 2025-02-03 17:02 ` Thomas Lamprecht 0 siblings, 1 reply; 3+ messages in thread From: Stefan Hanreich @ 2025-01-30 15:48 UTC (permalink / raw) To: pdm-devel I'm currently working on the SDN integration and for that I need a way to deploy SDN configuration changes to multiple remotes simultaneously. In general I will need to do the following: * Create / Update / Delete some parts of the SDN configuration of multiple remotes, preferably synchronized across the remotes. * Apply the new SDN configuration (possibly opt-in) for all/some nodes in multiple remotes During this operation it would make sense to make sure that there are no pending changes in the SDN configuration, so users do not accidentally apply unrelated changes via PDM. We also need to prevent any concurrent SDN configuration changes for the same reason - so we don't apply any unrelated configuration. The question is: Do we also want to be able to prevent concurrent changes across multiple remotes, or are we fine with only preventing concurrent changes on a remote level? With network configuration affecting more than one remote, I think it would be better to synchronize changes across remotes since oftentimes applying the configuration to only one remote doesn't really make sense and the failure to apply configuration could affect the other remote. The two options I see, depending on the answer to that question: * introducing some form of lock that prevents any changes to the SDN configuration from other sources * do something based on the current digest functionality The general process for making changes to the SDN configuration would look as follows with the lock-based approach: * check for pending changes, and if there are none: lock the SDN configuration (atomically in one API call) * make the changes to the SDN configuration * apply the SDN configuration changes * release the lock * In the case of errors we can rollback the configuration changes and then release all locks. I currently gravitate towards the lock-based approach due to the following reasons: * It enables us to synchronize changes across multiple remotes - as compared to a digest based approach. * It's a lot more ergonomic for developers, since you simply acquire/release the lock. With a digest-based approach, modifications that require multiple API calls need to acquire a new digest everytime and track it across multiple API calls. With SDN specifically, when applying the configuration, we need to provide and check the digest as well. * It is just easier to prevent concurrent changes in the first place rather than reacting to them. If they cannot occur, then rollbacking is easier and less error-prone since the developer can assume nothing changed in the previously handled remotes as well. The downsides of this approach I can see: * It requires sweeping changes to basically the whole SDN API, and keeping backwards compatibility is harder. * Also, many API endpoints in PVE already provide the digest functionality, so it would be a lot easier to retro-fit this for usage with PDM and possibly require no changes at all. * In case of failures on the PDM side it is harder to recover, since it requires manual intervention (removing the lock manually). For single configuration files the digest-based approach could work quite well in cases where we don't need to synchronize changes across multiple remotes. But for SDN the digest-based approach is a bit more complicated: We currently generate digests for each section in the configuration file, instead of for the configuration file as a whole. This would be relatively easy to add though. The second problem is that the configuration is split across multiple files, so we'd need to either look at all digests of all configuration files in all API calls or check a 'global' SDN configuration digest on every call. Again, certainly solvable but also requires some work. Since even with our best effort we will run into situations where the lock doesn't get properly released, a simple escape hatch to unlock the SDN config should be provided (like qm unlock). One such scenario would be PDM losing connectivity to one of the remotes while holding the lock, there's not really anything we can do there. Since we probably need some form of doing this with other configuration files as well, I wanted to ask for your input. I think this concept could be applied generally to configuration changes that need to be made synchronized across multiple remotes (syncing firewall configuration comes to mind). This is just a rough draft on how this could work and I probably oversaw some edge-cases. I'm happy for any input or alternative ideas! _______________________________________________ pdm-devel mailing list pdm-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pdm-devel] RFC: Synchronizing configuration changes across remotes 2025-01-30 15:48 [pdm-devel] RFC: Synchronizing configuration changes across remotes Stefan Hanreich @ 2025-02-03 17:02 ` Thomas Lamprecht 2025-02-04 10:34 ` Stefan Hanreich 0 siblings, 1 reply; 3+ messages in thread From: Thomas Lamprecht @ 2025-02-03 17:02 UTC (permalink / raw) To: Proxmox Datacenter Manager development discussion, Stefan Hanreich Am 30.01.25 um 16:48 schrieb Stefan Hanreich: > During this operation it would make sense to make sure that there are > no pending changes in the SDN configuration, so users do not > accidentally apply unrelated changes via PDM. We also need to prevent > any concurrent SDN configuration changes for the same reason - so we > don't apply any unrelated configuration. > > The question is: Do we also want to be able to prevent concurrent > changes across multiple remotes, or are we fine with only preventing > concurrent changes on a remote level? With network configuration > affecting more than one remote, I think it would be better to > synchronize changes across remotes since oftentimes applying the > configuration to only one remote doesn't really make sense and the > failure to apply configuration could affect the other remote. To answer yes to your specific question and really mean it (as in: actually safe, generic reliable, and maybe even atomic) would mean that we need to add a cluster layer with an algorithm than actually ensures us consensus over all remotes. I.e., Paxos, like corosync uses, or raft or something like that (depending on the exact properties wanted). All these are very costly and do not scale well at all, so with interpreting your question rather narrowly I'd have to answer with a strong no; requiring that would severely limit PDM in its usefulness and IME bring major complexity and headaches along with it. And as rolling out the network to FRR and what not else has tons of side effects doing all this consensus work would quite definitively rather useless anyway. But, squinting a bit more and interpreting the question to not mean that we should add ways for doing things in guaranteed lockstep through an FSM distributed over all remotes to rather mean that one adds some way to ensure one can do various edits without others being able to make any modifications during that sequence it should be dooable. Especially if we transparently shift responsibility to clean things up if anything goes wrong to the user (with some methods to empower them doing so, documentation and tooling wise). > I currently gravitate towards the lock-based approach due to the > following reasons: Yeah, digest is not giving you anything here, at least for anything that consists of more than one change; and adding a dedicated central API endpoint for every variant of batch update we might need seems hardly scalable nor like good API design. > * It enables us to synchronize changes across multiple remotes - as > compared to a digest based approach. > * It's a lot more ergonomic for developers, since you simply > acquire/release the lock. With a digest-based approach, modifications > that require multiple API calls need to acquire a new digest > everytime and track it across multiple API calls. With SDN specifically, > when applying the configuration, we need to provide and check the digest > as well. > * It is just easier to prevent concurrent changes in the first place > rather than reacting to them. If they cannot occur, then rollbacking > is easier and less error-prone since the developer can assume nothing > changed in the previously handled remotes as well. > > The downsides of this approach I can see: > * It requires sweeping changes to basically the whole SDN API, and > keeping backwards compatibility is harder. Does it really require sweeping changes? I'd think modifications are already hedging against concurrent access now, so this should not mean we change to a completely new edit paradigm here. My thoughts when we talked was to go roughly for: Add a new endpoint that 1) ensure basic healthiness and 2) registers a lock for the whole, or potentially only some parts, of the SDN stack. This should work by returning a lock-cookie random string to be used by subsequent calls to do various updates in one go while ensuring nothing else can do so or just steal our lock. Then check this lock centrally on any write-config and be basically done I think? A slightly more elaborate variant might be to also split the edit step, i.e. 1. check all remotes and get lock 2. extend the config(s) with a section (or a separate ".new" config) for pending changes, write all new changes to that. 3. commit the pending sections or .new config file. With that you would have the smallest possibility for failure due to unrelated node/connection hickups and reduce the time gap for actually activating the changes. If something is off an admin even could manually apply these directly on the cluster/nodes. > * Also, many API endpoints in PVE already provide the digest > functionality, so it would be a lot easier to retro-fit this for usage > with PDM and possibly require no changes at all. Digest should be able to co-exist, if the config is unlocked and digest is the same then the edit is generally safe. > * In case of failures on the PDM side it is harder to recover, since > it requires manual intervention (removing the lock manually). Well, a partially rolled out SDN update might always be (relatively) hard to recover from; which approach would avoid that (and not require paxos, or raft level guarantees)? > For single configuration files the digest-based approach could work > quite well in cases where we don't need to synchronize changes across > multiple remotes. But for SDN the digest-based approach is a bit more > complicated: We currently generate digests for each section in the > configuration file, instead of for the configuration file as a whole. > This would be relatively easy to add though. The second problem is > that the configuration is split across multiple files, so we'd need to > either look at all digests of all configuration files in all API calls > or check a 'global' SDN configuration digest on every call. Again, > certainly solvable but also requires some work. FWIW, we already got pmxcfs backed domain locks, which I added for the HA stack back in the day. These allow relatively cheaply to take a lock that only one pmxcfs instance (i.e., one node) at a time can hold. Pair that with some local lock (e.g., flock, in single-process, many threads rust land it could be an even cheaper mutex) and you can quite simply and not to expensively lock edits – and I'd figure SDN modifications do not have _that_ high of a frequency to make performance here to critical for such locking to become a problem. > Since even with our best effort we will run into situations where the > lock doesn't get properly released, a simple escape hatch to unlock > the SDN config should be provided (like qm unlock). One such scenario > would be PDM losing connectivity to one of the remotes while holding > the lock, there's not really anything we can do there. With the variant that allows separately committing a change the lock could be released along that (only if there is no error, naturally), that should avoid most problematic situation where an admin cannot be sure what to do. As otherwise the new config, or pending section, would still exist and can help to interpret the status quo of the network and the best course of action. > Since we probably need some form of doing this with other > configuration files as well, I wanted to ask for your input. I think > this concept could be applied generally to configuration changes that > need to be made synchronized across multiple remotes (syncing firewall > configuration comes to mind). This is just a rough draft on how this > could work and I probably oversaw some edge-cases. I'm happy for any > input or alternative ideas! There are further details to flesh out here, but in any way I think that we really should focus on SDN here and avoid some overly generic solution but rather tailor it to the specific SDN use case(s) at hand. Firewall can IMO be placed under the SDN umbrella and might be fine to use similar mechanics, maybe even exactly the same, but I would not concentrate on building something generic here now, if we can re-use that then great, but SDN is quite specific, not a lot of things depend on rolling out changes in (best-effort) lock step to ensure the end result is actually a functioning thing. Meaning, most other things should not require any such inter-remote synchronization building blocks in the first place; we need to ensure we do not use that hammer to often, as recently replied to Dominik's bulk-action mail: the PDM should be minimal in state and configs it manages itself for some remote management feature, else things will get complex and coupled way to fast. _______________________________________________ pdm-devel mailing list pdm-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pdm-devel] RFC: Synchronizing configuration changes across remotes 2025-02-03 17:02 ` Thomas Lamprecht @ 2025-02-04 10:34 ` Stefan Hanreich 0 siblings, 0 replies; 3+ messages in thread From: Stefan Hanreich @ 2025-02-04 10:34 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox Datacenter Manager development discussion On 2/3/25 18:02, Thomas Lamprecht wrote: > Yeah, digest is not giving you anything here, at least for anything that > consists of more than one change; and adding a dedicated central API > endpoint for every variant of batch update we might need seems hardly > scalable nor like good API design. Yes, although I've considered adding an endpoint for getting / setting the whole SDN configuration at some point, but I've scrapped that since it's unnecessary for what I'm currently implementing (adding single zones / vnets / ...). > Does it really require sweeping changes? I'd think modifications are > already hedging against concurrent access now, so this should not mean > we change to a completely new edit paradigm here. We'd at least have to touch every non-read request in SDN to check for the global lock - but yes, the wording is a bit overly dramatic. We already have an existing lock_sdn_config, so adding another layer of locking there shouldn't be an issue. If I decide to go for the .new config route described below, this will be a bit more involved though. > My thoughts when we talked was to go roughly for: > Add a new endpoint that 1) ensure basic healthiness and 2) registers a > lock for the whole, or potentially only some parts, of the SDN stack. > This should work by returning a lock-cookie random string to be used by > subsequent calls to do various updates in one go while ensuring nothing > else can do so or just steal our lock. Then check this lock centrally > on any write-config and be basically done I think? That was basically what I envisioned as the implementation for the lock too. > A slightly more elaborate variant might be to also split the edit step, > i.e. > 1. check all remotes and get lock > 2. extend the config(s) with a section (or a separate ".new" config) for > pending changes, write all new changes to that. > 3. commit the pending sections or .new config file. > > With that you would have the smallest possibility for failure due to > unrelated node/connection hickups and reduce the time gap for actually > activating the changes. If something is off an admin even could manually > apply these directly on the cluster/nodes. This sounds like an even better idea, I'll look into how I could implement that. As a first step, I think I'll simply go for the lock-cookie approach, since we can always implement this more elaborate approach on top of that. >> * In case of failures on the PDM side it is harder to recover, since >> it requires manual intervention (removing the lock manually). > > Well, a partially rolled out SDN update might always be (relatively) > hard to recover from; which approach would avoid that (and not require > paxos, or raft level guarantees)? One idea that came to my mind was automatic rollback after a timeout if some health check on the PVE side fails, similar to when you change resolution in a graphics driver. > FWIW, we already got pmxcfs backed domain locks, which I added for the > HA stack back in the day. These allow relatively cheaply to take a lock > that only one pmxcfs instance (i.e., one node) at a time can hold. Pair > that with some local lock (e.g., flock, in single-process, many threads > rust land it could be an even cheaper mutex) and you can quite simply > and not to expensively lock edits – and I'd figure SDN modifications do > not have _that_ high of a frequency to make performance here to critical > for such locking to become a problem. I'll look into those - thanks for the pointer. _______________________________________________ pdm-devel mailing list pdm-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-02-04 10:34 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-01-30 15:48 [pdm-devel] RFC: Synchronizing configuration changes across remotes Stefan Hanreich 2025-02-03 17:02 ` Thomas Lamprecht 2025-02-04 10:34 ` Stefan Hanreich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inboxService provided by Proxmox Server Solutions GmbH | Privacy | Legal