public inbox for pdm-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pdm-devel] RFC: Synchronizing configuration changes across remotes
@ 2025-01-30 15:48 Stefan Hanreich
  2025-02-03 17:02 ` Thomas Lamprecht
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan Hanreich @ 2025-01-30 15:48 UTC (permalink / raw)
  To: pdm-devel

I'm currently working on the SDN integration and for that I need a way
to deploy SDN configuration changes to multiple remotes
simultaneously.

In general I will need to do the following:

* Create / Update / Delete some parts of the SDN configuration of
multiple remotes, preferably synchronized across the remotes.
* Apply the new SDN configuration (possibly opt-in) for all/some nodes
in multiple remotes

During this operation it would make sense to make sure that there are
no pending changes in the SDN configuration, so users do not
accidentally apply unrelated changes via PDM. We also need to prevent
any concurrent SDN configuration changes for the same reason - so we
don't apply any unrelated configuration.

The question is: Do we also want to be able to prevent concurrent
changes across multiple remotes, or are we fine with only preventing
concurrent changes on a remote level? With network configuration
affecting more than one remote, I think it would be better to
synchronize changes across remotes since oftentimes applying the
configuration to only one remote doesn't really make sense and the
failure to apply configuration could affect the other remote.

The two options I see, depending on the answer to that question:
* introducing some form of lock that prevents any changes to the SDN
configuration from other sources
* do something based on the current digest functionality

The general process for making changes to the SDN configuration would
look as follows with the lock-based approach:
* check for pending changes, and if there are none: lock the SDN
configuration (atomically in one API call)
* make the changes to the SDN configuration
* apply the SDN configuration changes
* release the lock
* In the case of errors we can rollback the configuration changes and
then release all locks.


I currently gravitate towards the lock-based approach due to the
following reasons:
* It enables us to synchronize changes across multiple remotes - as
compared to a digest based approach.
* It's a lot more ergonomic for developers, since you simply
acquire/release the lock. With a digest-based approach, modifications
that require multiple API calls need to acquire a new digest
everytime and track it across multiple API calls. With SDN specifically,
when applying the configuration, we need to provide and check the digest
as well.
* It is just easier to prevent concurrent changes in the first place
rather than reacting to them. If they cannot occur, then rollbacking
is easier and less error-prone since the developer can assume nothing
changed in the previously handled remotes as well.

The downsides of this approach I can see:
* It requires sweeping changes to basically the whole SDN API, and
keeping backwards compatibility is harder.
* Also, many API endpoints in PVE already provide the digest
functionality, so it would be a lot easier to retro-fit this for usage
with PDM and possibly require no changes at all.
* In case of failures on the PDM side it is harder to recover, since
it requires manual intervention (removing the lock manually).

For single configuration files the digest-based approach could work
quite well in cases where we don't need to synchronize changes across
multiple remotes. But for SDN the digest-based approach is a bit more
complicated: We currently generate digests for each section in the
configuration file, instead of for the configuration file as a whole.
This would be relatively easy to add though. The second problem is
that the configuration is split across multiple files, so we'd need to
either look at all digests of all configuration files in all API calls
or check a 'global' SDN configuration digest on every call. Again,
certainly solvable but also requires some work.

Since even with our best effort we will run into situations where the
lock doesn't get properly released, a simple escape hatch to unlock
the SDN config should be provided (like qm unlock). One such scenario
would be PDM losing connectivity to one of the remotes while holding
the lock, there's not really anything we can do there.


Since we probably need some form of doing this with other
configuration files as well, I wanted to ask for your input. I think
this concept could be applied generally to configuration changes that
need to be made synchronized across multiple remotes (syncing firewall
configuration comes to mind). This is just a rough draft on how this
could work and I probably oversaw some edge-cases. I'm happy for any
input or alternative ideas!


_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pdm-devel] RFC: Synchronizing configuration changes across remotes
  2025-01-30 15:48 [pdm-devel] RFC: Synchronizing configuration changes across remotes Stefan Hanreich
@ 2025-02-03 17:02 ` Thomas Lamprecht
  2025-02-04 10:34   ` Stefan Hanreich
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Lamprecht @ 2025-02-03 17:02 UTC (permalink / raw)
  To: Proxmox Datacenter Manager development discussion, Stefan Hanreich

Am 30.01.25 um 16:48 schrieb Stefan Hanreich:
> During this operation it would make sense to make sure that there are
> no pending changes in the SDN configuration, so users do not
> accidentally apply unrelated changes via PDM. We also need to prevent
> any concurrent SDN configuration changes for the same reason - so we
> don't apply any unrelated configuration.
> 
> The question is: Do we also want to be able to prevent concurrent
> changes across multiple remotes, or are we fine with only preventing
> concurrent changes on a remote level? With network configuration
> affecting more than one remote, I think it would be better to
> synchronize changes across remotes since oftentimes applying the
> configuration to only one remote doesn't really make sense and the
> failure to apply configuration could affect the other remote.

To answer yes to your specific question and really mean it (as in:
actually safe, generic reliable, and maybe even atomic) would mean that
we need to add a cluster layer with an algorithm than actually ensures
us consensus over all remotes.
I.e., Paxos, like corosync uses, or raft or something like that
(depending on the exact properties wanted).
All these are very costly and do not scale well at all, so with
interpreting your question rather narrowly I'd have to answer with a
strong no; requiring that would severely limit PDM in its usefulness and
IME bring major complexity and headaches along with it.  And as rolling
out the network to FRR and what not else has tons of side effects doing
all this consensus work would quite definitively rather useless anyway.

But, squinting a bit more and interpreting the question to not mean that
we should add ways for doing things in guaranteed lockstep through an
FSM distributed over all remotes to rather mean that one adds some way
to ensure one can do various edits without others being able to make any
modifications during that sequence it should be dooable.
Especially if we transparently shift responsibility to clean things up
if anything goes wrong to the user (with some methods to empower them
doing so, documentation and tooling wise).

> I currently gravitate towards the lock-based approach due to the
> following reasons:

Yeah, digest is not giving you anything here, at least for anything that
consists of more than one change; and adding a dedicated central API
endpoint for every variant of batch update we might need seems hardly
scalable nor like good API design.

> * It enables us to synchronize changes across multiple remotes - as
> compared to a digest based approach.
> * It's a lot more ergonomic for developers, since you simply
> acquire/release the lock. With a digest-based approach, modifications
> that require multiple API calls need to acquire a new digest
> everytime and track it across multiple API calls. With SDN specifically,
> when applying the configuration, we need to provide and check the digest
> as well.
> * It is just easier to prevent concurrent changes in the first place
> rather than reacting to them. If they cannot occur, then rollbacking
> is easier and less error-prone since the developer can assume nothing
> changed in the previously handled remotes as well.
> 
> The downsides of this approach I can see:
> * It requires sweeping changes to basically the whole SDN API, and
> keeping backwards compatibility is harder.

Does it really require sweeping changes? I'd think modifications are
already hedging against concurrent access now, so this should not mean
we change to a completely new edit paradigm here.

My thoughts when we talked was to go roughly for:
Add a new endpoint that 1) ensure basic healthiness and 2) registers a
lock for the whole, or potentially only some parts, of the SDN stack.
This should work by returning a lock-cookie random string to be used by
subsequent calls to do various updates in one go while ensuring nothing
else can do so or just steal our lock.  Then check this lock centrally
on any write-config and be basically done I think?

A slightly more elaborate variant might be to also split the edit step,
i.e.
1. check all remotes and get lock
2. extend the config(s) with a section (or a separate ".new" config) for
   pending changes, write all new changes to that.
3. commit the pending sections or .new config file.

With that you would have the smallest possibility for failure due to
unrelated node/connection hickups and reduce the time gap for actually
activating the changes. If something is off an admin even could manually
apply these directly on the cluster/nodes.

> * Also, many API endpoints in PVE already provide the digest
> functionality, so it would be a lot easier to retro-fit this for usage
> with PDM and possibly require no changes at all.

Digest should be able to co-exist, if the config is unlocked and digest
is the same then the edit is generally safe.

> * In case of failures on the PDM side it is harder to recover, since
> it requires manual intervention (removing the lock manually).

Well, a partially rolled out SDN update might always be (relatively)
hard to recover from; which approach would avoid that (and not require
paxos, or raft level guarantees)?

> For single configuration files the digest-based approach could work
> quite well in cases where we don't need to synchronize changes across
> multiple remotes. But for SDN the digest-based approach is a bit more
> complicated: We currently generate digests for each section in the
> configuration file, instead of for the configuration file as a whole.
> This would be relatively easy to add though. The second problem is
> that the configuration is split across multiple files, so we'd need to
> either look at all digests of all configuration files in all API calls
> or check a 'global' SDN configuration digest on every call. Again,
> certainly solvable but also requires some work.

FWIW, we already got pmxcfs backed domain locks, which I added for the
HA stack back in the day. These allow relatively cheaply to take a lock
that only one pmxcfs instance (i.e., one node) at a time can hold.  Pair
that with some local lock (e.g., flock, in single-process, many threads
rust land it could be an even cheaper mutex) and you can quite simply
and not to expensively lock edits – and I'd figure SDN modifications do
not have _that_ high of a frequency to make performance here to critical
for such locking to become a problem.

> Since even with our best effort we will run into situations where the
> lock doesn't get properly released, a simple escape hatch to unlock
> the SDN config should be provided (like qm unlock). One such scenario
> would be PDM losing connectivity to one of the remotes while holding
> the lock, there's not really anything we can do there.

With the variant that allows separately committing a change the lock
could be released along that (only if there is no error, naturally),
that should avoid most problematic situation where an admin cannot be
sure what to do. As otherwise the new config, or pending section, would
still exist and can help to interpret the status quo of the network and
the best course of action.
 
> Since we probably need some form of doing this with other
> configuration files as well, I wanted to ask for your input. I think
> this concept could be applied generally to configuration changes that
> need to be made synchronized across multiple remotes (syncing firewall
> configuration comes to mind). This is just a rough draft on how this
> could work and I probably oversaw some edge-cases. I'm happy for any
> input or alternative ideas!

There are further details to flesh out here, but in any way I think that
we really should focus on SDN here and avoid some overly generic
solution but rather tailor it to the specific SDN use case(s) at hand.
Firewall can IMO be placed under the SDN umbrella and might be fine to
use similar mechanics, maybe even exactly the same, but I would not
concentrate on building something generic here now, if we can re-use
that then great, but SDN is quite specific, not a lot of things depend
on rolling out changes in (best-effort) lock step to ensure the end
result is actually a functioning thing. Meaning, most other things
should not require any such inter-remote synchronization building blocks
in the first place; we need to ensure we do not use that hammer to
often, as recently replied to Dominik's bulk-action mail: the PDM should
be minimal in state and configs it manages itself for some remote
management feature, else things will get complex and coupled way to
fast.


_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pdm-devel] RFC: Synchronizing configuration changes across remotes
  2025-02-03 17:02 ` Thomas Lamprecht
@ 2025-02-04 10:34   ` Stefan Hanreich
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Hanreich @ 2025-02-04 10:34 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox Datacenter Manager development discussion

On 2/3/25 18:02, Thomas Lamprecht wrote:
> Yeah, digest is not giving you anything here, at least for anything that
> consists of more than one change; and adding a dedicated central API
> endpoint for every variant of batch update we might need seems hardly
> scalable nor like good API design.

Yes, although I've considered adding an endpoint for getting / setting
the whole SDN configuration at some point, but I've scrapped that since
it's unnecessary for what I'm currently implementing (adding single
zones / vnets / ...).

> Does it really require sweeping changes? I'd think modifications are
> already hedging against concurrent access now, so this should not mean
> we change to a completely new edit paradigm here.

We'd at least have to touch every non-read request in SDN to check for
the global lock - but yes, the wording is a bit overly dramatic. We
already have an existing lock_sdn_config, so adding another layer of
locking there shouldn't be an issue. If I decide to go for the .new
config route described below, this will be a bit more involved though.

> My thoughts when we talked was to go roughly for:
> Add a new endpoint that 1) ensure basic healthiness and 2) registers a
> lock for the whole, or potentially only some parts, of the SDN stack.
> This should work by returning a lock-cookie random string to be used by
> subsequent calls to do various updates in one go while ensuring nothing
> else can do so or just steal our lock.  Then check this lock centrally
> on any write-config and be basically done I think?

That was basically what I envisioned as the implementation for the
lock too.

> A slightly more elaborate variant might be to also split the edit step,
> i.e.
> 1. check all remotes and get lock
> 2. extend the config(s) with a section (or a separate ".new" config) for
>    pending changes, write all new changes to that.
> 3. commit the pending sections or .new config file.
> 
> With that you would have the smallest possibility for failure due to
> unrelated node/connection hickups and reduce the time gap for actually
> activating the changes. If something is off an admin even could manually
> apply these directly on the cluster/nodes.

This sounds like an even better idea, I'll look into how I could
implement that. As a first step, I think I'll simply go for the
lock-cookie approach, since we can always implement this more elaborate
approach on top of that.

>> * In case of failures on the PDM side it is harder to recover, since
>> it requires manual intervention (removing the lock manually).
> 
> Well, a partially rolled out SDN update might always be (relatively)
> hard to recover from; which approach would avoid that (and not require
> paxos, or raft level guarantees)?

One idea that came to my mind was automatic rollback after a timeout if
some health check on the PVE side fails, similar to when you change
resolution in a graphics driver.

> FWIW, we already got pmxcfs backed domain locks, which I added for the
> HA stack back in the day. These allow relatively cheaply to take a lock
> that only one pmxcfs instance (i.e., one node) at a time can hold.  Pair
> that with some local lock (e.g., flock, in single-process, many threads
> rust land it could be an even cheaper mutex) and you can quite simply
> and not to expensively lock edits – and I'd figure SDN modifications do
> not have _that_ high of a frequency to make performance here to critical
> for such locking to become a problem.

I'll look into those - thanks for the pointer.


_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-02-04 10:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-30 15:48 [pdm-devel] RFC: Synchronizing configuration changes across remotes Stefan Hanreich
2025-02-03 17:02 ` Thomas Lamprecht
2025-02-04 10:34   ` Stefan Hanreich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal