Re: [RFC datacenter-manager/proxmox 0/7] inject application context via rpcenv for easier integration testing

all lists on lists.proxmox.com
 help / color / mirror / Atom feed

From: Robert Obkircher <r.obkircher@proxmox.com>
To: pdm-devel@lists.proxmox.com
Subject: Re: [RFC datacenter-manager/proxmox 0/7] inject application context via rpcenv for easier integration testing
Date: Tue, 3 Feb 2026 12:02:23 +0100	[thread overview]
Message-ID: <f760f217-41a5-4773-99b2-fe2f1967e193@proxmox.com> (raw)
In-Reply-To: <20260129134418.307552-1-l.wagner@proxmox.com>


On 1/29/26 14:43, Lukas Wagner wrote:
> TL;DR: Inject essential runtime config, client factory, etc. as an application
> context object and allow to retrieve this object easily in an API handler via
> rpcenv. This allows us directly call API handler implementations from
> integration tests. The aim is to make it easier to write good tests on a larger
> level.
>
> ## Introduction
>
> Considering the different stages of automated software testing, unit testing
> (test small software components in isolation) integration testing (test
> multiple components to together, specifically their interactions) and
> end-to-end testing (test the entire application in a context as close to
> production as possible), I've found that the middle one, integration testing is
> a very challenging one in our stack.
>
> I've found that the challenges with integration testing mostly stem from the
> following:
>
>    - "hardcoded" (as in, determined by some constant or literally
>      hard-coded) assumptions about storage paths (config, state, caches) and
>      users/permissions. This makes it challenging to call into the component
>      from a test running as normal user, e.g. from a regular `cargo test`.
>  
>    - use of global/static instances (examples: Worker task context,
>     proxmox-product-config, client factory in PDM...) in application code and
>     shared crates. While this can be okay for things that are truly global (e.g.
>     logging), it often hinders testing and  especially test isolation due to hidden
>     dependencies between test cases and some potential internal state of the global
>     instance. This is one of the common causes of flaky tests. Also, the setup of
>     these global instances in test cases is always a bit awkward, since the order
>     of test execution is not defined (and might actually run in parallel), so some
>     kind of synchronization between the test cases is necessary. Furthermore,
>     certain test cases might require a *different* setup for these global instances
>     (example: client factory that should produce a different kind of mocked PVE
>     client); but since we usually put these in OnceLocks, one has to put these
>     tests into a separate test binary.
>
>   - tight coupling between major subsystems (e.g. one part calling into the
>     other without any clear boundary or abstraction via callbacks or traits)
>
>
> With regards to PDM, there are a couple of things that need considering when testing:
>   - Filesystem access to config, state and caches
>   - Interactions with remotes via their API
>
> I feel like if we find a sensible way to abstract these for tests, we can
> resonably cover a huge amount of the code in the backend.
>
> ## Implementation
>
> The core idea I want to discuss in this RFC is to provide a context/application
> object ('struct PdmApplication') and "injecting" it into out API handlers via
> the already existing rpcenv variable.
>
> This new context object would give access to:
>   - base paths for caches, config, state
>   - file permissions, user/group
>   - client factory
>   - depending on what we need, potentially other things
>
> So, for instance, instead of calling into the `connection` module to create a
> PVE client, one could retrieve the client factory instance from the application
> object and create the application from that:
>
> So the following
>
>   let client = connection::make_pve_client(...)
>
> becomes
>
>   // app is of type `PdmApplication`
>   let client = app.client_factory().make_pve_client(...)
>
> And
>
>   let config_path = configdir!("foo.cfg")
>
> becomes
>
>   let config_path = app.config_path().join("foo.cfg")
>
> NOTE: The exact structure of this object and what kind of methods it provides
> is not fixed yet.
>
>
> ## Where does `PdmApplication` come from?
>
> For the regular API daemon, the PdmApplication context object is set up during
> the daemon startup routine. 
>
> A handle to it is then associated with the REST server and stored in the
> RpcEnvironment that is passed to the API handlers. In it's current form this
> entails some ugly `dyn Any` hacks, but I think it's not too bad. Any API
> handler can retrieve the application object from rpcenv, e.g. like this:
>
>     // API handler implementation
>     async fn apt_get_changelog(
>         remote: String,
>         node: String,
>         options: APTGetChangelogOptions,
>         rpcenv: &mut dyn RpcEnvironment,
>     ) -> Result<String, Error> {
>         // `get_application` simply does the downcast from `Any` to the correct type.
>         // RpcEnvironment has to hold it as `Any` since the `PdmApplication` type is 
>         // of course specific to the application itself.
>         // 
>         // NOTE: app is of type Arc<PdmApplication>
>         let app = context::get_application(rpcenv);
>
>         // For configs that might to be mocked for tests, the app object could
>         // also provide some form of accessor:
>         let (config, _digest) = app.remote_config().config()?;
>         let remote = get_remote(&config, &remote)?;
>
>         remote_updates::get_changelog(&app, remote, &node, options.name).await
>     }
>
> This application handle would need to be propagated down the call stack, either
> as a &PdmApplication or a Arc<PdmApplication>:
>
>     // actual implementation of the functionality somewhere else in the codebase
>     pub async fn get_changelog(
>         app: &PdmApplication,
>         remote: &Remote,
>         node: &str,
>         package: String,
>     ) -> Result<String, Error> {
>         ...
>         let client = app.client_factory().make_pve_client(remote)?;
>         ...
>     }
>
>
> Code that is executed independently from API handlers, e.g. periodic tasks,
> would receive their application handle during setup (as a clone of the
> 'original' Arc<PdmApplication> that was injected into the REST server).
>
> This general pattern of injecting application state/context via a parameter
> to the handler is something that is also common in other Rust-based
> web framworks, e.g. [actix-web] and [axum].
>
>
> ## Interaction with shared crates
>
> Quite a few of our shared crates in proxmox.git have some form of static/global
> context that has to be initialized, for instance:
>
>   - proxmox-notify
>   - proxmox-rest-server (worker task setup)
>   - proxmox-product-config
>   - proxmox-access-control
>
> I think long-term, if possible, we should try to avoid having static context in
> crates completely. For instance, at some point I'll probably refactor
> proxmox-notify so that the context is always passed along with any call to the
> crate, instead of setting it up statically. As a stop-gap, this context can
> still be static in the *application*, but with everything else I mentioned in
> this RFC so far, I think that it should in some form be derived from the
> general PdmApplication context object.
>
> One thing that could work is implementing a crate-specific context trait for the
> application object itself and then passing the application itself:
>
>     impl proxmox_notify::Context for PdmApplication {
>        ...
>     }
>
>     proxmox_notify::send(&app, &notification)?;
>
> Or, alternatively, the `app` object would be used to construct some kind of other
> context object for the crate:
>
>     struct PdmNotifyContext {
>         ...
>     }
>
>     impl proxmox_notify::Context for PdmNotifyContext {
>        ...
>     }
>
>     impl PdmNotifyContext {
>         fn from_app(app: &PdmApplication) {
>            ...
>         }
>     }
>
>     let notify_context = PdmNotifyContext::from_app(app);
>     proxmox_notify::send(&notify_context, &notification)?;
>
>
> I have not yet written any code for these shared crate setups, but I think the
> general pattern *should* work.
>
> What I'm not so sure about yet and where I have not done any experiments is the
> case where the entire API handler is defined in a shared crate. These of course
> cannot access the product-specific struct. Many of these rely on some form of
> static initialization, either directly or indirectly (e.g.
> proxmox-product-config).
>
> I'd be happy about any ideas for these cases.

I'm not very familiar with any of this, but have you considered making
dependency injection more flexible?

If API handlers can choose what they want (e.g. with a InjectedRef<T>
parameters) they never need to know about an application specific type
like PdmApplication.

Tests would then also now exactly which dependencies to construct and
they could just pass them by reference.

As far as I can tell, Actix [1] also seems to support multiple types
for app_data and Axum has a slightly different approach with the
FromRef trait used for substates [2].

[1]
https://docs.rs/actix-web/latest/actix_web/struct.Scope.html#method.app_data
[2] https://docs.rs/axum/latest/axum/extract/struct.State.html#substates

>
> Personally I'm not a big fan of defining the entire API handlers in the shared
> crate, I rather prefer having the handler definitions in the application code
> and having this handler call *into* the shared crate - I think this gives more
> flexibility in case there need to be some application-specific changes (e.g.
> additional parameters, different types, etc.). Also in this case it would be
> easy to pass along an additional context object along with the call.
>
> ## Summary
>
> Pros:
>   - We can reasonably test subsystems in isolation with the comfort of a simple `cargo test`
>   - I think this RFC shows that it's definitely possible to adapt to this new pattern
>     gradually without having to refactor huge amounts of existing code at once.
>
> Risks:
>   - Increased complexity
>     - Having to pass some kind of application handle down the callstack
>       could be considered quite invasive.
>     - Having to deal with Box'd/Arc'd trait objects (e.g. the client factory)
>       can be a bit annoying
>     - The application context object 
>
>   - While this approach does not require us to commit fully right from the start, it could
>     be quite awkward if we have "the old approach" and the "new approach" in parallel.
>     So it would be important that we agree on the sensibility of this approach and then
>     consequently use it for new features and then over time gradually introduce it
>     into the existing code.
>
>
> The patches submitted in this patch series are of course still work in
> progress; their main aim is to demonstrate that viability of the concept.
>
> I'd be very happy to get some feedback on the general idea and of course also
> the PoC implementation (but don't get too caught up in naming, missing doc
> comments, etc - as I said, everything is still work in progress).
>
> ## References:
>
> [actix-web]: https://actix.rs/docs/application/#state
> [axum]: https://docs.rs/axum/latest/axum/#sharing-state-with-handlers
>
>
> proxmox:
>
> Lukas Wagner (1):
>   router: rpc environment: allow to provide a application-specific
>     context handle via rpcenv
>
>  proxmox-rest-server/src/api_config.rs  | 10 ++++++++++
>  proxmox-rest-server/src/environment.rs |  6 +++++-
>  proxmox-router/src/cli/environment.rs  |  6 ++++++
>  proxmox-router/src/rpc_environment.rs  |  5 ++++-
>  4 files changed, 25 insertions(+), 2 deletions(-)
>
>
> proxmox-datacenter-manager:
>
> Lukas Wagner (6):
>   connection: store client factory in an Arc and add public getter
>   parallel fetcher: allow to use custom client factory
>   introduce PdmApplication struct and inject it during API server
>     startup
>   remote updates: use PdmApplication object to derive paths, permissions
>     and client factory
>   tests: add captured responses for integration tests
>   tests: add basic integration tests for the remote updates API
>
>  server/src/api/pve/mod.rs                     |  11 +-
>  server/src/api/remote_updates.rs              |  54 +-
>  server/src/bin/proxmox-datacenter-api/main.rs |   9 +-
>  .../tasks/remote_tasks.rs                     |   2 +
>  .../tasks/remote_updates.rs                   |  22 +-
>  .../bin/proxmox-datacenter-privileged-api.rs  |   7 +-
>  server/src/connection.rs                      |  29 +-
>  server/src/context.rs                         |  74 +-
>  server/src/lib.rs                             |   3 +-
>  .../src/metric_collection/collection_task.rs  |   2 +-
>  server/src/parallel_fetcher.rs                |  24 +-
>  server/src/remote_updates.rs                  |  90 ++-
>  server/src/test_support/mod.rs                |   3 +-
>  server/tests/api_responses/pve/apt_repos.json | 469 +++++++++++
>  .../tests/api_responses/pve/apt_update.json   | 638 +++++++++++++++
>  .../tests/api_responses/pve/apt_versions.json | 736 ++++++++++++++++++
>  .../api_responses/pve/node_subscription.json  |   6 +
>  server/tests/test_remote_updates.rs           | 288 +++++++
>  18 files changed, 2394 insertions(+), 73 deletions(-)
>  create mode 100644 server/tests/api_responses/pve/apt_repos.json
>  create mode 100644 server/tests/api_responses/pve/apt_update.json
>  create mode 100644 server/tests/api_responses/pve/apt_versions.json
>  create mode 100644 server/tests/api_responses/pve/node_subscription.json
>  create mode 100644 server/tests/test_remote_updates.rs
>
>
> Summary over all repositories:
>   22 files changed, 2419 insertions(+), 75 deletions(-)
>

     prev parent reply	other threads:[~2026-02-03 11:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-29 13:44 Lukas Wagner
2026-01-29 13:44 ` [RFC proxmox 1/1] router: rpc environment: allow to provide a application-specific context handle via rpcenv Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 1/6] connection: store client factory in an Arc and add public getter Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 2/6] parallel fetcher: allow to use custom client factory Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 3/6] introduce PdmApplication struct and inject it during API server startup Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 4/6] remote updates: use PdmApplication object to derive paths, permissions and client factory Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 5/6] tests: add captured responses for integration tests Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 6/6] tests: add basic integration tests for the remote updates API Lukas Wagner
2026-02-03 11:02 ` Robert Obkircher [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f760f217-41a5-4773-99b2-fe2f1967e193@proxmox.com \
    --to=r.obkircher@proxmox.com \
    --cc=pdm-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal