public inbox for pdm-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [RFC datacenter-manager/proxmox 0/7] inject application context via rpcenv for easier integration testing
@ 2026-01-29 13:44 Lukas Wagner
  2026-01-29 13:44 ` [RFC proxmox 1/1] router: rpc environment: allow to provide a application-specific context handle via rpcenv Lukas Wagner
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Lukas Wagner @ 2026-01-29 13:44 UTC (permalink / raw)
  To: pdm-devel

TL;DR: Inject essential runtime config, client factory, etc. as an application
context object and allow to retrieve this object easily in an API handler via
rpcenv. This allows us directly call API handler implementations from
integration tests. The aim is to make it easier to write good tests on a larger
level.

## Introduction

Considering the different stages of automated software testing, unit testing
(test small software components in isolation) integration testing (test
multiple components to together, specifically their interactions) and
end-to-end testing (test the entire application in a context as close to
production as possible), I've found that the middle one, integration testing is
a very challenging one in our stack.

I've found that the challenges with integration testing mostly stem from the
following:

   - "hardcoded" (as in, determined by some constant or literally
     hard-coded) assumptions about storage paths (config, state, caches) and
     users/permissions. This makes it challenging to call into the component
     from a test running as normal user, e.g. from a regular `cargo test`.
 
   - use of global/static instances (examples: Worker task context,
    proxmox-product-config, client factory in PDM...) in application code and
    shared crates. While this can be okay for things that are truly global (e.g.
    logging), it often hinders testing and  especially test isolation due to hidden
    dependencies between test cases and some potential internal state of the global
    instance. This is one of the common causes of flaky tests. Also, the setup of
    these global instances in test cases is always a bit awkward, since the order
    of test execution is not defined (and might actually run in parallel), so some
    kind of synchronization between the test cases is necessary. Furthermore,
    certain test cases might require a *different* setup for these global instances
    (example: client factory that should produce a different kind of mocked PVE
    client); but since we usually put these in OnceLocks, one has to put these
    tests into a separate test binary.

  - tight coupling between major subsystems (e.g. one part calling into the
    other without any clear boundary or abstraction via callbacks or traits)


With regards to PDM, there are a couple of things that need considering when testing:
  - Filesystem access to config, state and caches
  - Interactions with remotes via their API

I feel like if we find a sensible way to abstract these for tests, we can
resonably cover a huge amount of the code in the backend.

## Implementation

The core idea I want to discuss in this RFC is to provide a context/application
object ('struct PdmApplication') and "injecting" it into out API handlers via
the already existing rpcenv variable.

This new context object would give access to:
  - base paths for caches, config, state
  - file permissions, user/group
  - client factory
  - depending on what we need, potentially other things

So, for instance, instead of calling into the `connection` module to create a
PVE client, one could retrieve the client factory instance from the application
object and create the application from that:

So the following

  let client = connection::make_pve_client(...)

becomes

  // app is of type `PdmApplication`
  let client = app.client_factory().make_pve_client(...)

And

  let config_path = configdir!("foo.cfg")

becomes

  let config_path = app.config_path().join("foo.cfg")

NOTE: The exact structure of this object and what kind of methods it provides
is not fixed yet.


## Where does `PdmApplication` come from?

For the regular API daemon, the PdmApplication context object is set up during
the daemon startup routine. 

A handle to it is then associated with the REST server and stored in the
RpcEnvironment that is passed to the API handlers. In it's current form this
entails some ugly `dyn Any` hacks, but I think it's not too bad. Any API
handler can retrieve the application object from rpcenv, e.g. like this:

    // API handler implementation
    async fn apt_get_changelog(
        remote: String,
        node: String,
        options: APTGetChangelogOptions,
        rpcenv: &mut dyn RpcEnvironment,
    ) -> Result<String, Error> {
        // `get_application` simply does the downcast from `Any` to the correct type.
        // RpcEnvironment has to hold it as `Any` since the `PdmApplication` type is 
        // of course specific to the application itself.
        // 
        // NOTE: app is of type Arc<PdmApplication>
        let app = context::get_application(rpcenv);

        // For configs that might to be mocked for tests, the app object could
        // also provide some form of accessor:
        let (config, _digest) = app.remote_config().config()?;
        let remote = get_remote(&config, &remote)?;

        remote_updates::get_changelog(&app, remote, &node, options.name).await
    }

This application handle would need to be propagated down the call stack, either
as a &PdmApplication or a Arc<PdmApplication>:

    // actual implementation of the functionality somewhere else in the codebase
    pub async fn get_changelog(
        app: &PdmApplication,
        remote: &Remote,
        node: &str,
        package: String,
    ) -> Result<String, Error> {
        ...
        let client = app.client_factory().make_pve_client(remote)?;
        ...
    }


Code that is executed independently from API handlers, e.g. periodic tasks,
would receive their application handle during setup (as a clone of the
'original' Arc<PdmApplication> that was injected into the REST server).

This general pattern of injecting application state/context via a parameter
to the handler is something that is also common in other Rust-based
web framworks, e.g. [actix-web] and [axum].


## Interaction with shared crates

Quite a few of our shared crates in proxmox.git have some form of static/global
context that has to be initialized, for instance:

  - proxmox-notify
  - proxmox-rest-server (worker task setup)
  - proxmox-product-config
  - proxmox-access-control

I think long-term, if possible, we should try to avoid having static context in
crates completely. For instance, at some point I'll probably refactor
proxmox-notify so that the context is always passed along with any call to the
crate, instead of setting it up statically. As a stop-gap, this context can
still be static in the *application*, but with everything else I mentioned in
this RFC so far, I think that it should in some form be derived from the
general PdmApplication context object.

One thing that could work is implementing a crate-specific context trait for the
application object itself and then passing the application itself:

    impl proxmox_notify::Context for PdmApplication {
       ...
    }

    proxmox_notify::send(&app, &notification)?;

Or, alternatively, the `app` object would be used to construct some kind of other
context object for the crate:

    struct PdmNotifyContext {
        ...
    }

    impl proxmox_notify::Context for PdmNotifyContext {
       ...
    }

    impl PdmNotifyContext {
        fn from_app(app: &PdmApplication) {
           ...
        }
    }

    let notify_context = PdmNotifyContext::from_app(app);
    proxmox_notify::send(&notify_context, &notification)?;


I have not yet written any code for these shared crate setups, but I think the
general pattern *should* work.

What I'm not so sure about yet and where I have not done any experiments is the
case where the entire API handler is defined in a shared crate. These of course
cannot access the product-specific struct. Many of these rely on some form of
static initialization, either directly or indirectly (e.g.
proxmox-product-config).

I'd be happy about any ideas for these cases.

Personally I'm not a big fan of defining the entire API handlers in the shared
crate, I rather prefer having the handler definitions in the application code
and having this handler call *into* the shared crate - I think this gives more
flexibility in case there need to be some application-specific changes (e.g.
additional parameters, different types, etc.). Also in this case it would be
easy to pass along an additional context object along with the call.

## Summary

Pros:
  - We can reasonably test subsystems in isolation with the comfort of a simple `cargo test`
  - I think this RFC shows that it's definitely possible to adapt to this new pattern
    gradually without having to refactor huge amounts of existing code at once.

Risks:
  - Increased complexity
    - Having to pass some kind of application handle down the callstack
      could be considered quite invasive.
    - Having to deal with Box'd/Arc'd trait objects (e.g. the client factory)
      can be a bit annoying
    - The application context object 

  - While this approach does not require us to commit fully right from the start, it could
    be quite awkward if we have "the old approach" and the "new approach" in parallel.
    So it would be important that we agree on the sensibility of this approach and then
    consequently use it for new features and then over time gradually introduce it
    into the existing code.


The patches submitted in this patch series are of course still work in
progress; their main aim is to demonstrate that viability of the concept.

I'd be very happy to get some feedback on the general idea and of course also
the PoC implementation (but don't get too caught up in naming, missing doc
comments, etc - as I said, everything is still work in progress).

## References:

[actix-web]: https://actix.rs/docs/application/#state
[axum]: https://docs.rs/axum/latest/axum/#sharing-state-with-handlers


proxmox:

Lukas Wagner (1):
  router: rpc environment: allow to provide a application-specific
    context handle via rpcenv

 proxmox-rest-server/src/api_config.rs  | 10 ++++++++++
 proxmox-rest-server/src/environment.rs |  6 +++++-
 proxmox-router/src/cli/environment.rs  |  6 ++++++
 proxmox-router/src/rpc_environment.rs  |  5 ++++-
 4 files changed, 25 insertions(+), 2 deletions(-)


proxmox-datacenter-manager:

Lukas Wagner (6):
  connection: store client factory in an Arc and add public getter
  parallel fetcher: allow to use custom client factory
  introduce PdmApplication struct and inject it during API server
    startup
  remote updates: use PdmApplication object to derive paths, permissions
    and client factory
  tests: add captured responses for integration tests
  tests: add basic integration tests for the remote updates API

 server/src/api/pve/mod.rs                     |  11 +-
 server/src/api/remote_updates.rs              |  54 +-
 server/src/bin/proxmox-datacenter-api/main.rs |   9 +-
 .../tasks/remote_tasks.rs                     |   2 +
 .../tasks/remote_updates.rs                   |  22 +-
 .../bin/proxmox-datacenter-privileged-api.rs  |   7 +-
 server/src/connection.rs                      |  29 +-
 server/src/context.rs                         |  74 +-
 server/src/lib.rs                             |   3 +-
 .../src/metric_collection/collection_task.rs  |   2 +-
 server/src/parallel_fetcher.rs                |  24 +-
 server/src/remote_updates.rs                  |  90 ++-
 server/src/test_support/mod.rs                |   3 +-
 server/tests/api_responses/pve/apt_repos.json | 469 +++++++++++
 .../tests/api_responses/pve/apt_update.json   | 638 +++++++++++++++
 .../tests/api_responses/pve/apt_versions.json | 736 ++++++++++++++++++
 .../api_responses/pve/node_subscription.json  |   6 +
 server/tests/test_remote_updates.rs           | 288 +++++++
 18 files changed, 2394 insertions(+), 73 deletions(-)
 create mode 100644 server/tests/api_responses/pve/apt_repos.json
 create mode 100644 server/tests/api_responses/pve/apt_update.json
 create mode 100644 server/tests/api_responses/pve/apt_versions.json
 create mode 100644 server/tests/api_responses/pve/node_subscription.json
 create mode 100644 server/tests/test_remote_updates.rs


Summary over all repositories:
  22 files changed, 2419 insertions(+), 75 deletions(-)

-- 
Generated by murpp 0.9.0




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-03 11:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-29 13:44 [RFC datacenter-manager/proxmox 0/7] inject application context via rpcenv for easier integration testing Lukas Wagner
2026-01-29 13:44 ` [RFC proxmox 1/1] router: rpc environment: allow to provide a application-specific context handle via rpcenv Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 1/6] connection: store client factory in an Arc and add public getter Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 2/6] parallel fetcher: allow to use custom client factory Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 3/6] introduce PdmApplication struct and inject it during API server startup Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 4/6] remote updates: use PdmApplication object to derive paths, permissions and client factory Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 5/6] tests: add captured responses for integration tests Lukas Wagner
2026-01-29 13:44 ` [RFC datacenter-manager 6/6] tests: add basic integration tests for the remote updates API Lukas Wagner
2026-02-03 11:02 ` [RFC datacenter-manager/proxmox 0/7] inject application context via rpcenv for easier integration testing Robert Obkircher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal