public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] superseded: [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead
Date: Wed, 21 Jan 2026 16:15:40 +0100	[thread overview]
Message-ID: <62aa3252-4d4c-4c18-bcf6-83a3e621d874@proxmox.com> (raw)
In-Reply-To: <20260102160750.285157-1-s.rufinatscha@proxmox.com>

https://lore.proxmox.com/pbs-devel/20260121151408.731516-1-s.rufinatscha@proxmox.com/T/#t

On 1/2/26 5:07 PM, Samuel Rufinatscha wrote:
> Hi,
> 
> this series improves the performance of token-based API authentication
> in PBS (pbs-config) and in PDM (underlying proxmox-access-control
> crate), addressing the API token verification hotspot reported in our
> bugtracker #7017 [1].
> 
> When profiling PBS /status endpoint with cargo flamegraph [2],
> token-based authentication showed up as a dominant hotspot via
> proxmox_sys::crypt::verify_crypt_pw. Applying this series removes that
> path from the hot section of the flamegraph. The same performance issue
> was measured [2] for PDM. PDM uses the underlying shared
> proxmox-access-control library for token handling, which is a
> factored out version of the token.shadow handling code from PBS.
> 
> While this series fixes the immediate performance issue both in PBS
> (pbs-config) and in the shared proxmox-access-control crate used by
> PDM, PBS should eventually, ideally be refactored, in a separate
> effort, to use proxmox-access-control for token handling instead of its
> local implementation.
> 
> Problem
> 
> For token-based API requests, both PBS’s pbs-config token.shadow
> handling and PDM proxmox-access-control’s token.shadow handling
> currently:
> 
> 1. read the token.shadow file on each request
> 2. deserialize it into a HashMap<Authid, String>
> 3. run password hash verification via
>     proxmox_sys::crypt::verify_crypt_pw for the provided token secret
> 
> Under load, this results in significant CPU usage spent in repeated
> password hashing for the same token+secret pairs. The attached
> flamegraphs for PBS [2] and PDM [3] show
> proxmox_sys::crypt::verify_crypt_pw dominating the hot path.
> 
> Approach
> 
> The goal is to reduce the cost of token-based authentication preserving
> the existing token handling semantics (including detecting manual edits
> to token.shadow) and be consistent between PBS (pbs-config) and
> PDM (proxmox-access-control). For both sites, this series proposes to:
> 
> 1. Introduce an in-memory cache for verified token secrets and
> invalidate it through a shared ConfigVersionCache generation. Note, a
> shared generation is required to keep privileged and unprivileged
> daemon in sync to avoid caching inconsistencies across processes.
> 2. Invalidate on token.shadow file API changes (set_secret,
> delete_secret)
> 3. Invalidate on direct/manual token.shadow file changes (mtime +
> length)
> 4. Avoid per-request file stat calls using a TTL window
> 
> Testing
> 
> *PBS (pbs-config)*
> 
> To verify the effect in PBS, I:
> 1. Set up test environment based on latest PBS ISO, installed Rust
>     toolchain, cloned proxmox-backup repository to use with cargo
>     flamegraph. Reproduced bug #7017 [1] by profiling the /status
>     endpoint with token-based authentication using cargo flamegraph [2].
> 2. Built PBS with pbs-config patches and re-ran the same workload and
>     profiling setup. Confirmed that
>     proxmox_sys::crypt::verify_crypt_pw path no longer appears in the
>     hot section of the flamegraph. CPU usage is now dominated by TLS
>     overhead.
> 3. Functionally-wise, I verified that:
>     * valid tokens authenticate correctly when used in API requests
>     * invalid secrets are rejected as before
>     * generating a new token secret via dashboard (create token for user,
>     regenerate existing secret) works and authenticates correctly
> 
> *PDM (proxmox-access-control)*
> 
> To verify the effect in PDM, I followed a similar testing approach.
> Instead of PBS’ /status, I profiled the /version endpoint with cargo
> flamegraph [2] and verified that the expensive hashing path disappears
> from the hot section after introducing caching.
> 
> Functionally-wise, I verified that:
>     * valid tokens authenticate correctly when used in API requests
>     * invalid secrets are rejected as before
>     * generating a new token secret via dashboard (create token for user,
>     regenerate existing secret) works and authenticates correctly
> 
> Benchmarks:
> 
> Two different benchmarks have been run to measure caching effects
> and RwLock contention:
> 
> (1) Requests per second for PBS /status endpoint (E2E)
> 
> Benchmarked parallel token auth requests for
> /status?verbose=0 on top of the datastore lookup cache series [4]
> to check throughput impact. With datastores=1, repeat=5000, parallel=16
> this series gives ~172 req/s compared to ~65 req/s without it.
> This is a ~2.6x improvement (and aligns with the ~179 req/s from the
> previous series, which used per-process cache invalidation).
> 
> (2) RwLock contention for token create/delete under heavy load of
> token-authenticated requests
> 
> The previous version of the series compared std::sync::RwLock and
> parking_lot::RwLock contention for token create/delete under heavy
> parallel token-authenticated readers. parking_lot::RwLock has been
> chosen for the added fairness guarantees.
> 
> Patch summary
> 
> pbs-config:
> 
> 0001 – pbs-config: add token.shadow generation to ConfigVersionCache
> Extends ConfigVersionCache to provide a process-shared generation
> number for token.shadow changes.
> 
> 0002 – pbs-config: cache verified API token secrets
> Adds an in-memory cache to cache verified, plain-text API token secrets.
> Cache is invalidated through the process-shared ConfigVersionCache
> generation number. Uses openssl’s memcmp constant-time for matching
> secrets.
> 
> 0003 – pbs-config: invalidate token-secret cache on token.shadow
> changes
> Stats token.shadow mtime and length and clears the cache when the
> file changes, on each token verification request.
> 
> 0004 – pbs-config: add TTL window to token-secret cache
> Introduces a TTL (TOKEN_SECRET_CACHE_TTL_SECS, default 60) for metadata
> checks so that fs::metadata calls are not performed on each request.
> 
> proxmox-access-control:
> 
> 0005 – access-control: extend AccessControlConfig for token.shadow invalidation
> 
> Extends the AccessControlConfig trait with
> token_shadow_cache_generation() and
> increment_token_shadow_cache_generation() for
> proxmox-access-control to get the shared token.shadow generation number
> and bump it on token shadow changes.
> 
> 0006 – access-control: cache verified API token secrets
> Mirrors PBS PATCH 0002.
> 
> 0007 – access-control: invalidate token-secret cache on token.shadow changes
> Mirrors PBS PATCH 0003.
> 
> 0008 – access-control: add TTL window to token-secret cache
> Mirrors PBS PATCH 0004.
> 
> proxmox-datacenter-manager:
> 
> 0009 – pdm-config: add token.shadow generation to ConfigVersionCache
> Extends PDM ConfigVersionCache and implements
> token_shadow_cache_generation() and
> increment_token_shadow_cache_generation() from AccessControlConfig for
> PDM.
> 
> 0010 – docs: document API token-cache TTL effects
> Documents the effects of the TTL window on token.shadow edits
> 
> Changes from v1 to v2:
> 
> * (refactor) Switched cache initialization to LazyLock
> * (perf) Use parking_lot::RwLock and best-effort cache access on the
>    read/refresh path (try_read/try_write) to avoid lock contention
> * (doc) Document TTL-delayed effect of manual token.shadow edits
> * (fix) Add generation guards (API_MUTATION_GENERATION +
>    FILE_GENERATION) to prevent caching across concurrent set/delete and
>    external edits
> 
> Changes from v2 to v3:
> 
> * (refactor) Replace PBS per-process cache invalidation with a
>    cross-process token.shadow generation based on PBS
>    ConfigVersionCache, ensuring cache consistency between privileged
>    and unprivileged daemons.
> * (refactor) Decoupling generation source from the
>    proxmox/proxmox-access-control cache implementation: extend
>    AccessControlConfig hooks so that products can provide the shared
>    token.shadow generation source.
> * (refactor) Extend PDM's ConfigVersionCache with
>    token_shadow_generation
>    and introduce a pdm_config::AccessControlConfig wrapper implementing
>    the new proxmox-access-control trait hooks. Switch server and CLI
>    initialization to use pdm_config::AccessControlConfig instead of
>    pdm_api_types::AccessControlConfig.
> * (refactor) Adapt generation checks around cached-secret comparison to
>    use the new shared generation source.
> * (fix/logic) cache_try_insert_secret: Update the local cache
>    generation if stale, allowing the new secret to be inserted
>    immediately
> * (refactor) Extract cache invalidation logic into a
>    invalidate_cache_state helper to reduce duplication and ensure
>    consistent state resets
> * (refactor) Simplify refresh_cache_if_file_changed: handle the
>    un-initialized/reset state and adjust the generation mismatch
>    path to ensure file metadata is always re-read.
> * (doc) Clarify TTL-delayed effects of manual token.shadow edits.
> 
> Please see the patch specific changelogs for more details.
> 
> Thanks for considering this patch series, I look forward to your
> feedback.
> 
> Best,
> Samuel Rufinatscha
> 
> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
> [2] attachment 1767 [1]: Flamegraph showing the proxmox_sys::crypt::verify_crypt_pw stack
> [3] attachment 1794 [1]: Flamegraph PDM baseline
> [4] https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> 
> proxmox-backup:
> 
> Samuel Rufinatscha (4):
>    pbs-config: add token.shadow generation to ConfigVersionCache
>    pbs-config: cache verified API token secrets
>    pbs-config: invalidate token-secret cache on token.shadow changes
>    pbs-config: add TTL window to token secret cache
> 
>   Cargo.toml                             |   1 +
>   docs/user-management.rst               |   4 +
>   pbs-config/Cargo.toml                  |   1 +
>   pbs-config/src/config_version_cache.rs |  18 ++
>   pbs-config/src/token_shadow.rs         | 298 ++++++++++++++++++++++++-
>   5 files changed, 321 insertions(+), 1 deletion(-)
> 
> 
> proxmox:
> 
> Samuel Rufinatscha (4):
>    proxmox-access-control: extend AccessControlConfig for token.shadow
>      invalidation
>    proxmox-access-control: cache verified API token secrets
>    proxmox-access-control: invalidate token-secret cache on token.shadow
>      changes
>    proxmox-access-control: add TTL window to token secret cache
> 
>   Cargo.toml                                 |   1 +
>   proxmox-access-control/Cargo.toml          |   1 +
>   proxmox-access-control/src/init.rs         |  17 ++
>   proxmox-access-control/src/token_shadow.rs | 299 ++++++++++++++++++++-
>   4 files changed, 317 insertions(+), 1 deletion(-)
> 
> 
> proxmox-datacenter-manager:
> 
> Samuel Rufinatscha (2):
>    pdm-config: implement token.shadow generation
>    docs: document API token-cache TTL effects
> 
>   cli/admin/src/main.rs                       |  2 +-
>   docs/access-control.rst                     |  4 ++
>   lib/pdm-config/Cargo.toml                   |  1 +
>   lib/pdm-config/src/access_control_config.rs | 73 +++++++++++++++++++++
>   lib/pdm-config/src/config_version_cache.rs  | 18 +++++
>   lib/pdm-config/src/lib.rs                   |  2 +
>   server/src/acl.rs                           |  3 +-
>   7 files changed, 100 insertions(+), 3 deletions(-)
>   create mode 100644 lib/pdm-config/src/access_control_config.rs
> 
> 
> Summary over all repositories:
>    16 files changed, 738 insertions(+), 5 deletions(-)
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

      parent reply	other threads:[~2026-01-21 15:15 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 16:07 [pbs-devel] " Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 1/4] pbs-config: add token.shadow generation to ConfigVersionCache Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-16 13:53     ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-16 15:13     ` Samuel Rufinatscha
2026-01-16 15:29       ` Fabian Grünbichler
2026-01-16 15:33         ` Samuel Rufinatscha
2026-01-16 16:00       ` Fabian Grünbichler
2026-01-16 16:56         ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 3/4] pbs-config: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-20  9:21     ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 4/4] pbs-config: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 1/4] proxmox-access-control: extend AccessControlConfig for token.shadow invalidation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 2/4] proxmox-access-control: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 3/4] proxmox-access-control: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 4/4] proxmox-access-control: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 1/2] pdm-config: implement token.shadow generation Samuel Rufinatscha
2026-01-14 10:45   ` Fabian Grünbichler
2026-01-16 16:28     ` Samuel Rufinatscha
2026-01-16 16:48       ` Shannon Sterz
2026-01-19  7:56         ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 2/2] docs: document API token-cache TTL effects Samuel Rufinatscha
2026-01-14 10:45   ` Fabian Grünbichler
2026-01-14 11:24     ` Samuel Rufinatscha
2026-01-21 15:15 ` Samuel Rufinatscha [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62aa3252-4d4c-4c18-bcf6-83a3e621d874@proxmox.com \
    --to=s.rufinatscha@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal