all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead
@ 2026-01-02 16:07 Samuel Rufinatscha
  2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 1/4] pbs-config: add token.shadow generation to ConfigVersionCache Samuel Rufinatscha
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Samuel Rufinatscha @ 2026-01-02 16:07 UTC (permalink / raw)
  To: pbs-devel

Hi,

this series improves the performance of token-based API authentication
in PBS (pbs-config) and in PDM (underlying proxmox-access-control
crate), addressing the API token verification hotspot reported in our
bugtracker #7017 [1].

When profiling PBS /status endpoint with cargo flamegraph [2],
token-based authentication showed up as a dominant hotspot via
proxmox_sys::crypt::verify_crypt_pw. Applying this series removes that
path from the hot section of the flamegraph. The same performance issue
was measured [2] for PDM. PDM uses the underlying shared
proxmox-access-control library for token handling, which is a
factored out version of the token.shadow handling code from PBS.

While this series fixes the immediate performance issue both in PBS
(pbs-config) and in the shared proxmox-access-control crate used by
PDM, PBS should eventually, ideally be refactored, in a separate
effort, to use proxmox-access-control for token handling instead of its
local implementation.

Problem

For token-based API requests, both PBS’s pbs-config token.shadow
handling and PDM proxmox-access-control’s token.shadow handling
currently:

1. read the token.shadow file on each request
2. deserialize it into a HashMap<Authid, String>
3. run password hash verification via
   proxmox_sys::crypt::verify_crypt_pw for the provided token secret

Under load, this results in significant CPU usage spent in repeated
password hashing for the same token+secret pairs. The attached
flamegraphs for PBS [2] and PDM [3] show
proxmox_sys::crypt::verify_crypt_pw dominating the hot path.

Approach

The goal is to reduce the cost of token-based authentication preserving
the existing token handling semantics (including detecting manual edits
to token.shadow) and be consistent between PBS (pbs-config) and
PDM (proxmox-access-control). For both sites, this series proposes to:

1. Introduce an in-memory cache for verified token secrets and
invalidate it through a shared ConfigVersionCache generation. Note, a
shared generation is required to keep privileged and unprivileged
daemon in sync to avoid caching inconsistencies across processes.
2. Invalidate on token.shadow file API changes (set_secret,
delete_secret)
3. Invalidate on direct/manual token.shadow file changes (mtime +
length)
4. Avoid per-request file stat calls using a TTL window

Testing

*PBS (pbs-config)*

To verify the effect in PBS, I:
1. Set up test environment based on latest PBS ISO, installed Rust
   toolchain, cloned proxmox-backup repository to use with cargo
   flamegraph. Reproduced bug #7017 [1] by profiling the /status
   endpoint with token-based authentication using cargo flamegraph [2].
2. Built PBS with pbs-config patches and re-ran the same workload and
   profiling setup. Confirmed that
   proxmox_sys::crypt::verify_crypt_pw path no longer appears in the
   hot section of the flamegraph. CPU usage is now dominated by TLS
   overhead.
3. Functionally-wise, I verified that:
   * valid tokens authenticate correctly when used in API requests
   * invalid secrets are rejected as before
   * generating a new token secret via dashboard (create token for user,
   regenerate existing secret) works and authenticates correctly

*PDM (proxmox-access-control)*

To verify the effect in PDM, I followed a similar testing approach.
Instead of PBS’ /status, I profiled the /version endpoint with cargo
flamegraph [2] and verified that the expensive hashing path disappears
from the hot section after introducing caching.

Functionally-wise, I verified that:
   * valid tokens authenticate correctly when used in API requests
   * invalid secrets are rejected as before
   * generating a new token secret via dashboard (create token for user,
   regenerate existing secret) works and authenticates correctly

Benchmarks:

Two different benchmarks have been run to measure caching effects
and RwLock contention:

(1) Requests per second for PBS /status endpoint (E2E)

Benchmarked parallel token auth requests for
/status?verbose=0 on top of the datastore lookup cache series [4]
to check throughput impact. With datastores=1, repeat=5000, parallel=16
this series gives ~172 req/s compared to ~65 req/s without it.
This is a ~2.6x improvement (and aligns with the ~179 req/s from the
previous series, which used per-process cache invalidation).

(2) RwLock contention for token create/delete under heavy load of
token-authenticated requests

The previous version of the series compared std::sync::RwLock and
parking_lot::RwLock contention for token create/delete under heavy
parallel token-authenticated readers. parking_lot::RwLock has been
chosen for the added fairness guarantees.

Patch summary

pbs-config:

0001 – pbs-config: add token.shadow generation to ConfigVersionCache
Extends ConfigVersionCache to provide a process-shared generation
number for token.shadow changes.

0002 – pbs-config: cache verified API token secrets
Adds an in-memory cache to cache verified, plain-text API token secrets.
Cache is invalidated through the process-shared ConfigVersionCache
generation number. Uses openssl’s memcmp constant-time for matching
secrets.

0003 – pbs-config: invalidate token-secret cache on token.shadow
changes
Stats token.shadow mtime and length and clears the cache when the
file changes, on each token verification request.

0004 – pbs-config: add TTL window to token-secret cache
Introduces a TTL (TOKEN_SECRET_CACHE_TTL_SECS, default 60) for metadata
checks so that fs::metadata calls are not performed on each request.

proxmox-access-control:

0005 – access-control: extend AccessControlConfig for token.shadow invalidation

Extends the AccessControlConfig trait with
token_shadow_cache_generation() and
increment_token_shadow_cache_generation() for
proxmox-access-control to get the shared token.shadow generation number
and bump it on token shadow changes.

0006 – access-control: cache verified API token secrets
Mirrors PBS PATCH 0002.

0007 – access-control: invalidate token-secret cache on token.shadow changes
Mirrors PBS PATCH 0003.

0008 – access-control: add TTL window to token-secret cache
Mirrors PBS PATCH 0004.

proxmox-datacenter-manager:

0009 – pdm-config: add token.shadow generation to ConfigVersionCache
Extends PDM ConfigVersionCache and implements
token_shadow_cache_generation() and
increment_token_shadow_cache_generation() from AccessControlConfig for
PDM.

0010 – docs: document API token-cache TTL effects
Documents the effects of the TTL window on token.shadow edits

Changes from v1 to v2:

* (refactor) Switched cache initialization to LazyLock
* (perf) Use parking_lot::RwLock and best-effort cache access on the
  read/refresh path (try_read/try_write) to avoid lock contention
* (doc) Document TTL-delayed effect of manual token.shadow edits
* (fix) Add generation guards (API_MUTATION_GENERATION +
  FILE_GENERATION) to prevent caching across concurrent set/delete and
  external edits

Changes from v2 to v3:

* (refactor) Replace PBS per-process cache invalidation with a
  cross-process token.shadow generation based on PBS
  ConfigVersionCache, ensuring cache consistency between privileged
  and unprivileged daemons.
* (refactor) Decoupling generation source from the
  proxmox/proxmox-access-control cache implementation: extend
  AccessControlConfig hooks so that products can provide the shared
  token.shadow generation source.
* (refactor) Extend PDM's ConfigVersionCache with
  token_shadow_generation
  and introduce a pdm_config::AccessControlConfig wrapper implementing
  the new proxmox-access-control trait hooks. Switch server and CLI
  initialization to use pdm_config::AccessControlConfig instead of
  pdm_api_types::AccessControlConfig.
* (refactor) Adapt generation checks around cached-secret comparison to
  use the new shared generation source.
* (fix/logic) cache_try_insert_secret: Update the local cache
  generation if stale, allowing the new secret to be inserted
  immediately
* (refactor) Extract cache invalidation logic into a
  invalidate_cache_state helper to reduce duplication and ensure
  consistent state resets
* (refactor) Simplify refresh_cache_if_file_changed: handle the
  un-initialized/reset state and adjust the generation mismatch
  path to ensure file metadata is always re-read.
* (doc) Clarify TTL-delayed effects of manual token.shadow edits.

Please see the patch specific changelogs for more details.

Thanks for considering this patch series, I look forward to your
feedback.

Best,
Samuel Rufinatscha

[1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
[2] attachment 1767 [1]: Flamegraph showing the proxmox_sys::crypt::verify_crypt_pw stack
[3] attachment 1794 [1]: Flamegraph PDM baseline
[4] https://bugzilla.proxmox.com/show_bug.cgi?id=6049

proxmox-backup:

Samuel Rufinatscha (4):
  pbs-config: add token.shadow generation to ConfigVersionCache
  pbs-config: cache verified API token secrets
  pbs-config: invalidate token-secret cache on token.shadow changes
  pbs-config: add TTL window to token secret cache

 Cargo.toml                             |   1 +
 docs/user-management.rst               |   4 +
 pbs-config/Cargo.toml                  |   1 +
 pbs-config/src/config_version_cache.rs |  18 ++
 pbs-config/src/token_shadow.rs         | 298 ++++++++++++++++++++++++-
 5 files changed, 321 insertions(+), 1 deletion(-)


proxmox:

Samuel Rufinatscha (4):
  proxmox-access-control: extend AccessControlConfig for token.shadow
    invalidation
  proxmox-access-control: cache verified API token secrets
  proxmox-access-control: invalidate token-secret cache on token.shadow
    changes
  proxmox-access-control: add TTL window to token secret cache

 Cargo.toml                                 |   1 +
 proxmox-access-control/Cargo.toml          |   1 +
 proxmox-access-control/src/init.rs         |  17 ++
 proxmox-access-control/src/token_shadow.rs | 299 ++++++++++++++++++++-
 4 files changed, 317 insertions(+), 1 deletion(-)


proxmox-datacenter-manager:

Samuel Rufinatscha (2):
  pdm-config: implement token.shadow generation
  docs: document API token-cache TTL effects

 cli/admin/src/main.rs                       |  2 +-
 docs/access-control.rst                     |  4 ++
 lib/pdm-config/Cargo.toml                   |  1 +
 lib/pdm-config/src/access_control_config.rs | 73 +++++++++++++++++++++
 lib/pdm-config/src/config_version_cache.rs  | 18 +++++
 lib/pdm-config/src/lib.rs                   |  2 +
 server/src/acl.rs                           |  3 +-
 7 files changed, 100 insertions(+), 3 deletions(-)
 create mode 100644 lib/pdm-config/src/access_control_config.rs


Summary over all repositories:
  16 files changed, 738 insertions(+), 5 deletions(-)

-- 
Generated by git-murpp 0.8.1


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-01-02 16:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-02 16:07 [pbs-devel] [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 1/4] pbs-config: add token.shadow generation to ConfigVersionCache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 3/4] pbs-config: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 4/4] pbs-config: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 1/4] proxmox-access-control: extend AccessControlConfig for token.shadow invalidation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 2/4] proxmox-access-control: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 3/4] proxmox-access-control: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 4/4] proxmox-access-control: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 1/2] pdm-config: implement token.shadow generation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 2/2] docs: document API token-cache TTL effects Samuel Rufinatscha

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal