From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead
Date: Fri, 2 Jan 2026 17:07:39 +0100 [thread overview]
Message-ID: <20260102160750.285157-1-s.rufinatscha@proxmox.com> (raw)
Hi,
this series improves the performance of token-based API authentication
in PBS (pbs-config) and in PDM (underlying proxmox-access-control
crate), addressing the API token verification hotspot reported in our
bugtracker #7017 [1].
When profiling PBS /status endpoint with cargo flamegraph [2],
token-based authentication showed up as a dominant hotspot via
proxmox_sys::crypt::verify_crypt_pw. Applying this series removes that
path from the hot section of the flamegraph. The same performance issue
was measured [2] for PDM. PDM uses the underlying shared
proxmox-access-control library for token handling, which is a
factored out version of the token.shadow handling code from PBS.
While this series fixes the immediate performance issue both in PBS
(pbs-config) and in the shared proxmox-access-control crate used by
PDM, PBS should eventually, ideally be refactored, in a separate
effort, to use proxmox-access-control for token handling instead of its
local implementation.
Problem
For token-based API requests, both PBS’s pbs-config token.shadow
handling and PDM proxmox-access-control’s token.shadow handling
currently:
1. read the token.shadow file on each request
2. deserialize it into a HashMap<Authid, String>
3. run password hash verification via
proxmox_sys::crypt::verify_crypt_pw for the provided token secret
Under load, this results in significant CPU usage spent in repeated
password hashing for the same token+secret pairs. The attached
flamegraphs for PBS [2] and PDM [3] show
proxmox_sys::crypt::verify_crypt_pw dominating the hot path.
Approach
The goal is to reduce the cost of token-based authentication preserving
the existing token handling semantics (including detecting manual edits
to token.shadow) and be consistent between PBS (pbs-config) and
PDM (proxmox-access-control). For both sites, this series proposes to:
1. Introduce an in-memory cache for verified token secrets and
invalidate it through a shared ConfigVersionCache generation. Note, a
shared generation is required to keep privileged and unprivileged
daemon in sync to avoid caching inconsistencies across processes.
2. Invalidate on token.shadow file API changes (set_secret,
delete_secret)
3. Invalidate on direct/manual token.shadow file changes (mtime +
length)
4. Avoid per-request file stat calls using a TTL window
Testing
*PBS (pbs-config)*
To verify the effect in PBS, I:
1. Set up test environment based on latest PBS ISO, installed Rust
toolchain, cloned proxmox-backup repository to use with cargo
flamegraph. Reproduced bug #7017 [1] by profiling the /status
endpoint with token-based authentication using cargo flamegraph [2].
2. Built PBS with pbs-config patches and re-ran the same workload and
profiling setup. Confirmed that
proxmox_sys::crypt::verify_crypt_pw path no longer appears in the
hot section of the flamegraph. CPU usage is now dominated by TLS
overhead.
3. Functionally-wise, I verified that:
* valid tokens authenticate correctly when used in API requests
* invalid secrets are rejected as before
* generating a new token secret via dashboard (create token for user,
regenerate existing secret) works and authenticates correctly
*PDM (proxmox-access-control)*
To verify the effect in PDM, I followed a similar testing approach.
Instead of PBS’ /status, I profiled the /version endpoint with cargo
flamegraph [2] and verified that the expensive hashing path disappears
from the hot section after introducing caching.
Functionally-wise, I verified that:
* valid tokens authenticate correctly when used in API requests
* invalid secrets are rejected as before
* generating a new token secret via dashboard (create token for user,
regenerate existing secret) works and authenticates correctly
Benchmarks:
Two different benchmarks have been run to measure caching effects
and RwLock contention:
(1) Requests per second for PBS /status endpoint (E2E)
Benchmarked parallel token auth requests for
/status?verbose=0 on top of the datastore lookup cache series [4]
to check throughput impact. With datastores=1, repeat=5000, parallel=16
this series gives ~172 req/s compared to ~65 req/s without it.
This is a ~2.6x improvement (and aligns with the ~179 req/s from the
previous series, which used per-process cache invalidation).
(2) RwLock contention for token create/delete under heavy load of
token-authenticated requests
The previous version of the series compared std::sync::RwLock and
parking_lot::RwLock contention for token create/delete under heavy
parallel token-authenticated readers. parking_lot::RwLock has been
chosen for the added fairness guarantees.
Patch summary
pbs-config:
0001 – pbs-config: add token.shadow generation to ConfigVersionCache
Extends ConfigVersionCache to provide a process-shared generation
number for token.shadow changes.
0002 – pbs-config: cache verified API token secrets
Adds an in-memory cache to cache verified, plain-text API token secrets.
Cache is invalidated through the process-shared ConfigVersionCache
generation number. Uses openssl’s memcmp constant-time for matching
secrets.
0003 – pbs-config: invalidate token-secret cache on token.shadow
changes
Stats token.shadow mtime and length and clears the cache when the
file changes, on each token verification request.
0004 – pbs-config: add TTL window to token-secret cache
Introduces a TTL (TOKEN_SECRET_CACHE_TTL_SECS, default 60) for metadata
checks so that fs::metadata calls are not performed on each request.
proxmox-access-control:
0005 – access-control: extend AccessControlConfig for token.shadow invalidation
Extends the AccessControlConfig trait with
token_shadow_cache_generation() and
increment_token_shadow_cache_generation() for
proxmox-access-control to get the shared token.shadow generation number
and bump it on token shadow changes.
0006 – access-control: cache verified API token secrets
Mirrors PBS PATCH 0002.
0007 – access-control: invalidate token-secret cache on token.shadow changes
Mirrors PBS PATCH 0003.
0008 – access-control: add TTL window to token-secret cache
Mirrors PBS PATCH 0004.
proxmox-datacenter-manager:
0009 – pdm-config: add token.shadow generation to ConfigVersionCache
Extends PDM ConfigVersionCache and implements
token_shadow_cache_generation() and
increment_token_shadow_cache_generation() from AccessControlConfig for
PDM.
0010 – docs: document API token-cache TTL effects
Documents the effects of the TTL window on token.shadow edits
Changes from v1 to v2:
* (refactor) Switched cache initialization to LazyLock
* (perf) Use parking_lot::RwLock and best-effort cache access on the
read/refresh path (try_read/try_write) to avoid lock contention
* (doc) Document TTL-delayed effect of manual token.shadow edits
* (fix) Add generation guards (API_MUTATION_GENERATION +
FILE_GENERATION) to prevent caching across concurrent set/delete and
external edits
Changes from v2 to v3:
* (refactor) Replace PBS per-process cache invalidation with a
cross-process token.shadow generation based on PBS
ConfigVersionCache, ensuring cache consistency between privileged
and unprivileged daemons.
* (refactor) Decoupling generation source from the
proxmox/proxmox-access-control cache implementation: extend
AccessControlConfig hooks so that products can provide the shared
token.shadow generation source.
* (refactor) Extend PDM's ConfigVersionCache with
token_shadow_generation
and introduce a pdm_config::AccessControlConfig wrapper implementing
the new proxmox-access-control trait hooks. Switch server and CLI
initialization to use pdm_config::AccessControlConfig instead of
pdm_api_types::AccessControlConfig.
* (refactor) Adapt generation checks around cached-secret comparison to
use the new shared generation source.
* (fix/logic) cache_try_insert_secret: Update the local cache
generation if stale, allowing the new secret to be inserted
immediately
* (refactor) Extract cache invalidation logic into a
invalidate_cache_state helper to reduce duplication and ensure
consistent state resets
* (refactor) Simplify refresh_cache_if_file_changed: handle the
un-initialized/reset state and adjust the generation mismatch
path to ensure file metadata is always re-read.
* (doc) Clarify TTL-delayed effects of manual token.shadow edits.
Please see the patch specific changelogs for more details.
Thanks for considering this patch series, I look forward to your
feedback.
Best,
Samuel Rufinatscha
[1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
[2] attachment 1767 [1]: Flamegraph showing the proxmox_sys::crypt::verify_crypt_pw stack
[3] attachment 1794 [1]: Flamegraph PDM baseline
[4] https://bugzilla.proxmox.com/show_bug.cgi?id=6049
proxmox-backup:
Samuel Rufinatscha (4):
pbs-config: add token.shadow generation to ConfigVersionCache
pbs-config: cache verified API token secrets
pbs-config: invalidate token-secret cache on token.shadow changes
pbs-config: add TTL window to token secret cache
Cargo.toml | 1 +
docs/user-management.rst | 4 +
pbs-config/Cargo.toml | 1 +
pbs-config/src/config_version_cache.rs | 18 ++
pbs-config/src/token_shadow.rs | 298 ++++++++++++++++++++++++-
5 files changed, 321 insertions(+), 1 deletion(-)
proxmox:
Samuel Rufinatscha (4):
proxmox-access-control: extend AccessControlConfig for token.shadow
invalidation
proxmox-access-control: cache verified API token secrets
proxmox-access-control: invalidate token-secret cache on token.shadow
changes
proxmox-access-control: add TTL window to token secret cache
Cargo.toml | 1 +
proxmox-access-control/Cargo.toml | 1 +
proxmox-access-control/src/init.rs | 17 ++
proxmox-access-control/src/token_shadow.rs | 299 ++++++++++++++++++++-
4 files changed, 317 insertions(+), 1 deletion(-)
proxmox-datacenter-manager:
Samuel Rufinatscha (2):
pdm-config: implement token.shadow generation
docs: document API token-cache TTL effects
cli/admin/src/main.rs | 2 +-
docs/access-control.rst | 4 ++
lib/pdm-config/Cargo.toml | 1 +
lib/pdm-config/src/access_control_config.rs | 73 +++++++++++++++++++++
lib/pdm-config/src/config_version_cache.rs | 18 +++++
lib/pdm-config/src/lib.rs | 2 +
server/src/acl.rs | 3 +-
7 files changed, 100 insertions(+), 3 deletions(-)
create mode 100644 lib/pdm-config/src/access_control_config.rs
Summary over all repositories:
16 files changed, 738 insertions(+), 5 deletions(-)
--
Generated by git-murpp 0.8.1
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next reply other threads:[~2026-01-02 16:07 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 16:07 Samuel Rufinatscha [this message]
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 1/4] pbs-config: add token.shadow generation to ConfigVersionCache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 3/4] pbs-config: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 4/4] pbs-config: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 1/4] proxmox-access-control: extend AccessControlConfig for token.shadow invalidation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 2/4] proxmox-access-control: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 3/4] proxmox-access-control: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 4/4] proxmox-access-control: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 1/2] pdm-config: implement token.shadow generation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 2/2] docs: document API token-cache TTL effects Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260102160750.285157-1-s.rufinatscha@proxmox.com \
--to=s.rufinatscha@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox