From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead
Date: Fri, 2 Jan 2026 17:07:39 +0100 [thread overview]
Message-ID: <20260102160750.285157-1-s.rufinatscha@proxmox.com> (raw)
Hi,
this series improves the performance of token-based API authentication
in PBS (pbs-config) and in PDM (underlying proxmox-access-control
crate), addressing the API token verification hotspot reported in our
bugtracker #7017 [1].
When profiling PBS /status endpoint with cargo flamegraph [2],
token-based authentication showed up as a dominant hotspot via
proxmox_sys::crypt::verify_crypt_pw. Applying this series removes that
path from the hot section of the flamegraph. The same performance issue
was measured [2] for PDM. PDM uses the underlying shared
proxmox-access-control library for token handling, which is a
factored out version of the token.shadow handling code from PBS.
While this series fixes the immediate performance issue both in PBS
(pbs-config) and in the shared proxmox-access-control crate used by
PDM, PBS should eventually, ideally be refactored, in a separate
effort, to use proxmox-access-control for token handling instead of its
local implementation.
Problem
For token-based API requests, both PBS’s pbs-config token.shadow
handling and PDM proxmox-access-control’s token.shadow handling
currently:
1. read the token.shadow file on each request
2. deserialize it into a HashMap<Authid, String>
3. run password hash verification via
proxmox_sys::crypt::verify_crypt_pw for the provided token secret
Under load, this results in significant CPU usage spent in repeated
password hashing for the same token+secret pairs. The attached
flamegraphs for PBS [2] and PDM [3] show
proxmox_sys::crypt::verify_crypt_pw dominating the hot path.
Approach
The goal is to reduce the cost of token-based authentication preserving
the existing token handling semantics (including detecting manual edits
to token.shadow) and be consistent between PBS (pbs-config) and
PDM (proxmox-access-control). For both sites, this series proposes to:
1. Introduce an in-memory cache for verified token secrets and
invalidate it through a shared ConfigVersionCache generation. Note, a
shared generation is required to keep privileged and unprivileged
daemon in sync to avoid caching inconsistencies across processes.
2. Invalidate on token.shadow file API changes (set_secret,
delete_secret)
3. Invalidate on direct/manual token.shadow file changes (mtime +
length)
4. Avoid per-request file stat calls using a TTL window
Testing
*PBS (pbs-config)*
To verify the effect in PBS, I:
1. Set up test environment based on latest PBS ISO, installed Rust
toolchain, cloned proxmox-backup repository to use with cargo
flamegraph. Reproduced bug #7017 [1] by profiling the /status
endpoint with token-based authentication using cargo flamegraph [2].
2. Built PBS with pbs-config patches and re-ran the same workload and
profiling setup. Confirmed that
proxmox_sys::crypt::verify_crypt_pw path no longer appears in the
hot section of the flamegraph. CPU usage is now dominated by TLS
overhead.
3. Functionally-wise, I verified that:
* valid tokens authenticate correctly when used in API requests
* invalid secrets are rejected as before
* generating a new token secret via dashboard (create token for user,
regenerate existing secret) works and authenticates correctly
*PDM (proxmox-access-control)*
To verify the effect in PDM, I followed a similar testing approach.
Instead of PBS’ /status, I profiled the /version endpoint with cargo
flamegraph [2] and verified that the expensive hashing path disappears
from the hot section after introducing caching.
Functionally-wise, I verified that:
* valid tokens authenticate correctly when used in API requests
* invalid secrets are rejected as before
* generating a new token secret via dashboard (create token for user,
regenerate existing secret) works and authenticates correctly
Benchmarks:
Two different benchmarks have been run to measure caching effects
and RwLock contention:
(1) Requests per second for PBS /status endpoint (E2E)
Benchmarked parallel token auth requests for
/status?verbose=0 on top of the datastore lookup cache series [4]
to check throughput impact. With datastores=1, repeat=5000, parallel=16
this series gives ~172 req/s compared to ~65 req/s without it.
This is a ~2.6x improvement (and aligns with the ~179 req/s from the
previous series, which used per-process cache invalidation).
(2) RwLock contention for token create/delete under heavy load of
token-authenticated requests
The previous version of the series compared std::sync::RwLock and
parking_lot::RwLock contention for token create/delete under heavy
parallel token-authenticated readers. parking_lot::RwLock has been
chosen for the added fairness guarantees.
Patch summary
pbs-config:
0001 – pbs-config: add token.shadow generation to ConfigVersionCache
Extends ConfigVersionCache to provide a process-shared generation
number for token.shadow changes.
0002 – pbs-config: cache verified API token secrets
Adds an in-memory cache to cache verified, plain-text API token secrets.
Cache is invalidated through the process-shared ConfigVersionCache
generation number. Uses openssl’s memcmp constant-time for matching
secrets.
0003 – pbs-config: invalidate token-secret cache on token.shadow
changes
Stats token.shadow mtime and length and clears the cache when the
file changes, on each token verification request.
0004 – pbs-config: add TTL window to token-secret cache
Introduces a TTL (TOKEN_SECRET_CACHE_TTL_SECS, default 60) for metadata
checks so that fs::metadata calls are not performed on each request.
proxmox-access-control:
0005 – access-control: extend AccessControlConfig for token.shadow invalidation
Extends the AccessControlConfig trait with
token_shadow_cache_generation() and
increment_token_shadow_cache_generation() for
proxmox-access-control to get the shared token.shadow generation number
and bump it on token shadow changes.
0006 – access-control: cache verified API token secrets
Mirrors PBS PATCH 0002.
0007 – access-control: invalidate token-secret cache on token.shadow changes
Mirrors PBS PATCH 0003.
0008 – access-control: add TTL window to token-secret cache
Mirrors PBS PATCH 0004.
proxmox-datacenter-manager:
0009 – pdm-config: add token.shadow generation to ConfigVersionCache
Extends PDM ConfigVersionCache and implements
token_shadow_cache_generation() and
increment_token_shadow_cache_generation() from AccessControlConfig for
PDM.
0010 – docs: document API token-cache TTL effects
Documents the effects of the TTL window on token.shadow edits
Changes from v1 to v2:
* (refactor) Switched cache initialization to LazyLock
* (perf) Use parking_lot::RwLock and best-effort cache access on the
read/refresh path (try_read/try_write) to avoid lock contention
* (doc) Document TTL-delayed effect of manual token.shadow edits
* (fix) Add generation guards (API_MUTATION_GENERATION +
FILE_GENERATION) to prevent caching across concurrent set/delete and
external edits
Changes from v2 to v3:
* (refactor) Replace PBS per-process cache invalidation with a
cross-process token.shadow generation based on PBS
ConfigVersionCache, ensuring cache consistency between privileged
and unprivileged daemons.
* (refactor) Decoupling generation source from the
proxmox/proxmox-access-control cache implementation: extend
AccessControlConfig hooks so that products can provide the shared
token.shadow generation source.
* (refactor) Extend PDM's ConfigVersionCache with
token_shadow_generation
and introduce a pdm_config::AccessControlConfig wrapper implementing
the new proxmox-access-control trait hooks. Switch server and CLI
initialization to use pdm_config::AccessControlConfig instead of
pdm_api_types::AccessControlConfig.
* (refactor) Adapt generation checks around cached-secret comparison to
use the new shared generation source.
* (fix/logic) cache_try_insert_secret: Update the local cache
generation if stale, allowing the new secret to be inserted
immediately
* (refactor) Extract cache invalidation logic into a
invalidate_cache_state helper to reduce duplication and ensure
consistent state resets
* (refactor) Simplify refresh_cache_if_file_changed: handle the
un-initialized/reset state and adjust the generation mismatch
path to ensure file metadata is always re-read.
* (doc) Clarify TTL-delayed effects of manual token.shadow edits.
Please see the patch specific changelogs for more details.
Thanks for considering this patch series, I look forward to your
feedback.
Best,
Samuel Rufinatscha
[1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
[2] attachment 1767 [1]: Flamegraph showing the proxmox_sys::crypt::verify_crypt_pw stack
[3] attachment 1794 [1]: Flamegraph PDM baseline
[4] https://bugzilla.proxmox.com/show_bug.cgi?id=6049
proxmox-backup:
Samuel Rufinatscha (4):
pbs-config: add token.shadow generation to ConfigVersionCache
pbs-config: cache verified API token secrets
pbs-config: invalidate token-secret cache on token.shadow changes
pbs-config: add TTL window to token secret cache
Cargo.toml | 1 +
docs/user-management.rst | 4 +
pbs-config/Cargo.toml | 1 +
pbs-config/src/config_version_cache.rs | 18 ++
pbs-config/src/token_shadow.rs | 298 ++++++++++++++++++++++++-
5 files changed, 321 insertions(+), 1 deletion(-)
proxmox:
Samuel Rufinatscha (4):
proxmox-access-control: extend AccessControlConfig for token.shadow
invalidation
proxmox-access-control: cache verified API token secrets
proxmox-access-control: invalidate token-secret cache on token.shadow
changes
proxmox-access-control: add TTL window to token secret cache
Cargo.toml | 1 +
proxmox-access-control/Cargo.toml | 1 +
proxmox-access-control/src/init.rs | 17 ++
proxmox-access-control/src/token_shadow.rs | 299 ++++++++++++++++++++-
4 files changed, 317 insertions(+), 1 deletion(-)
proxmox-datacenter-manager:
Samuel Rufinatscha (2):
pdm-config: implement token.shadow generation
docs: document API token-cache TTL effects
cli/admin/src/main.rs | 2 +-
docs/access-control.rst | 4 ++
lib/pdm-config/Cargo.toml | 1 +
lib/pdm-config/src/access_control_config.rs | 73 +++++++++++++++++++++
lib/pdm-config/src/config_version_cache.rs | 18 +++++
lib/pdm-config/src/lib.rs | 2 +
server/src/acl.rs | 3 +-
7 files changed, 100 insertions(+), 3 deletions(-)
create mode 100644 lib/pdm-config/src/access_control_config.rs
Summary over all repositories:
16 files changed, 738 insertions(+), 5 deletions(-)
--
Generated by git-murpp 0.8.1
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next reply other threads:[~2026-01-02 16:07 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 16:07 Samuel Rufinatscha [this message]
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 1/4] pbs-config: add token.shadow generation to ConfigVersionCache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 3/4] pbs-config: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 4/4] pbs-config: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 1/4] proxmox-access-control: extend AccessControlConfig for token.shadow invalidation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 2/4] proxmox-access-control: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 3/4] proxmox-access-control: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 4/4] proxmox-access-control: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 1/2] pdm-config: implement token.shadow generation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 2/2] docs: document API token-cache TTL effects Samuel Rufinatscha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260102160750.285157-1-s.rufinatscha@proxmox.com \
--to=s.rufinatscha@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.