public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] superseded: [PATCH proxmox{-backup, , -datacenter-manager} v2 0/7] token-shadow: reduce api token verification overhead
Date: Fri, 2 Jan 2026 17:09:45 +0100	[thread overview]
Message-ID: <323772e0-ac91-4bd3-942c-e18e70119e05@proxmox.com> (raw)
In-Reply-To: <20251217162520.486520-1-s.rufinatscha@proxmox.com>

https://lore.proxmox.com/pbs-devel/20260102160750.285157-1-s.rufinatscha@proxmox.com/T/#t

On 12/17/25 5:25 PM, Samuel Rufinatscha wrote:
> Hi,
> 
> this series improves the performance of token-based API authentication
> in PBS (pbs-config) and in PDM (underlying proxmox-access-control
> crate), addressing the API token verification hotspot reported in our
> bugtracker #6049 [1].
> 
> When profiling PBS /status endpoint with cargo flamegraph [2],
> token-based authentication showed up as a dominant hotspot via
> proxmox_sys::crypt::verify_crypt_pw. Applying this series removes that
> path from the hot section of the flamegraph. The same performance issue
> was measured [2] for PDM. PDM uses the underlying shared
> proxmox-access-control library for token handling, which is a
> factored out version of the token.shadow handling code from PBS.
> 
> While this series fixes the immediate performance issue both in PBS
> (pbs-config) and in the shared proxmox-access-control crate used by
> PDM, PBS should eventually, ideally be refactored, in a separate
> effort, to use proxmox-access-control for token handling instead of its
> local implementation.
> 
> Problem
> 
> For token-based API requests, both PBS’s pbs-config token.shadow
> handling and PDM proxmox-access-control’s token.shadow handling
> currently:
> 
> 1. read the token.shadow file on each request
> 2. deserialize it into a HashMap<Authid, String>
> 3. run password hash verification via
>     proxmox_sys::crypt::verify_crypt_pw for the provided token secret
> 
> Under load, this results in significant CPU usage spent in repeated
> password hash computations for the same token+secret pairs. The
> attached flamegraphs for PBS [2] and PDM [3] show
> proxmox_sys::crypt::verify_crypt_pw dominating the hot path.
> 
> Approach
> 
> The goal is to reduce the cost of token-based authentication preserving
> the existing token handling semantics (including detecting manual edits
> to token.shadow) and be consistent between PBS (pbs-config) and
> PDM (proxmox-access-control). For both sites, the series proposes
> following approach:
> 
> 1. Introduce an in-memory cache for verified token secrets
> 2. Invalidate the cache when token.shadow changes (detect manual edits)
> 3. Control metadata checks with a TTL window
> 
> Testing
> 
> *PBS (pbs-config)*
> 
> To verify the effect in PBS, I:
> 1. Set up test environment based on latest PBS ISO, installed Rust
>     toolchain, cloned proxmox-backup repository to use with cargo
>     flamegraph. Reproduced bug #6049 [1] by profiling the /status
>     endpoint with token-based authentication using cargo flamegraph [2].
>     The flamegraph showed proxmox_sys::crypt::verify_crypt_pw is the
>     hotspot.
> 2. Built PBS with pbs-config patches and re-ran the same workload and
>     profiling setup.
> 3. Confirmed that the proxmox_sys::crypt::verify_crypt_pw path no
>     longer appears in the hot section of the flamegraph. CPU usage is
>     now dominated by TLS overhead.
> 4. Functionally verified that:
>     * token-based API authentication still works for valid tokens
>     * invalid secrets are rejected as before
>     * generating a new token secret via dashboard works and
>     authenticates correctly
> 
> *PDM (proxmox-access-control)*
> 
> To verify the effect in PDM, I followed a similar testing approach.
> Instead of /status, I profiled the /version endpoint with cargo
> flamegraph [2] and verified that the token hashing path disappears [4]
> from the hot section after applying the proxmox-access-control patches.
> 
> Functionally I verified that:
>     * token-based API authentication still works for valid tokens
>     * invalid secrets are rejected as before
>     * generating a new token secret via dashboard works and
>     authenticates correctly
> 
> Benchmarks:
> 
> Two different benchmarks have been run to measure caching effects
> and RwLock contention:
> 
> (1) Requests per second for PBS /status endpoint (E2E)
> (2) RwLock contention for token create/delete under
>      heavy parallel token-authenticated readers; compared
>      std::sync::RwLock and parking_lot::RwLock.
> 
> (1) benchmarked parallel token auth requests for
> /status?verbose=0 on top of the datastore lookup cache series [5]
> to check throughput impact. With datastores=1, repeat=5000, parallel=16
> this series gives ~179 req/s compared to ~65 req/s without it.
> This is a ~2.75x improvement.
> 
> (2) benchmarked token create/delete operations under heavy load of
> token-authenticated requests on top of the datastore lookup cache [5]
> series. This benchmark was done using against a 64-parallel
> token-auth flood (200k requests) against
> /admin/datastore/ds0001/status?verbose=0 while executing 50 token
> create + 50 token delete operations. After the series I saw the
> following e2e API latencies:
> 
> parking_lot::RwLock
>    - create avg ~27ms (p95 ~28ms) vs ~46ms (p95 ~50ms) baseline
>    - delete avg ~17ms (p95 ~19ms) vs ~33ms (p95 ~35ms) baseline
> 
> std::sync::RwLock
>    - create avg ~27ms (p95 ~28ms)
>    - create avg ~17ms (p95 ~19ms)
> 
> It appears that the both RwLock implementations perform similarly
> for this workload. The parking_lot version has been chosen for the
> added fairness guarantees.
> 
> Patch summary
> 
> pbs-config:
> 
> 0001 – pbs-config: cache verified API token secrets
> Adds an in-memory cache keyed by Authid that stores plain text token
> secrets after a successful verification or generation and uses
> openssl’s memcmp constant-time for comparison.
> 
> 0002 – pbs-config: invalidate token-secret cache on token.shadow
> changes
> Tracks token.shadow mtime and length and clears the in-memory
> cache when the file changes.
> 
> 0003 – pbs-config: add TTL window to token-secret cache
> Introduces a TTL (TOKEN_SECRET_CACHE_TTL_SECS, default 60) for metadata
> checks so that fs::metadata is only called periodically.
> 
> proxmox-access-control:
> 
> 0004 – access-control: cache verified API token secrets
> Mirrors PBS PATCH 0001.
> 
> 0005 – access-control: invalidate token-secret cache on token.shadow changes
> Mirrors PBS PATCH 0002.
> 
> 0006 – access-control: add TTL window to token-secret cache
> Mirrors PBS PATCH 0003.
> 
> proxmox-datacenter-manager:
> 
> 0007 – docs: document API token-cache TTL effects
> Documents the effects of the TTL window on token.shadow edits
> 
> Changes since v1
> 
> - (refactor) Switched cache initialization to LazyLock
> - (perf) Use parking_lot::RwLock and best-effort cache access on the
>    read/refresh path (try_read/try_write) to avoid lock contention
> - (doc) Document TTL-delayed effect of manual token.shadow edits
> - (fix) Add generation guards (API_MUTATION_GENERATION +
>    FILE_GENERATION) to prevent caching across concurrent set/delete and
>    external edits
> 
> Please see the patch specific changelogs for more details.
> 
> Thanks for considering this patch series, I look forward to your
> feedback.
> 
> Best,
> Samuel Rufinatscha
> 
> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
> [2] attachment 1767 [1]: Flamegraph showing the proxmox_sys::crypt::verify_crypt_pw stack
> [3] attachment 1794 [1]: Flamegraph PDM baseline
> [4] attachment 1795 [1]: Flamegraph PDM patched
> [5] https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> 
> proxmox-backup:
> 
> Samuel Rufinatscha (3):
>    pbs-config: cache verified API token secrets
>    pbs-config: invalidate token-secret cache on token.shadow changes
>    pbs-config: add TTL window to token secret cache
> 
>   Cargo.toml                     |   1 +
>   docs/user-management.rst       |   4 +
>   pbs-config/Cargo.toml          |   1 +
>   pbs-config/src/token_shadow.rs | 238 ++++++++++++++++++++++++++++++++-
>   4 files changed, 243 insertions(+), 1 deletion(-)
> 
> 
> proxmox:
> 
> Samuel Rufinatscha (3):
>    proxmox-access-control: cache verified API token secrets
>    proxmox-access-control: invalidate token-secret cache on token.shadow
>      changes
>    proxmox-access-control: add TTL window to token secret cache
> 
>   Cargo.toml                                 |   1 +
>   proxmox-access-control/Cargo.toml          |   1 +
>   proxmox-access-control/src/token_shadow.rs | 238 ++++++++++++++++++++-
>   3 files changed, 239 insertions(+), 1 deletion(-)
> 
> 
> proxmox-datacenter-manager:
> 
> Samuel Rufinatscha (1):
>    docs: document API token-cache TTL effects
> 
>   docs/access-control.rst | 3 +++
>   1 file changed, 3 insertions(+)
> 
> 
> Summary over all repositories:
>    8 files changed, 485 insertions(+), 2 deletions(-)
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

      parent reply	other threads:[~2026-01-02 16:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-17 16:25 [pbs-devel] " Samuel Rufinatscha
2025-12-17 16:25 ` [pbs-devel] [PATCH proxmox-backup v2 1/3] pbs-config: cache verified API token secrets Samuel Rufinatscha
2025-12-17 16:25 ` [pbs-devel] [PATCH proxmox-backup v2 2/3] pbs-config: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2025-12-17 16:25 ` [pbs-devel] [PATCH proxmox-backup v2 3/3] pbs-config: add TTL window to token secret cache Samuel Rufinatscha
2025-12-17 16:25 ` [pbs-devel] [PATCH proxmox v2 1/3] proxmox-access-control: cache verified API token secrets Samuel Rufinatscha
2025-12-17 16:25 ` [pbs-devel] [PATCH proxmox v2 2/3] proxmox-access-control: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2025-12-17 16:25 ` [pbs-devel] [PATCH proxmox v2 3/3] proxmox-access-control: add TTL window to token secret cache Samuel Rufinatscha
2025-12-17 16:25 ` [pbs-devel] [PATCH proxmox-datacenter-manager v2 1/1] docs: document API token-cache TTL effects Samuel Rufinatscha
2025-12-18 11:03 ` [pbs-devel] [PATCH proxmox{-backup, , -datacenter-manager} v2 0/7] token-shadow: reduce api token verification overhead Samuel Rufinatscha
2026-01-02 16:09 ` Samuel Rufinatscha [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=323772e0-ac91-4bd3-942c-e18e70119e05@proxmox.com \
    --to=s.rufinatscha@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal