all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>,
	Samuel Rufinatscha <s.rufinatscha@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets
Date: Fri, 16 Jan 2026 16:29:17 +0100	[thread overview]
Message-ID: <176857735749.137827.7483218182007589047@yuna.proxmox.com> (raw)
In-Reply-To: <13d6a203-415d-45ff-b4bb-149903d08f94@proxmox.com>

Quoting Samuel Rufinatscha (2026-01-16 16:13:17)
> On 1/14/26 11:44 AM, Fabian Grünbichler wrote:
> > On January 2, 2026 5:07 pm, Samuel Rufinatscha wrote:
> >> Currently, every token-based API request reads the token.shadow file and
> >> runs the expensive password hash verification for the given token
> >> secret. This shows up as a hotspot in /status profiling (see
> >> bug #7017 [1]).
> >>
> >> This patch introduces an in-memory cache of successfully verified token
> >> secrets. Subsequent requests for the same token+secret combination only
> >> perform a comparison using openssl::memcmp::eq and avoid re-running the
> >> password hash. The cache is updated when a token secret is set and
> >> cleared when a token is deleted. Note, this does NOT include manual
> >> config changes, which will be covered in a subsequent patch.
> >>
> >> This patch is part of the series which fixes bug #7017 [1].
> >>
> >> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
> >>
> >> Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
> >> ---
> >> Changes from v1 to v2:
> >>
> >> * Replace OnceCell with LazyLock, and std::sync::RwLock with
> >> parking_lot::RwLock.
> >> * Add API_MUTATION_GENERATION and guard cache inserts
> >> to prevent “zombie inserts” across concurrent set/delete.
> >> * Refactor cache operations into cache_try_secret_matches,
> >> cache_try_insert_secret, and centralize write-side behavior in
> >> apply_api_mutation.
> >> * Switch fast-path cache access to try_read/try_write (best-effort).
> >>
> >> Changes from v2 to v3:
> >>
> >> * Replaced process-local cache invalidation (AtomicU64
> >> API_MUTATION_GENERATION) with a cross-process shared generation via
> >> ConfigVersionCache.
> >> * Validate shared generation before/after the constant-time secret
> >> compare; only insert into cache if the generation is unchanged.
> >> * invalidate_cache_state() on insert if shared generation changed.
> >>
> >>   Cargo.toml                     |   1 +
> >>   pbs-config/Cargo.toml          |   1 +
> >>   pbs-config/src/token_shadow.rs | 157 ++++++++++++++++++++++++++++++++-
> >>   3 files changed, 158 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/Cargo.toml b/Cargo.toml
> >> index 1aa57ae5..821b63b7 100644
> >> --- a/Cargo.toml
> >> +++ b/Cargo.toml
> >> @@ -143,6 +143,7 @@ nom = "7"
> >>   num-traits = "0.2"
> >>   once_cell = "1.3.1"
> >>   openssl = "0.10.40"
> >> +parking_lot = "0.12"
> >>   percent-encoding = "2.1"
> >>   pin-project-lite = "0.2"
> >>   regex = "1.5.5"
> >> diff --git a/pbs-config/Cargo.toml b/pbs-config/Cargo.toml
> >> index 74afb3c6..eb81ce00 100644
> >> --- a/pbs-config/Cargo.toml
> >> +++ b/pbs-config/Cargo.toml
> >> @@ -13,6 +13,7 @@ libc.workspace = true
> >>   nix.workspace = true
> >>   once_cell.workspace = true
> >>   openssl.workspace = true
> >> +parking_lot.workspace = true
> >>   regex.workspace = true
> >>   serde.workspace = true
> >>   serde_json.workspace = true
> >> diff --git a/pbs-config/src/token_shadow.rs b/pbs-config/src/token_shadow.rs
> >> index 640fabbf..fa84aee5 100644
> >> --- a/pbs-config/src/token_shadow.rs
> >> +++ b/pbs-config/src/token_shadow.rs
> >> @@ -1,6 +1,8 @@
> >>   use std::collections::HashMap;
> >> +use std::sync::LazyLock;
> >>   
> >>   use anyhow::{bail, format_err, Error};
> >> +use parking_lot::RwLock;
> >>   use serde::{Deserialize, Serialize};
> >>   use serde_json::{from_value, Value};
> >>   
> >> @@ -13,6 +15,18 @@ use crate::{open_backup_lockfile, BackupLockGuard};
> >>   const LOCK_FILE: &str = pbs_buildcfg::configdir!("/token.shadow.lock");
> >>   const CONF_FILE: &str = pbs_buildcfg::configdir!("/token.shadow");
> >>   
> >> +/// Global in-memory cache for successfully verified API token secrets.
> >> +/// The cache stores plain text secrets for token Authids that have already been
> >> +/// verified against the hashed values in `token.shadow`. This allows for cheap
> >> +/// subsequent authentications for the same token+secret combination, avoiding
> >> +/// recomputing the password hash on every request.
> >> +static TOKEN_SECRET_CACHE: LazyLock<RwLock<ApiTokenSecretCache>> = LazyLock::new(|| {
> >> +    RwLock::new(ApiTokenSecretCache {
> >> +        secrets: HashMap::new(),
> >> +        shared_gen: 0,
> >> +    })
> >> +});
> >> +
> >>   #[derive(Serialize, Deserialize)]
> >>   #[serde(rename_all = "kebab-case")]
> >>   /// ApiToken id / secret pair
> >> @@ -54,9 +68,27 @@ pub fn verify_secret(tokenid: &Authid, secret: &str) -> Result<(), Error> {
> >>           bail!("not an API token ID");
> >>       }
> >>   
> >> +    // Fast path
> >> +    if cache_try_secret_matches(tokenid, secret) {
> >> +        return Ok(());
> >> +    }
> >> +
> >> +    // Slow path
> >> +    // First, capture the shared generation before doing the hash verification.
> >> +    let gen_before = token_shadow_shared_gen();
> >> +
> >>       let data = read_file()?;
> >>       match data.get(tokenid) {
> >> -        Some(hashed_secret) => proxmox_sys::crypt::verify_crypt_pw(secret, hashed_secret),
> >> +        Some(hashed_secret) => {
> >> +            proxmox_sys::crypt::verify_crypt_pw(secret, hashed_secret)?;
> >> +
> >> +            // Try to cache only if nothing changed while verifying the secret.
> >> +            if let Some(gen) = gen_before {
> >> +                cache_try_insert_secret(tokenid.clone(), secret.to_owned(), gen);
> >> +            }
> >> +
> >> +            Ok(())
> >> +        }
> >>           None => bail!("invalid API token"),
> >>       }
> >>   }
> >> @@ -82,6 +114,8 @@ fn set_secret(tokenid: &Authid, secret: &str) -> Result<(), Error> {
> >>       data.insert(tokenid.clone(), hashed_secret);
> >>       write_file(data)?;
> >>   
> >> +    apply_api_mutation(tokenid, Some(secret));
> >> +
> >>       Ok(())
> >>   }
> >>   
> >> @@ -97,5 +131,126 @@ pub fn delete_secret(tokenid: &Authid) -> Result<(), Error> {
> >>       data.remove(tokenid);
> >>       write_file(data)?;
> >>   
> >> +    apply_api_mutation(tokenid, None);
> >> +
> >>       Ok(())
> >>   }
> >> +
> >> +struct ApiTokenSecretCache {
> >> +    /// Keys are token Authids, values are the corresponding plain text secrets.
> >> +    /// Entries are added after a successful on-disk verification in
> >> +    /// `verify_secret` or when a new token secret is generated by
> >> +    /// `generate_and_set_secret`. Used to avoid repeated
> >> +    /// password-hash computation on subsequent authentications.
> >> +    secrets: HashMap<Authid, CachedSecret>,
> >> +    /// Shared generation to detect mutations of the underlying token.shadow file.
> >> +    shared_gen: usize,
> >> +}
> >> +
> >> +/// Cached secret.
> >> +struct CachedSecret {
> >> +    secret: String,
> >> +}
> >> +
> >> +fn cache_try_insert_secret(tokenid: Authid, secret: String, shared_gen_before: usize) {
> >> +    let Some(mut cache) = TOKEN_SECRET_CACHE.try_write() else {
> >> +        return;
> >> +    };
> >> +
> >> +    let Some(shared_gen_now) = token_shadow_shared_gen() else {
> >> +        return;
> >> +    };
> >> +
> >> +    // If this process missed a generation bump, its cache is stale.
> >> +    if cache.shared_gen != shared_gen_now {
> >> +        invalidate_cache_state(&mut cache);
> >> +        cache.shared_gen = shared_gen_now;
> >> +    }
> >> +
> >> +    // If a mutation happened while we were verifying the secret, do not insert.
> >> +    if shared_gen_now == shared_gen_before {
> >> +        cache.secrets.insert(tokenid, CachedSecret { secret });
> >> +    }
> >> +}
> >> +
> >> +// Tries to match the given token secret against the cached secret.
> >> +// Checks the generation before and after the constant-time compare to avoid a
> >> +// TOCTOU window. If another process rotates/deletes a token while we're validating
> >> +// the cached secret, the generation will change, and we
> >> +// must not trust the cache for this request.
> >> +fn cache_try_secret_matches(tokenid: &Authid, secret: &str) -> bool {
> >> +    let Some(cache) = TOKEN_SECRET_CACHE.try_read() else {
> >> +        return false;
> >> +    };
> >> +    let Some(entry) = cache.secrets.get(tokenid) else {
> >> +        return false;
> >> +    };
> >> +
> >> +    let cache_gen = cache.shared_gen;
> >> +
> >> +    let Some(gen1) = token_shadow_shared_gen() else {
> >> +        return false;
> >> +    };
> >> +    if gen1 != cache_gen {
> >> +        return false;
> >> +    }
> >> +
> >> +    let eq = openssl::memcmp::eq(entry.secret.as_bytes(), secret.as_bytes());
> > 
> > should we invalidate the cache here for this particular authid in case
> > of a mismatch, to avoid making brute forcing too easy/cheap?
> >
> 
> We are not doing a cheap reject, in mismatch we do still fall through to
> verify_crypt_pw(). Evicting on mismatch could however enable cache
> thrashing where wrong secrets for a known tokenid would evict cached
> entries. So I think we should not invalidate here on mismatch.
> 
> >> +    let Some(gen2) = token_shadow_shared_gen() else {
> >> +        return false;
> >> +    };
> >> +
> >> +    eq && gen2 == cache_gen
> >> +}
> >> +
> >> +fn apply_api_mutation(tokenid: &Authid, new_secret: Option<&str>) {
> >> +    // Signal cache invalidation to other processes (best-effort).
> >> +    let new_shared_gen = bump_token_shadow_shared_gen();
> >> +
> >> +    let mut cache = TOKEN_SECRET_CACHE.write();
> >> +
> >> +    // If we cannot read/bump the shared generation, we cannot safely trust the cache.
> >> +    let Some(gen) = new_shared_gen else {
> >> +        invalidate_cache_state(&mut cache);
> >> +        cache.shared_gen = 0;
> >> +        return;
> >> +    };
> >> +
> >> +    // Update to the post-mutation generation.
> >> +    cache.shared_gen = gen;
> >> +
> >> +    // Apply the new mutation.
> >> +    match new_secret {
> >> +        Some(secret) => {
> >> +            cache.secrets.insert(
> >> +                tokenid.clone(),
> >> +                CachedSecret {
> >> +                    secret: secret.to_owned(),
> >> +                },
> >> +            );
> >> +        }
> >> +        None => {
> >> +            cache.secrets.remove(tokenid);
> >> +        }
> >> +    }
> >> +}
> >> +
> >> +/// Get the current shared generation.
> >> +fn token_shadow_shared_gen() -> Option<usize> {
> >> +    crate::ConfigVersionCache::new()
> >> +        .ok()
> >> +        .map(|cvc| cvc.token_shadow_generation())
> >> +}
> >> +
> >> +/// Bump and return the new shared generation.
> >> +fn bump_token_shadow_shared_gen() -> Option<usize> {
> >> +    crate::ConfigVersionCache::new()
> >> +        .ok()
> >> +        .map(|cvc| cvc.increase_token_shadow_generation() + 1)
> >> +}
> >> +
> >> +/// Invalidates the cache state and only keeps the shared generation.
> > 
> > both calls to this actually set the cached generation to some value
> > right after, so maybe this should take a generation directly and set it?
> >
> 
> patch 3/4 doesn’t always update the gen on cache invalidation
> (shadow_mtime_len() error branch in apply_api_mutation) but most other
> call sites do. Agreed this can be refactored, maybe:

that one sets the generation before (potentially) invalidating the cache
though, so we could unconditionally reset the generation to that value when
invalidating.. we should maybe also re-order the lock and bump there?

> 
> fn invalidate_cache_state(cache: &mut ApiTokenSecretCache) {
>      cache.secrets.clear();
>      // clear other cache fields (mtime/len/last_checked) as needed
> }
> 
> fn invalidate_cache_state_and_set_gen(cache: &mut ApiTokenSecretCache, 
> gen: usize) {
>      invalidate_cache_state(cache);
>      cache.shared_gen = gen;
> }
> 
> We could also do a single helper with Option<usize> but two helpers make 
> the call sites more explicit.
> 
> >> +fn invalidate_cache_state(cache: &mut ApiTokenSecretCache) {
> >> +    cache.secrets.clear();
> >> +}
> >> -- 
> >> 2.47.3
> >>
> >>
> >>
> >> _______________________________________________
> >> pbs-devel mailing list
> >> pbs-devel@lists.proxmox.com
> >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> >>
> > 
> > 
> > _______________________________________________
> > pbs-devel mailing list
> > pbs-devel@lists.proxmox.com
> > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
>


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2026-01-16 15:30 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 16:07 [pbs-devel] [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 1/4] pbs-config: add token.shadow generation to ConfigVersionCache Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-16 13:53     ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-16 15:13     ` Samuel Rufinatscha
2026-01-16 15:29       ` Fabian Grünbichler [this message]
2026-01-16 15:33         ` Samuel Rufinatscha
2026-01-16 16:00       ` Fabian Grünbichler
2026-01-16 16:56         ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 3/4] pbs-config: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 4/4] pbs-config: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 1/4] proxmox-access-control: extend AccessControlConfig for token.shadow invalidation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 2/4] proxmox-access-control: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 3/4] proxmox-access-control: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 4/4] proxmox-access-control: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 1/2] pdm-config: implement token.shadow generation Samuel Rufinatscha
2026-01-14 10:45   ` Fabian Grünbichler
2026-01-16 16:28     ` Samuel Rufinatscha
2026-01-16 16:48       ` Shannon Sterz
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 2/2] docs: document API token-cache TTL effects Samuel Rufinatscha
2026-01-14 10:45   ` Fabian Grünbichler
2026-01-14 11:24     ` Samuel Rufinatscha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=176857735749.137827.7483218182007589047@yuna.proxmox.com \
    --to=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=s.rufinatscha@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal