public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
	"Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets
Date: Fri, 16 Jan 2026 16:33:03 +0100	[thread overview]
Message-ID: <0be759dd-2254-4ae9-a326-6716a3bbbad8@proxmox.com> (raw)
In-Reply-To: <176857735749.137827.7483218182007589047@yuna.proxmox.com>

On 1/16/26 4:28 PM, Fabian Grünbichler wrote:
> Quoting Samuel Rufinatscha (2026-01-16 16:13:17)
>> On 1/14/26 11:44 AM, Fabian Grünbichler wrote:
>>> On January 2, 2026 5:07 pm, Samuel Rufinatscha wrote:
>>>> Currently, every token-based API request reads the token.shadow file and
>>>> runs the expensive password hash verification for the given token
>>>> secret. This shows up as a hotspot in /status profiling (see
>>>> bug #7017 [1]).
>>>>
>>>> This patch introduces an in-memory cache of successfully verified token
>>>> secrets. Subsequent requests for the same token+secret combination only
>>>> perform a comparison using openssl::memcmp::eq and avoid re-running the
>>>> password hash. The cache is updated when a token secret is set and
>>>> cleared when a token is deleted. Note, this does NOT include manual
>>>> config changes, which will be covered in a subsequent patch.
>>>>
>>>> This patch is part of the series which fixes bug #7017 [1].
>>>>
>>>> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
>>>>
>>>> Signed-off-by: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
>>>> ---
>>>> Changes from v1 to v2:
>>>>
>>>> * Replace OnceCell with LazyLock, and std::sync::RwLock with
>>>> parking_lot::RwLock.
>>>> * Add API_MUTATION_GENERATION and guard cache inserts
>>>> to prevent “zombie inserts” across concurrent set/delete.
>>>> * Refactor cache operations into cache_try_secret_matches,
>>>> cache_try_insert_secret, and centralize write-side behavior in
>>>> apply_api_mutation.
>>>> * Switch fast-path cache access to try_read/try_write (best-effort).
>>>>
>>>> Changes from v2 to v3:
>>>>
>>>> * Replaced process-local cache invalidation (AtomicU64
>>>> API_MUTATION_GENERATION) with a cross-process shared generation via
>>>> ConfigVersionCache.
>>>> * Validate shared generation before/after the constant-time secret
>>>> compare; only insert into cache if the generation is unchanged.
>>>> * invalidate_cache_state() on insert if shared generation changed.
>>>>
>>>>    Cargo.toml                     |   1 +
>>>>    pbs-config/Cargo.toml          |   1 +
>>>>    pbs-config/src/token_shadow.rs | 157 ++++++++++++++++++++++++++++++++-
>>>>    3 files changed, 158 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/Cargo.toml b/Cargo.toml
>>>> index 1aa57ae5..821b63b7 100644
>>>> --- a/Cargo.toml
>>>> +++ b/Cargo.toml
>>>> @@ -143,6 +143,7 @@ nom = "7"
>>>>    num-traits = "0.2"
>>>>    once_cell = "1.3.1"
>>>>    openssl = "0.10.40"
>>>> +parking_lot = "0.12"
>>>>    percent-encoding = "2.1"
>>>>    pin-project-lite = "0.2"
>>>>    regex = "1.5.5"
>>>> diff --git a/pbs-config/Cargo.toml b/pbs-config/Cargo.toml
>>>> index 74afb3c6..eb81ce00 100644
>>>> --- a/pbs-config/Cargo.toml
>>>> +++ b/pbs-config/Cargo.toml
>>>> @@ -13,6 +13,7 @@ libc.workspace = true
>>>>    nix.workspace = true
>>>>    once_cell.workspace = true
>>>>    openssl.workspace = true
>>>> +parking_lot.workspace = true
>>>>    regex.workspace = true
>>>>    serde.workspace = true
>>>>    serde_json.workspace = true
>>>> diff --git a/pbs-config/src/token_shadow.rs b/pbs-config/src/token_shadow.rs
>>>> index 640fabbf..fa84aee5 100644
>>>> --- a/pbs-config/src/token_shadow.rs
>>>> +++ b/pbs-config/src/token_shadow.rs
>>>> @@ -1,6 +1,8 @@
>>>>    use std::collections::HashMap;
>>>> +use std::sync::LazyLock;
>>>>    
>>>>    use anyhow::{bail, format_err, Error};
>>>> +use parking_lot::RwLock;
>>>>    use serde::{Deserialize, Serialize};
>>>>    use serde_json::{from_value, Value};
>>>>    
>>>> @@ -13,6 +15,18 @@ use crate::{open_backup_lockfile, BackupLockGuard};
>>>>    const LOCK_FILE: &str = pbs_buildcfg::configdir!("/token.shadow.lock");
>>>>    const CONF_FILE: &str = pbs_buildcfg::configdir!("/token.shadow");
>>>>    
>>>> +/// Global in-memory cache for successfully verified API token secrets.
>>>> +/// The cache stores plain text secrets for token Authids that have already been
>>>> +/// verified against the hashed values in `token.shadow`. This allows for cheap
>>>> +/// subsequent authentications for the same token+secret combination, avoiding
>>>> +/// recomputing the password hash on every request.
>>>> +static TOKEN_SECRET_CACHE: LazyLock<RwLock<ApiTokenSecretCache>> = LazyLock::new(|| {
>>>> +    RwLock::new(ApiTokenSecretCache {
>>>> +        secrets: HashMap::new(),
>>>> +        shared_gen: 0,
>>>> +    })
>>>> +});
>>>> +
>>>>    #[derive(Serialize, Deserialize)]
>>>>    #[serde(rename_all = "kebab-case")]
>>>>    /// ApiToken id / secret pair
>>>> @@ -54,9 +68,27 @@ pub fn verify_secret(tokenid: &Authid, secret: &str) -> Result<(), Error> {
>>>>            bail!("not an API token ID");
>>>>        }
>>>>    
>>>> +    // Fast path
>>>> +    if cache_try_secret_matches(tokenid, secret) {
>>>> +        return Ok(());
>>>> +    }
>>>> +
>>>> +    // Slow path
>>>> +    // First, capture the shared generation before doing the hash verification.
>>>> +    let gen_before = token_shadow_shared_gen();
>>>> +
>>>>        let data = read_file()?;
>>>>        match data.get(tokenid) {
>>>> -        Some(hashed_secret) => proxmox_sys::crypt::verify_crypt_pw(secret, hashed_secret),
>>>> +        Some(hashed_secret) => {
>>>> +            proxmox_sys::crypt::verify_crypt_pw(secret, hashed_secret)?;
>>>> +
>>>> +            // Try to cache only if nothing changed while verifying the secret.
>>>> +            if let Some(gen) = gen_before {
>>>> +                cache_try_insert_secret(tokenid.clone(), secret.to_owned(), gen);
>>>> +            }
>>>> +
>>>> +            Ok(())
>>>> +        }
>>>>            None => bail!("invalid API token"),
>>>>        }
>>>>    }
>>>> @@ -82,6 +114,8 @@ fn set_secret(tokenid: &Authid, secret: &str) -> Result<(), Error> {
>>>>        data.insert(tokenid.clone(), hashed_secret);
>>>>        write_file(data)?;
>>>>    
>>>> +    apply_api_mutation(tokenid, Some(secret));
>>>> +
>>>>        Ok(())
>>>>    }
>>>>    
>>>> @@ -97,5 +131,126 @@ pub fn delete_secret(tokenid: &Authid) -> Result<(), Error> {
>>>>        data.remove(tokenid);
>>>>        write_file(data)?;
>>>>    
>>>> +    apply_api_mutation(tokenid, None);
>>>> +
>>>>        Ok(())
>>>>    }
>>>> +
>>>> +struct ApiTokenSecretCache {
>>>> +    /// Keys are token Authids, values are the corresponding plain text secrets.
>>>> +    /// Entries are added after a successful on-disk verification in
>>>> +    /// `verify_secret` or when a new token secret is generated by
>>>> +    /// `generate_and_set_secret`. Used to avoid repeated
>>>> +    /// password-hash computation on subsequent authentications.
>>>> +    secrets: HashMap<Authid, CachedSecret>,
>>>> +    /// Shared generation to detect mutations of the underlying token.shadow file.
>>>> +    shared_gen: usize,
>>>> +}
>>>> +
>>>> +/// Cached secret.
>>>> +struct CachedSecret {
>>>> +    secret: String,
>>>> +}
>>>> +
>>>> +fn cache_try_insert_secret(tokenid: Authid, secret: String, shared_gen_before: usize) {
>>>> +    let Some(mut cache) = TOKEN_SECRET_CACHE.try_write() else {
>>>> +        return;
>>>> +    };
>>>> +
>>>> +    let Some(shared_gen_now) = token_shadow_shared_gen() else {
>>>> +        return;
>>>> +    };
>>>> +
>>>> +    // If this process missed a generation bump, its cache is stale.
>>>> +    if cache.shared_gen != shared_gen_now {
>>>> +        invalidate_cache_state(&mut cache);
>>>> +        cache.shared_gen = shared_gen_now;
>>>> +    }
>>>> +
>>>> +    // If a mutation happened while we were verifying the secret, do not insert.
>>>> +    if shared_gen_now == shared_gen_before {
>>>> +        cache.secrets.insert(tokenid, CachedSecret { secret });
>>>> +    }
>>>> +}
>>>> +
>>>> +// Tries to match the given token secret against the cached secret.
>>>> +// Checks the generation before and after the constant-time compare to avoid a
>>>> +// TOCTOU window. If another process rotates/deletes a token while we're validating
>>>> +// the cached secret, the generation will change, and we
>>>> +// must not trust the cache for this request.
>>>> +fn cache_try_secret_matches(tokenid: &Authid, secret: &str) -> bool {
>>>> +    let Some(cache) = TOKEN_SECRET_CACHE.try_read() else {
>>>> +        return false;
>>>> +    };
>>>> +    let Some(entry) = cache.secrets.get(tokenid) else {
>>>> +        return false;
>>>> +    };
>>>> +
>>>> +    let cache_gen = cache.shared_gen;
>>>> +
>>>> +    let Some(gen1) = token_shadow_shared_gen() else {
>>>> +        return false;
>>>> +    };
>>>> +    if gen1 != cache_gen {
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    let eq = openssl::memcmp::eq(entry.secret.as_bytes(), secret.as_bytes());
>>>
>>> should we invalidate the cache here for this particular authid in case
>>> of a mismatch, to avoid making brute forcing too easy/cheap?
>>>
>>
>> We are not doing a cheap reject, in mismatch we do still fall through to
>> verify_crypt_pw(). Evicting on mismatch could however enable cache
>> thrashing where wrong secrets for a known tokenid would evict cached
>> entries. So I think we should not invalidate here on mismatch.
>>
>>>> +    let Some(gen2) = token_shadow_shared_gen() else {
>>>> +        return false;
>>>> +    };
>>>> +
>>>> +    eq && gen2 == cache_gen
>>>> +}
>>>> +
>>>> +fn apply_api_mutation(tokenid: &Authid, new_secret: Option<&str>) {
>>>> +    // Signal cache invalidation to other processes (best-effort).
>>>> +    let new_shared_gen = bump_token_shadow_shared_gen();
>>>> +
>>>> +    let mut cache = TOKEN_SECRET_CACHE.write();
>>>> +
>>>> +    // If we cannot read/bump the shared generation, we cannot safely trust the cache.
>>>> +    let Some(gen) = new_shared_gen else {
>>>> +        invalidate_cache_state(&mut cache);
>>>> +        cache.shared_gen = 0;
>>>> +        return;
>>>> +    };
>>>> +
>>>> +    // Update to the post-mutation generation.
>>>> +    cache.shared_gen = gen;
>>>> +
>>>> +    // Apply the new mutation.
>>>> +    match new_secret {
>>>> +        Some(secret) => {
>>>> +            cache.secrets.insert(
>>>> +                tokenid.clone(),
>>>> +                CachedSecret {
>>>> +                    secret: secret.to_owned(),
>>>> +                },
>>>> +            );
>>>> +        }
>>>> +        None => {
>>>> +            cache.secrets.remove(tokenid);
>>>> +        }
>>>> +    }
>>>> +}
>>>> +
>>>> +/// Get the current shared generation.
>>>> +fn token_shadow_shared_gen() -> Option<usize> {
>>>> +    crate::ConfigVersionCache::new()
>>>> +        .ok()
>>>> +        .map(|cvc| cvc.token_shadow_generation())
>>>> +}
>>>> +
>>>> +/// Bump and return the new shared generation.
>>>> +fn bump_token_shadow_shared_gen() -> Option<usize> {
>>>> +    crate::ConfigVersionCache::new()
>>>> +        .ok()
>>>> +        .map(|cvc| cvc.increase_token_shadow_generation() + 1)
>>>> +}
>>>> +
>>>> +/// Invalidates the cache state and only keeps the shared generation.
>>>
>>> both calls to this actually set the cached generation to some value
>>> right after, so maybe this should take a generation directly and set it?
>>>
>>
>> patch 3/4 doesn’t always update the gen on cache invalidation
>> (shadow_mtime_len() error branch in apply_api_mutation) but most other
>> call sites do. Agreed this can be refactored, maybe:
> 
> that one sets the generation before (potentially) invalidating the cache
> though, so we could unconditionally reset the generation to that value when
> invalidating.. we should maybe also re-order the lock and bump there?
>

Good point, I will check this! thanks Fabian! :)

>>
>> fn invalidate_cache_state(cache: &mut ApiTokenSecretCache) {
>>       cache.secrets.clear();
>>       // clear other cache fields (mtime/len/last_checked) as needed
>> }
>>
>> fn invalidate_cache_state_and_set_gen(cache: &mut ApiTokenSecretCache,
>> gen: usize) {
>>       invalidate_cache_state(cache);
>>       cache.shared_gen = gen;
>> }
>>
>> We could also do a single helper with Option<usize> but two helpers make
>> the call sites more explicit.
>>
>>>> +fn invalidate_cache_state(cache: &mut ApiTokenSecretCache) {
>>>> +    cache.secrets.clear();
>>>> +}
>>>> -- 
>>>> 2.47.3
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> pbs-devel mailing list
>>>> pbs-devel@lists.proxmox.com
>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>>>
>>>
>>>
>>> _______________________________________________
>>> pbs-devel mailing list
>>> pbs-devel@lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2026-01-16 15:33 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 16:07 [pbs-devel] [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 1/4] pbs-config: add token.shadow generation to ConfigVersionCache Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-16 13:53     ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-16 15:13     ` Samuel Rufinatscha
2026-01-16 15:29       ` Fabian Grünbichler
2026-01-16 15:33         ` Samuel Rufinatscha [this message]
2026-01-16 16:00       ` Fabian Grünbichler
2026-01-16 16:56         ` Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 3/4] pbs-config: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-14 10:44   ` Fabian Grünbichler
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-backup v3 4/4] pbs-config: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 1/4] proxmox-access-control: extend AccessControlConfig for token.shadow invalidation Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 2/4] proxmox-access-control: cache verified API token secrets Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 3/4] proxmox-access-control: invalidate token-secret cache on token.shadow changes Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox v3 4/4] proxmox-access-control: add TTL window to token secret cache Samuel Rufinatscha
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 1/2] pdm-config: implement token.shadow generation Samuel Rufinatscha
2026-01-14 10:45   ` Fabian Grünbichler
2026-01-16 16:28     ` Samuel Rufinatscha
2026-01-16 16:48       ` Shannon Sterz
2026-01-02 16:07 ` [pbs-devel] [PATCH proxmox-datacenter-manager v3 2/2] docs: document API token-cache TTL effects Samuel Rufinatscha
2026-01-14 10:45   ` Fabian Grünbichler
2026-01-14 11:24     ` Samuel Rufinatscha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0be759dd-2254-4ae9-a326-6716a3bbbad8@proxmox.com \
    --to=s.rufinatscha@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal