From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] superseded: [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path
Date: Fri, 14 Nov 2025 16:08:30 +0100 [thread overview]
Message-ID: <84a63b00-9964-45a9-bb07-590f98239a21@proxmox.com> (raw)
In-Reply-To: <20251111122941.110412-1-s.rufinatscha@proxmox.com>
https://lore.proxmox.com/pbs-devel/20251114150544.224839-1-s.rufinatscha@proxmox.com/T/#t
On 11/11/25 1:29 PM, Samuel Rufinatscha wrote:
> Hi,
>
> this series reduces CPU time in datastore lookups by avoiding repeated
> datastore.cfg reads/parses in both `lookup_datastore()` and
> `DataStore::Drop`. It also adds a TTL so manual config edits are
> noticed without reintroducing hashing on every request.
>
> While investigating #6049 [1], cargo-flamegraph [2] showed hotspots during
> repeated `/status` calls in `lookup_datastore()` and in `Drop`,
> dominated by `pbs_config::datastore::config()` (config parse).
>
> The parsing cost itself should eventually be investigated in a future
> effort. Furthermore, cargo-flamegraph showed that when using a
> token-based auth method to access the API, a significant amount of time
> is spent in validation on every request, likely related to bcrypt.
> Also this should be eventually revisited in a future effort.
>
> ## Approach
>
> [PATCH 1/3] Fast path for datastore lookups
> Use the shared-memory `ConfigVersionCache` generation for `datastore.cfg`.
> Tag each cached `DataStoreImpl` with the last seen generation; when it
> matches, reuse the cached instance. Fall back to the existing slow path
> on mismatch or when the cache is unavailable.
>
> [PATCH 2/3] Fast path for `Drop`
> Reuse the maintenance mode eviction decision captured at lookup time,
> removing the config reload from `Drop`.
>
> [PATCH 3/3] TTL to catch manual edits
> If a cached entry is older than `DATASTORE_CONFIG_CACHE_TTL_SECS`
> (default 60s), the next lookup refreshes it via the slow path. This
> detects manual file edits without hashing on every request.
>
> ## Results
>
> End-to-end `/status?verbose=0` (1000 stores, 5 req/store, parallel=16):
>
> Metric Baseline [1/3] [2/3]
> ------------------------------------------------
> Total time 13s 11s 10s
> Throughput (all) 384.62 454.55 500.00
> Cold RPS (round #1) 76.92 90.91 100.00
> Warm RPS (2..N) 307.69 363.64 400.00
>
> Patch 1 improves overall throughput by ~18% (−15% total time). Patch 2
> adds ~10% on top. Patch 3 is a robustness feature; a 0.1 s probe shows
> periodic latency spikes at TTL expiry and flat latencies otherwise.
>
> ## Reproduction steps
>
> VM: 4 vCPU, ~8 GiB RAM, VirtIO-SCSI; disks:
> - scsi0 32G (OS)
> - scsi1 1000G (datastores)
>
> Install PBS from ISO on the VM.
>
> Set up ZFS on /dev/sdb (adjust if different):
>
> zpool create -f -o ashift=12 pbsbench /dev/sdb
> zfs set mountpoint=/pbsbench pbsbench
> zfs create pbsbench/pbs-bench
>
> Raise file-descriptor limit:
>
> sudo systemctl edit proxmox-backup-proxy.service
>
> Add the following lines:
>
> [Service]
> LimitNOFILE=1048576
>
> Reload systemd and restart the proxy:
>
> sudo systemctl daemon-reload
> sudo systemctl restart proxmox-backup-proxy.service
>
> Verify the limit:
>
> systemctl show proxmox-backup-proxy.service | grep LimitNOFILE
>
> Create 1000 ZFS-backed datastores (as used in #6049 [1]):
>
> seq -w 001 1000 | xargs -n1 -P1 bash -c '
> id=$0
> name="ds${id}"
> dataset="pbsbench/pbs-bench/${name}"
> path="/pbsbench/pbs-bench/${name}"
> zfs create -o mountpoint="$path" "$dataset"
> proxmox-backup-manager datastore create "$name" "$path" \
> --comment "ZFS dataset-based datastore"
> '
>
> Build PBS from this series, then run the server under manually
> under flamegraph:
>
> systemctl stop proxmox-backup-proxy
> cargo flamegraph --release --bin proxmox-backup-proxy
>
> Benchmark script (`bench.sh`) used for the numbers above:
>
> #!/usr/bin/env bash
> set -euo pipefail
>
> # --- Config ---------------------------------------------------------------
> HOST='https://localhost:8007'
> USER='root@pam'
> PASS="$(cat passfile)"
>
> DATASTORE_PATH="/pbsbench/pbs-bench"
> MAX_STORES=1000 # how many stores to include
> PARALLEL=16 # concurrent workers
> REPEAT=5 # requests per store (1 cold + REPEAT-1 warm)
>
> PRINT_FIRST=false # true => log first request's HTTP code per store
>
> # --- Helpers --------------------------------------------------------------
> fmt_rps () {
> local n="$1" t="$2"
> awk -v n="$n" -v t="$t" 'BEGIN { if (t > 0) printf("%.2f\n", n/t); else print "0.00" }'
> }
>
> # --- Login ---------------------------------------------------------------
> auth=$(curl -ks -X POST "$HOST/api2/json/access/ticket" \
> -d "username=$USER" -d "password=$PASS")
> ticket=$(echo "$auth" | jq -r '.data.ticket')
>
> if [[ -z "${ticket:-}" || "$ticket" == "null" ]]; then
> echo "[ERROR] Login failed (no ticket)"
> exit 1
> fi
>
> # --- Collect stores (deterministic order) --------------------------------
> mapfile -t STORES < <(
> find "$DATASTORE_PATH" -mindepth 1 -maxdepth 1 -type d -printf '%f\n' \
> | sort | head -n "$MAX_STORES"
> )
>
> USED_STORES=${#STORES[@]}
> if (( USED_STORES == 0 )); then
> echo "[ERROR] No datastore dirs under $DATASTORE_PATH"
> exit 1
> fi
>
> echo "[INFO] Running with stores=$USED_STORES, repeat=$REPEAT, parallel=$PARALLEL"
>
> # --- Temp counters --------------------------------------------------------
> SUCCESS_ALL="$(mktemp)"
> FAIL_ALL="$(mktemp)"
> COLD_OK="$(mktemp)"
> WARM_OK="$(mktemp)"
> trap 'rm -f "$SUCCESS_ALL" "$FAIL_ALL" "$COLD_OK" "$WARM_OK"' EXIT
>
> export HOST ticket REPEAT SUCCESS_ALL FAIL_ALL COLD_OK WARM_OK PRINT_FIRST
>
> SECONDS=0
>
> # --- Fire requests --------------------------------------------------------
> printf "%s\n" "${STORES[@]}" \
> | xargs -P"$PARALLEL" -I{} bash -c '
> store="$1"
> url="$HOST/api2/json/admin/datastore/$store/status?verbose=0"
>
> for ((i=1;i<=REPEAT;i++)); do
> code=$(curl -ks -o /dev/null -w "%{http_code}" -b "PBSAuthCookie=$ticket" "$url" || echo 000)
>
> if [[ "$code" == "200" ]]; then
> echo 1 >> "$SUCCESS_ALL"
> if (( i == 1 )); then
> echo 1 >> "$COLD_OK"
> else
> echo 1 >> "$WARM_OK"
> fi
> if [[ "$PRINT_FIRST" == "true" && $i -eq 1 ]]; then
> ts=$(date +%H:%M:%S)
> echo "[$ts] $store #$i HTTP:200"
> fi
> else
> echo 1 >> "$FAIL_ALL"
> if [[ "$PRINT_FIRST" == "true" && $i -eq 1 ]]; then
> ts=$(date +%H:%M:%S)
> echo "[$ts] $store #$i HTTP:$code (FAIL)"
> fi
> fi
> done
> ' _ {}
>
> # --- Summary --------------------------------------------------------------
> elapsed=$SECONDS
> ok=$(wc -l < "$SUCCESS_ALL" 2>/dev/null || echo 0)
> fail=$(wc -l < "$FAIL_ALL" 2>/dev/null || echo 0)
> cold_ok=$(wc -l < "$COLD_OK" 2>/dev/null || echo 0)
> warm_ok=$(wc -l < "$WARM_OK" 2>/dev/null || echo 0)
>
> expected=$(( USED_STORES * REPEAT ))
> total=$(( ok + fail ))
>
> rps_all=$(fmt_rps "$ok" "$elapsed")
> rps_cold=$(fmt_rps "$cold_ok" "$elapsed")
> rps_warm=$(fmt_rps "$warm_ok" "$elapsed")
>
> echo "===== Summary ====="
> echo "Stores used: $USED_STORES"
> echo "Expected requests: $expected"
> echo "Executed requests: $total"
> echo "OK (HTTP 200): $ok"
> echo "Failed: $fail"
> printf "Total time: %dm %ds\n" $((elapsed/60)) $((elapsed%60))
> echo "Throughput all RPS: $rps_all"
> echo "Cold RPS (round #1): $rps_cold"
> echo "Warm RPS (#2..N): $rps_warm"
>
> ## Maintainer notes
>
> - No dependency bumps, no API changes, no breaking changes in this
> series.
>
> ## Patch summary
>
> [PATCH 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups
> [PATCH 2/3] partial fix #6049: datastore: use config fast-path in Drop
> [PATCH 3/3] datastore: add TTL fallback to catch manual config edits
>
> Thanks for reviewing!
>
> [1] Bugzilla #6049: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> [2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
>
> Samuel Rufinatscha (3):
> partial fix #6049: datastore: impl ConfigVersionCache fast path for
> lookups
> partial fix #6049: datastore: use config fast-path in Drop
> datastore: add TTL fallback to catch manual config edits
>
> pbs-config/src/config_version_cache.rs | 10 ++-
> pbs-datastore/src/datastore.rs | 119 ++++++++++++++++++-------
> 2 files changed, 96 insertions(+), 33 deletions(-)
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
prev parent reply other threads:[~2025-11-14 15:08 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-11 12:29 [pbs-devel] " Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 1/3] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Samuel Rufinatscha
2025-11-12 13:24 ` Fabian Grünbichler
2025-11-13 12:59 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 2/3] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-12 11:24 ` Fabian Grünbichler
2025-11-12 15:20 ` Samuel Rufinatscha
2025-11-11 12:29 ` [pbs-devel] [PATCH proxmox-backup 3/3] datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-12 11:27 ` [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-12 17:27 ` Samuel Rufinatscha
2025-11-14 15:08 ` Samuel Rufinatscha [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=84a63b00-9964-45a9-bb07-590f98239a21@proxmox.com \
--to=s.rufinatscha@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.