public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Samuel Rufinatscha <s.rufinatscha@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] superseded: [PATCH proxmox-backup v3 0/6] datastore: remove config reload on hot path
Date: Mon, 24 Nov 2025 16:35:59 +0100	[thread overview]
Message-ID: <22b6cec0-3375-4429-8040-57640dddb461@proxmox.com> (raw)
In-Reply-To: <20251120130342.248815-1-s.rufinatscha@proxmox.com>

https://lore.proxmox.com/pbs-devel/20251124153328.239666-1-s.rufinatscha@proxmox.com/T/#t

On 11/20/25 2:03 PM, Samuel Rufinatscha wrote:
> Hi,
> 
> this series reduces CPU time in datastore lookups by avoiding repeated
> datastore.cfg reads/parses in both `lookup_datastore()` and
> `DataStore::Drop`. It also adds a TTL so manual config edits are
> noticed without reintroducing hashing on every request.
> 
> While investigating #6049 [1], cargo-flamegraph [2] showed hotspots during
> repeated `/status` calls in `lookup_datastore()` and in `Drop`,
> dominated by `pbs_config::datastore::config()` (config parse).
> 
> The parsing cost itself should eventually be investigated in a future
> effort. Furthermore, cargo-flamegraph showed that when using a
> token-based auth method to access the API, a significant amount of time
> is spent in validation on every request request [3].
> 
> ## Approach
> 
> [PATCH 1/6] Extend ConfigVersionCache for datastore generation
>    Expose a dedicated datastore generation counter and an increment
>    helper so callers can cheaply track datastore.cfg versions.
> 
> [PATCH 2/6] Fast path for datastore lookups
>    Cache the parsed datastore.cfg keyed by the shared datastore
>    generation. lookup_datastore() reuses both the cached config and an
>    existing DataStoreImpl when the generation matches, and falls back
>    to the old slow path otherwise.
> 
> [PATCH 3/6] Fast path for Drop
>    Make DataStore::Drop use the cached config if possible instead of
>    rereading datastore.cfg from disk.
> 
> [PATCH 4/6] TTL to catch manual edits
>    Add a small TTL around the cached config and bump the datastore
>    generation whenever the config is reloaded. This catches manual
>    edits to datastore.cfg without reintroducing hashing or
>    config parsing on every request.
> 
> [PATCH 5/6] Add reload flag to config cache helper
>    Add a flag to the config cache helper to indicate whether a
>    config reload is acceptable.
> 
> [PATCH 6/6] Only bump generation on config digest change
>    Avoid unnecessary generation bumps when the config is reloaded
>    but the digest did not change.
> 
> ## Benchmark results
> 
> All the following benchmarks are based on top of
> https://lore.proxmox.com/pbs-devel/20251112131525.645971-1-f.gruenbichler@proxmox.com/T/#u
> 
> ### End-to-end
> 
> Testing `/status?verbose=0` end-to-end with 1000 stores, 5 req/store
> and parallel=16 before/after the series:
> 
>      Metric                 Before     After
>      ----------------------------------------
>      Total time             12s        9s
>      Throughput (all)       416.67     555.56
>      Cold RPS (round #1)    83.33      111.11
>      Warm RPS (#2..N)       333.33     444.44
> 
> Running under flamegraph [2], TLS appears to consume a significant
> amount of CPU time and blur the results. Still, a ~33% higher overall
> throughput and ~25% less end-to-end time for this workload.
> 
> ### Isolated benchmarks (hyperfine)
> 
> In addition to the end-to-end tests, I measured two standalone benchmarks
> with hyperfine, each using a config with 1000
> datastores.  `M` is the number of distinct datastores looked up and
> `N` is the number of lookups per datastore.
> 
> Drop-direct variant:
> 
> Drops the `DataStore` after every lookup, so the `Drop` path runs on
> every iteration:
> 
>      use anyhow::Error;
> 
>      use pbs_api_types::Operation;
>      use pbs_datastore::DataStore;
> 
>      fn main() -> Result<(), Error> {
>          let mut args = std::env::args();
>          args.next();
> 
>          let datastores = if let Some(n) = args.next() {
>              n.parse::<usize>()?
>          } else {
>              1000
>          };
> 
>          let iterations = if let Some(n) = args.next() {
>              n.parse::<usize>()?
>          } else {
>              1000
>          };
> 
>          for d in 1..=datastores {
>              let name = format!("ds{:04}", d);
> 
>              for i in 1..=iterations {
>                  DataStore::lookup_datastore(&name, Some(Operation::Write))?;
>              }
>          }
> 
>          Ok(())
>      }
> 
>      +------+-------+------------+------------+----------+
>      | M    | N     | Baseline   | Patched    | Speedup  |
>      +------+-------+------------+------------+----------+
>      | 1    | 1000  | 1.699  s   | 37.3  ms   | 45.5x    |
>      | 10   | 100   | 1.710  s   | 35.8  ms   | 47.7x    |
>      | 100  | 10    | 1.787  s   | 36.6  ms   | 48.9x    |
>      | 1000 | 1     | 1.899  s   | 46.0  ms   | 41.3x    |
>      +------+-------+------------+------------+----------+
> 
> 
> Bulk-drop variant:
> 
> Keeps the `DataStore` instances alive for
> all `N` lookups of a given datastore and then drops them in bulk,
> mimicking a task that performs many lookups while it is running and
> only triggers the expensive `Drop` logic when the last user exits.
> 
>      use anyhow::Error;
> 
>      use pbs_api_types::Operation;
>      use pbs_datastore::DataStore;
> 
>      fn main() -> Result<(), Error> {
>          let mut args = std::env::args();
>          args.next();
> 
>          let datastores = if let Some(n) = args.next() {
>              n.parse::<usize>()?
>          } else {
>              1000
>          };
> 
>          let iterations = if let Some(n) = args.next() {
>              n.parse::<usize>()?
>          } else {
>              1000
>          };
> 
>          for d in 1..=datastores {
>              let name = format!("ds{:04}", d);
> 
>              let mut stores = Vec::with_capacity(iterations);
>              for i in 1..=iterations {
>                  stores.push(DataStore::lookup_datastore(&name, Some(Operation::Write))?);
>              }
>          }
> 
>          Ok(())
>      }
> 
>      +------+-------+--------------+-------------+----------+
>      | M    | N     | Baseline     | Patched     | Speedup  |
>      +------+-------+--------------+-------------+----------+
>      | 1    | 1000  | 888.8  ms    | 39.3  ms    | 22.6x    |
>      | 10   | 100   | 890.8  ms    | 35.3  ms    | 25.3x    |
>      | 100  | 10    | 974.5  ms    | 36.3  ms    | 26.8x    |
>      | 1000 | 1     | 1.848  s     | 39.9  ms    | 46.3x    |
>      +------+-------+--------------+-------------+----------+
> 
> 
> Both variants show that the combination of the cached config lookups
> and the cheaper `Drop` handling reduces the hot-path cost from ~1.7 s
> per run to a few tens of milliseconds in these benchmarks.
> 
> ## Reproduction steps
> 
> VM: 4 vCPU, ~8 GiB RAM, VirtIO-SCSI; disks:
>    - scsi0 32G (OS)
>    - scsi1 1000G (datastores)
> 
> Install PBS from ISO on the VM.
> 
> Set up ZFS on /dev/sdb (adjust if different):
> 
>      zpool create -f -o ashift=12 pbsbench /dev/sdb
>      zfs set mountpoint=/pbsbench pbsbench
>      zfs create pbsbench/pbs-bench
> 
> Raise file-descriptor limit:
> 
>      sudo systemctl edit proxmox-backup-proxy.service
> 
> Add the following lines:
> 
>      [Service]
>      LimitNOFILE=1048576
> 
> Reload systemd and restart the proxy:
> 
>      sudo systemctl daemon-reload
>      sudo systemctl restart proxmox-backup-proxy.service
> 
> Verify the limit:
> 
>      systemctl show proxmox-backup-proxy.service | grep LimitNOFILE
> 
> Create 1000 ZFS-backed datastores (as used in #6049 [1]):
> 
>      seq -w 001 1000 | xargs -n1 -P1 bash -c '
>        id=$0
>        name="ds${id}"
>        dataset="pbsbench/pbs-bench/${name}"
>        path="/pbsbench/pbs-bench/${name}"
>        zfs create -o mountpoint="$path" "$dataset"
>        proxmox-backup-manager datastore create "$name" "$path" \
>          --comment "ZFS dataset-based datastore"
>      '
> 
> Build PBS from this series, then run the server under manually
> under flamegraph:
> 
>      systemctl stop proxmox-backup-proxy
>      cargo flamegraph --release --bin proxmox-backup-proxy
> 
> ## Other resources:
> 
> ### E2E benchmark script:
> 
>      #!/usr/bin/env bash
>      set -euo pipefail
> 
>      # --- Config ---------------------------------------------------------------
>      HOST='https://localhost:8007'
>      USER='root@pam'
>      PASS="$(cat passfile)"
> 
>      DATASTORE_PATH="/pbsbench/pbs-bench"
>      MAX_STORES=1000      # how many stores to include
>      PARALLEL=16         # concurrent workers
>      REPEAT=5            # requests per store (1 cold + REPEAT-1 warm)
> 
>      PRINT_FIRST=false   # true => log first request's HTTP code per store
> 
>      # --- Helpers --------------------------------------------------------------
>      fmt_rps () {
>        local n="$1" t="$2"
>        awk -v n="$n" -v t="$t" 'BEGIN { if (t > 0) printf("%.2f\n", n/t); else print "0.00" }'
>      }
> 
>      # --- Login ---------------------------------------------------------------
>      auth=$(curl -ks -X POST "$HOST/api2/json/access/ticket" \
>        -d "username=$USER" -d "password=$PASS")
>      ticket=$(echo "$auth" | jq -r '.data.ticket')
> 
>      if [[ -z "${ticket:-}" || "$ticket" == "null" ]]; then
>        echo "[ERROR] Login failed (no ticket)"
>        exit 1
>      fi
> 
>      # --- Collect stores (deterministic order) --------------------------------
>      mapfile -t STORES < <(
>        find "$DATASTORE_PATH" -mindepth 1 -maxdepth 1 -type d -printf '%f\n' \
>        | sort | head -n "$MAX_STORES"
>      )
> 
>      USED_STORES=${#STORES[@]}
>      if (( USED_STORES == 0 )); then
>        echo "[ERROR] No datastore dirs under $DATASTORE_PATH"
>        exit 1
>      fi
> 
>      echo "[INFO] Running with stores=$USED_STORES, repeat=$REPEAT, parallel=$PARALLEL"
> 
>      # --- Temp counters --------------------------------------------------------
>      SUCCESS_ALL="$(mktemp)"
>      FAIL_ALL="$(mktemp)"
>      COLD_OK="$(mktemp)"
>      WARM_OK="$(mktemp)"
>      trap 'rm -f "$SUCCESS_ALL" "$FAIL_ALL" "$COLD_OK" "$WARM_OK"' EXIT
> 
>      export HOST ticket REPEAT SUCCESS_ALL FAIL_ALL COLD_OK WARM_OK PRINT_FIRST
> 
>      SECONDS=0
> 
>      # --- Fire requests --------------------------------------------------------
>      printf "%s\n" "${STORES[@]}" \
>      | xargs -P"$PARALLEL" -I{} bash -c '
>        store="$1"
>        url="$HOST/api2/json/admin/datastore/$store/status?verbose=0"
> 
>        for ((i=1;i<=REPEAT;i++)); do
>          code=$(curl -ks -o /dev/null -w "%{http_code}" -b "PBSAuthCookie=$ticket" "$url" || echo 000)
> 
>          if [[ "$code" == "200" ]]; then
>            echo 1 >> "$SUCCESS_ALL"
>            if (( i == 1 )); then
>              echo 1 >> "$COLD_OK"
>            else
>              echo 1 >> "$WARM_OK"
>            fi
>            if [[ "$PRINT_FIRST" == "true" && $i -eq 1 ]]; then
>              ts=$(date +%H:%M:%S)
>              echo "[$ts] $store #$i HTTP:200"
>            fi
>          else
>            echo 1 >> "$FAIL_ALL"
>            if [[ "$PRINT_FIRST" == "true" && $i -eq 1 ]]; then
>              ts=$(date +%H:%M:%S)
>              echo "[$ts] $store #$i HTTP:$code (FAIL)"
>            fi
>          fi
>        done
>      ' _ {}
> 
>      # --- Summary --------------------------------------------------------------
>      elapsed=$SECONDS
>      ok=$(wc -l < "$SUCCESS_ALL" 2>/dev/null || echo 0)
>      fail=$(wc -l < "$FAIL_ALL" 2>/dev/null || echo 0)
>      cold_ok=$(wc -l < "$COLD_OK" 2>/dev/null || echo 0)
>      warm_ok=$(wc -l < "$WARM_OK" 2>/dev/null || echo 0)
> 
>      expected=$(( USED_STORES * REPEAT ))
>      total=$(( ok + fail ))
> 
>      rps_all=$(fmt_rps "$ok" "$elapsed")
>      rps_cold=$(fmt_rps "$cold_ok" "$elapsed")
>      rps_warm=$(fmt_rps "$warm_ok" "$elapsed")
> 
>      echo "===== Summary ====="
>      echo "Stores used:         $USED_STORES"
>      echo "Expected requests:   $expected"
>      echo "Executed requests:   $total"
>      echo "OK (HTTP 200):       $ok"
>      echo "Failed:              $fail"
>      printf "Total time:          %dm %ds\n" $((elapsed/60)) $((elapsed%60))
>      echo "Throughput all RPS:  $rps_all"
>      echo "Cold RPS (round #1): $rps_cold"
>      echo "Warm RPS (#2..N):    $rps_warm"
> 
> ## Patch summary
> 
> [PATCH 1/6] partial fix #6049: config: enable config version cache for datastore
> [PATCH 2/6] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups
> [PATCH 3/6] partial fix #6049: datastore: use config fast-path in Drop
> [PATCH 4/6] partial fix #6049: datastore: add TTL fallback to catch manual config edits
> [PATCH 5/6] to add a reload flag to the config cache helper.
> [PATCH 6/6] to only bump generation when the config digest changes.
> 
> ## Changes from v2:
> 
> Added:
> - [PATCH 5/6]: Add a reload flag to the config cache helper.
> - [PATCH 6/6]: Only bump generation when the config digest changes.
> 
> ## Maintainer notes
> 
> No dependency bumps, no API changes and no breaking changes.
> 
> Thanks,
> Samuel
> 
> [1] Bugzilla #6049: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> [2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
> [3] Bugzilla #7017: https://bugzilla.proxmox.com/show_bug.cgi?id=7017
> 
> Samuel Rufinatscha (6):
>    partial fix #6049: config: enable config version cache for datastore
>    partial fix #6049: datastore: impl ConfigVersionCache fast path for
>      lookups
>    partial fix #6049: datastore: use config fast-path in Drop
>    partial fix #6049: datastore: add TTL fallback to catch manual config
>      edits
>    partial fix #6049: datastore: add reload flag to config cache helper
>    datastore: only bump generation when config digest changes
> 
>   pbs-config/src/config_version_cache.rs |  10 +-
>   pbs-datastore/Cargo.toml               |   1 +
>   pbs-datastore/src/datastore.rs         | 232 ++++++++++++++++++++-----
>   3 files changed, 197 insertions(+), 46 deletions(-)
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


      parent reply	other threads:[~2025-11-24 15:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-20 13:03 [pbs-devel] " Samuel Rufinatscha
2025-11-20 13:03 ` [pbs-devel] [PATCH proxmox-backup v3 1/6] partial fix #6049: config: enable config version cache for datastore Samuel Rufinatscha
2025-11-20 13:03 ` [pbs-devel] [PATCH proxmox-backup v3 2/6] partial fix #6049: datastore: impl ConfigVersionCache fast path for lookups Samuel Rufinatscha
2025-11-20 13:03 ` [pbs-devel] [PATCH proxmox-backup v3 3/6] partial fix #6049: datastore: use config fast-path in Drop Samuel Rufinatscha
2025-11-20 13:03 ` [pbs-devel] [PATCH proxmox-backup v3 4/6] partial fix #6049: datastore: add TTL fallback to catch manual config edits Samuel Rufinatscha
2025-11-20 13:03 ` [pbs-devel] [PATCH proxmox-backup v3 5/6] partial fix #6049: datastore: add reload flag to config cache helper Samuel Rufinatscha
2025-11-20 14:50   ` Fabian Grünbichler
2025-11-20 18:15     ` Samuel Rufinatscha
2025-11-20 13:03 ` [pbs-devel] [PATCH proxmox-backup v3 6/6] datastore: only bump generation when config digest changes Samuel Rufinatscha
2025-11-20 14:50   ` Fabian Grünbichler
2025-11-21  8:37     ` Samuel Rufinatscha
2025-11-20 14:50 ` [pbs-devel] [PATCH proxmox-backup v3 0/6] datastore: remove config reload on hot path Fabian Grünbichler
2025-11-20 15:17   ` Samuel Rufinatscha
2025-11-24 15:35 ` Samuel Rufinatscha [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22b6cec0-3375-4429-8040-57640dddb461@proxmox.com \
    --to=s.rufinatscha@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal