From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 04DBC1FF179 for ; Wed, 12 Nov 2025 12:26:37 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 275CA1FF76; Wed, 12 Nov 2025 12:27:24 +0100 (CET) Date: Wed, 12 Nov 2025 12:27:17 +0100 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox Backup Server development discussion References: <20251111122941.110412-1-s.rufinatscha@proxmox.com> In-Reply-To: <20251111122941.110412-1-s.rufinatscha@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.17.0 (https://github.com/astroidmail/astroid) Message-Id: <1762935902.jw31anddp2.astroid@yuna.none> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1762946815836 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.047 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH proxmox-backup 0/3] datastore: remove config reload on hot path X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox Backup Server development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" On November 11, 2025 1:29 pm, Samuel Rufinatscha wrote: > Hi, > > this series reduces CPU time in datastore lookups by avoiding repeated > datastore.cfg reads/parses in both `lookup_datastore()` and > `DataStore::Drop`. It also adds a TTL so manual config edits are > noticed without reintroducing hashing on every request. > > While investigating #6049 [1], cargo-flamegraph [2] showed hotspots during > repeated `/status` calls in `lookup_datastore()` and in `Drop`, > dominated by `pbs_config::datastore::config()` (config parse). > > The parsing cost itself should eventually be investigated in a future > effort. Furthermore, cargo-flamegraph showed that when using a > token-based auth method to access the API, a significant amount of time > is spent in validation on every request, likely related to bcrypt. > Also this should be eventually revisited in a future effort. please file a bug for the token part, if there isn't already one! thanks for diving into this, it already looks promising, even though the effect on more "normal" systems with more reasonable numbers of datastores and clients will be less pronounced ;) the big TL;DR would be that we trade faster datastore lookups (which happen quite frequently, in particular if there are many datastores with clients checking their status) against slightly delayed reload of the configuration in case of manual, behind-our-backs edits, with one particular corner case that is slightly problematic, but also a bit contrived: - datastore is looked up - config is edited (manually) to set maintenance mode to one that requires removing from the datastore map once the last task exits - last task drops the datastore struct - no regular edits happened in the meantime - the Drop handler doesn't know it needs to remove the datastore from the map - open FD is held by proxy, datastore fails to be unmounted/.. we could solve this issue by not only bumping the generation on save, but also when we reload the config (in particular if we cache the whole config!). that would make the Drop handler efficient enough for idle systems that have mostly lookups but no long running tasks. as soon as a datastore has long running tasks, the last such task will likely exit long after the TTL for its config lookup has expired, so will need to do a refresh - although that refresh could again be from the global cache, instead of from disk? still wouldn't close the window entirely, but make it pretty unlikely to be hit in practice.. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel