From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 307C61FF164 for <inbox@lore.proxmox.com>; Fri, 23 May 2025 18:00:37 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 36E381E4A7; Fri, 23 May 2025 18:00:34 +0200 (CEST) From: Aaron Lauterer <a.lauterer@proxmox.com> To: pve-devel@lists.proxmox.com Date: Fri, 23 May 2025 18:00:10 +0200 Message-Id: <20250523160029.404400-1-a.lauterer@proxmox.com> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.032 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [lxc.pm, pvestatd.pm, api2tools.pm, status.pm, qemu.pm, nodes.pm, qemuserver.pm, procfstools.pm] Subject: [pve-devel] [RFC cluster/common/container/manager/pve9-rrd-migration-tool/qemu-server/storage 00/19] Expand and migrate RRD data X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> List-Post: <mailto:pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> This patch series expands the RRD format for nodes and VMs. For all types (nodes, VMs, storage) we adjust the aggregation to align them with the way they are done on the Backup Server. Therefore, we have new RRD defitions for all 3 types. New values are added for nodes and VMs. In particular: Nodes: * memfree * membuffers * memcached * arcsize * pressures: * cpu some * io some * io full * mem some * mem full VMs: * memhost (memory consumption of all processes in the guests cgroup, host view) * pressures: * cpu some * cpu full * io some * io full * mem some * mem full To not lose old RRD data, we need to migrate the old RRD files to the ones with the new schema. Some initial performance tests showed that migrating 10k VM RRD files took ~2m40s single threaded. This is way to long to do it within the pmxcfs itself. Therefore this will be a dedicated step. I wrote a small rust tool that binds to librrd to to the migraton. We could include it in a post-install step when upgrading to PVE 9. To avoid missing data and key errors in the journal, we need to ship some changes to PVE 8 that can handle the new format sent out by pvestatd. Those patches are the first in the series and are marked with a "-pve8" postfix in the repo name. This RFC series so far only handles migration and any changes needed for the new fields. It does not yet include any GUI patches to add additional graphs to the summary pages of nodes and guests. Plans: * Add GUI parts: * Additional graphs, mostly for pressures. * add more info the memory graph. e.g. ZFS ARC * add host memory view of guests in graph and gauge * pve8to9: * have a check how many RRD files are present and verify that there is enough space on the root FS How to test: 1. build pve-cluster with the pve8 patches and install it on all nodes. 2. build all the other packages and install them. build the migration tool with cargo and copy the binary to the nodes for now. 3. run the migration tool on the first host 4. continue running the migration tool on the other nodes one by one If you uncomment the extra logging in the pmxcfs/status.c you should see how the different situations are handled. In the PVE8 patches start at line 1373, in the later patches for PVE9 it starts at line 1565. cluster-pve8: Aaron Lauterer (2): cfs status.c: drop old pve2-vm rrd schema support status: handle new pve9- metrics update data src/pmxcfs/status.c | 56 ++++++++++++++++++++++++++++++++++----------- src/pmxcfs/status.h | 2 ++ 2 files changed, 45 insertions(+), 13 deletions(-) pve9-rrd-migration-tool: Aaron Lauterer (1): introduce rrd migration tool for pve8 -> pve9 cluster: Aaron Lauterer (1): status: introduce new pve9- rrd and metric format src/pmxcfs/status.c | 242 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 217 insertions(+), 25 deletions(-) common: Aaron Lauterer (1): add helper to fetch value from smaps_rollup for pid Folke Gleumes (3): fix error in pressure parsing add functions to retrieve pressures for vm/ct metrics: add buffer and cache to meminfo src/PVE/ProcFSTools.pm | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) manager: Aaron Lauterer (5): api2tools: drop old VM rrd schema pvestatd: collect and distribute new pve9- metrics api: nodes: rrd and rrddata fetch from new pve9-node rrd files if present api2tools: extract stats: handle existence of new pve9- data ui: rrdmodels: add new columns PVE/API2/Nodes.pm | 8 +- PVE/API2Tools.pm | 24 +---- PVE/Service/pvestatd.pm | 128 +++++++++++++++++++++------ www/manager6/data/model/RRDModels.js | 16 ++++ 4 files changed, 126 insertions(+), 50 deletions(-) storage: Aaron Lauterer (1): status: rrddata: use new pve9 rrd location if file is present src/PVE/API2/Storage/Status.pm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) qemu-server: Aaron Lauterer (3): vmstatus: add memhost for host view of vm mem consumption vmstatus: switch mem stat to PSS of VM cgroup rrddata: use new pve9 rrd location if file is present Folke Gleumes (1): metrics: add pressure to metrics PVE/API2/Qemu.pm | 4 +++- PVE/QemuServer.pm | 23 +++++++++++++++++++---- 2 files changed, 22 insertions(+), 5 deletions(-) container: Aaron Lauterer (1): rrddata: use new pve9 rrd location if file is present src/PVE/API2/LXC.pm | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Summary over all repositories: 12 files changed, 457 insertions(+), 98 deletions(-) -- Generated by git-murpp 0.8.1 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel