From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id EA6D01FF15C for ; Fri, 17 Oct 2025 13:35:00 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 5A8325A; Fri, 17 Oct 2025 13:35:09 +0200 (CEST) To: pve-devel@lists.proxmox.com Date: Fri, 17 Oct 2025 12:25:25 +0100 MIME-Version: 1.0 Message-ID: List-Id: Proxmox VE development discussion List-Post: From: Tiago Sousa via pve-devel Precedence: list Cc: Tiago Sousa X-Mailman-Version: 2.1.29 X-BeenThere: pve-devel@lists.proxmox.com List-Subscribe: , List-Unsubscribe: , List-Archive: Reply-To: Proxmox VE development discussion List-Help: Subject: [pve-devel] [RFC pve-storage/qemu-server 00/10] introduce thin provisioned drives to thick LVM storage Content-Type: multipart/mixed; boundary="===============5264218947625369588==" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" --===============5264218947625369588== Content-Type: message/rfc822 Content-Disposition: inline Return-Path: X-Original-To: pve-devel@lists.proxmox.com Delivered-To: pve-devel@lists.proxmox.com Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9849AD0F2D for ; Fri, 17 Oct 2025 13:35:07 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id AE80C27D1E for ; Fri, 17 Oct 2025 13:35:06 +0200 (CEST) Received: from eurotux.com (mail.eurotux.com [185.98.249.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 17 Oct 2025 13:35:05 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by eurotux.com (Postfix) with ESMTP id 1E4FA30D17BA; Fri, 17 Oct 2025 12:26:02 +0100 (WEST) Authentication-Results: mail.prd.eurotux.pt (amavisd-new); dkim=pass (2048-bit key) reason="pass (just generated, assumed good)" header.d=eurotux.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=eurotux.com; h= content-transfer-encoding:content-type:content-type:mime-version :x-mailer:message-id:date:date:subject:subject:from:from; s= default; t=1760700361; x=1762514762; bh=ZC41+902jWEMbd6VuVwF1qPx AJvDwFRtwPzxQGaFO88=; b=vQCVmcIZIhyERovf1FhHhUnDlHaPRuAbRzZtui/s 4T1+EDAhbgDfL3alwiWHaFwWJVuhUYp12XkJd9Ve5lJbhmIaK1pXTkXj+52zx6HS Q8UK+9xD9tenQmZ5+o1Yenk+8qsA97aw1UHoTPHjBAnDlVpiIZxhZf36nP63FAX5 CnCi7ZgXKf1MWNTgAAk/rdcjKTPNB0Li6mtkL/widfwQx+UWGoInnTeZtF3VPdKu J92uiGtqBmds+crRkibCSiFeyuG3eHSXJJE24xBlKJxPhDkw0py8wIc/i0Ed1pXH BEUsohzPg5Nl/AS+d/mKCqI2iyYIZx2YOwI+gperqcUelw== X-Virus-Scanned: amavisd-new at mail.prd.eurotux.pt Received: from eurotux.com ([127.0.0.1]) by localhost (mail.prd.eurotux.pt [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Zypr6yNfjYvv; Fri, 17 Oct 2025 12:26:01 +0100 (WEST) Received: from proxmox.example (184.137.90.149.rev.vodafone.pt [149.90.137.184]) (Authenticated sender: joao.sousa@eurotux.com) by eurotux.com (Postfix) with ESMTPSA id 2131E30CFB0E; Fri, 17 Oct 2025 12:26:01 +0100 (WEST) From: Tiago Sousa To: pve-devel@lists.proxmox.com Subject: [RFC pve-storage/qemu-server 00/10] introduce thin provisioned drives to thick LVM storage Date: Fri, 17 Oct 2025 12:25:25 +0100 Message-ID: <20251017112539.26471-1-joao.sousa@eurotux.com> X-Mailer: git-send-email 2.47.3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL -0.000 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain DMARC_PASS -0.1 DMARC pass policy RCVD_IN_MSPIKE_H2 0.001 Average reputation (+2) SPF_HELO_PASS -0.001 SPF: HELO matches SPF record SPF_PASS -0.001 SPF: sender matches SPF record As discussed with Alexandre Derumier, I=E2=80=99m sharing a prototype dae= mon that monitors an extend queue and performs live VM disk resizes to enable thin provisioning on LVM storage. The idea is to create disks smaller than their qcow2 virtual size, currently hardcoded to 2 GiB. This applies when a VM disk has a snapshot. A write threshold is set on the file blockdev node so that it triggers only when the actual filesystem reaches it. To handle this, a notion of underlay is introduced. For LVM,the underlay controls the underlying LV of the disk. For example, a 200 GiB qcow2 disk starts with a 2 GiB LV and grows incrementally as needed. Qcow2 preallocation must be fully off, which has performance implications. The block write threshold is calculated from two storage config variables, chunksize and chunk-percentage, using: underlay_size - chunksize * chunk-percentage. For example, if chunk-percentage =3D 0.3, the event fires when 30% of chunksize remains. When qmeventd receives an event, it appends it to /etc/pve/extend-queue as vmid:blockdev_nodename. In a cluster, all nodes write to the same queue. pvestord (PVE Storage Daemon) runs a 1-second cycle checking the queue. If a request is found and the VM is local, the entry is dequeued to avoid conflicts. It then queries the QMP socket for the VM=E2=80=99s blockstats, identifies the disk, and extends the LV. Flow: qemu-vm -> qmeventd -> /etc/pve/extend-queue <- pvestord So far test have been done manually. Some problems and questions: - Thin provisioning is currently hardcoded for drives that have a parent snapshot. - The thin variable that is introduced in the drive config needs review before wider implementation. - Consider making thin optional via snapshot prompt checkbox. - Could eventually offer the option for all qcow2 disks on LVM. - Re-evaluate blockdev name generation: sha256 vs reversible encoding (like base64) to identify drives and allow offline extends. - Since all extend requests share the same queue, drives on different LVM storages must wait for their turn, even though actions on separate storages could run concurrently. - qmeventd writes to the queue aren=E2=80=99t cluster-safe. I couldn=E2=80= =99t find any primitives in the C code to lock the file via pmxcfs (like cfs_lock_file in Perl). Is there any function that handles this? pve-storage: Tiago Sousa (4): pvestord: setup new pvestord daemon storage: add extend queue handling lvmplugin: add thin volume support for LVM external snapshots plugin: lvmplugin: add underlay functions src/Makefile | 1 + src/PVE/Makefile | 1 + src/PVE/Service/Makefile | 10 ++ src/PVE/Service/pvestord.pm | 193 ++++++++++++++++++++++++++++++++++ src/PVE/Storage.pm | 100 +++++++++++++++++- src/PVE/Storage/Common.pm | 4 +- src/PVE/Storage/LVMPlugin.pm | 84 ++++++++++++--- src/PVE/Storage/Plugin.pm | 29 ++++- src/bin/Makefile | 3 + src/bin/pvestord | 24 +++++ src/services/Makefile | 14 +++ src/services/pvestord.service | 15 +++ 12 files changed, 462 insertions(+), 16 deletions(-) create mode 100644 src/PVE/Service/Makefile create mode 100644 src/PVE/Service/pvestord.pm create mode 100755 src/bin/pvestord create mode 100644 src/services/Makefile create mode 100644 src/services/pvestord.service qemu-server: Tiago Sousa (4): qmeventd: add block write threshold event handling blockdev: add set write threshold blockdev: add query-blockstats qmp command blockdev: add underlay resize src/PVE/QemuServer.pm | 22 ++++++++++ src/PVE/QemuServer/Blockdev.pm | 80 ++++++++++++++++++++++++++++++++++ src/PVE/QemuServer/Drive.pm | 7 +++ src/qmeventd/qmeventd.c | 21 ++++++++- 4 files changed, 129 insertions(+), 1 deletion(-) pve-manager: Tiago Sousa (1): services: add pvestord service PVE/API2/Services.pm | 1 + 1 file changed, 1 insertion(+) pve-cluster: Tiago Sousa (1): observe extend queue src/PVE/Cluster.pm | 1 + 1 file changed, 1 insertion(+) -- 2.47.3 --===============5264218947625369588== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel --===============5264218947625369588==--