* [pve-devel] [RFC pve-storage/qemu-server 00/10] introduce thin provisioned drives to thick LVM storage
@ 2025-10-17 11:25 Tiago Sousa via pve-devel
0 siblings, 0 replies; only message in thread
From: Tiago Sousa via pve-devel @ 2025-10-17 11:25 UTC (permalink / raw)
To: pve-devel; +Cc: Tiago Sousa
[-- Attachment #1: Type: message/rfc822, Size: 7679 bytes --]
From: Tiago Sousa <joao.sousa@eurotux.com>
To: pve-devel@lists.proxmox.com
Subject: [RFC pve-storage/qemu-server 00/10] introduce thin provisioned drives to thick LVM storage
Date: Fri, 17 Oct 2025 12:25:25 +0100
Message-ID: <20251017112539.26471-1-joao.sousa@eurotux.com>
As discussed with Alexandre Derumier, I’m sharing a prototype daemon
that monitors an extend queue and performs live VM disk resizes to
enable thin provisioning on LVM storage.
The idea is to create disks smaller than their qcow2 virtual size,
currently hardcoded to 2 GiB. This applies when a VM disk
has a snapshot. A write threshold is set on the file blockdev node so
that it triggers only when the actual filesystem reaches it.
To handle this, a notion of underlay is introduced. For LVM,the
underlay controls the underlying LV of the disk. For example, a
200 GiB qcow2 disk starts with a 2 GiB LV and grows incrementally as
needed. Qcow2 preallocation must be fully off, which has performance
implications.
The block write threshold is calculated from two storage config
variables, chunksize and chunk-percentage, using:
underlay_size - chunksize * chunk-percentage.
For example, if chunk-percentage = 0.3, the event fires when 30% of
chunksize remains.
When qmeventd receives an event, it appends it to /etc/pve/extend-queue
as vmid:blockdev_nodename. In a cluster, all nodes write to the same
queue.
pvestord (PVE Storage Daemon) runs a 1-second cycle checking the
queue. If a request is found and the VM is local, the entry is dequeued
to avoid conflicts. It then queries the QMP socket for the VM’s
blockstats, identifies the disk, and extends the LV.
Flow: qemu-vm -> qmeventd -> /etc/pve/extend-queue <- pvestord
So far test have been done manually.
Some problems and questions:
- Thin provisioning is currently hardcoded for drives that have a parent
snapshot.
- The thin variable that is introduced in the drive config needs
review before wider implementation.
- Consider making thin optional via snapshot prompt checkbox.
- Could eventually offer the option for all qcow2 disks on LVM.
- Re-evaluate blockdev name generation: sha256 vs reversible
encoding (like base64) to identify drives and allow offline extends.
- Since all extend requests share the same queue, drives on different
LVM storages must wait for their turn, even though actions on
separate storages could run concurrently.
- qmeventd writes to the queue aren’t cluster-safe. I couldn’t find any
primitives in the C code to lock the file via pmxcfs (like
cfs_lock_file in Perl). Is there any function that handles this?
pve-storage:
Tiago Sousa (4):
pvestord: setup new pvestord daemon
storage: add extend queue handling
lvmplugin: add thin volume support for LVM external snapshots
plugin: lvmplugin: add underlay functions
src/Makefile | 1 +
src/PVE/Makefile | 1 +
src/PVE/Service/Makefile | 10 ++
src/PVE/Service/pvestord.pm | 193 ++++++++++++++++++++++++++++++++++
src/PVE/Storage.pm | 100 +++++++++++++++++-
src/PVE/Storage/Common.pm | 4 +-
src/PVE/Storage/LVMPlugin.pm | 84 ++++++++++++---
src/PVE/Storage/Plugin.pm | 29 ++++-
src/bin/Makefile | 3 +
src/bin/pvestord | 24 +++++
src/services/Makefile | 14 +++
src/services/pvestord.service | 15 +++
12 files changed, 462 insertions(+), 16 deletions(-)
create mode 100644 src/PVE/Service/Makefile
create mode 100644 src/PVE/Service/pvestord.pm
create mode 100755 src/bin/pvestord
create mode 100644 src/services/Makefile
create mode 100644 src/services/pvestord.service
qemu-server:
Tiago Sousa (4):
qmeventd: add block write threshold event handling
blockdev: add set write threshold
blockdev: add query-blockstats qmp command
blockdev: add underlay resize
src/PVE/QemuServer.pm | 22 ++++++++++
src/PVE/QemuServer/Blockdev.pm | 80 ++++++++++++++++++++++++++++++++++
src/PVE/QemuServer/Drive.pm | 7 +++
src/qmeventd/qmeventd.c | 21 ++++++++-
4 files changed, 129 insertions(+), 1 deletion(-)
pve-manager:
Tiago Sousa (1):
services: add pvestord service
PVE/API2/Services.pm | 1 +
1 file changed, 1 insertion(+)
pve-cluster:
Tiago Sousa (1):
observe extend queue
src/PVE/Cluster.pm | 1 +
1 file changed, 1 insertion(+)
--
2.47.3
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2025-10-17 11:35 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-17 11:25 [pve-devel] [RFC pve-storage/qemu-server 00/10] introduce thin provisioned drives to thick LVM storage Tiago Sousa via pve-devel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox