* [RFC storage, qemu-server] offload full/live clone to storage backend
@ 2026-06-30 19:20 Ciro Iriarte
0 siblings, 0 replies; only message in thread
From: Ciro Iriarte @ 2026-06-30 19:20 UTC (permalink / raw)
To: pve-devel
Hello,
I would like feedback on a storage-plugin contract that lets a backend
perform full-clone copies itself, instead of PVE moving every block
through the host.
Decision I am asking about
--------------------------
For a full clone where source and destination live on the same
offload-capable storage, should PVE delegate the copy to the storage
plugin (array-internal copy) rather than run qemu-img convert /
drive-mirror on the host? If yes, I propose the minimal contract below.
Background
----------
clone_image already offloads LINKED clones (CoW). Full clones do not
come through it: clone_disk() does vdisk_alloc + qemu-img convert
(offline) or drive-mirror (running), reading and writing the whole
volume over the host even when the array could copy it internally in
near-constant time with zero host I/O. volume_has_feature returns
copy => 1 today, but that only means "may be copied", not "the storage
can copy it itself" -- there is no hook for the storage to do the copy.
Proposed contract (pve-storage)
-------------------------------
- Add an optional plugin method, copy_image($scfg, $storeid,
$src_volname, $vmid, $dst_format, $opts), that allocates and copies
on the backend atomically and returns the new volname. Base class
returns undef (unsupported), so existing plugins are unaffected --
the driver is the sole source of the capability.
- Add a copy-offload feature to volume_has_feature, negotiated with the
target storeid/format, so the driver also decides per copy whether
this specific src->dst pair can be offloaded.
- Add a per-storage copy-offload option as a policy override (default
on), exposed only in the options of drivers that advertise the
capability, so it can disable offload but never enable it where
unsupported. Setting it to 0 routes that storage back to the
host-assisted path with no other change.
- Bump the storage APIVER/APIAGE.
Consumer (qemu-server)
----------------------
- In clone_disk(), in the offline full path, when src and dst are the
same storage, formats match, the disk is not a special disk
(cloudinit/efidisk0/tpmstate0), and the plugin advertises
copy-offload, call the new copy_image hook instead of qemu-img
convert. Otherwise run the existing path unchanged.
- A failed offload aborts the clone (existing rollback frees the
target); it does not silently fall back to a host copy, so
misbehaviour surfaces as a clean error rather than a bad volume.
Live clone (running VM)
-----------------------
A clone is not a migration: the source keeps running on its own volume,
so the target only needs a consistent point-in-time copy, not
drive-mirror convergence. For a running VM, quiesce the guest once
around the whole disk set (guest-fsfreeze-freeze) at the clone loop
level, flush QEMU's caches, run the array copy per disk, then thaw.
Online offload runs only when the freeze succeeds, giving an
FS-consistent copy; if there is no agent or the freeze fails, fall back
to drive-mirror. This needs no separate switch -- the same copy-offload
option governs it, and the freeze requirement keeps it from ever
producing a crash-consistent copy silently.
Out of scope
------------
- Cross-storage and format-converting clones (inherently
host-mediated).
- VM move-disk needs no separate work: it already calls
clone_disk(full) and its "same storage, same format" case is
rejected, so it inherits this hook without ever mis-firing.
- Container volumes (storage_migrate path) -- a possible follow-up.
If the contract looks acceptable, I will follow up with [RFC PATCH]
series against pve-storage and then qemu-server. My CLA is on file.
Tracking/details:
https://github.com/ciroiriarte/pve-FCLUPlugin/issues/11
Thanks,
Ciro Iriarte
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-07-01 8:05 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 19:20 [RFC storage, qemu-server] offload full/live clone to storage backend Ciro Iriarte
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox