public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Ciro Iriarte <cyruspy@gmail.com>
To: pve-devel@lists.proxmox.com
Subject: [RFC storage, qemu-server] offload full/live clone to storage backend
Date: Tue, 30 Jun 2026 12:20:21 -0700 (PDT)	[thread overview]
Message-ID: <6a4416f5.01f0a1da.1533d8.2fdf@mx.google.com> (raw)

Hello,

I would like feedback on a storage-plugin contract that lets a backend
perform full-clone copies itself, instead of PVE moving every block
through the host.

Decision I am asking about
--------------------------
For a full clone where source and destination live on the same
offload-capable storage, should PVE delegate the copy to the storage
plugin (array-internal copy) rather than run qemu-img convert /
drive-mirror on the host? If yes, I propose the minimal contract below.

Background
----------
clone_image already offloads LINKED clones (CoW). Full clones do not
come through it: clone_disk() does vdisk_alloc + qemu-img convert
(offline) or drive-mirror (running), reading and writing the whole
volume over the host even when the array could copy it internally in
near-constant time with zero host I/O. volume_has_feature returns
copy => 1 today, but that only means "may be copied", not "the storage
can copy it itself" -- there is no hook for the storage to do the copy.

Proposed contract (pve-storage)
-------------------------------
- Add an optional plugin method, copy_image($scfg, $storeid,
  $src_volname, $vmid, $dst_format, $opts), that allocates and copies
  on the backend atomically and returns the new volname. Base class
  returns undef (unsupported), so existing plugins are unaffected --
  the driver is the sole source of the capability.
- Add a copy-offload feature to volume_has_feature, negotiated with the
  target storeid/format, so the driver also decides per copy whether
  this specific src->dst pair can be offloaded.
- Add a per-storage copy-offload option as a policy override (default
  on), exposed only in the options of drivers that advertise the
  capability, so it can disable offload but never enable it where
  unsupported. Setting it to 0 routes that storage back to the
  host-assisted path with no other change.
- Bump the storage APIVER/APIAGE.

Consumer (qemu-server)
----------------------
- In clone_disk(), in the offline full path, when src and dst are the
  same storage, formats match, the disk is not a special disk
  (cloudinit/efidisk0/tpmstate0), and the plugin advertises
  copy-offload, call the new copy_image hook instead of qemu-img
  convert. Otherwise run the existing path unchanged.
- A failed offload aborts the clone (existing rollback frees the
  target); it does not silently fall back to a host copy, so
  misbehaviour surfaces as a clean error rather than a bad volume.

Live clone (running VM)
-----------------------
A clone is not a migration: the source keeps running on its own volume,
so the target only needs a consistent point-in-time copy, not
drive-mirror convergence. For a running VM, quiesce the guest once
around the whole disk set (guest-fsfreeze-freeze) at the clone loop
level, flush QEMU's caches, run the array copy per disk, then thaw.
Online offload runs only when the freeze succeeds, giving an
FS-consistent copy; if there is no agent or the freeze fails, fall back
to drive-mirror. This needs no separate switch -- the same copy-offload
option governs it, and the freeze requirement keeps it from ever
producing a crash-consistent copy silently.

Out of scope
------------
- Cross-storage and format-converting clones (inherently
  host-mediated).
- VM move-disk needs no separate work: it already calls
  clone_disk(full) and its "same storage, same format" case is
  rejected, so it inherits this hook without ever mis-firing.
- Container volumes (storage_migrate path) -- a possible follow-up.

If the contract looks acceptable, I will follow up with [RFC PATCH]
series against pve-storage and then qemu-server. My CLA is on file.

Tracking/details:
https://github.com/ciroiriarte/pve-FCLUPlugin/issues/11

Thanks,
Ciro Iriarte



                 reply	other threads:[~2026-07-01  8:05 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6a4416f5.01f0a1da.1533d8.2fdf@mx.google.com \
    --to=cyruspy@gmail.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal