public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH-SERIES qemu/common/storage/qemu-server/container/manager v5 00/32] backup provider API
Date: Fri, 21 Mar 2025 14:48:20 +0100	[thread overview]
Message-ID: <20250321134852.103871-1-f.ebner@proxmox.com> (raw)

v4: https://lore.proxmox.com/pve-devel/20241114150754.374376-1-f.ebner@proxmox.com/
v3: https://lore.proxmox.com/pve-devel/20241107165146.125935-1-f.ebner@proxmox.com/

Changes in v5:
* Drop already applied patches.
* Rebase on latest master.
* Set finishing state and end time in backup state in QEMU to avoid
  wrongly detecting previous backup as still active leading to a
  warning.
* Unbreak container restore with a bandwidth limit.
* Switch from 'block-device' mechanism to 'file-handle', providing
  access to the image contents as a (virtual) file created with
  'nbdfuse'. The reason is that block devices created via qemu-nbd
  will lead to a lot of ugly error messages if the device is
  disconnected before partition probing is finished (which can be
  delayed even until the end of the backup, apparently there is a
  lock). This also avoids the need to load the 'nbd' kernel module
  and not exposing the devices as host block devices is just nicer,
  potentially also security-wise. This requires using the fallocate
  syscall instead of the previous BLKDISCARD ioctl. A helper for that
  is provided via the Storage/Common.pm module.
* Indicate support for QEMU feature via 'backup-access-api' flag
  returned by 'query-proxmox-support' QMP command rather than
  requiring a version guard, so it can land whenever.
* Update storage's ApiChangelog, split out API age+version bump into
  its own patch.
* Let plugins define their own list of sensitive properties.
* Add postinst snippet to load NBD module during qemu-server upgrade.
* Check nbd/parameters directory instead of nbd/coresize to determine
  whether the module is loaded. coresize was just picked on a whim,
  parameters seems cleaner.
* Handle edge case with size not being a single number for character
  devices when parsing/checking tar contents.
* Borg plugin:
  * improve SSH options/handling
  * improve handling and permissions of run and ssh directories
  * mount borg archives to subdirectory of run dir
  * add helper for common command environment
  * move to Custom subfolder
* Directory example plugin:
  * use NBD device node reservation mechanism.
  * die if NBD module is not loaded.
  * Unbreak logging warnings by using the correct function.

Changes in v4:
* Drop already applied patches.
* Rework run_in_userns() and related helpers.
* Improve child PID handling for VM 'block-device' method backup
* Improve child PID and volume cleanup for VM 'block-device' method
  backup
* Improve /dev/nbdX handling by adding reservation and checks
* Add modules-load and modprobe configs for 'nbd' module
* Add some more logging

Changes in v3:
* Add storage_has_feature() helper and use it to decide on whether the
  storage uses a backup provider, instead of having this be implicit
  with whether a backup provider is returned by new_backup_provider().
* Fix querying block-node size for fleecing in stop mode, by issuing
  the QMP command only after the VM is enforced running.
* Run backup_container() in user namespace associated to the
  container.
* And introduce 'prepare' phase for backup_hook() to be used to
  prepare for running in that user namespace context.
* Pass in guest and firewall config as raw data instead of by file
  name (so files don't have to be accessible in user namespace context
  for containers).
* Run restore of containers with 'directory' mechanism in user
  namespace switching from 'rsync' to 'tar' which is easier to "split"
  into a privileged and unprivileged half.
* Check potentially untrusted tar archives.
* Borg plugin: make SSH work and use that.

Changes in v2:
* Add 'block-device' backup mechansim for VMs. The NBD export is
  mounted by Proxmox VE and only the block device path (as well as a
  callback to get the next dirty range for bitmaps) is passed to the
  backup provider.
* Add POC example for Borg - note that I tested with borg 1.2.4 in
  Debian and only tested with a local repository, not SSH yet.
* Merge hook API into a single function for backup and for jobs.
* Add restore_vm_init() and restore_vm_cleanup() for better
  flexibility to allow preparing the whole restore. Question is
  if restore_vm_volume_init() and restore_vm_volume_cleanup() should
  be dropped (but certain providers might prefer using only those)?
  Having both is more flexible, but makes the API longer of course.
* Switch to backup_vm() (was per-volume backup_vm_volume() before) and
  backup_container(), passing along the configuration files, rather
  than having dedicated methods for the configuration files, for
  giving the backup provider more flexibility.
* Some renames in API methods/params to improve clarity.
* Pass backup time to backup 'start' hook and use that in the
  directory example rather than the job start time.
* Use POD for base plugin documentation and flesh out documentation.
* Use 'BackupProvider::Plugin::' namespace.
* Various smaller improvements in the directory provider example.

======

A backup provider needs to implement a storage plugin as well as a
backup provider plugin. The storage plugin is for integration in
Proxmox VE's front-end, so users can manage the backups via
UI/API/CLI. The backup provider plugin is for interfacing with the
backup provider's backend to integrate backup and restore with that
backend into Proxmox VE.

This is an initial draft of an API and required changes to the backup
stack in Proxmox VE to make it work. Depending on feedback from other
developers and interested parties, it can still substantially change.

======

The backup provider API is split into two parts, both of which again
need different implementations for VM and LXC guests:

1. Backup API

There are two hook callback functions, namely:
1. job_hook() is called during the start/end/abort phases of the whole
   backup job.
2. backup_hook() is called during the start/end/abort phases of the
   backup of an individual guest. There also is a 'prepare' phase
   useful for container backups, because the backup method for
   containers itself is executed in the user namespace context
   associated to the container.

The backup_get_mechanism() method is used to decide on the backup
mechanism. Currently, 'file-handle' or 'nbd' for VMs, and 'directory'
for containers is possible. The method also let's the plugin indicate
whether to use a bitmap for incremental VM backup or not. It is enough
to implement one mechanism for VMs and one mechanism for containers.

Next, there are methods for backing up the guest's configuration and
data, backup_vm() for VM backup and backup_container() for container
backup, with the latter running

Finally, some helpers like getting the provider name or volume ID for
the backup target, as well as for handling the backup log.

1.1 Backup Mechanisms

VM:

Access to the data on the VM's disk from the time the backup started
is made available via a so-called "snapshot access". This is either
the full image, or in case a bitmap is used, the dirty parts of the
image since the last time the bitmap was used for a successful backup.
Reading outside of the dirty parts will result in an error. After
backing up each part of the disk, it should be discarded in the export
to avoid unnecessary space usage on the Proxmox VE side (there is an
associated fleecing image).

VM mechanism 'file-handle':

The snapshot access is exposed via a file descriptor. A subroutine to
read the dirty regions for incremental backup is provided as well.

VM mechanism 'nbd':

The snapshot access and, if used, bitmap are exported via NBD.

Container mechanism 'directory':

A copy or snapshot of the container's filesystem state is made
available as a directory. The method is executed inside the user
namespace associated to the container.

2. Restore API

The restore_get_mechanism() method is used to decide on the restore
mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for
containers are possible. It is enough to implement one mechanism for
VMs and one mechanism for containers.

Next, methods for extracting the guest and firewall configuration and
the implementations of the restore mechanism via a pair of methods: an
init method, for making the data available to Proxmox VE and a cleanup
method that is called after restore.

For VMs, there also is a restore_vm_get_device_info() helper required,
to get the disks included in the backup and their sizes.

2.1. Restore Mechanisms

VM mechanism 'qemu-img':

The backup provider gives a path to the disk image that will be
restored. The path needs to be something 'qemu-img' can deal with,
e.g. can also be an NBD URI or similar.

Container mechanism 'directory':

The backup provider gives the path to a directory with the full
filesystem structure of the container.

Container mechanism 'tar':

The backup provider gives the path to a (potentially compressed) tar
archive with the full filesystem structure of the container.

See the PVE::BackupProvider::Plugin module for the full API
documentation.

======

This series adapts the backup stack in Proxmox VE to allow using the
above API. For QEMU, backup access setup and teardown QMP commands are
implemented to be able to provide access to a consistent disk state to
the backup provider.

The series also provides an example implementation for a backup
provider as a proof-of-concept, exposing the different features.

======

Open questions:

Should the backup provider plugin system also follow the same API
age+version schema with a Custom/ directory for external plugins
derived from the base plugin?

Should the bitmap action be passed directly to the backup provider?
I.e. have 'not-used', 'not-used-removed', 'new', 'used', 'invalid',
instead of only 'none', 'new' and 'reuse'. It makes API slightly more
complicated. Is there any situation where backup provider could care
if bitmap is new, because it was the first or bitmap is new because
previous was invalid? Both cases require the backup provider to do a
full backup.

======

Feedback is very welcome, especially from people wishing to implement
such a backup provider plugin! Please tell me what issues you see with
the proposed API, what would and wouldn't work from your perspective?

======

Dependencies: libpve-storage-perl depends on pve-common. pve-manager,
pve-container and qemu-server all depend on new libpve-storage-perl.
pve-manager also build-depends on the new libpve-storage-perl for its
tests. pve-container depends on new pve-common, i.e. 8.2.9 for the
"tools: run fork: allow running code in parent after fork"
functionality. To keep things clean, pve-manager should also depend on
new pve-container and qemu-server.

======

qemu:

Fiona Ebner (5):
  PVE backup: add target ID in backup state
  PVE backup: get device info: allow caller to specify filter for which
    devices use fleecing
  PVE backup: implement backup access setup and teardown API for
    external providers
  PVE backup: implement bitmap support for external backup access
  PVE backup: backup-access api: indicate situation where a bitmap was
    recreated

 pve-backup.c         | 473 ++++++++++++++++++++++++++++++++++++++++++-
 pve-backup.h         |  16 ++
 qapi/block-core.json |  74 ++++++-
 system/runstate.c    |   6 +
 4 files changed, 563 insertions(+), 6 deletions(-)
 create mode 100644 pve-backup.h


common:

Fiona Ebner (1):
  syscall: expose fallocate syscall

 src/PVE/Syscall.pm | 1 +
 1 file changed, 1 insertion(+)


storage:

Fiona Ebner (8):
  add storage_has_feature() helper function
  common: add deallocate helper function
  plugin: introduce new_backup_provider() method
  config api/plugins: let plugins define sensitive properties themselves
  plugin api: bump api version and age
  extract backup config: delegate to backup provider for storages that
    support it
  add backup provider example
  Borg example plugin

 ApiChangeLog                                  |   32 +
 src/PVE/API2/Storage/Config.pm                |    4 +-
 src/PVE/BackupProvider/Makefile               |    3 +
 src/PVE/BackupProvider/Plugin/Base.pm         | 1161 +++++++++++++++++
 src/PVE/BackupProvider/Plugin/Borg.pm         |  465 +++++++
 .../BackupProvider/Plugin/DirectoryExample.pm |  802 ++++++++++++
 src/PVE/BackupProvider/Plugin/Makefile        |    5 +
 src/PVE/Makefile                              |    1 +
 src/PVE/Storage.pm                            |   31 +-
 src/PVE/Storage/BTRFSPlugin.pm                |    1 +
 src/PVE/Storage/CIFSPlugin.pm                 |    1 +
 src/PVE/Storage/CephFSPlugin.pm               |    1 +
 src/PVE/Storage/Common.pm                     |   30 +
 .../Custom/BackupProviderDirExamplePlugin.pm  |  308 +++++
 src/PVE/Storage/Custom/BorgBackupPlugin.pm    |  689 ++++++++++
 src/PVE/Storage/Custom/Makefile               |    6 +
 src/PVE/Storage/DirPlugin.pm                  |    1 +
 src/PVE/Storage/ESXiPlugin.pm                 |    1 +
 src/PVE/Storage/GlusterfsPlugin.pm            |    1 +
 src/PVE/Storage/ISCSIDirectPlugin.pm          |    1 +
 src/PVE/Storage/ISCSIPlugin.pm                |    1 +
 src/PVE/Storage/LVMPlugin.pm                  |    1 +
 src/PVE/Storage/LvmThinPlugin.pm              |    1 +
 src/PVE/Storage/Makefile                      |    1 +
 src/PVE/Storage/NFSPlugin.pm                  |    1 +
 src/PVE/Storage/PBSPlugin.pm                  |    5 +
 src/PVE/Storage/Plugin.pm                     |   37 +
 src/PVE/Storage/RBDPlugin.pm                  |    1 +
 src/PVE/Storage/ZFSPlugin.pm                  |    1 +
 src/PVE/Storage/ZFSPoolPlugin.pm              |    1 +
 30 files changed, 3590 insertions(+), 4 deletions(-)
 create mode 100644 src/PVE/BackupProvider/Makefile
 create mode 100644 src/PVE/BackupProvider/Plugin/Base.pm
 create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm
 create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm
 create mode 100644 src/PVE/BackupProvider/Plugin/Makefile
 create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
 create mode 100644 src/PVE/Storage/Custom/BorgBackupPlugin.pm
 create mode 100644 src/PVE/Storage/Custom/Makefile


qemu-server:

Fiona Ebner (9):
  backup: keep track of block-node size for fleecing
  backup: fleecing: use exact size when allocating non-raw fleecing
    images
  backup: allow adding fleecing images also for EFI and TPM
  backup: implement backup for external providers
  backup: implement restore for external providers
  backup restore: external: hardening check for untrusted source image
  backup: future-proof checks for QEMU feature support
  backup: support 'missing-recreated' bitmap action
  backup: bitmap action to human: lie about TPM state

 PVE/API2/Qemu.pm         |  30 ++-
 PVE/QemuServer.pm        | 145 ++++++++++++
 PVE/VZDump/QemuServer.pm | 462 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 622 insertions(+), 15 deletions(-)


container:

Fiona Ebner (7):
  add LXC::Namespaces module
  backup: implement backup for external providers
  backup: implement restore for external providers
  external restore: don't use 'one-file-system' tar flag when restoring
    from a directory
  create: factor out compression option helper
  restore tar archive: check potentially untrusted archive
  api: add early check against restoring privileged container from
    external source

 src/PVE/API2/LXC.pm       |  14 ++
 src/PVE/LXC/Create.pm     | 273 +++++++++++++++++++++++++++++++++++---
 src/PVE/LXC/Makefile      |   1 +
 src/PVE/LXC/Namespaces.pm |  60 +++++++++
 src/PVE/VZDump/LXC.pm     |  40 +++++-
 5 files changed, 365 insertions(+), 23 deletions(-)
 create mode 100644 src/PVE/LXC/Namespaces.pm


manager:

Fiona Ebner (2):
  ui: backup: also check for backup subtype to classify archive
  backup: implement backup for external providers

 PVE/VZDump.pm                      | 57 ++++++++++++++++++++++++++----
 test/vzdump_new_test.pl            |  3 ++
 www/manager6/Utils.js              | 10 +++---
 www/manager6/grid/BackupView.js    |  4 +--
 www/manager6/storage/BackupView.js |  4 +--
 5 files changed, 63 insertions(+), 15 deletions(-)


Summary over all repositories:
  48 files changed, 5204 insertions(+), 63 deletions(-)

-- 
Generated by git-murpp 0.5.0


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


             reply	other threads:[~2025-03-21 13:51 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21 13:48 Fiona Ebner [this message]
2025-03-21 13:48 ` [pve-devel] [PATCH qemu v5 01/32] PVE backup: add target ID in backup state Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu v5 02/32] PVE backup: get device info: allow caller to specify filter for which devices use fleecing Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu v5 03/32] PVE backup: implement backup access setup and teardown API for external providers Fiona Ebner
2025-03-24 13:02   ` Wolfgang Bumiller
2025-03-25 10:51     ` Fiona Ebner
2025-03-25 11:11     ` Fiona Ebner
2025-03-25 11:22       ` Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu v5 04/32] PVE backup: implement bitmap support for external backup access Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu v5 05/32] PVE backup: backup-access api: indicate situation where a bitmap was recreated Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH common v5 06/32] syscall: expose fallocate syscall Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH storage v5 07/32] add storage_has_feature() helper function Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH storage v5 08/32] common: add deallocate " Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH storage v5 09/32] plugin: introduce new_backup_provider() method Fiona Ebner
2025-03-24 15:43   ` Wolfgang Bumiller
2025-03-25 12:50     ` Fiona Ebner
2025-03-27 11:03       ` Wolfgang Bumiller
2025-03-27 13:58         ` Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH storage v5 10/32] config api/plugins: let plugins define sensitive properties themselves Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH storage v5 11/32] plugin api: bump api version and age Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH storage v5 12/32] extract backup config: delegate to backup provider for storages that support it Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [POC storage v5 13/32] add backup provider example Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [POC storage v5 14/32] Borg example plugin Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 15/32] backup: keep track of block-node size for fleecing Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 16/32] backup: fleecing: use exact size when allocating non-raw fleecing images Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 17/32] backup: allow adding fleecing images also for EFI and TPM Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 18/32] backup: implement backup for external providers Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 19/32] backup: implement restore " Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 20/32] backup restore: external: hardening check for untrusted source image Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 21/32] backup: future-proof checks for QEMU feature support Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 22/32] backup: support 'missing-recreated' bitmap action Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH qemu-server v5 23/32] backup: bitmap action to human: lie about TPM state Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH container v5 24/32] add LXC::Namespaces module Fiona Ebner
2025-03-24 12:38   ` Wolfgang Bumiller
2025-03-25 13:06     ` Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH container v5 25/32] backup: implement backup for external providers Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH container v5 26/32] backup: implement restore " Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH container v5 27/32] external restore: don't use 'one-file-system' tar flag when restoring from a directory Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH container v5 28/32] create: factor out compression option helper Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH container v5 29/32] restore tar archive: check potentially untrusted archive Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH container v5 30/32] api: add early check against restoring privileged container from external source Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH manager v5 31/32] ui: backup: also check for backup subtype to classify archive Fiona Ebner
2025-03-21 13:48 ` [pve-devel] [PATCH manager v5 32/32] backup: implement backup for external providers Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250321134852.103871-1-f.ebner@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal